Meta releases Llama 4 Maverick 17B-128E multimodal model on Hugging Face
Meta released Llama 4 Maverick, a 17B active parameter model with 128 experts (mixture-of-experts architecture), on Hugging Face. The model supports image-text-to-text tasks, making it a multimodal open-weights release. This is part of the Llama 4 generation, representing Meta's latest open-weights frontier push with MoE architecture.
Related guides (4)
Related events (8)
Meta releases Llama 4 Maverick 17B-128E multimodal instruct model on Hugging Face
Meta released Llama 4 Maverick, a 17B active parameter model with 128 experts (MoE architecture), as an image-text-to-text instruct model on Hugging Face. The model supports multimodal inputs and multiple languages including Arabic, German, and English. With 28K+ downloads and 493 likes shortly after release, it is seeing significant early adoption.
Meta releases Llama 4 Scout 17B-16E multimodal model on Hugging Face
Meta released Llama 4 Scout, a 17B active parameter model with 16 experts (mixture-of-experts architecture), on Hugging Face. The model supports image-text-to-text tasks, making it a multimodal open-weights release. With over 14,000 downloads and 249 likes shortly after release, it is seeing meaningful early adoption.
Meta releases Llama 4 Scout 17B-16E instruct model on Hugging Face
Meta released Llama 4 Scout, a 17B active parameter / 16-expert mixture-of-experts instruct model with image-text-to-text (multimodal) capabilities, published on Hugging Face under the meta-llama organization. The model supports multiple languages including Arabic, German, and English. With over 420K downloads and 1,300 likes shortly after release, it is seeing significant community uptake.
Meta releases Llama 3.2 90B Vision multimodal model on Hugging Face
Meta released Llama 3.2 90B Vision, a large multimodal model supporting image-text-to-text tasks, published on Hugging Face under the meta-llama organization. The model is part of the Llama 3.2 family and supports English, German, and French. This is a significant open-weights multimodal release from Meta, extending the Llama 3 series with vision capabilities at the 90B parameter scale.
Meta releases Llama 3.2 11B Vision multimodal model on Hugging Face
Meta released Llama 3.2 11B Vision, an open-weights image-text-to-text model, on Hugging Face. The model is part of the Llama 3.2 family and supports multiple languages including English, German, and French. This represents Meta's entry into open-weights multimodal models at the 11B parameter scale.
Meta releases Llama 3.2 90B Vision-Instruct multimodal model
Meta released Llama 3.2 90B Vision-Instruct on Hugging Face, a large multimodal model supporting image-text-to-text tasks. The model is part of the Llama 3.2 family and supports English and German. With 858 downloads and 358 likes, it represents Meta's open-weights push into vision-language capabilities at the 90B parameter scale.
Meta releases Llama 3.2 11B Vision Instruct multimodal model
Meta released Llama 3.2 11B Vision Instruct on Hugging Face, an open-weights multimodal model supporting image-text-to-text tasks. The model is part of the Llama 3.2 family and supports English and German. With over 157K downloads and 1,600 likes, it has seen substantial community adoption.
Llama 3.2 Multimodal and Edge Models Launch on Hugging Face
Meta released Llama 3.2, introducing vision-capable multimodal models alongside lightweight models optimized for on-device inference. Hugging Face published a blog post covering integration support, model availability, and deployment options across the ecosystem. The release marks Meta's first open-weights multimodal Llama models, adding image understanding to the Llama family. Smaller 1B and 3B parameter variants target edge and mobile deployment scenarios.



