4Hugging Face Blog·1mo ago

New in llama.cpp: Model Management

llama.cpp has introduced new model management capabilities, as described in a Hugging Face blog post from the ggml-org. The post covers updates to how models are handled within the llama.cpp inference framework. This is a tooling update relevant to the open-source local inference ecosystem.

Open Weights Progress Inference Economics Agent and Tool Ecosystem ggml-org llama.cpp Hugging Face

Related guides (4)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Inference EconomicsTopic guide

Inference Economics: The Cost Structure of Running AI Models in Production

Read asIn-depth

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

Llama 3.2 in Keras

Hugging Face published a blog post detailing the integration of Meta's Llama 3.2 models into the Keras framework. The post covers how developers can use Keras to load, fine-tune, and run inference with Llama 3.2, expanding the ecosystem of tools available for working with the model. This represents a tooling/framework integration update rather than a new capability announcement.

Open Weights Progress Agent and Tool Ecosystem Keras Llama 3.2 Hugging Face +1 more

8Hugging Face Blog·1mo ago·source ↗

Welcome Llama 3 - Meta's new open LLM

Hugging Face published a blog post welcoming Meta's Llama 3 release, covering the new open-weights large language models. Llama 3 represents a significant update to Meta's open model family, with improved capabilities over Llama 2. The post covers integration and availability on the Hugging Face platform.

Frontier Model Releases Open Weights Progress Llama 2 Llama 3 Hugging Face +2 more

8Hugging Face Blog·1mo ago·source ↗

GGML and llama.cpp Join Hugging Face to Ensure Long-Term Progress of Local AI

GGML and llama.cpp, the foundational open-source libraries enabling efficient local inference of large language models, are joining Hugging Face. This move is intended to secure long-term development and sustainability of the projects that underpin much of the local/on-device AI ecosystem. The acquisition or integration represents a significant consolidation of key open-weights inference infrastructure under the Hugging Face umbrella.

Open Weights Progress Inference Economics Georgi Gerganov llama.cpp Hugging Face +2 more

8Hugging Face Blog·1mo ago·source ↗

Llama 3.2 Multimodal and Edge Models Launch on Hugging Face

Meta released Llama 3.2, introducing vision-capable multimodal models alongside lightweight models optimized for on-device inference. Hugging Face published a blog post covering integration support, model availability, and deployment options across the ecosystem. The release marks Meta's first open-weights multimodal Llama models, adding image understanding to the Llama family. Smaller 1B and 3B parameter variants target edge and mobile deployment scenarios.

Frontier Model Releases Open Weights Progress Llama 3.2 Hugging Face Meta +3 more

4Hugging Face Blog·1mo ago·source ↗

Introduction to ggml

This Hugging Face blog post introduces ggml, a C-based tensor library that underpins popular inference runtimes like llama.cpp and whisper.cpp. It explains ggml's design philosophy, quantization support, and how it enables efficient on-device inference for large language models. The post serves as an educational overview for developers looking to understand or build on the ggml ecosystem.

Open Weights Progress Inference Economics whisper.cpp llama.cpp Hugging Face +2 more

7Meta Llama·11d ago·source ↗

Meta releases Llama 3.2 11B Vision Instruct multimodal model

Meta released Llama 3.2 11B Vision Instruct on Hugging Face, an open-weights multimodal model supporting image-text-to-text tasks. The model is part of the Llama 3.2 family and supports English and German. With over 157K downloads and 1,600 likes, it has seen substantial community adoption.

Open Weights Progress Multimodal Progress Hugging Face Meta Llama 3.2 90B Vision-Instruct

9Hugging Face Blog·1mo ago·source ↗

Llama 3.1 Released: 405B, 70B & 8B Models with Multilinguality and Long Context

Meta released Llama 3.1, a family of open-weights models at three scales (405B, 70B, 8B) featuring multilingual support and extended context windows. The 405B model represents Meta's largest open-weights release to date, positioning it as a frontier-class open model. Hugging Face published a blog post covering the release, integration details, and deployment options across the ecosystem.

Long Context Evolution Frontier Model Releases Llama 3.1 70B Meta Llama 3.1 405B Hugging Face +5 more

7Meta Llama·11d ago·source ↗

Meta releases Llama 3.2 90B Vision multimodal model on Hugging Face

Meta released Llama 3.2 90B Vision, a large multimodal model supporting image-text-to-text tasks, published on Hugging Face under the meta-llama organization. The model is part of the Llama 3.2 family and supports English, German, and French. This is a significant open-weights multimodal release from Meta, extending the Llama 3 series with vision capabilities at the 90B parameter scale.

Frontier Model Releases Open Weights Progress Llama 3.2 90B Vision Hugging Face Meta +1 more