6Hugging Face Blog·1mo ago

Falcon-Edge: 1.58-bit Quantized Language Model Series from TII

Technology Innovation Institute (TII) has released Falcon-Edge, a series of language models operating at 1.58-bit precision, targeting edge deployment scenarios. The models are designed to be fine-tunable despite extreme quantization, positioning them as practical options for resource-constrained environments. This release extends the Falcon model family into the ultra-low-bit regime, following broader industry interest in BitNet-style ternary weight models.

Frontier Model Releases Open Weights Progress Inference Economics BitNet 1.58-bit quantization Falcon-Edge Hugging Face Technology Innovation Institute

Related guides (3)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Hugging Face published a blog post describing a method for fine-tuning large language models down to 1.58-bit precision, referencing the BitNet b1.58 quantization scheme. The post covers tooling and workflows that make extreme quantization more accessible via the Hugging Face ecosystem. This represents a practical guide to applying ternary-weight quantization ({-1, 0, 1}) to existing models through fine-tuning rather than training from scratch.

Open Weights Progress Inference Economics Transformers 1.58-bit quantization Hugging Face +1 more

6Hugging Face Blog·1mo ago·source ↗

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

TII UAE has released Falcon-H1, a new family of hybrid-head language models combining attention and state-space mechanisms to improve efficiency and performance. The models are published on Hugging Face and represent TII's latest iteration in the Falcon series. The hybrid architecture targets better inference economics and competitive benchmark results relative to model size.

Frontier Model Releases Open Weights Progress Hugging Face Hybrid-Head Architecture Falcon-H1 +2 more

8Hugging Face Blog·1mo ago·source ↗

Falcon 180B Released: New Open-Weights Frontier Model

Technology Innovation Institute (TII) has released Falcon 180B, a 180-billion parameter open-weights language model announced via Hugging Face. At the time of release, it was positioned as the largest publicly available open-weights model, trained on 3.5 trillion tokens. The model is available on Hugging Face Hub for research and commercial use under a custom license.

Frontier Model Releases Open Weights Progress Hugging Face Technology Innovation Institute Falcon 180B +1 more

6Hugging Face Blog·1mo ago·source ↗

Falcon 2: 11B Parameter Pretrained LLM and VLM Trained on 5T+ Tokens Across 11 Languages

Technology Innovation Institute (TII) has released Falcon 2, an 11B parameter language model pretrained on over 5 trillion tokens spanning 11 languages. The release includes both a base language model and a vision-language model (VLM) variant. This represents a significant update to the Falcon model family, expanding multilingual and multimodal capabilities.

Frontier Model Releases Open Weights Progress Falcon Hugging Face Falcon 2 VLM +2 more

5Hugging Face Blog·1mo ago·source ↗

Falcon-Arabic: A Breakthrough in Arabic Language Models

TII UAE has released Falcon-Arabic, a language model specifically designed for Arabic. The announcement highlights it as a significant advancement in Arabic NLP capabilities. As a tier-2 source with minimal body content, specific technical details about model size, training data, or benchmark performance are not available from this item.

Frontier Model Releases Open Weights Progress Falcon-H1-Arabic Hugging Face Technology Innovation Institute

7Mistral Ai News·20d ago·source ↗

Mistral AI Releases Ministral 3B and 8B Edge Models

Mistral AI has introduced two new small language models, Ministral 3B and Ministral 8B, targeting on-device and edge computing use cases. Both models support up to 128k context length and claim state-of-the-art performance in the sub-10B parameter category, outperforming comparable models from Google and Meta on internal benchmarks. Ministral 8B features an interleaved sliding-window attention mechanism for memory-efficient inference and is priced at $0.1/M tokens via API, while Ministral 3B is priced at $0.04/M tokens. Weights for Ministral 8B Instruct are available for research use, with commercial licensing available on request.

Long Context Evolution Frontier Model Releases Mistral AI Gemma 2 9B Ministral 8B +12 more

6Hugging Face Blog·1mo ago·source ↗

Falcon LLM Integrated into Hugging Face Ecosystem

Hugging Face announced the integration of the Falcon language models (Falcon-7B and Falcon-40B) into its ecosystem, including model hosting, inference APIs, and tooling support. Falcon, developed by the Technology Innovation Institute (TII), had recently topped the Open LLM Leaderboard at the time of release. The post covers usage patterns, fine-tuning guidance, and deployment options within the Hugging Face stack.

Open Weights Progress Inference Economics Falcon-7B Open LLM Leaderboard Falcon-40B +3 more

6Hugging Face Blog·1mo ago·source ↗

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Hugging Face published a blog post detailing the integration of 4-bit quantization via bitsandbytes into the Transformers library, enabling large language models to run on consumer-grade hardware. The post covers NF4 (NormalFloat4) data type and double quantization techniques from the QLoRA paper, which together reduce memory footprint significantly while preserving model quality. It demonstrates how users can load models like LLaMA in 4-bit precision and fine-tune them using QLoRA with minimal code changes.

Open Weights Progress Inference Economics Transformers NF4 (NormalFloat4)QLoRA +4 more