Almanac
← Events
5Hugging Face Blog·1mo ago

Granite 4.1 LLMs: How They're Built

IBM has published a blog post on Hugging Face detailing the construction of its Granite 4.1 language models. The post covers architectural and training decisions behind the new model family. As a tier-2 source with default commentary depth, this provides insight into IBM's continued investment in open enterprise LLMs but lacks the full technical depth of a primary research paper.

Related guides (3)

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Granite 4.0 Nano: Just how small can you go?

IBM has released Granite 4.0 Nano, a small-footprint language model in the Granite 4.0 family, published via the Hugging Face blog. The post explores the capabilities and trade-offs of pushing model size to its lower limits while maintaining practical utility. This release is part of IBM's ongoing effort to develop efficient, enterprise-deployable AI models under the Granite brand.

5Hugging Face Blog·1mo ago·source ↗

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

IBM released Granite 4.0 3B Vision, a compact multimodal model targeting enterprise document understanding tasks. The model is hosted on Hugging Face and positioned for deployment in resource-constrained enterprise environments. As a 3B-parameter vision-language model, it competes in the small-but-capable segment increasingly favored for on-premise and edge deployments.

4Hugging Face Blog·1mo ago·source ↗

Optimizing your LLM in production

A Hugging Face blog post covering practical techniques for optimizing large language models in production environments. The post likely addresses inference efficiency methods such as quantization, batching, caching, and hardware utilization strategies. It serves as a practitioner-oriented guide for deploying LLMs at scale.

5Hugging Face Blog·1mo ago·source ↗

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context

IBM released Granite Embedding Multilingual R2, an open-weights (Apache 2.0) multilingual embedding model with 32K context window, claiming best-in-class retrieval quality among sub-100M parameter models. The model is positioned for enterprise RAG and retrieval use cases across multiple languages. It is hosted and announced via Hugging Face.

7Hugging Face Blog·1mo ago·source ↗

Welcome Gemma - Google's new open LLM

Google released Gemma, a family of open-weight large language models, announced via the Hugging Face blog. The models are positioned as Google's entry into the open-weights LLM space, following the success of models like Llama 2. This release marks a significant strategic move by Google to compete in the open-source AI ecosystem.

7Hugging Face Blog·1mo ago·source ↗

Welcome Gemma 2 - Google's new open LLM

Google released Gemma 2, a new open-weights large language model, announced via the Hugging Face blog. The post covers integration with the Hugging Face ecosystem and highlights the model's capabilities. Gemma 2 represents Google's continued investment in open-weight model releases to compete in the open-source LLM space.

4Hugging Face Blog·1mo ago·source ↗

Deploy LLMs with Hugging Face Inference Endpoints

Hugging Face published a guide on deploying large language models using their Inference Endpoints service. The post covers how to set up scalable, production-ready LLM deployments with minimal infrastructure overhead. It targets developers looking to move from experimentation to hosted inference without managing raw compute.

5Interconnects·1mo ago·source ↗

OLMo Hybrid and Future LLM Architectures

Interconnects covers the latest OLMo hybrid model release and discusses emerging trends in open-source post-training tooling. The piece examines architectural directions for future large language models. As a tier-2 commentary source, it provides analysis rather than primary research findings.