Almanac
← Events
4Hugging Face Blog·1mo ago

Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms

Hugging Face has introduced AnyLanguageModel, a unified Swift API that abstracts over both local on-device LLMs and remote LLM endpoints on Apple platforms (iOS, macOS). The library aims to simplify developer integration by providing a single interface regardless of whether inference runs locally or via a cloud API. This is positioned as a tooling release targeting the Apple developer ecosystem for AI-powered app development.

Related guides (3)

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Releasing Swift Transformers: Run On-Device LLMs in Apple Devices

Hugging Face released Swift Transformers, a Swift library enabling on-device LLM inference on Apple hardware (iOS, macOS) via Core ML. The library provides a pipeline abstraction for text generation and supports models converted to Core ML format. This extends the Hugging Face ecosystem to Apple's native development environment, lowering the barrier for deploying LLMs on Apple Silicon devices.

5Hugging Face Blog·1mo ago·source ↗

SmolLM: Hugging Face Releases Blazingly Fast Small Language Models

Hugging Face introduces SmolLM, a family of small language models designed for on-device and edge deployment with high speed and competitive performance. The models are positioned as efficient alternatives for resource-constrained environments. The release includes model weights and associated tooling on the Hugging Face Hub.

4Hugging Face Blog·1mo ago·source ↗

LLM Inference on Edge: Running LLMs via React Native on Mobile Devices

A Hugging Face blog post provides a practical guide to running large language models on-device using React Native for mobile phones. The post covers edge inference patterns, tooling setup, and deployment considerations for mobile LLM execution. This represents growing ecosystem support for on-device AI inference as an alternative to cloud-based deployment.

3Github Trending·8d ago·source ↗

mlx-lm: LLM inference library for Apple MLX framework trending on GitHub

mlx-lm is an open-source Python library for running LLMs using Apple's MLX framework, designed for Apple Silicon hardware. The repository has accumulated 5,817 stars with 43 new stars today, indicating steady community interest. It represents a key piece of the Apple-native ML inference ecosystem.

4Hugging Face Blog·1mo ago·source ↗

Deploy LLMs with Hugging Face Inference Endpoints

Hugging Face published a guide on deploying large language models using their Inference Endpoints service. The post covers how to set up scalable, production-ready LLM deployments with minimal infrastructure overhead. It targets developers looking to move from experimentation to hosted inference without managing raw compute.

8Hugging Face Blog·1mo ago·source ↗

GGML and llama.cpp Join Hugging Face to Ensure Long-Term Progress of Local AI

GGML and llama.cpp, the foundational open-source libraries enabling efficient local inference of large language models, are joining Hugging Face. This move is intended to secure long-term development and sustainability of the projects that underpin much of the local/on-device AI ecosystem. The acquisition or integration represents a significant consolidation of key open-weights inference infrastructure under the Hugging Face umbrella.

5Hugging Face Blog·1mo ago·source ↗

Fine-tune Any LLM from the Hugging Face Hub with Together AI

Together AI has announced an integration with Hugging Face that enables fine-tuning of any model from the Hugging Face Hub directly through Together AI's platform. This partnership expands access to fine-tuning infrastructure for open-weight models without requiring users to manage their own compute. The integration targets developers and enterprises seeking managed fine-tuning workflows for a broad range of open-source LLMs.

5Hugging Face Blog·1mo ago·source ↗

From OpenAI to Open LLMs with Messages API on Hugging Face

Hugging Face's Text Generation Inference (TGI) now supports an OpenAI-compatible Messages API, enabling developers to switch from OpenAI models to open-weight LLMs with minimal code changes. The integration allows existing OpenAI SDK users to point their client at Hugging Face endpoints by changing only the base URL and model name. This lowers the migration barrier for teams wanting to self-host or use open models while retaining familiar tooling.