Releasing Swift Transformers: Run On-Device LLMs in Apple Devices
Hugging Face released Swift Transformers, a Swift library enabling on-device LLM inference on Apple hardware (iOS, macOS) via Core ML. The library provides a pipeline abstraction for text generation and supports models converted to Core ML format. This extends the Hugging Face ecosystem to Apple's native development environment, lowering the barrier for deploying LLMs on Apple Silicon devices.
Related guides (3)
Related events (8)
Swift Transformers Reaches 1.0 – and Looks to the Future
Hugging Face's Swift Transformers library has reached version 1.0, marking a stable release milestone for running transformer models natively on Apple platforms. The announcement covers the library's current capabilities and future roadmap for on-device inference on iOS and macOS. This represents a significant step for deploying open-weight models in Apple ecosystem applications without server-side inference.
Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms
Hugging Face has introduced AnyLanguageModel, a unified Swift API that abstracts over both local on-device LLMs and remote LLM endpoints on Apple platforms (iOS, macOS). The library aims to simplify developer integration by providing a single interface regardless of whether inference runs locally or via a cloud API. This is positioned as a tooling release targeting the Apple developer ecosystem for AI-powered app development.
SmolLM: Hugging Face Releases Blazingly Fast Small Language Models
Hugging Face introduces SmolLM, a family of small language models designed for on-device and edge deployment with high speed and competitive performance. The models are positioned as efficient alternatives for resource-constrained environments. The release includes model weights and associated tooling on the Hugging Face Hub.
Swift Diffusers: Fast Stable Diffusion for Mac
Hugging Face published a blog post introducing Swift Diffusers, a native macOS/iOS application for running Stable Diffusion models locally on Apple Silicon hardware. The post covers optimizations leveraging Apple's Core ML framework to accelerate inference on Mac. This represents an effort to bring on-device diffusion model inference to consumer Apple hardware without cloud dependency.
WWDC 24: Running Mistral 7B with Core ML
This Hugging Face blog post covers running Mistral 7B on Apple devices using Core ML, likely demonstrated or announced around WWDC 2024. It addresses on-device inference of a 7B parameter open-weights model using Apple's ML framework. This represents a practical deployment pattern for running capable open-weights LLMs locally on Apple Silicon hardware.
mlx-lm: LLM inference library for Apple MLX framework trending on GitHub
mlx-lm is an open-source Python library for running LLMs using Apple's MLX framework, designed for Apple Silicon hardware. The repository has accumulated 5,817 stars with 43 new stars today, indicating steady community interest. It represents a key piece of the Apple-native ML inference ecosystem.
Faster Stable Diffusion with Core ML on iPhone, iPad, and Mac
Hugging Face published a blog post detailing optimizations for running Stable Diffusion models via Core ML on Apple devices including iPhone, iPad, and Mac. The post covers techniques to accelerate on-device inference using Apple's neural engine and Core ML framework. This represents progress in deploying capable diffusion models at the edge without cloud dependency.
Using Stable Diffusion with Core ML on Apple Silicon
Hugging Face published a guide on running Stable Diffusion models via Apple's Core ML framework on Apple Silicon hardware. The post covers converting diffusion model weights to Core ML format and integrating them into the Diffusers library for on-device inference. This represents an early effort to enable efficient local image generation on consumer Apple hardware without requiring cloud GPU resources.


