Almanac
← Events
5Hugging Face Blog·1mo ago

Swift Transformers Reaches 1.0 – and Looks to the Future

Hugging Face's Swift Transformers library has reached version 1.0, marking a stable release milestone for running transformer models natively on Apple platforms. The announcement covers the library's current capabilities and future roadmap for on-device inference on iOS and macOS. This represents a significant step for deploying open-weight models in Apple ecosystem applications without server-side inference.

Related guides (3)

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Releasing Swift Transformers: Run On-Device LLMs in Apple Devices

Hugging Face released Swift Transformers, a Swift library enabling on-device LLM inference on Apple hardware (iOS, macOS) via Core ML. The library provides a pipeline abstraction for text generation and supports models converted to Core ML format. This extends the Hugging Face ecosystem to Apple's native development environment, lowering the barrier for deploying LLMs on Apple Silicon devices.

7Hugging Face Blog·1mo ago·source ↗

Transformers v5: Simple model definitions powering the AI ecosystem

Hugging Face has announced Transformers v5, a major version update to its flagship open-source library. The release focuses on simplified model definitions and architectural improvements to the codebase. As one of the most widely used ML libraries in the ecosystem, this update has broad implications for researchers and practitioners building on top of the Transformers framework.

5Hugging Face Blog·1mo ago·source ↗

Transformers.js v4: Now Available on NPM

Hugging Face has released Transformers.js v4, a major version update to its JavaScript library for running transformer models in the browser and Node.js, now published on NPM. The release likely includes updated model support, performance improvements, and API changes. This continues the trend of bringing ML inference capabilities directly to JavaScript environments without requiring a Python backend.

4Hugging Face Blog·1mo ago·source ↗

How Hugging Face Sped Up Transformer Inference 100x for API Customers

Hugging Face describes engineering optimizations that achieved up to 100x speedups in transformer inference for their hosted API customers. The post covers techniques applied to accelerate model serving at scale. This is a 2021 article documenting early inference optimization work at Hugging Face's inference API product.

6Hugging Face Blog·1mo ago·source ↗

Transformers.js v3: WebGPU Support, New Models & Tasks, and More

Hugging Face released Transformers.js v3, a major update to its JavaScript inference library enabling on-device ML in browsers and Node.js. The release adds WebGPU backend support for hardware-accelerated inference, expands the supported model and task catalog, and improves overall performance. This brings browser-side AI inference closer to parity with native runtimes for a wider range of use cases.

6Hugging Face Blog·1mo ago·source ↗

License to Call: Introducing Transformers Agents 2.0

Hugging Face announced Transformers Agents 2.0, a major update to their agent framework built on top of the Transformers library. The release introduces new abstractions for tool use, multi-step reasoning, and agent orchestration, positioning it as a production-ready framework for building AI agents. The update reflects growing ecosystem investment in standardized agent tooling patterns.

4Hugging Face Blog·1mo ago·source ↗

Introducing swift-huggingface: The Complete Swift Client for Hugging Face

Hugging Face has released swift-huggingface, a Swift client library for interacting with the Hugging Face platform and its APIs. The library targets Apple ecosystem developers, enabling native iOS/macOS integration with Hugging Face model inference, Hub access, and related services. This extends Hugging Face's multi-language SDK ecosystem to Swift.

5Hugging Face Blog·1mo ago·source ↗

Transformers Backend Integration in SGLang

Hugging Face has announced an integration that allows SGLang, a high-performance LLM serving framework, to use the Transformers library as a backend. This enables models supported by Transformers to be served through SGLang's inference engine, combining SGLang's optimized serving capabilities with the broad model coverage of the Transformers ecosystem. The integration lowers the barrier for deploying a wide range of models with production-grade inference infrastructure.