Transformers v5: Simple model definitions powering the AI ecosystem
Hugging Face has announced Transformers v5, a major version update to its flagship open-source library. The release focuses on simplified model definitions and architectural improvements to the codebase. As one of the most widely used ML libraries in the ecosystem, this update has broad implications for researchers and practitioners building on top of the Transformers framework.
Related guides (4)
Related events (8)
The Transformers Library: Standardizing Model Definitions
Hugging Face published a blog post outlining their approach to standardizing model definitions within the Transformers library. The post addresses how the library structures and maintains model code to ensure consistency, reproducibility, and ease of integration across a wide range of architectures. This is a tooling and ecosystem development relevant to practitioners building on or contributing to the Transformers framework.
Tokenization in Transformers v5: Simpler, Clearer, and More Modular
Hugging Face's Transformers v5 introduces a redesigned tokenization system aimed at being simpler, clearer, and more modular. The blog post outlines architectural changes to how tokenizers are structured and used within the library. This represents a significant API and design evolution for one of the most widely used ML frameworks in the ecosystem.
Transformers.js v4: Now Available on NPM
Hugging Face has released Transformers.js v4, a major version update to its JavaScript library for running transformer models in the browser and Node.js, now published on NPM. The release likely includes updated model support, performance improvements, and API changes. This continues the trend of bringing ML inference capabilities directly to JavaScript environments without requiring a Python backend.
License to Call: Introducing Transformers Agents 2.0
Hugging Face announced Transformers Agents 2.0, a major update to their agent framework built on top of the Transformers library. The release introduces new abstractions for tool use, multi-step reasoning, and agent orchestration, positioning it as a production-ready framework for building AI agents. The update reflects growing ecosystem investment in standardized agent tooling patterns.
Transformers.js v3: WebGPU Support, New Models & Tasks, and More
Hugging Face released Transformers.js v3, a major update to its JavaScript inference library enabling on-device ML in browsers and Node.js. The release adds WebGPU backend support for hardware-accelerated inference, expands the supported model and task catalog, and improves overall performance. This brings browser-side AI inference closer to parity with native runtimes for a wider range of use cases.
Introducing Decision Transformers on Hugging Face
Hugging Face introduces support for Decision Transformers, a framework that casts offline reinforcement learning as a sequence modeling problem using transformer architectures. The blog post covers the conceptual basis of Decision Transformers and their integration into the Hugging Face ecosystem. This represents an early step in bringing RL-based model paradigms into the standard ML tooling stack.
Swift Transformers Reaches 1.0 – and Looks to the Future
Hugging Face's Swift Transformers library has reached version 1.0, marking a stable release milestone for running transformer models natively on Apple platforms. The announcement covers the library's current capabilities and future roadmap for on-device inference on iOS and macOS. This represents a significant step for deploying open-weight models in Apple ecosystem applications without server-side inference.
Transformers Backend Integration in SGLang
Hugging Face has announced an integration that allows SGLang, a high-performance LLM serving framework, to use the Transformers library as a backend. This enables models supported by Transformers to be served through SGLang's inference engine, combining SGLang's optimized serving capabilities with the broad model coverage of the Transformers ecosystem. The integration lowers the barrier for deploying a wide range of models with production-grade inference infrastructure.



