6Hugging Face Blog·1mo ago

Hugging Face Transformers Code Agent Beats GAIA Benchmark

Hugging Face reports that their Transformers-based code agent has achieved a top score on the GAIA benchmark, a challenging evaluation for general AI assistants requiring multi-step reasoning and tool use. The result positions Hugging Face's open agent framework competitively against proprietary systems. The post details the agent architecture and tooling approach used to achieve the result.

Evaluation and Benchmarking Open Weights Progress Agent and Tool Ecosystem Transformers Code Agent GAIA Hugging Face

Related guides (4)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Evaluation and BenchmarkingTopic guide

Evaluation and Benchmarking: The Shifting Yardstick of AI Capability

Read asIn-depth

Related events (8)

6Hugging Face Blog·1mo ago·source ↗

Gaia2 and ARE: Empowering the community to study agents

Hugging Face has released Gaia2 and the Agent Reasoning Evaluation (ARE) framework, aimed at enabling the research community to study and benchmark AI agents. The post describes new tools and datasets for evaluating agent capabilities, building on the original GAIA benchmark. This represents an expansion of the agent evaluation ecosystem with community-oriented tooling.

Evaluation and Benchmarking Agent and Tool Ecosystem GAIA2 GAIA Hugging Face +1 more

5Hugging Face Blog·2d ago·source ↗

Hugging Face benchmarks open models on agentic tool-use tasks

Hugging Face published a blog post examining whether open models are sufficiently capable for agentic use cases, focusing on benchmarking them against real-world tooling. The post addresses the practical question of which open-weights models can reliably handle tool-calling and multi-step agentic workflows. This is relevant to practitioners evaluating open models for agent deployments.

Evaluation and Benchmarking Open Weights Progress Hugging Face +1 more

4Hugging Face Blog·1mo ago·source ↗

CUGA on Hugging Face: Democratizing Configurable AI Agents

IBM Research has released CUGA (Configurable Universal Generative Agent) on Hugging Face, positioning it as a framework for building configurable AI agents. The announcement appears on the Hugging Face blog as a tier-2 commentary piece from IBM Research. Details on architecture, benchmarks, and specific capabilities are not available from the body text provided.

Enterprise Deployment Patterns Agent and Tool Ecosystem IBM Research Hugging Face CUGA

6Hugging Face Blog·1mo ago·source ↗

License to Call: Introducing Transformers Agents 2.0

Hugging Face announced Transformers Agents 2.0, a major update to their agent framework built on top of the Transformers library. The release introduces new abstractions for tool use, multi-step reasoning, and agent orchestration, positioning it as a production-ready framework for building AI agents. The update reflects growing ecosystem investment in standardized agent tooling patterns.

Open Weights Progress Enterprise Deployment Patterns Transformers Transformers Agents 2.0 Hugging Face +1 more

4Hugging Face Blog·1mo ago·source ↗

Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training

Habana Labs and Hugging Face announced a partnership to accelerate transformer model training on Habana's Gaudi AI processors. The collaboration aims to integrate Hugging Face's Transformers library with Habana's hardware, offering an alternative to GPU-based training infrastructure. This represents an early effort to diversify the AI training hardware ecosystem beyond NVIDIA dominance.

Training Infrastructure Inference Economics Habana Labs Gaudi Hugging Face Transformers +2 more

4Hugging Face Blog·1mo ago·source ↗

How Hugging Face Sped Up Transformer Inference 100x for API Customers

Hugging Face describes engineering optimizations that achieved up to 100x speedups in transformer inference for their hosted API customers. The post covers techniques applied to accelerate model serving at scale. This is a 2021 article documenting early inference optimization work at Hugging Face's inference API product.

Inference Economics Enterprise Deployment Patterns Transformers Hugging Face Inference API Hugging Face

4Hugging Face Blog·11d ago·source ↗

Hugging Face demonstrates agent chaining two Spaces to build a 3D Paris gallery

A Hugging Face blog post describes an agent that autonomously chains two Hugging Face Spaces to generate a 3D gallery of Paris, illustrating multi-step tool use and Space-to-Space orchestration. The demo showcases how agents can compose existing hosted ML tools without custom infrastructure. This is a practical capability demonstration relevant to the agent-tool ecosystem.

Agent and Tool Ecosystem Hugging Face Spaces Hugging Face Mishig Davaadorj

4Hugging Face Blog·1mo ago·source ↗

Hugging Face and Graphcore Partner for IPU-Optimized Transformers

Hugging Face and Graphcore announced a partnership to optimize Transformer models for Graphcore's Intelligence Processing Unit (IPU) hardware. The collaboration aims to make IPU-accelerated inference and training accessible through the Hugging Face ecosystem. This represents an early effort to broaden AI hardware options beyond GPU-dominated infrastructure.

Training Infrastructure Inference Economics Transformers Graphcore Hugging Face +1 more