Almanac
← Events
5Hugging Face Blog·1mo ago

Bringing Serverless GPU Inference to Hugging Face Users via Cloudflare Workers AI

Hugging Face and Cloudflare have partnered to bring serverless GPU inference to Hugging Face users through Cloudflare Workers AI. The integration allows developers to run Hugging Face models on Cloudflare's global edge network without managing GPU infrastructure. This represents an expansion of serverless inference options for the Hugging Face ecosystem, lowering the barrier to deploying ML models at scale.

Related guides (4)

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Serverless Inference with Hugging Face and NVIDIA NIM

Hugging Face and NVIDIA have partnered to offer serverless inference via NVIDIA NIM microservices on DGX Cloud infrastructure. The integration allows developers to run optimized model inference without managing GPU infrastructure, combining Hugging Face's model hub with NVIDIA's inference optimization stack. This represents an expansion of the existing Hugging Face–NVIDIA partnership into managed inference services.

4Hugging Face Blog·1mo ago·source ↗

Hugging Face Adds Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita

Hugging Face has expanded its serverless inference provider ecosystem by integrating three new partners: Hyperbolic, Nebius AI Studio, and Novita. These providers offer API-based inference for models hosted on the Hugging Face Hub, increasing the options available to developers for deploying open-weights models without managing infrastructure. The expansion reflects growing competition in the inference-as-a-service market targeting open-source AI workloads.

4Hugging Face Blog·1mo ago·source ↗

Featherless AI Joins Hugging Face Inference Providers

Hugging Face has added Featherless AI as a new inference provider in its Inference Providers ecosystem. Featherless AI specializes in serverless inference for open-weight models, expanding the range of third-party compute options available through the Hugging Face platform. This integration allows developers to route model inference requests to Featherless AI directly via the Hugging Face API and model hub.

6Hugging Face Blog·1mo ago·source ↗

Hugging Face Launches Inference Providers on the Hub

Hugging Face has introduced Inference Providers on the Hub, a feature that allows users to run models hosted on the Hub through third-party inference providers directly from the platform. This integration consolidates access to multiple inference backends under a unified interface, reducing friction for developers who want to deploy or test models at scale. The announcement positions Hugging Face as a marketplace layer connecting model authors with inference infrastructure providers.

6Hugging Face Blog·1mo ago·source ↗

Hugging Face and NVIDIA Launch Training Cluster as a Service

Hugging Face and NVIDIA are announcing a joint 'Training Cluster as a Service' offering, providing managed GPU cluster access for AI model training. The collaboration aims to lower the barrier for organizations to access large-scale training infrastructure without managing hardware directly. This represents a strategic partnership between a major AI platform and a leading GPU manufacturer to address enterprise training infrastructure needs.

5Hugging Face Blog·1mo ago·source ↗

Google Cloud TPUs made available to Hugging Face users

Hugging Face has announced the availability of Google Cloud TPUs for its Inference Endpoints and Spaces products. This integration allows Hugging Face users to deploy and run models on TPU hardware directly through the Hugging Face platform. The move expands the hardware options available to developers and researchers working with large models on Hugging Face infrastructure.

5Hugging Face Blog·1mo ago·source ↗

Deploy models on AWS Inferentia2 from Hugging Face

Hugging Face has announced support for deploying models on AWS Inferentia2 via Hugging Face Inference Endpoints. The integration allows users to deploy popular open-weight models on AWS's custom ML accelerator chips directly from the Hugging Face Hub. This expands the hardware options available for cost-effective inference beyond standard GPU instances.

5Hugging Face Blog·1mo ago·source ↗

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Hugging Face announced integration with NVIDIA DGX Cloud, enabling users to train models on H100 GPU clusters directly through the Hugging Face platform. The partnership simplifies access to high-end training infrastructure without requiring users to manage cloud provisioning themselves. This represents a continued push to lower the barrier to large-scale model training for the broader ML community.