3Hugging Face Blog·1mo ago

Rocket Money x Hugging Face: Scaling Volatile ML Models in Production

Rocket Money partnered with Hugging Face to deploy and scale ML models in production, addressing challenges around volatile workloads. The case study covers infrastructure patterns for handling unpredictable demand in a fintech ML context. It represents a practical deployment example of Hugging Face's enterprise inference and hosting offerings.

Inference Economics Enterprise Deployment Patterns Rocket Money Hugging Face

Related guides (3)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost of Running AI in Production

Read asBeginner In-depth

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

Introducing HUGS - Scale your AI with Open Models

Hugging Face announced HUGS (Hugging Face Generative Services), a new product aimed at helping enterprises scale AI deployments using open models. The service appears to target production inference infrastructure for open-weight models, positioning Hugging Face as a managed deployment layer. This is a product launch in the enterprise AI infrastructure space, competing with managed inference offerings from other providers.

Open Weights Progress Inference Economics HUGS Hugging Face +1 more

4Hugging Face Blog·1mo ago·source ↗

Deploy LLMs with Hugging Face Inference Endpoints

Hugging Face published a guide on deploying large language models using their Inference Endpoints service. The post covers how to set up scalable, production-ready LLM deployments with minimal infrastructure overhead. It targets developers looking to move from experimentation to hosted inference without managing raw compute.

Inference Economics Enterprise Deployment Patterns Hugging Face Inference Endpoints Hugging Face

4Hugging Face Blog·1mo ago·source ↗

Accelerating Hugging Face Transformers with AWS Inferentia2

Hugging Face published a blog post detailing how to accelerate Transformer model inference using AWS Inferentia2, Amazon's second-generation ML inference chip. The post covers integration patterns between the Hugging Face ecosystem and the Neuron SDK for deploying models on Inferentia2 hardware. This represents a practical guide for enterprise and cloud-based inference deployment using dedicated AI accelerators.

Training Infrastructure Inference Economics AWS Inferentia2 Hugging Face Transformers Hugging Face +3 more

4Hugging Face Blog·1mo ago·source ↗

Deploy Hugging Face Models Easily with Amazon SageMaker

Hugging Face and Amazon SageMaker announced an integration enabling streamlined deployment of Hugging Face models via SageMaker's managed infrastructure. The partnership provides dedicated Hugging Face Deep Learning Containers on AWS, simplifying the path from model hub to production inference. This represents an early milestone in the enterprise deployment pattern of hosted model hubs integrating with cloud ML platforms.

Inference Economics Enterprise Deployment Patterns Amazon SageMaker Hugging Face Deep Learning Containers Hugging Face +1 more

5Hugging Face Blog·1mo ago·source ↗

Hugging Face Teams Up with Protect AI: Enhancing Model Security for the ML Community

Hugging Face has announced a partnership with Protect AI to improve security for machine learning models hosted on the platform. The collaboration aims to address vulnerabilities in model files and supply chain risks that affect the broader ML community. Specific details about the technical implementation and scope of the security enhancements are not provided in the available content.

AI Safety Research Enterprise Deployment Patterns Protect AI Hugging Face

5Hugging Face Blog·1mo ago·source ↗

Databricks + Hugging Face Integration Achieves Up to 40% Faster LLM Training and Tuning

Databricks and Hugging Face have published a case study describing their integration that delivers up to 40% faster training and fine-tuning of large language models. The collaboration leverages Databricks' distributed compute infrastructure alongside Hugging Face's model hub and training libraries. This represents a practical infrastructure optimization for enterprise teams running LLM workloads on Databricks.

Training Infrastructure Enterprise Deployment Patterns Databricks Hugging Face

5Hugging Face Blog·1mo ago·source ↗

Deploy models on AWS Inferentia2 from Hugging Face

Hugging Face has announced support for deploying models on AWS Inferentia2 via Hugging Face Inference Endpoints. The integration allows users to deploy popular open-weight models on AWS's custom ML accelerator chips directly from the Hugging Face Hub. This expands the hardware options available for cost-effective inference beyond standard GPU instances.

Inference Economics Enterprise Deployment Patterns Hugging Face Inference Endpoints AWS Inferentia2 Hugging Face +1 more

5Hugging Face Blog·1mo ago·source ↗

The Partnership: Amazon SageMaker and Hugging Face

Hugging Face and Amazon announced a partnership integrating Hugging Face models and tools natively into Amazon SageMaker. This collaboration enables developers to train and deploy Hugging Face Transformers models directly within SageMaker's managed ML infrastructure. The partnership represents an early major cloud-provider integration for Hugging Face, expanding enterprise access to open-source NLP models.

Enterprise Deployment Patterns Agent and Tool Ecosystem Amazon SageMaker Hugging Face Transformers Hugging Face +1 more