Almanac
← Events
5Hugging Face Blog·1mo ago

Experimenting with Automatic PII Detection on the Hub using Presidio

Hugging Face describes an experiment integrating Microsoft's Presidio library for automatic personally identifiable information (PII) detection across datasets hosted on the Hub. The effort aims to flag or redact sensitive data before it can be used in model training pipelines. This represents a practical infrastructure-level approach to data governance and privacy compliance for open ML datasets.

Related guides (4)

Related events (8)

6Hugging Face Blog·1mo ago·source ↗

Hugging Face Launches Inference Providers on the Hub

Hugging Face has introduced Inference Providers on the Hub, a feature that allows users to run models hosted on the Hub through third-party inference providers directly from the platform. This integration consolidates access to multiple inference backends under a unified interface, reducing friction for developers who want to deploy or test models at scale. The announcement positions Hugging Face as a marketplace layer connecting model authors with inference infrastructure providers.

6Openai Blog·1mo ago·source ↗

Introducing OpenAI Privacy Filter

OpenAI has released an open-weight model called Privacy Filter designed to detect and redact personally identifiable information (PII) in text. The model is described as achieving state-of-the-art accuracy on PII detection tasks. This is OpenAI's first open-weight release focused specifically on data privacy and compliance use cases.

3Hugging Face Blog·1mo ago·source ↗

Huggy Lingo: Using Machine Learning to Improve Language Metadata on the Hugging Face Hub

Hugging Face introduced Huggy Lingo, a machine learning pipeline designed to automatically detect and fill in missing language metadata for models and datasets on the Hub. The system addresses a significant gap where many uploaded repositories lack proper language tags, making discovery and filtering difficult. By applying language identification models to repository contents, the project aims to improve the overall quality and searchability of the Hub's metadata.

4Hugging Face Blog·1mo ago·source ↗

Hugging Face and FriendliAI Partner to Supercharge Model Deployment on the Hub

Hugging Face and FriendliAI have announced a partnership to integrate FriendliAI's inference infrastructure directly into the Hugging Face Hub. The collaboration aims to simplify and accelerate model deployment for developers accessing models through the Hub. This expands the ecosystem of inference providers available on Hugging Face's platform.

5Hugging Face Blog·1mo ago·source ↗

Hugging Face Teams Up with Protect AI: Enhancing Model Security for the ML Community

Hugging Face has announced a partnership with Protect AI to improve security for machine learning models hosted on the platform. The collaboration aims to address vulnerabilities in model files and supply chain risks that affect the broader ML community. Specific details about the technical implementation and scope of the security enhancements are not provided in the available content.

5Hugging Face Blog·1mo ago·source ↗

Announcing Evaluation on the Hub

Hugging Face announced Evaluation on the Hub, a new feature enabling users to evaluate any model on any dataset directly within the Hugging Face Hub infrastructure. The tool aims to lower the barrier to standardized model evaluation by integrating evaluation workflows into the existing model and dataset hosting platform. This represents an infrastructure step toward more accessible and reproducible benchmarking in the ML community.

5Hugging Face Blog·1mo ago·source ↗

Running Privacy-Preserving Inferences on Hugging Face Endpoints

Hugging Face has published a blog post describing the integration of Fully Homomorphic Encryption (FHE) with its Inference Endpoints service, enabling privacy-preserving ML inference where data remains encrypted throughout computation. The approach allows clients to send encrypted inputs to a hosted model without the server ever seeing plaintext data. This represents a practical deployment of FHE-based ML, a technique that has historically been too slow for production use but is gaining traction with recent optimizations.

4Hugging Face Blog·1mo ago·source ↗

Creating Privacy Preserving AI with Substra

This Hugging Face blog post covers Substra, a federated learning framework developed by Owkin for privacy-preserving AI. The post describes how Substra enables collaborative model training across institutions without sharing raw data, targeting healthcare and biomedical use cases. It represents a practical deployment pattern for federated learning in sensitive data environments.