
HuggingFace
huggingface-47eb5f3a·18 events·first seen 1mo agoAliases: HuggingFace, huggingface_hub
Co-occurring entities
More like this (12)
Recent events (18)
huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning
Hugging Face has released huggingface_hub v1.0, marking a major milestone for the Python client library that underpins access to the Hugging Face Hub ecosystem. The v1.0 designation signals API stability and maturity after five years of development. This library is a foundational piece of open-source ML infrastructure, enabling model downloads, dataset access, and repository management across the broader ML community.
FineVideo: Behind the Scenes — HuggingFace Video Dataset Release
HuggingFace published a behind-the-scenes account of FineVideo, a curated dataset aimed at advancing video understanding in AI/ML models. The post details the data collection, annotation, and curation methodology used to build the dataset. FineVideo is positioned as a resource for training and evaluating multimodal video models.
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
HuggingFace introduces WebSight, a dataset designed to train vision-language models to convert web screenshots into HTML/CSS code. The dataset enables multimodal models to perform screenshot-to-code tasks, a capability relevant to UI automation and web development agents. This work represents a targeted dataset contribution for a specific multimodal grounding task.
Qwen-Image: 20B MMDiT Image Foundation Model with Native Text Rendering
Alibaba's Qwen team has released Qwen-Image, a 20B parameter MMDiT (Multimodal Diffusion Transformer) image generation foundation model. The model claims significant advances in complex text rendering capabilities, including multi-line layouts, paragraph-level semantics, and fine-grained typographic details across alphabetic and other language scripts. It also features precise image editing capabilities and is accessible via Qwen Chat and open-weight repositories on HuggingFace and ModelScope.
Qwen2.5-Math Process Reward Model for Mathematical Reasoning Supervision
Alibaba's Qwen team introduces a process reward model (PRM) aimed at improving the reliability of mathematical reasoning in LLMs by supervising intermediate reasoning steps rather than only final answers. The work addresses the problem of models producing plausible but flawed intermediate derivations even when reaching correct conclusions. The release includes model weights on HuggingFace and ModelScope alongside a GitHub repository.
Qwen2.5-LLM: Alibaba releases open-weight language models from 0.5B to 72B
Alibaba's Qwen team releases the Qwen2.5 series of decoder-only dense language models, open-sourcing seven variants spanning 0.5B to 72B parameters. The release targets production use cases in the 10-30B range and mobile deployments at 3B scale. This represents a significant expansion of the open-weights frontier from a Tier 1 Chinese AI lab.
DeepSeek V4 Preview Release: 1.6T-param Pro and 284B Flash Models with 1M Context, Open-Sourced
DeepSeek has released DeepSeek-V4 as an open-weights preview, comprising two MoE variants: V4-Pro (1.6T total / 49B active parameters) and V4-Flash (284B total / 13B active parameters). Both models support 1M token context by default, enabled by a novel Token-wise compression and DeepSeek Sparse Attention (DSA) architecture. V4-Pro claims open-source SOTA on agentic coding benchmarks and world-class math/STEM/coding performance rivaling top closed-source models, while V4-Flash offers near-parity reasoning at lower cost and latency. The API is live today with OpenAI and Anthropic compatibility, and legacy model endpoints will be retired in July 2026.
Mistral 7B: Open-Weights 7B Model Outperforming Llama 2 13B
Mistral AI released Mistral 7B, a 7.3B parameter language model under the Apache 2.0 license that outperforms Llama 2 13B across all evaluated benchmarks and approaches Llama 34B on many tasks. The model employs Grouped-Query Attention (GQA) for faster inference and Sliding Window Attention (SWA) to handle longer sequences at reduced cost, achieving roughly 2x speed improvement at 16k sequence length. A fine-tuned chat variant, Mistral 7B Instruct, outperforms all 7B chat models on MT-Bench and is competitive with 13B-class chat models. The release includes deployment support for AWS, GCP, Azure, HuggingFace, and local use via vLLM.
Mistral AI Releases Codestral: 22B Open-Weight Code Generation Model
Mistral AI has released Codestral, a 22B open-weight model explicitly designed for code generation, supporting 80+ programming languages with a 32k context window. The model is available under a non-production license on HuggingFace, with commercial licenses available on request, and is accessible via a dedicated API endpoint (codestral.mistral.ai) free during an 8-week beta. Codestral claims state-of-the-art performance on RepoBench, HumanEval, and fill-in-the-middle benchmarks, outperforming DeepSeek Coder 33B and matching or exceeding GPT-4-Turbo on some language-specific evals. Integrations are available with LlamaIndex, LangChain, Continue.dev, and Tabnine for IDE-based developer workflows.
Mistral AI Releases Mathstral 7B: Math-Specialized Model with SOTA Reasoning in Size Category
Mistral AI has released Mathstral 7B, a math and STEM-specialized model built on Mistral 7B, developed in collaboration with Project Numina. The model achieves 56.6% on MATH and 63.47% on MMLU in standard evaluation, improving to 74.59% on MATH with a reward model over 64 candidates using inference-time compute scaling. Weights are open on HuggingFace and compatible with mistral-inference and mistral-finetune tooling.
GLM-5.1 Open-Weights Model Targets Long-Running Agentic Tasks; Andrew Ng on Coding Agent Acceleration by Software Domain
Z.ai released GLM-5.1, an open-weights mixture-of-experts LLM (754B total / 40B active parameters) designed for sustained agentic coding tasks lasting up to eight hours, featuring iterative planning-execution-evaluation loops with thousands of tool calls. The model claims top open-weights performance on Artificial Analysis Intelligence Index and SWE-Bench Pro, available under MIT license via HuggingFace. The accompanying editorial by Andrew Ng offers a tiered framework for how much coding agents accelerate different software work categories—frontend most, then backend, infrastructure, and research least—with practical implications for team organization. A secondary item references data-center opposition and LLM helpfulness failure modes.
Qwen2.5-Turbo Extends Context Length to 1M Tokens
Alibaba's Qwen team has released Qwen2.5-Turbo, extending the model's context window from 128K to 1 million tokens (approximately 1 million English words). The update includes optimizations for both model capabilities and inference performance at extreme context lengths. The model is available via API and through HuggingFace and ModelScope demos.
CodeQwen1.5: Alibaba's Open-Source Code LLM Release
Alibaba's Qwen team released CodeQwen1.5, an open-source large language model specialized for code generation and programming assistance. The release is positioned as a transparent, accessible alternative to proprietary coding assistants like GitHub Copilot, addressing concerns around cost, privacy, security, and copyright. The model is available on GitHub, HuggingFace, and ModelScope.
Qwen1.5-32B: Alibaba's 30B-Parameter Capstone for the Qwen1.5 Series
Alibaba's Qwen team released Qwen1.5-32B, a ~30 billion parameter open-weights language model positioned as the capstone of the Qwen1.5 series. The model targets the emerging consensus around 30B parameters as an optimal balance between performance, memory footprint, and inference efficiency. It is released alongside code on GitHub, weights on HuggingFace and ModelScope, and an interactive demo.
Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters
Alibaba's Qwen team releases Qwen1.5-MoE-A2.7B, a mixture-of-experts model with only 2.7 billion activated parameters that claims performance parity with 7B dense models such as Mistral 7B and Qwen1.5-7B. The model activates roughly one-third of its total parameters during inference, offering significant compute efficiency gains. This release follows growing industry interest in MoE architectures sparked by Mixtral, and the model is available on GitHub, HuggingFace, and ModelScope.
DeepSeek-V2.5: Merged Open-Source Model Combining General and Coding Capabilities
DeepSeek has released DeepSeek-V2.5, an open-source model that merges DeepSeek-V2-Chat-0628 and DeepSeek-Coder-V2-0724 into a single unified model. The release improves general conversational capabilities, coding performance, instruction-following, and writing tasks while also strengthening safety properties—raising the overall safety score from 74.4% to 82.6% and reducing safety spillover rate from 11.3% to 4.6%. The model is available via backward-compatible API endpoints (deepseek-chat and deepseek-coder) and on HuggingFace, retaining features like Function Calling, FIM completion, and JSON output. Benchmark results show improvements on HumanEval Python and LiveCodeBench, though SWE-verified performance remains an acknowledged weak area.
Mistral Large 2 (123B): New Frontier Model with 128k Context, Multilingual and Code Capabilities
Mistral AI releases Mistral Large 2, a 123-billion-parameter model with a 128k context window, supporting 80+ coding languages and over a dozen natural languages. The model claims competitive performance with GPT-4o, Claude 3 Opus, and Llama 3 405B on code generation, reasoning, and multilingual benchmarks, while targeting cost-efficient single-node inference. Weights are available under a Mistral Research License for non-commercial use, with a commercial license required for self-deployment. The model is accessible via Mistral's la Plateforme API (mistral-large-2407), HuggingFace, and Google Cloud Vertex AI.
Z.ai's GLM-5.1 Open-Weights Model Targets Multi-Hour Agentic Coding Tasks with Iterative Self-Evaluation
Z.ai released GLM-5.1, a 754B parameter mixture-of-experts open-weights model optimized for long-running agentic coding tasks, capable of cycling through planning, execution, and strategy revision hundreds of times over sessions lasting up to eight hours. The model achieves top open-weights scores on the Artificial Analysis Intelligence Index and third place on Arena's Code leaderboard, while leading SWE-Bench Pro in Z.ai's own evaluations at 58.4 percent. Weights are available on HuggingFace under MIT license, with API pricing roughly 40 percent higher than its predecessor but still below comparable proprietary models. No technical report has been published, leaving architecture and training details undisclosed.