7arXiv cs.AI (Artificial Intelligence)·1mo ago

Toto 2.0: Open-Weights Time Series Foundation Models Demonstrate Scaling Laws from 4M to 2.5B Parameters

Datadog releases Toto 2.0, a family of five open-weights time series forecasting models ranging from 4M to 2.5B parameters, demonstrating consistent forecast quality improvements with scale. The models achieve state-of-the-art results on three benchmarks: BOOM (observability), GIFT-Eval (general-purpose), and TIME (contamination-resistant). The release includes architectural details, a u-muP hyperparameter transfer pipeline, and all base checkpoints under Apache 2.0 license.

Training Infrastructure Frontier Model Releases Evaluation and Benchmarking Open Weights Progress Toto 2.0 GIFT-Eval TIME Datadog BOOM u-muP

Related guides (3)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

Related events (8)

9Openai Blog·1mo ago·source ↗

OpenAI Releases gpt-oss-120b and gpt-oss-20b Open-Weight Models Under Apache 2.0

OpenAI is releasing two open-weight language models, gpt-oss-120b and gpt-oss-20b, under the Apache 2.0 license. The models are claimed to outperform similarly sized open models on reasoning tasks and feature strong tool use capabilities. They are optimized for efficient deployment on consumer hardware, positioning them as cost-effective alternatives in the open-weights ecosystem.

Frontier Model Releases Open Weights Progress Apache 2.0 GPT-OSS 120B OpenAI +3 more

9Openai Blog·1mo ago·source ↗

OpenAI Releases gpt-oss-120b and gpt-oss-20b Open-Weight Reasoning Models

OpenAI has published model cards for gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models released under the Apache 2.0 license alongside a dedicated gpt-oss usage policy. This marks a significant move by OpenAI into the open-weights space, offering both a large 120B parameter model and a smaller 20B variant. The release signals a strategic shift for OpenAI, which has historically kept its frontier models proprietary.

Frontier Model Releases Open Weights Progress gpt-oss usage policy Apache 2.0 GPT-OSS 120B +4 more

6The Batch·20d ago·source ↗

Kimi K2.6: Moonshot AI's 1T-Parameter Vision-Language Model Matches Open-Weights Peers, Trails Top Closed Models

Moonshot AI released Kimi K2.6, a 1 trillion-parameter mixture-of-experts vision-language model with 32B active parameters, designed for long-horizon autonomous coding sessions lasting multiple days and multi-agent orchestration scaling to 300 parallel subagents executing up to 4,000 steps. The model matches Qwen3.6 Max Preview and DeepSeek-V4-Pro on the Artificial Analysis Intelligence Index (scoring 54 vs. their 52) while trailing closed models like GPT-5.5 and Claude Opus 4.7. Weights are freely downloadable from Hugging Face under a modified MIT license permitting commercial use, with API access priced at $0.95/$0.16/$4.00 per million input/cached/output tokens. Notable features include a 256K token context window, native INT4 quantization, a 'preserve thinking' mode for multi-turn reasoning continuity, and a research preview 'claw groups' feature enabling cross-developer agent collaboration.

Frontier Model Releases Evaluation and Benchmarking Artificial Analysis Intelligence Index Claude Opus 4.6 Qwen3.6 Max Preview +14 more

8Hugging Face Blog·1mo ago·source ↗

Falcon 180B Released: New Open-Weights Frontier Model

Technology Innovation Institute (TII) has released Falcon 180B, a 180-billion parameter open-weights language model announced via Hugging Face. At the time of release, it was positioned as the largest publicly available open-weights model, trained on 3.5 trillion tokens. The model is available on Hugging Face Hub for research and commercial use under a custom license.

Frontier Model Releases Open Weights Progress Hugging Face Technology Innovation Institute Falcon 180B +1 more

5Openai Blog·1mo ago·source ↗

GPT-2: 6-Month Follow-Up — 774M Parameter Model Released

OpenAI released the 774 million parameter version of GPT-2 as part of its staged release strategy, following the 124M model in February and 355M model in May 2019. The release is accompanied by an open-source legal agreement to facilitate model-sharing partnerships between organizations. OpenAI also published a technical report on coordinating with the AI research community around publication norms and staged disclosure practices.

Frontier Model Releases Open Weights Progress GPT-2 124M GPT-2 OpenAI +2 more

8Qwen Research·1mo ago·source ↗

Qwen2.5-LLM: Alibaba releases open-weight language models from 0.5B to 72B

Alibaba's Qwen team releases the Qwen2.5 series of decoder-only dense language models, open-sourcing seven variants spanning 0.5B to 72B parameters. The release targets production use cases in the 10-30B range and mobile deployments at 3B scale. This represents a significant expansion of the open-weights frontier from a Tier 1 Chinese AI lab.

Frontier Model Releases Open Weights Progress Qwen2.5 Alibaba Qwen Team +4 more

7arXiv · cs.CL·19d ago·source ↗

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

This paper reframes parameter-efficient fine-tuning (PEFT) not merely as a cheaper alternative to full fine-tuning, but as a substrate for persistent, instance-specific personal models layered atop shared foundation models. The authors analyze three scaling axes: Scale Up (stronger base models amplifying adapter utility), Scale Down (minimum viable adapter size), and Scale Out (managing millions of concurrent adapted instances). They introduce MinT as an infrastructure reference for adapter identity, versioning, provenance, evaluation, and serving at scale.

Training Infrastructure Inference Economics LoRA Parameter-Efficient Fine-Tuning MinT +2 more

7The Batch·19d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.

Frontier Model Releases Open Weights Progress Nemotron 3 Super 120B-A12B Nemotron 3 Ultra-500B-A50B PivotRL +18 more