Entity · model

GPT-OSS 120B

modelactivegpt-oss-120b-19c677a7·6 events·first seen May 18, 2026

Aliases: GPT-OSS 120B, gpt-oss-120b, gpt-oss-120B, GPT-OSS-120B

Co-occurring entities

More like this (12)

gpt-oss-20b GPT-OSS gpt-oss-safeguard GPT-J 6B GPT-4o GPT-4o mini GPT-2 124M GPT-4b micro GPT-4V GPT-Neo-2.7B GPT-5.4 mini GPT-4.1 mini

Recent events (6)

5arXiv · cs.CL·Jul 15, 2026·source ↗

EcoSpec: Cost-aware speculative decoding for MoE models reduces expert memory traffic

Researchers introduce EcoSpec, a speculative decoding framework that incorporates predicted marginal expert activation cost into draft-token selection for sparse Mixture-of-Experts LLMs. The key insight is that standard confidence-driven draft selection causes 'expert scattering'—routing draft tokens to disjoint experts increases memory traffic and undermines speculative decoding speedups. EcoSpec uses a lightweight expert predictor and dynamic expert buffer to favor draft paths that reuse already-loaded experts, achieving up to 1.62× end-to-end decoding speedup. Evaluations cover DeepSeek-V3.1 (671B), Qwen3-235B-A22B, and GPT-OSS-120B across reasoning, coding, QA, and dialogue tasks.

Training Infrastructure Inference Economics speculative decoding DeepSeek V4 GPT-OSS 120B +2 more

7The Batch·Jun 2, 2026·source ↗

Alibaba releases Qwen3.5 open-weights vision-language model family with MoE architecture across eight sizes

Alibaba released the Qwen3.5 family of eight open-weights vision-language models ranging from 0.8B to 397B parameters, built on a mixture-of-experts architecture with mixed attention and Gated DeltaNet layers. The flagship Qwen3.5-397B-A17B outperforms GPT-5.2, Claude 4.5 Opus, and Gemini-3 Pro on 28 of 44 vision benchmarks, while the 9B model surpasses OpenAI's gpt-oss-120B on most language tasks. Open weights are available under Apache 2.0, with hosted agentic variants (Qwen3.5-Plus, Qwen3.5-Flash) available via Alibaba Cloud. The release is notable for strong small-model efficiency and comes amid reported team departures following the Qwen3 rollout.

Frontier Model Releases Open Weights Progress GPT-5.2 Alibaba Cloud Model Studio Claude Opus 4.6 +10 more

5The Batch·Jun 1, 2026·source ↗

Researchers at UT-Austin and Google Model Human Decision-Making in Rock-Paper-Scissors

Researchers from UT-Austin and Google used AlphaEvolve, an evolutionary code-optimization method, to synthesize interpretable Python programs that predict move-by-move decisions of LLMs and humans playing rock-paper-scissors against bots. They found that Gemini 2.5 Pro, Gemini 2.5 Flash, and GPT-4.1 share similar sequential-pattern-tracking strategies that are more systematic than typical human play, while GPT-OSS 120B and humans relied on simpler opponent-move-frequency heuristics. The study demonstrates that code synthesis from behavioral data can serve as an interpretability tool for LLM decision-making, revealing that LLMs do not simply mimic human strategies.

Evaluation and Benchmarking AI Safety Research Google Gemini-2.5-Flash-Lite AlphaEvolve +6 more

9Openai Blog·May 20, 2026·source ↗

OpenAI Releases gpt-oss-120b and gpt-oss-20b Open-Weight Reasoning Models

OpenAI has published model cards for gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models released under the Apache 2.0 license alongside a dedicated gpt-oss usage policy. This marks a significant move by OpenAI into the open-weights space, offering both a large 120B parameter model and a smaller 20B variant. The release signals a strategic shift for OpenAI, which has historically kept its frontier models proprietary.

Frontier Model Releases Open Weights Progress gpt-oss usage policy Apache 2.0 GPT-OSS 120B +4 more

9Openai Blog·May 20, 2026·source ↗

OpenAI Releases gpt-oss-120b and gpt-oss-20b Open-Weight Models Under Apache 2.0

OpenAI is releasing two open-weight language models, gpt-oss-120b and gpt-oss-20b, under the Apache 2.0 license. The models are claimed to outperform similarly sized open models on reasoning tasks and feature strong tool use capabilities. They are optimized for efficient deployment on consumer hardware, positioning them as cost-effective alternatives in the open-weights ecosystem.

Frontier Model Releases Open Weights Progress Apache 2.0 GPT-OSS 120B OpenAI +3 more

8Mistral Ai News·May 18, 2026·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

Long Context Evolution Frontier Model Releases Mistral AI Mistral Small 4 Pixtral +14 more