exploitbench-870055b5·1 events·first seen Aliases: ExploitBench
OpenAI announced a preview of three vision-language models — GPT-5.6 Sol, Terra, and Luna — descending in capability and price, currently available only to U.S. government-approved organizations via API and Codex. GPT-5.6 Sol, the flagship tier, features a new 'max reasoning' mode and 'ultra mode' that spawns multiple subagents for multi-step tasks, and achieved state-of-the-art results on Terminal-Bench 2.1 (91.9%) while approaching Claude Mythos 5 on ExploitBench. The models include layered biosecurity and cybersecurity guardrails, with independent evaluations from METR and SecureBio yielding mixed but notable findings — particularly a near-10-point biology knowledge jump over GPT-5.5 and ambiguous autonomous task-duration results from METR. Wider public release is planned within weeks.