Almanac
benchmark

APEX-Agents-AA

benchmarkactiveprovisionalapex-agents-aa-6ab2469f·2 events·first seen 18d ago

Aliases: APEX-Agents-AA

Co-occurring entities

More like this (12)

Recent events (2)

6The Batch·18d ago·source ↗

Google Launches Gemini 3.5 Flash: Mid-Tier Model With Agentic Gains at 3x Higher Price

Google released Gemini 3.5 Flash at Google I/O 2026, a mixture-of-experts multimodal model with adjustable reasoning levels, thought preservation across multi-turn conversations, and a 1M-token context window. The model tops APEX-Agents-AA and MMMU-Pro benchmarks among Flash-tier models but trails leading frontier models on overall intelligence, knowledge, and coding. Pricing is $1.50/$9.00 per million input/output tokens—three times the cost of its predecessor Gemini 3 Flash—raising questions about Google's positioning of Flash as a mid-tier rather than budget offering. Independent testing found it costs more in practice than Gemini 3.1 Pro despite Google's claims of competitive pricing.

6The Batch·18d ago·source ↗

Gemini 3.5 Flash Launch, AI FDE Job Trends, AI Act Delays, and Agent-Driven Web Traffic

Google launched Gemini 3.5 Flash, a mid-tier multimodal mixture-of-experts model with improved agentic capabilities, visual understanding, and speed, priced at $1.50/$9.00 per million input/output tokens — three times the cost of its predecessor Gemini 3 Flash. The model supports up to 1M token context, adjustable reasoning levels, and thought preservation across multi-turn conversations, and tops the Artificial Analysis APEX-Agents-AA and MMMU-Pro benchmarks. The issue also covers Andrew Ng's commentary on the rise of AI Forward Deployed Engineers versus the broader AI Engineer role, plus news items on EU AI Act implementation delays and AI agents driving measurable online traffic shifts.