OmniGameArena
omnigamearena-24881536·1 events·first seen 8d agoAliases: OmniGameArena
Co-occurring entities
More like this (12)
Recent events (1)
OmniGameArena: UE5 benchmark for VLM game agents with multi-round improvement dynamics
Researchers introduce OmniGameArena, a real-time benchmark of twelve Unreal Engine 5 games spanning solo, PvP, and cooperative play, designed to evaluate vision-language model agents under unified protocols across commercial VLMs, open-weight VLMs, and specialized game policies. The benchmark introduces the Improvement Dynamics Curve (IDC), an agentic-reflection harness where a tool-using LLM autonomously refines skill prompts across multiple rounds, exposing how agent performance evolves and generalizes beyond a single cold-start score. Twelve VLM agents are evaluated on the leaderboard, with four top agents further analyzed under IDC. The work addresses gaps in existing game benchmarks that report only single-attempt scores and lack multi-agent or cooperative evaluation modes.