product
VISTA
productactiveprovisional
vista-802aa418·1 events·first seen 7d agoAliases: VISTA
More like this (12)
Recent events (1)
VISTA: Hybrid user simulation toolkit for interactive agent evaluation
Researchers introduce VISTA, a user simulation framework designed to address limitations in current agent evaluation methods, which rely on static benchmarks that miss dynamic, multi-step failure modes. VISTA provides six metrics for measuring realism, capability coverage, and interaction effectiveness, and combines UI-based and API-based interactions in a hybrid simulator. The toolkit is evaluated in e-commerce and education customer service settings, showing more realistic and comprehensive coverage than existing approaches.