EVA
eva-4467f6dd·2 events·first seen 1mo agoAliases: EVA
Co-occurring entities
More like this (12)
Recent events (2)
A New Framework for Evaluating Voice Agents (EVA)
ServiceNow AI has published a blog post on Hugging Face introducing EVA, a new evaluation framework designed specifically for voice agents. The framework appears to address gaps in existing evaluation methodologies for assessing voice-based AI agent performance. As voice agents become more prevalent in enterprise and consumer settings, standardized evaluation protocols are increasingly important for benchmarking progress.
Data Points: OpenAI shuts down Sora, Anthropic multi-agent harness, EVA voice benchmark, Arm AGI CPU, White House AI preemption proposal
OpenAI is shutting down its Sora text-to-video platform without explanation, ending a major Disney licensing deal worth up to $1 billion and eliminating video capabilities from ChatGPT amid Hollywood copyright tensions. Anthropic published details on a multi-agent harness enabling Claude to build full-stack applications over multi-hour sessions using a planner-generator-evaluator architecture. ServiceNow AI Research released EVA, an open-source two-dimensional benchmark for voice agents measuring both task accuracy and conversational experience quality. Additional items cover Arm's first self-designed data center CPU (AGI CPU) co-developed with Meta, and the Trump Administration's legislative proposal for a federal AI framework that would preempt state AI laws.