Entity · benchmark

EVA

benchmarkactiveeva-4467f6dd·2 events·first seen May 18, 2026

Aliases: EVA

Co-occurring entities

ServiceNow AI Research ClawBot Playwright Disney Claude Arm Arm AGI CPU Trump Administration OpenClaw Baidu Alibaba Whisper large-v3 WeChat OpenAI Tencent Sora Meta Anthropic ServiceNow AI Hugging Face

More like this (12)

PEVA EVA-Bench Data 2.0 EvoStruct EEVEE EMO VideoVAE+AlphaEvolve EnvFactory Ecom-RLVE EG-VQA Point-E EVAgent

Recent events (2)

7The Batch·Jun 2, 2026·source ↗

Data Points: OpenAI shuts down Sora, Anthropic multi-agent harness, EVA voice benchmark, Arm AGI CPU, White House AI preemption proposal

OpenAI is shutting down its Sora text-to-video platform without explanation, ending a major Disney licensing deal worth up to $1 billion and eliminating video capabilities from ChatGPT amid Hollywood copyright tensions. Anthropic published details on a multi-agent harness enabling Claude to build full-stack applications over multi-hour sessions using a planner-generator-evaluator architecture. ServiceNow AI Research released EVA, an open-source two-dimensional benchmark for voice agents measuring both task accuracy and conversational experience quality. Additional items cover Arm's first self-designed data center CPU (AGI CPU) co-developed with Meta, and the Trump Administration's legislative proposal for a federal AI framework that would preempt state AI laws.

Training Infrastructure Frontier Model Releases ServiceNow AI Research ClawBot Playwright +19 more

4Hugging Face Blog·May 18, 2026·source ↗

A New Framework for Evaluating Voice Agents (EVA)

ServiceNow AI has published a blog post on Hugging Face introducing EVA, a new evaluation framework designed specifically for voice agents. The framework appears to address gaps in existing evaluation methodologies for assessing voice-based AI agent performance. As voice agents become more prevalent in enterprise and consumer settings, standardized evaluation protocols are increasingly important for benchmarking progress.

Evaluation and Benchmarking Agent and Tool Ecosystem ServiceNow AI Hugging Face EVA