Entity · benchmark

TAC (Travel Agent Compassion)

benchmarkactivetac-travel-agent-compassion--07a98639·1 events·first seen Jun 17, 2026

Aliases: TAC (Travel Agent Compassion)

Co-occurring entities

GPT-5.2 Claude Opus 4.6 DeepSeek V4 Inspect Scout Gemini-2.5-Flash-Lite EU General-Purpose AI Code of Practice OpenAI GPT-5.5 Anthropic

More like this (12)

Travelers Group TPC-H TempCompass TCN Tavily TradingAgents PACT CATT PAC-ACT Agent Communication Protocol ACROS TPC-DS

Recent events (1)

5arXiv · cs.CL·Jun 17, 2026·source ↗

TAC benchmark finds frontier AI agents systematically book animal-exploitative travel options below chance rate

Researchers introduce TAC (Travel Agent Compassion), the first agentic benchmark testing whether AI agents avoid animal-exploitative options when booking travel on behalf of users. Across 48 scenarios spanning six exploitation categories, all seven evaluated frontier models score below the 64% chance baseline, with the best performer (Claude Opus 4.7) at 53%. A single welfare-aware sentence in the system prompt yields dramatic gains in Claude and GPT-5.5 (47-63 percentage points) but minimal effect on DeepSeek and Gemini models. The study highlights a gap between models' text-response welfare reasoning and their agentic decision-making behavior.

Evaluation and Benchmarking AI Safety Research GPT-5.2 Claude Opus 4.6 DeepSeek V4 +8 more