Entity · product

Inspect Scout

productactiveinspect-scout-5e357dc5·1 events·first seen Jun 17, 2026

Aliases: Inspect Scout

Co-occurring entities

GPT-5.2 Claude Opus 4.6 DeepSeek V4 TAC (Travel Agent Compassion)Gemini-2.5-Flash-Lite EU General-Purpose AI Code of Practice OpenAI GPT-5.5 Anthropic

More like this (12)

Llama 4 Scout XLSCOUT Llama 4 Scout 17B-16E SkillSpector Scan, Pilot, Scale OpenInspect MCP Inspector SCOPE AgentSpec shot-scraper Semgrep AgentScope

Recent events (1)

5arXiv · cs.CL·Jun 17, 2026·source ↗

TAC benchmark finds frontier AI agents systematically book animal-exploitative travel options below chance rate

Researchers introduce TAC (Travel Agent Compassion), the first agentic benchmark testing whether AI agents avoid animal-exploitative options when booking travel on behalf of users. Across 48 scenarios spanning six exploitation categories, all seven evaluated frontier models score below the 64% chance baseline, with the best performer (Claude Opus 4.7) at 53%. A single welfare-aware sentence in the system prompt yields dramatic gains in Claude and GPT-5.5 (47-63 percentage points) but minimal effect on DeepSeek and Gemini models. The study highlights a gap between models' text-response welfare reasoning and their agentic decision-making behavior.

Evaluation and Benchmarking AI Safety Research GPT-5.2 Claude Opus 4.6 DeepSeek V4 +8 more