6arXiv cs.CL (Computation and Language)·12d ago

CHAP: Collaborative Human-Agent Protocol for structured human-AI accountability in multi-agent deployments

Researchers from BrightbeamAI introduce CHAP (Collaborative Human-Agent Protocol), a protocol specification for formalizing human-agent collaboration in production multi-agent systems. CHAP defines shared workspaces, structured override events with diffs and rationales, non-repudiable signed approvals, and an append-only evidence log, filling a gap left by MCP (tool access) and A2A (agent-to-agent interoperability). The protocol ships with a reference implementation, conformance suite, and worked examples. It targets high-stakes deployments in domains like clinical decisions, contracts, and code where human judgment must be auditable and replayable.

AI Safety Research Agent and Tool Ecosystem BrightbeamAI Collaborative Human-Agent Protocol MCP A2A Protocol

Related guides (2)

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

7Anthropic News·19d ago·source ↗

Anthropic publishes framework for safe and trustworthy agent development

Anthropic released a formal framework for responsible agent development, articulating principles around human oversight, transparency, value alignment, and privacy for autonomous AI agents. The document draws on Claude Code as a reference implementation and cites enterprise deployments at Trellix and Block as real-world examples. The framework is positioned as a contribution to emerging industry standards for agentic AI systems, acknowledging open technical challenges in value alignment measurement and oversight calibration.

AI Safety Research Regulatory Developments Block Claude Code Trellix +2 more

6arXiv · cs.AI·9d ago·source ↗

AgentBeats: Standardized Agent Evaluation via A2A and MCP Protocols

A new arXiv preprint proposes Agentified Agent Assessment (AAA), a framework where evaluation is performed by judge agents interacting through standardized protocols—A2A for task management and MCP for tool access—rather than bespoke benchmark harnesses. The authors introduce AgentBeats as a concrete implementation, validated through a five-month open competition with 298 judge agents and 467 subject agents across 12 categories, plus a coding-agent case study. The work addresses fragmentation in agent evaluation by decoupling assessment logic from agent implementation, enabling reproducible and interoperable benchmarking.

Evaluation and Benchmarking Agent and Tool Ecosystem AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility AgentBeats MCP +1 more

5The Batch·18d ago·source ↗

Andrew Ng proposes Stack Overflow-style knowledge sharing for AI coding agents via chub

Andrew Ng describes the vision for chub (Context Hub), a CLI tool providing up-to-date API documentation to coding agents, which reached over 5,000 GitHub stars in its first week. The piece argues for a Stack Overflow-like feedback loop where agents that discover bugs or better API usage patterns can contribute learnings back to shared documentation. Ng also references Moltbook, a Reddit-like social network for agents recently acquired by Meta, as inspiration for agent-to-agent knowledge sharing. The post outlines early-stage work on agentic deep research to expand chub's documentation collection from under 100 to nearly 1,000 documents.

Enterprise Deployment Patterns Agent and Tool Ecosystem DeepLearning.AI Xin Ye Rohit Prasad +4 more

4Hugging Face Blog·1mo ago·source ↗

MCP for Research: How to Connect AI to Research Tools

Hugging Face published a blog post explaining how the Model Context Protocol (MCP) can be used to connect AI agents to research tools and data sources. The post covers practical patterns for integrating AI with academic and scientific workflows using MCP as a standardized interface layer. This is a commentary/tutorial piece aimed at researchers looking to extend AI agent capabilities into domain-specific tooling.

Enterprise Deployment Patterns Agent and Tool Ecosystem Hugging Face Anthropic Model Context Protocol

3Github Trending·13d ago·source ↗

agent-teams-ai: multi-agent orchestration framework with kanban-style oversight

A TypeScript open-source project on GitHub implements a multi-agent system where autonomous agents handle tasks, communicate with each other, and review each other's work, while the user supervises via a kanban board. The framework supports 200+ models across 75+ LLM providers including Codex, Claude, and OpenCode. It has accumulated 1,189 stars with 56 added today, suggesting growing community interest.

Agent and Tool Ecosystem agent-teams-ai OpenAI Anthropic

6arXiv · cs.AI·16d ago·source ↗

Benchmark Agent: Autonomous system for end-to-end benchmark construction

Researchers introduce Benchmark Agent, a fully autonomous agentic system that orchestrates the complete benchmark construction pipeline — from query analysis and subtask design to data annotation and quality control. The system was used to produce 15 benchmarks spanning text understanding, multimodal understanding, and domain-specific reasoning, with evaluation via human judges, LLM-as-a-judge, and consistency checks. The work addresses two persistent problems in the field: the labor intensity of benchmark creation and rapid performance saturation after release. Code and a demo will be publicly released.

Evaluation and Benchmarking Agent and Tool Ecosystem Benchmark Everything Everywhere All at Once Benchmark Agent

8Anthropic News·1mo ago·source ↗

Anthropic Open-Sources the Model Context Protocol (MCP)

Anthropic has released the Model Context Protocol (MCP), an open standard enabling secure, two-way connections between AI assistants and external data sources such as business tools, content repositories, and development environments. The protocol introduces a client-server architecture with SDKs, local MCP server support in Claude Desktop, and a repository of pre-built connectors for systems like GitHub, Slack, Google Drive, and Postgres. Early adopters include Block and Apollo, with development tool companies Zed, Replit, Codeium, and Sourcegraph integrating MCP into their platforms. The goal is to replace fragmented, per-source integrations with a single universal protocol, improving context availability for AI agents.

Inference Economics Enterprise Deployment Patterns Justin Spahr-Summers David Soria Parra Zed +10 more

4Hugging Face Blog·1mo ago·source ↗

Tiny Agents: an MCP-powered agent in 50 lines of code

Hugging Face published a blog post demonstrating how to build a minimal AI agent using the Model Context Protocol (MCP) in approximately 50 lines of code. The post showcases how MCP enables agents to discover and invoke tools dynamically, reducing the boilerplate required for agentic workflows. This serves as both a tutorial and a commentary on MCP's role in simplifying agent-tool integration in the current ecosystem.

Agent and Tool Ecosystem Hugging Face Tiny Agents Model Context Protocol