Almanac
benchmark

MCP-Universe

benchmarkactiveprovisionalmcp-universe-32162629·1 events·first seen 5d ago

Aliases: MCP-Universe

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·5d ago·source ↗

HyperTool: Unified executable MCP-style interface reduces step-wise tool call overhead for LLM agents

HyperTool introduces a unified executable interface that allows LLM agents to invoke multiple tool calls within a single code block, hiding intermediate dataflow from the main reasoning trace. This addresses an 'execution-granularity mismatch' where step-wise atomic tool calls waste context and force models to manage low-level operations. On the MCP-Universe benchmark, HyperTool more than doubles accuracy for Qwen3-32B (15.69% → 35.29%) and Qwen3-8B (9.93% → 33.33%), outperforming GPT-OSS and Kimi-k2.5.