benchmark
MAS-PromptBench
benchmarkactiveprovisional
mas-promptbench-74048eb1·1 events·first seen 7h agoAliases: MAS-PromptBench
More like this (12)
Recent events (1)
MAS-PromptBench: Systematic study of prompt optimization in multi-agent LLM systems
A new arXiv preprint introduces MAS-PromptBench, a benchmark and study examining when and how much system-prompt optimization improves multi-agent LLM systems (MAS). The authors evaluate two prompt optimizers across diverse MAS configurations varying in task, workflow, communication protocol, and team size. Results show prompt optimization can unlock significant gains but also expose open challenges, particularly around the exponentially growing search space as agent count increases.