5arXiv cs.AI (Artificial Intelligence)·23d ago

SwarmHarness: Decentralized Skill-Based Task Routing Protocol for AI Agent Networks

SwarmHarness is a proposed decentralized protocol enabling AI compute sharing and task routing across heterogeneous nodes (workstations, inference servers, edge devices) without a central coordinator. It combines a DHT-based registry for peer discovery, a utility-function router dispatching tasks by capability/load/latency/trust, and a Shapley-value-based credit mechanism to align incentives among participating nodes. The system is designed as a foundational primitive for autonomous multi-agent networks where agents can hire compute, route subtasks, and settle credits without human intermediation. It positions itself against existing approaches like Golem, BrokerChain, BOINC, and Petals by integrating decentralization with a native incentive layer.

Training Infrastructure Inference Economics Agent and Tool Ecosystem SwarmCredit Golem SwarmRegistry SwarmHarness Distributed Hash Table Shapley values SwarmRouter Petals BrokerChain BOINC HarnessAPI

Related guides (3)

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost of Running AI in Production

Read asBeginner In-depth

Related events (8)

3Github Trending·22d ago·source ↗

Awesome Harness Engineering: Curated List for AI Agent Infrastructure

A GitHub repository aggregating resources on AI agent harness engineering, covering tools, patterns, evaluations, memory systems, MCP (Model Context Protocol), permissions, observability, and orchestration. The list has accumulated 1,318 stars with 39 added today, indicating moderate community traction. It serves as a reference index rather than original research or tooling.

Evaluation and Benchmarking Agent and Tool Ecosystem ai-boost/awesome-harness-engineering Model Context Protocol

6arXiv · cs.LG·25d ago·source ↗

From Model Scaling to System Scaling: Scaling the Harness in Agentic AI

This paper argues that the next major bottleneck in agentic AI is system-level design—what the authors call 'scaling the harness'—rather than continued model scaling alone. The agent harness encompasses memory substrates, context constructors, skill-routing layers, orchestration loops, and verification/governance components that together translate model capability into long-horizon behavior. The authors identify three core bottlenecks (context governance, trustworthy memory, dynamic skill routing) and propose harness-level benchmarks measuring trajectory quality, memory hygiene, and verification cost. They introduce CheetahClaws, a Python-native reference harness, and compare it against Claude Code and OpenClaw.

Evaluation and Benchmarking Inference Economics SafeRL-Lab dynamic skill routing Scaling the Harness (paper)+8 more

6arXiv · cs.CL·1mo ago·source ↗

Code as Agent Harness: A Survey of Code as Operational Substrate for Agentic AI Systems

This survey paper introduces the concept of 'code as agent harness,' framing code not merely as output but as the operational infrastructure for LLM-based agents—covering reasoning, action, environment modeling, and execution-based verification. The authors organize the analysis across three layers: harness interface, harness mechanisms (planning, memory, tool use, feedback control), and scaling to multi-agent systems. Applications span coding assistants, GUI/OS automation, embodied agents, scientific discovery, and enterprise workflows. Open challenges include evaluation beyond task success, verification under incomplete feedback, and human oversight for safety-critical actions.

Evaluation and Benchmarking AI Safety Research embodied agents large language models Code as Agent Harness +6 more

7arXiv · cs.CL·24d ago·source ↗

SIA: Self-Improving AI via Joint Harness and Weight Updates

SIA proposes a self-improving loop in which a Feedback-Agent simultaneously updates both the scaffold (harness) and model weights of a task-specific agent, unifying two previously disjoint research lines: meta-agent scaffold rewriting and test-time training. The system is evaluated on three diverse benchmarks—Chinese legal charge classification, GPU kernel optimization, and single-cell RNA denoising—achieving gains of 56.6%, 91.9% runtime reduction, and 502% respectively over baselines. The paper argues that harness updates shape agentic behavior while weight updates instill domain intuition that prompting alone cannot provide, and that combining both levers consistently outperforms either alone.

Frontier Model Releases Evaluation and Benchmarking LawBench SIA (Self Improving AI)harness update +4 more

5Github Trending·19d ago·source ↗

ruvnet/ruflo: Agent Meta-Harness for Claude with Multi-Agent Swarm Coordination

Ruflo is an open-source TypeScript framework positioning itself as a meta-harness for Claude-based multi-agent systems. It features adaptive memory, swarm intelligence coordination, RAG integration, and native Claude Code/Codex integration. The project has accumulated 57,231 stars with 354 added today, indicating significant community traction.

Enterprise Deployment Patterns Agent and Tool Ecosystem RAG ruflo Claude +3 more

5The Batch·28d ago·source ↗

Hermes Agent Challenges OpenClaw on Token Usage Leaderboard; Agent Self-Improvement Highlighted

Hermes Agent, an open-source AI agent from Nous Research launched in February 2026, has surpassed OpenClaw on OpenRouter's daily token consumption leaderboard. Hermes Agent differentiates itself through a memory architecture and automatic skill-building capability using the SKILL.md format, enabling self-improvement as a core agentic feature. It supports local and cloud deployment, integrates with ~20 messaging services, and works with a wide variety of LLMs via the Agent Communication Protocol. The piece also covers Andrew Ng's commentary on Harvard's grade-capping policy, which is tangential to AI/ML.

Open Weights Progress Agent and Tool Ecosystem DeepLearning.AI OpenClaw Agent Communication Protocol +5 more

5arXiv · cs.CL·25d ago·source ↗

PolyGnosis 2.0: Multi-Agent Architecture for Prediction Market Intelligence via Harness Engineering

PolyGnosis 2.0 introduces a multi-agent system that synthesizes Polymarket prediction market signals with GDELT OSINT streams to identify 'Perspective Mismatches' as trading signals. The paper rigorously evaluates agentic harness engineering techniques—reflection loops, tool-calling, divide-and-conquer partitioning, and chain-of-thought—in high-noise financial domains. Key empirical findings include that structural partitioning is necessary for multi-dimensional alignment, but unconstrained terminal reflection induces logical drift, and a pervasive consensus bias emerges across agent configurations. The authors identify a Pareto-optimal configuration achieving professional-grade analytical precision with minimized latency and token overhead.

Evaluation and Benchmarking Agent and Tool Ecosystem PolyGnosis 2.0 Divide-and-Conquer Partitioning Harness Engineering +4 more

6The Batch·8d ago·source ↗

Andrew Ng introduces OpenCoworker, an open-source desktop AI agent harness

Andrew Ng and collaborators Rohit Prasad and Devika Verma have released OpenCoworker, a free open-source desktop agent built by extending the aisuite library to support agent harnesses. The tool allows users to connect frontier LLMs (OpenAI, Anthropic, Google) or local models via Ollama to desktop tasks including file access, messaging, and workflow automation, with privacy as a design priority. Ng frames this as a response to data-retention concerns with commercial desktop agents, citing Anthropic's Fable release as a recent example of policy opacity. The post also provides a concise overview of the current desktop agent landscape and the shift toward LLM-driven agentic loops.

Open Weights Progress Agent and Tool Ecosystem Ollama DeepLearning.AI aisuite +7 more