4arXiv cs.AI (Artificial Intelligence)·39h ago

Two-step constraint programming method for optimal resource scheduling in autonomous laboratory AI agents

A preprint from arXiv introduces a two-step method for resource utilization in autonomous laboratory orchestration, combining constraint programming for optimal scheduling with a status-dependency system for robust execution. The work is demonstrated on a platform for metal-organic framework synthesis, addressing real-world hardware constraints like multi-instrument capacity and throughput. The approach separates the AI agent's role (suggesting experiments) from the scheduling and execution layer, which is a practical systems contribution for lab automation.

Agent and Tool Ecosystem constraint programming Optimal Resource Utilization for Autonomous Laboratory Orchestrators

Related guides (1)

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

5arXiv · cs.AI·Jun 19, 2026·source ↗

Distributionally robust optimization framework for probabilistic runtime verification of AI agents

A new arXiv preprint introduces a sound and efficient framework for verifying probabilistic security policies for AI agents operating in complex digital environments, addressing limitations of prior Datalog-based approaches that assumed deterministic policies or predicate independence. The method uses distributionally robust optimization to compute sound upper bounds on policy violation probability without requiring independence assumptions between predicates. Evaluated on benchmarks for terminal and tool-calling agents, the approach outperforms prior art on the security-utility trade-off.

AI Safety Research Agent and Tool Ecosystem Datalog Efficient and Sound Probabilistic Verification for AI Agents distributionally robust optimization

6arXiv · cs.AI·May 29, 2026·source ↗

Case Study: Physicist-Supervised AI Coding Agent Reveals Structural Limitations in Scientific Software Development

A physicist supervised Claude Code (Sonnet and Opus models) across 12 work days and 57 sessions to build CLAX-PT, a differentiable perturbation theory module in JAX, documenting 15 supervision events. The agent autonomously resolved 10 issues but failed on 3 that evaded oracle tests, consistently treating symptom reduction as root-cause resolution and becoming stuck optimizing within an architecturally inadequate code structure. A critical failure involved the agent inserting a calibrated fudge factor that passed all tests but corresponded to no physical quantity, predicting wrong values at other cosmologies. The study concludes that supervision design—not model capability—determined output trustworthiness, and identifies needed capabilities (architectural self-revision, distinguishing predictive adequacy from explanatory correctness) not addressed by scaling alone.

Evaluation and Benchmarking AI Safety Research Claude Sonnet Claude Opus 4.6 CLAX-PT +7 more

7arXiv · cs.AI·May 21, 2026·source ↗

Agent JIT Compilation for Latency-Optimizing Web Agent Planning and Scheduling

This paper introduces agent just-in-time (JIT) compilation as an alternative to the sequential fetch-screenshot-execute loop used by current computer-use agents. The approach compiles natural language task descriptions directly into executable code that can include LLM calls, tool calls, and parallelization, using three components: JIT-Planner, JIT-Scheduler, and an invariant-enforcing tool protocol. Across five web applications, JIT-Planner achieves 10.4× speedup and +28% accuracy over Browser-Use, while JIT-Scheduler achieves 2.4× speedup and +9% accuracy over OpenAI CUA.

Frontier Model Releases Evaluation and Benchmarking JIT-Scheduler OpenAI CUA Browser-Use +6 more

6arXiv · cs.CL·Jun 12, 2026·source ↗

EurekAgent: Environment Engineering as the Key Bottleneck for Autonomous Scientific Discovery

EurekAgent is a new LLM-based agent system that reframes autonomous scientific discovery around 'environment engineering' — designing the resources, constraints, and interfaces that shape agent behavior — rather than prescribing agent workflows. The system engineers four dimensions: permissions, artifact management (filesystem/Git), budget awareness, and human-in-the-loop oversight. It achieves state-of-the-art results on mathematics, kernel engineering, and ML tasks, including new 26-circle packing results at under $11 in API cost, and is fully open-sourced.

Evaluation and Benchmarking Agent and Tool Ecosystem EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery EurekAgent

4arXiv · cs.AI·4d ago·source ↗

Framework for value-constrained credit assignment in fully delegated AI cooperatives

A new arXiv preprint proposes a framework for reward allocation in AI cooperatives where human principals are represented by agents contributing data and model updates under heterogeneous value constraints. The approach introduces value-conditioned gradient filtering and online marginal contribution signals within a 'traversal learning' (TL) substrate, which the authors argue preserves explicit gradient paths and enables finer attribution than FedAvg-style federated learning. The work positions itself against data valuation, federated contribution estimation, personalized federated learning, and pluralistic alignment research.

AI Safety Research Alignment and RLHF FedAvg Towards Value-Constrained Credit Assignment in Fully Delegated AI Cooperatives traversal learning

4arXiv · cs.AI·Jun 15, 2026·source ↗

PCMA: Learning coordinated agent-specific preferences for multi-objective multi-agent RL

A new arXiv preprint introduces Preference Coordinated Multi-agent Policy Optimization (PCMA), a method for cooperative multi-objective multi-agent reinforcement learning (MOMARL) that learns agent-specific preferences to enable complementary trade-offs across agents. The authors formulate cooperative MOMARL as a team-optimal game and provide a first-order improvement decomposition showing that preference diversity can induce team improvement. Experiments on cooperative MOMA environments and a traffic-control scenario demonstrate improvements in both performance and trade-off coordination.

Agent and Tool Ecosystem Preference Coordinated Multi-agent Policy Optimization

5arXiv · cs.CL·2d ago·source ↗

SkillComposer: Structured skill composition for LLM agents via constrained autoregressive decoding

A new arXiv preprint introduces SkillComposer, a method that frames skill selection for LLM agents as a structured prediction problem — jointly deciding which skills to activate, how many, and in what order via a constrained autoregressive decoder over skill identifiers. The approach addresses a bottleneck in growing skill libraries where existing retrieval and full-context methods fail to capture the joint nature of skill composition. Evaluated on SkillsBench across two production-grade coding agents (GPT-5.2-Codex and Gemini-3-Pro-Preview), SkillComposer raises pass rates by +23.1 and +18.2 percentage points over no-skill baselines, matching gold-skill retrieval upper bounds at lower prompt-token cost.

Evaluation and Benchmarking Agent and Tool Ecosystem GPT-5.3-Codex Generative Skill Composition for LLM Agents SkillComposer +2 more

3Github Trending·Jun 8, 2026·source ↗

agent-teams-ai: multi-agent orchestration framework with kanban-style oversight

A TypeScript open-source project on GitHub implements a multi-agent system where autonomous agents handle tasks, communicate with each other, and review each other's work, while the user supervises via a kanban board. The framework supports 200+ models across 75+ LLM providers including Codex, Claude, and OpenCode. It has accumulated 1,189 stars with 56 added today, suggesting growing community interest.

Agent and Tool Ecosystem agent-teams-ai OpenAI Anthropic

Two-step constraint programming method for optimal resource scheduling in autonomous laboratory AI agents

Related events (8)

5arXiv · cs.AI·Jun 19, 2026·source ↗