RedAct framework protects procedural skills in agent execution traces via selective redaction and watermarking
Researchers introduce RedAct, a framework for releasing agent execution traces without exposing proprietary procedural skills (tool invocations, decision logic, error-recovery strategies). The system localizes sensitive information, rewrites traces while preserving audit-critical evidence, and embeds behavioral watermarks for provenance tracking. To evaluate the approach, the authors construct CapTraceBench, a benchmark of 75 long-horizon tasks and 154 skills across seven domains. RedAct reduces normalized skill transfer from 44.7–67.1% on raw traces to below the no-skill baseline, while watermark detection achieves 93.6–100% true positive rate with under 2% false alarms.
Related guides (3)
Related events (8)
Reversa: A Multi-Agent Framework for Reverse Engineering Legacy Software into AI-Readable Operational Specifications
Reversa is a multi-agent pipeline framework that converts legacy software systems into traceable operational specifications suitable for use by AI coding agents. The framework employs specialized agents for surface mapping, module analysis, implicit rule extraction, architecture synthesis, and specification review, with mechanisms for traceability, confidence marking, and gap preservation. An exploratory case study on migrating an ATM system from COBOL to Go produced 517 confidence-indexed claims, 53 Gherkin parity scenarios, and a partial reconstruction plan, though final validation was not completed. The system is distributed as a Node.js CLI and is positioned relative to literature on reverse engineering, LLM-based documentation, and software agents.
ProAct: Proactive Agent Architecture Using Idle-Time Compute to Anticipate User Needs
ProAct is a proactive agent architecture that uses idle time between user interactions to predict upcoming needs, pre-fetch information, and resolve knowledge gaps before queries are issued. The system analyzes dialogue history and persistent memory to iteratively acquire relevant information in advance. Evaluated on the new ProActEval benchmark (200 scenarios, 40 domains), ProAct reduces required turns by 14.8%, user effort by 11.7%, and hallucination rates by 28.1% compared to reactive baselines. The work also achieves state-of-the-art reflective accuracy on MemBench.
DataCOPE: Unsupervised skill discovery framework for data-analytic agents
Researchers introduce DataCOPE, an unsupervised verifier-guided framework for discovering reusable procedural skills in data-analytic agents without labeled supervision or parameter updates. The system coordinates three components—a data-analytic agent, an unsupervised verifier, and a skill manager for contrastive skill distillation—with task-specific verifier instantiations for report-style and reasoning-style analysis. Evaluated on Deep Data Research and DABStep benchmarks, DataCOPE improves mean scores by 9.71% and 32.30% respectively across four model settings. The approach addresses a key bottleneck in agentic data analysis: acquiring reliable skill supervision at scale.
Deep Research System Card
OpenAI has published the system card for its Deep Research capability, detailing pre-release safety work including external red teaming and frontier risk evaluations conducted under the Preparedness Framework. The document outlines identified risk areas and the mitigations implemented before deployment. This is the formal safety disclosure accompanying the Deep Research product launch.
RePro: Retrospective Progress-Aware Self-Refinement for LLM Agent Training
Researchers introduce RePro (Retrospective Progress-Aware Training), a framework addressing the gap between step-wise RL optimization and metacognitive task-progress awareness in LLM agents. The approach uses a forward-then-reflect rollout paradigm where agents execute actions online and then retrospectively assess step-wise progress given the completed trajectory and known outcome. Evaluated on WebShop, ALFWorld, and Sokoban, RePro achieves up to 12% absolute success rate gains over baseline Qwen-family models without requiring continuous external supervision.
agent-skills: Secure Validated Skill Registry for AI Coding Agents
A TypeScript-based open-source skill registry designed to extend AI coding agents including Claude Code, Cursor, GitHub Copilot, and Antigravity with validated, reusable capabilities. The project provides a structured way to add skills to multiple coding agent platforms with a focus on security and validation. It is gaining notable traction with 3,767 total stars and 225 stars added today.
Red-Teaming Large Language Models
This Hugging Face blog post introduces red-teaming as a safety evaluation methodology for large language models, explaining how adversarial testing can surface harmful outputs, biases, and failure modes before deployment. It covers techniques for systematically probing LLMs to elicit problematic behaviors and discusses the role of red-teaming in responsible AI development. The post serves as an educational overview aimed at practitioners working on LLM safety.
VeriTrace: Cognitive-Graph Framework with Explicit Regulatory Loops for Deep Research Agents
VeriTrace introduces a cognitive-graph framework for deep research agents that replaces implicit LLM reasoning over intermediate representations with three explicit regulatory loops: interpretive update, deviation feedback, and schema revision. The system addresses contamination and error propagation in evolving mental models during complex multi-step research tasks. Using Qwen3.5-27B backbones, VeriTrace improves over the strongest matched baseline by 4.22 pp on DeepResearch Bench Insight and 5.9 pp Overall win rate on DeepConsult. With Config-DeepSeek, it achieves the strongest reproducible open-source result on DeepResearch Bench.


