Data2Story: Multi-agent framework for end-to-end data journalism with verifiable claims
Researchers introduce Data2Story (Data Journalist Agent), a multi-agent framework that orchestrates specialized roles to transform raw data into multimodal news articles. A key innovation is an Inspector module that grounds every claim back to data, code, or external references, enabling verifiability. The system also generates interactive multimodal outputs (maps, audio) rather than static text and charts. Evaluation across 18 articles with 53 human participants shows competitive quality versus expert-written pieces, with particular strength in transparency and auditability, though human journalists retain an edge in editorial angle and creative design.
Related guides (2)
Related events (8)
Data Intelligence Agents (DIA): Autonomous coding agents for enterprise data integration and SQL generation
Researchers present Data Intelligence Agents (DIA), a production-deployed system of three autonomous coding agents (Data Interpreter, Schema Creator, Query Generator) that automate enterprise data integration workflows. Rather than generating text, the agents produce, execute, validate, and repair concrete artifacts (code, schemas, SQL) with shared memory for experience reuse. The Query Generator is evaluated across seven SQL benchmarks spanning four dialects and task categories, matching or surpassing best published results on all seven. The system is deployed in production for enterprise customers, making it a notable applied research contribution.
DataCOPE: Unsupervised skill discovery framework for data-analytic agents
Researchers introduce DataCOPE, an unsupervised verifier-guided framework for discovering reusable procedural skills in data-analytic agents without labeled supervision or parameter updates. The system coordinates three components—a data-analytic agent, an unsupervised verifier, and a skill manager for contrastive skill distillation—with task-specific verifier instantiations for report-style and reasoning-style analysis. Evaluated on Deep Data Research and DABStep benchmarks, DataCOPE improves mean scores by 9.71% and 32.30% respectively across four model settings. The approach addresses a key bottleneck in agentic data analysis: acquiring reliable skill supervision at scale.
AINews: Agents for Everything Else — Codex for Knowledge Work, Claude for Creative Work
A Latent Space daily AI news digest reflecting on the expanding scope of coding agents beyond software development into knowledge work and creative work domains. The piece uses OpenAI Codex and Anthropic Claude as anchoring examples of agents 'breaking containment' from their original coding/assistant niches. Published as a quieter news day commentary, it surveys the broadening agent ecosystem landscape.
Study finds AI disclosure designs in newsrooms fail readers, proposes user-agency-centered alternatives
A paper from arXiv examines how newsrooms disclose AI involvement in news content, finding that neither brief labels nor detailed disclosures achieve the goal of building reader trust. A controlled experiment with 34 readers shows detailed disclosures trigger a 'transparency dilemma' that can reduce trust, while one-line labels create an information gap requiring cognitive effort to fill. Readers instead preferred disclosure designs centered on user agency, including detail-on-demand interactions, proportional AI-ratio visualizations, and explicit 'no AI' labels. The author frames this as a design problem for the HCI community rather than a journalism ethics problem alone.
Datasette Agent
Simon Willison describes a Datasette Agent, an AI agent built on top of the Datasette data exploration tool. The post appears to demonstrate an agent capable of querying and reasoning over SQLite databases via natural language. This represents a practical deployment of LLM-powered tooling for data analysis workflows.
DABStep: Data Agent Benchmark for Multi-step Reasoning
Hugging Face introduces DABStep, a benchmark designed to evaluate data agents on multi-step reasoning tasks. The benchmark targets agentic systems that must perform complex, sequential data operations rather than single-step queries. It aims to fill a gap in evaluation tooling for realistic data analysis workflows involving tool use and chained reasoning.
CodeAgents + Structure: A Better Way to Execute Actions
Hugging Face published a blog post exploring the combination of code-based agents with structured outputs to improve action execution reliability. The post examines how enforcing structured generation can reduce errors and improve the robustness of agentic code execution pipelines. This represents a practical engineering approach to making code agents more dependable in production settings.
Agents-K1: End-to-end knowledge orchestration pipeline for agent-native scientific knowledge graphs
Agents-K1 is a new pipeline that converts raw scientific documents into structured knowledge graphs for use by LLM-based research agents, addressing the gap where existing systems reduce papers to abstracts and flat citation edges. The system integrates a multimodal parser, a 4B information-extraction model trained with GRPO, and a tri-source agent interface combining web search, graph retrieval, and cross-document traversal. The authors process 2.46 million scientific papers to produce Scholar-KG, releasing a one-million-paper subset. Experiments show improvements in scientific information extraction, knowledge graph construction, and multi-hop reasoning.

