5arXiv cs.CL (Computation and Language)·10d ago

Data2Story: Multi-agent framework for end-to-end data journalism with verifiable claims

Researchers introduce Data2Story (Data Journalist Agent), a multi-agent framework that orchestrates specialized roles to transform raw data into multimodal news articles. A key innovation is an Inspector module that grounds every claim back to data, code, or external references, enabling verifiability. The system also generates interactive multimodal outputs (maps, audio) rather than static text and charts. Evaluation across 18 articles with 53 human participants shows competitive quality versus expert-written pieces, with particular strength in transparency and auditability, though human journalists retain an edge in editorial angle and creative design.

Agent and Tool Ecosystem Multimodal Progress Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories Data Journalist Agent

Related guides (2)

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

6arXiv · cs.AI·2d ago·source ↗

Data Intelligence Agents (DIA): Autonomous coding agents for enterprise data integration and SQL generation

Researchers present Data Intelligence Agents (DIA), a production-deployed system of three autonomous coding agents (Data Interpreter, Schema Creator, Query Generator) that automate enterprise data integration workflows. Rather than generating text, the agents produce, execute, validate, and repair concrete artifacts (code, schemas, SQL) with shared memory for experience reuse. The Query Generator is evaluated across seven SQL benchmarks spanning four dialects and task categories, matching or surpassing best published results on all seven. The system is deployed in production for enterprise customers, making it a notable applied research contribution.

Evaluation and Benchmarking Enterprise Deployment Patterns Data Intelligence Agents: Interpreting, Modeling, and Querying Enterprise Data via Autonomous Coding Agents Data Intelligence Agents +1 more

5arXiv · cs.CL·15d ago·source ↗

DataCOPE: Unsupervised skill discovery framework for data-analytic agents

Researchers introduce DataCOPE, an unsupervised verifier-guided framework for discovering reusable procedural skills in data-analytic agents without labeled supervision or parameter updates. The system coordinates three components—a data-analytic agent, an unsupervised verifier, and a skill manager for contrastive skill distillation—with task-specific verifier instantiations for report-style and reasoning-style analysis. Evaluated on Deep Data Research and DABStep benchmarks, DataCOPE improves mean scores by 9.71% and 32.30% respectively across four model settings. The approach addresses a key bottleneck in agentic data analysis: acquiring reliable skill supervision at scale.

Evaluation and Benchmarking Agent and Tool Ecosystem DABStep Deep Research DataCOPE

4Latent Space·1mo ago·source ↗

AINews: Agents for Everything Else — Codex for Knowledge Work, Claude for Creative Work

A Latent Space daily AI news digest reflecting on the expanding scope of coding agents beyond software development into knowledge work and creative work domains. The piece uses OpenAI Codex and Anthropic Claude as anchoring examples of agents 'breaking containment' from their original coding/assistant niches. Published as a quieter news day commentary, it surveys the broadening agent ecosystem landscape.

Frontier Model Releases Agent and Tool Ecosystem Claude OpenAI OpenAI Codex +2 more

4arXiv · cs.AI·10d ago·source ↗

Study finds AI disclosure designs in newsrooms fail readers, proposes user-agency-centered alternatives

A paper from arXiv examines how newsrooms disclose AI involvement in news content, finding that neither brief labels nor detailed disclosures achieve the goal of building reader trust. A controlled experiment with 34 readers shows detailed disclosures trigger a 'transparency dilemma' that can reduce trust, while one-line labels create an information gap requiring cognitive effort to fill. Readers instead preferred disclosure designs centered on user agency, including detail-on-demand interactions, proportional AI-ratio visualizations, and explicit 'no AI' labels. The author frames this as a design problem for the HCI community rather than a journalism ethics problem alone.

AI Safety Research Enterprise Deployment Patterns Designed by Journalists, but Is It for Readers? Rethinking AI Disclosures and Transparency in News

4Simon Willison'S Weblog·1mo ago·source ↗

Datasette Agent

Simon Willison describes a Datasette Agent, an AI agent built on top of the Datasette data exploration tool. The post appears to demonstrate an agent capable of querying and reasoning over SQLite databases via natural language. This represents a practical deployment of LLM-powered tooling for data analysis workflows.

Agent and Tool Ecosystem SQLite Simon Willison Datasette +1 more

5Hugging Face Blog·1mo ago·source ↗

DABStep: Data Agent Benchmark for Multi-step Reasoning

Hugging Face introduces DABStep, a benchmark designed to evaluate data agents on multi-step reasoning tasks. The benchmark targets agentic systems that must perform complex, sequential data operations rather than single-step queries. It aims to fill a gap in evaluation tooling for realistic data analysis workflows involving tool use and chained reasoning.

Evaluation and Benchmarking Agent and Tool Ecosystem DABStep Hugging Face

5Hugging Face Blog·1mo ago·source ↗

CodeAgents + Structure: A Better Way to Execute Actions

Hugging Face published a blog post exploring the combination of code-based agents with structured outputs to improve action execution reliability. The post examines how enforcing structured generation can reduce errors and improve the robustness of agentic code execution pipelines. This represents a practical engineering approach to making code agents more dependable in production settings.

Inference Economics Agent and Tool Ecosystem structured output generation Hugging Face CodeAgents +1 more

6arXiv · cs.AI·8d ago·source ↗

Agents-K1: End-to-end knowledge orchestration pipeline for agent-native scientific knowledge graphs

Agents-K1 is a new pipeline that converts raw scientific documents into structured knowledge graphs for use by LLM-based research agents, addressing the gap where existing systems reduce papers to abstracts and flat citation edges. The system integrates a multimodal parser, a 4B information-extraction model trained with GRPO, and a tri-source agent interface combining web search, graph retrieval, and cross-document traversal. The authors process 2.46 million scientific papers to produce Scholar-KG, releasing a one-million-paper subset. Experiments show improvements in scientific information extraction, knowledge graph construction, and multi-hop reasoning.

Evaluation and Benchmarking Agent and Tool Ecosystem GRPO Agents-K1 Scholar-KG +1 more