DABStep
dabstep-ec52d9a5·2 events·first seen 28d agoAliases: DABStep
Co-occurring entities
More like this (12)
Recent events (2)
DABStep: Data Agent Benchmark for Multi-step Reasoning
Hugging Face introduces DABStep, a benchmark designed to evaluate data agents on multi-step reasoning tasks. The benchmark targets agentic systems that must perform complex, sequential data operations rather than single-step queries. It aims to fill a gap in evaluation tooling for realistic data analysis workflows involving tool use and chained reasoning.
DataCOPE: Unsupervised skill discovery framework for data-analytic agents
Researchers introduce DataCOPE, an unsupervised verifier-guided framework for discovering reusable procedural skills in data-analytic agents without labeled supervision or parameter updates. The system coordinates three components—a data-analytic agent, an unsupervised verifier, and a skill manager for contrastive skill distillation—with task-specific verifier instantiations for report-style and reasoning-style analysis. Evaluated on Deep Data Research and DABStep benchmarks, DataCOPE improves mean scores by 9.71% and 32.30% respectively across four model settings. The approach addresses a key bottleneck in agentic data analysis: acquiring reliable skill supervision at scale.