Almanac
dataset

SERA

datasetactiveprovisionalsera-8918c259·1 events·first seen 41h ago

Aliases: SERA

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·41h ago·source ↗

OpenThoughts-Agent: Open data curation pipeline for broadly capable agentic models

The OpenThoughts-Agent (OT-Agent) project releases a fully open data curation pipeline for training agentic language models, addressing the gap left by prior efforts (SWE-Smith, SERA, Nemotron-Terminal) that target single benchmarks. The team conducts over 100 controlled ablation experiments and assembles a 100K-example training set, fine-tuning Qwen3-32B to achieve 44.8% average accuracy across seven agentic benchmarks — a 3.9 percentage point improvement over the strongest existing open agentic model (Nemotron-Terminal-32B at 40.9%). Training data, pipeline, experimental data, and models are publicly released at openthoughts.ai.