dataset
Orca-Math
datasetactiveprovisional
orca-math-a882a766·1 events·first seen 26h agoAliases: Orca-Math
Co-occurring entities
More like this (12)
Recent events (1)
Semi-supervised framework scales LLM reasoning with minimal labeled data via lightweight verifier
A new arXiv preprint proposes a semi-supervised framework for training LLMs to reason with very few labeled examples, using a lightweight classifier to judge the validity of intermediate reasoning traces. An entropy-based confidence threshold filters unreliable pseudo-labels before fine-tuning. Experiments on math reasoning (Orca-Math subset) and visual QA (GQA) show accuracy comparable to using 10-15x more labeled data. The approach reduces dependence on expensive answer-level supervision by turning verification into a data-creation mechanism.