Entity · dataset

Orca-Math

datasetactiveorca-math-a882a766·1 events·first seen Jun 16, 2026

Aliases: Orca-Math

Co-occurring entities

GQA Scaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight Verifier

More like this (12)

ORCA-bench stablyai/orca NuminaMath Minerva Math Big-Math MathVista Axiom Math DAPO-Math OrpQuant / ORP MATH-MCQA DeepMath AdvancedMathBench

Recent events (1)

5arXiv · cs.CL·Jun 16, 2026·source ↗

Semi-supervised framework scales LLM reasoning with minimal labeled data via lightweight verifier

A new arXiv preprint proposes a semi-supervised framework for training LLMs to reason with very few labeled examples, using a lightweight classifier to judge the validity of intermediate reasoning traces. An entropy-based confidence threshold filters unreliable pseudo-labels before fine-tuning. Experiments on math reasoning (Orca-Math subset) and visual QA (GQA) show accuracy comparable to using 10-15x more labeled data. The approach reduces dependence on expensive answer-level supervision by turning verification into a data-creation mechanism.

Evaluation and Benchmarking Alignment and RLHF GQA Scaling LLM Reasoning from Minimal Labels: A Semi-Supervised Framework with a Lightweight Verifier Orca-Math