GASING pedagogy-guided CoT training enables strong arithmetic reasoning in 86M-parameter GPT-2 model
Researchers train a small 86M-parameter GPT-2 decoder from scratch using Chain-of-Thought supervision derived from GASING, an Indonesian left-to-right arithmetic pedagogy, without any reinforcement learning. The model achieves over 80% accuracy on held-out arithmetic problems and competes with substantially larger models. Mechanistic analyses reveal two emergent capabilities: an explicit procedural pathway and a subsequent associative 'mental arithmetic' capacity that bypasses step-by-step computation. The work suggests that pedagogically structured training data can yield efficient arithmetic capability at small scale.
Related guides (2)
Related events (8)
OpenAI Trains System Solving Grade School Math Problems at ~55% Accuracy
OpenAI released a system for solving grade school math word problems that achieves roughly twice the accuracy of a fine-tuned GPT-3 model. The system scored 55% on a sample test where 9-12 year olds scored 60%, suggesting near-human performance on elementary math. This work represents an early milestone in neural network mathematical reasoning capabilities.
Advancing science and math with GPT-5.2
OpenAI has released GPT-5.2, described as its strongest model for mathematics and science, achieving state-of-the-art results on GPQA Diamond and FrontierMath benchmarks. The announcement highlights practical research applications including solving an open theoretical problem and generating verified mathematical proofs. The post positions GPT-5.2 as a meaningful step toward AI-assisted scientific discovery.
Improving Mathematical Reasoning with Process Supervision
OpenAI trained a model achieving state-of-the-art mathematical problem solving by rewarding each correct reasoning step (process supervision) rather than only the final answer (outcome supervision). This approach improves performance on math benchmarks and carries an alignment benefit by training models to produce human-endorsed chain-of-thought reasoning. The work highlights a potential synergy between capability improvements and alignment techniques.
Reasoning models struggle to control their chains of thought, and that's good
OpenAI introduces CoT-Control, a framework for evaluating how well reasoning models can deliberately manipulate or suppress their chain-of-thought outputs. The finding that models struggle to control their CoT is framed as a positive safety property, reinforcing the argument that visible reasoning traces serve as a meaningful monitorability safeguard. This contributes to ongoing research on whether chain-of-thought transparency is a reliable alignment and oversight tool.
DreamReasoner-8B: Block-size curriculum learning enables long-CoT reasoning in diffusion language models
Researchers introduce DreamReasoner-8B, an open-source block diffusion language model trained with a block-size curriculum learning strategy that gradually transitions from fine-grained to coarse-grained block sizes during training. The work identifies a critical failure mode: training with large block sizes severely degrades reasoning, while small block sizes preserve it. The proposed curriculum bridges this gap, achieving math and code reasoning performance competitive with Qwen3-8B while retaining the parallel decoding efficiency of block diffusion models. The model and code are publicly released.
OpenAI GPT-next Solves 80-Year-Old Erdős Planar Unit Distance Problem for Under $1000
A Latent Space AINews digest reports that OpenAI's GPT-next model disproved the Erdős planar unit distance conjecture, an 80-year-old open problem in combinatorial geometry, at a compute cost under $1000. The item is framed as a notable AI-assisted mathematics result. The brief characterizes it as a quiet day overall but highlights this as a meaningful capability demonstration at the intersection of AI and formal mathematics.
New ways to learn math and science in ChatGPT
OpenAI is adding interactive visual explanations for math and science topics to ChatGPT, allowing users to explore formulas and variables in real time. The feature targets students and learners, representing an expansion of ChatGPT's educational capabilities. This is a product-layer enhancement rather than a new model or core capability release.
GPT-5 and the future of mathematical discovery
UCLA Professor Ernest Ryu collaborated with GPT-5 to solve an open problem in optimization theory, representing a concrete example of AI-assisted mathematical research. The announcement highlights GPT-5's capability in formal reasoning and scientific discovery beyond standard benchmarks. This is an OpenAI blog post showcasing a real-world research outcome involving a frontier model.

