paper
Arithmetic Pedagogy for Language Models
paperactiveprovisional
arithmetic-pedagogy-for-language-models-e0c17f7f·1 events·first seen 13d agoAliases: Arithmetic Pedagogy for Language Models
Co-occurring entities
More like this (12)
Reasoning Language ModelsLanguage Modeling LossTransformer Language ModelsCivil Court Simulation with Large Language ModelsReinforcement Learning for Language Models7B language modelLanguage Model Finetuninggenerative language modelingAnyLanguageModelLanguage Models Compare Quantities Using Number-specific and Unit-specific HeuristicsThe Value Axis: Language Models Encode Whether They're on the Right Trackencoder-only language models
Recent events (1)
GASING pedagogy-guided CoT training enables strong arithmetic reasoning in 86M-parameter GPT-2 model
Researchers train a small 86M-parameter GPT-2 decoder from scratch using Chain-of-Thought supervision derived from GASING, an Indonesian left-to-right arithmetic pedagogy, without any reinforcement learning. The model achieves over 80% accuracy on held-out arithmetic problems and competes with substantially larger models. Mechanistic analyses reveal two emergent capabilities: an explicit procedural pathway and a subsequent associative 'mental arithmetic' capacity that bypasses step-by-step computation. The work suggests that pedagogically structured training data can yield efficient arithmetic capability at small scale.