Almanac
paper

Arithmetic Pedagogy for Language Models

paperactiveprovisionalarithmetic-pedagogy-for-language-models-e0c17f7f·1 events·first seen 13d ago

Aliases: Arithmetic Pedagogy for Language Models

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·13d ago·source ↗

GASING pedagogy-guided CoT training enables strong arithmetic reasoning in 86M-parameter GPT-2 model

Researchers train a small 86M-parameter GPT-2 decoder from scratch using Chain-of-Thought supervision derived from GASING, an Indonesian left-to-right arithmetic pedagogy, without any reinforcement learning. The model achieves over 80% accuracy on held-out arithmetic problems and competes with substantially larger models. Mechanistic analyses reveal two emergent capabilities: an explicit procedural pathway and a subsequent associative 'mental arithmetic' capacity that bypasses step-by-step computation. The work suggests that pedagogically structured training data can yield efficient arithmetic capability at small scale.