Almanac
paper

Dango: A Strictly L1-Only Large Language Model for Studying Second Language Acquisition

paperactiveprovisionaldango-a-strictly-l1-only-large-language-model-for-studying-second-language-acquisition-b3689126·1 events·first seen 2d ago

Aliases: Dango: A Strictly L1-Only Large Language Model for Studying Second Language Acquisition

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·2d ago·source ↗

Dango: A 1.8B LLM trained exclusively on Japanese to study L1-to-L2 language transfer

Researchers introduce Dango, a 1.8B-parameter decoder-only LLM pretrained strictly on Japanese (L1) and fine-tuned on LLM-generated English (L2) learning lessons to simulate second language acquisition. A key contribution is a filtering method to remove L2 contamination from ostensibly monolingual pretraining corpora. Evaluations show Dango produces human-like L2 error patterns, outperforming multilingual and unfiltered baselines. The model, data, and code are released for computational SLA research.