technique

Randomized YaRN

techniqueactiveprovisionalrandomized-yarn-6c1a7cd5·1 events·first seen 2d ago

Aliases: Randomized YaRN

Co-occurring entities

BABILong Multi-Round Coreference Resolution YaRN

More like this (12)

YaRN Automatic Domain Randomization Random Coding WY Algorithm dynamics randomization randomized collage RAG randomized coordinate descent ComoRAG fastRAG QRData MedRLM

Recent events (1)

5arXiv · cs.CL·2d ago·source ↗

Randomized YaRN improves LLM length generalization for long-context reasoning

Researchers propose Randomized YaRN, a training method that combines YaRN-based positional extrapolation with randomized positional encodings and a length curriculum to improve LLM generalization to long contexts. Models trained on sequences under 8K tokens show consistent reasoning improvements on context lengths from 16K to 128K on BABILong and MRCR benchmarks. The key insight is that exposing models to out-of-distribution positional representations during short-context training enables better generalization at far longer inference-time lengths.

Long Context Evolution Evaluation and Benchmarking BABILong Multi-Round Coreference Resolution YaRN +1 more