Entity · benchmark

OOLONG-PAIRS

benchmarkactiveoolong-pairs-231f3cfc·1 events·first seen Jun 2, 2026

Aliases: OOLONG-PAIRS

Co-occurring entities

MIT Tim Kraska BrowseComp Omar Khattab Qwen3-4B Qwen3-Coder-480B-A35B-Instruct CodeAct Recursive Language Models (RLMs)Alex L. Zhang GPT-5.5

More like this (12)

Oolong Oolong-Synthetic OWL Pair-In, Pair-Out (PIPO)OLMo2 OLMo OAT Pair M-dist Pair Opt-dist omlx Loong ALOHA

Recent events (1)

7The Batch·Jun 2, 2026·source ↗

Recursive Language Models Offer Path To Dramatically Expand Beyond the Context Window

MIT researchers Alex L. Zhang, Tim Kraska, and Omar Khattab propose Recursive Language Models (RLMs), a framework that offloads long-context processing to an external Python REPL environment, allowing models to programmatically fetch and manage text chunks via code generation. The root model spawns submodel instances to handle subtasks, aggregating their outputs recursively. Evaluated on benchmarks requiring reasoning over documents up to 11 million tokens, RLMs substantially outperform both base models and competing agentic strategies such as retrieval and summarization agents. For example, RLM-GPT-5 achieved 91.3% on BrowseComp+ versus GPT-5's inability to produce an answer, and ~50% accuracy on OOLONG-PAIRS at 1 million tokens versus near-zero for baseline approaches.

Long Context Evolution Evaluation and Benchmarking MIT OOLONG-PAIRS Tim Kraska +9 more