Entity · technique

test-time compute scaling

techniqueactivetest-time-compute-scaling-8579cf5a·2 events·first seen May 21, 2026

Aliases: test-time compute scaling

Co-occurring entities

latent reasoning Chain-of-Thought Reasoning Reasoning in Memory (RiM)curriculum learning task-conditioned attractors latent dynamical systems Sudoku-Extreme Equilibrium Reasoners (EqR)

More like this (12)

test-time compute Test-time Compute Search inference-time compute scaling test-time training inference-time compute Test-Time Finetuning (TTFT)idle-time compute Rethinking Inference-Time Scaling in Local Computer-Use Agents: Failure Modes and Compute Tradeoffs volunteer compute temperature scaling Test-Time Scaling for Small VLMs on Multilingual Visual MCQ instruction tuning

Recent events (2)

6arXiv · cs.AI·May 29, 2026·source ↗

Reasoning in Memory (RiM): Latent Reasoning via Working Memory Blocks in LLMs

RiM introduces a latent reasoning method that replaces autoregressive chain-of-thought token generation with fixed sequences of special 'memory block' tokens, allowing LLMs to perform internal computation without externalizing intermediate steps. These memory blocks are processed in a single forward pass rather than generated autoregressively, improving compute efficiency at test time. Training uses a two-stage curriculum: first grounding memory blocks by predicting explicit reasoning steps, then discarding step-level supervision and refining answers iteratively. Experiments across multiple model families and sizes show RiM matches or exceeds existing latent reasoning methods.

Evaluation and Benchmarking Inference Economics latent reasoning Chain-of-Thought Reasoning Reasoning in Memory (RiM)+3 more

7arXiv · cs.LG·May 21, 2026·source ↗

Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning

This paper introduces Equilibrium Reasoners (EqR), a framework that formalizes test-time compute scaling through learned task-conditioned attractors in latent space, where stable fixed points correspond to valid solutions. EqR scales along two axes—depth (more iterations) and breadth (aggregating stochastic trajectories)—without requiring external verifiers or task-specific priors. On Sudoku-Extreme, unrolling up to 40,000 equivalent layers boosts accuracy from 2.6% (feedforward baseline) to over 99%. The work provides a mechanistic lens for understanding why iterative latent models generalize beyond memorized patterns.

Long Context Evolution Evaluation and Benchmarking task-conditioned attractors latent dynamical systems Sudoku-Extreme +3 more