Entity · technique

Entropy-Cut Metropolis-Hastings

techniqueactiveentropy-cut-metropolis-hastings-60346d0a·1 events·first seen May 29, 2026

Aliases: Entropy-Cut Metropolis-Hastings

Co-occurring entities

power distribution MATH500 GPQA Diamond AIME26 Metropolis-Hastings HumanEval

More like this (12)

Metropolis-Hastings Maximum Entropy Random Walk Certified Parallel-in-Time Sinkhorn for Dynamic Entropic Optimal Transport Entropy-Aware Dense Pruning Semantic Entropy Minimax Density Conditional Scale Entropy Entropy Regularization Hamiltonian leapfrog map Gradient Equilibrium Marchenko-Pastur distribution Minimal Markovization via Stable Quotients in Holonomy-Cover Decision Processes

Recent events (1)

7arXiv · cs.LG·May 29, 2026·source ↗

Entropy-Cut Metropolis-Hastings: Sampling-Based Reasoning Without RL Training

This paper introduces Entropy-Cut Metropolis-Hastings (ECMH), an algorithm that samples from a 'power distribution' over base language model outputs to elicit strong reasoning without reinforcement learning posttraining. Rather than cutting reasoning traces at uniformly random positions, ECMH uses next-token entropy as a proxy to identify consequential decision points (e.g., choice of proof strategy), then resamples from those positions. The authors prove that mixing time scales with the number of decisions rather than tokens, and demonstrate consistent improvements over RL-trained models on MATH500, HumanEval, GPQA Diamond, and AIME26.

Frontier Model Releases Evaluation and Benchmarking power distribution MATH500 Entropy-Cut Metropolis-Hastings +6 more