Almanac
model

Large Reasoning Models

modelactivelarge-reasoning-models-f542c2fc·1 events·first seen 28d ago

Aliases: Large Reasoning Models

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·28d ago·source ↗

Probe Trajectories Reveal Reasoning Dynamics in Large Reasoning Models

This paper investigates whether hidden representations of Large Reasoning Models (LRMs) can predict future model behavior by analyzing probe trajectories—the continuous evolution of concept probabilities across Chain-of-Thought reasoning tokens. The authors find that temporal trajectory features (volatility, trend, steady-state) significantly outperform single static probes, with max-pooling achieving up to 95% AUROC across safety and mathematics domains. Two methodological insights are offered: template-based training data matches dynamically generated responses in quality, and pooling strategy is critical to probe performance. The work positions probe trajectories as a complementary safety monitoring framework for LRMs where CoT faithfulness cannot be assumed.