paper

What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis

paperactiveprovisionalwhat-makes-effective-supervision-in-latent-chain-of-thought-an-information-theoretic-analysis-d498a755·1 events·first seen 47h ago

Aliases: What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis

Co-occurring entities

EIT-NLP Unified Latent Probe

More like this (12)

latent chain-of-thought chain-of-thought monitoring Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning Agentic Chain-of-Thought Steering Beyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning Models When the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning Models Chain-of-Thought Reasoning Open Chain of Thought Leaderboard Chain-of-Thought Fine-Tuning Chain-of-Thought Monitorability Evaluation Suite Chain-of-Thought Self-Consistency Latent Reasoning with Normalizing Flows

Recent events (1)

5arXiv · cs.CL·47h ago·source ↗

Information-theoretic analysis of supervision in latent chain-of-thought reasoning

This paper analyzes Latent Chain-of-Thought (CoT) reasoning — where reasoning occurs in continuous hidden states rather than discrete text — through an information-theoretic lens, identifying a 'dual collapse' failure mode involving gradient attenuation and representational drift. The authors decompose process supervision into Trajectory Supervision and Space Supervision, and introduce the Unified Latent Probe (ULP) to quantify mutual information between latent trajectories and explicit reasoning steps. Experiments reveal an 'Information-Performance Binding' showing reasoning accuracy depends on information fidelity in the latent chain, suggesting supervision should shift from geometric imitation toward mutual information maximization.

Evaluation and Benchmarking Alignment and RLHF EIT-NLP Unified Latent Probe What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis