technique

Unified Latent Probe

techniqueactiveprovisionalunified-latent-probe-44fc8eb1·1 events·first seen 47h ago

Aliases: Unified Latent Probe

Co-occurring entities

EIT-NLP What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis

More like this (12)

Reverse Probing Text-Only Probe hidden state probing Chain-Text Probe Probe Trajectories MemProbe probing classifiers logistic regression probes paired-scenario forced-choice probe DeepSeek-Prover-V2-7B MMLU-Pro Latent-Anchored GRPO

Recent events (1)

5arXiv · cs.CL·47h ago·source ↗

Information-theoretic analysis of supervision in latent chain-of-thought reasoning

This paper analyzes Latent Chain-of-Thought (CoT) reasoning — where reasoning occurs in continuous hidden states rather than discrete text — through an information-theoretic lens, identifying a 'dual collapse' failure mode involving gradient attenuation and representational drift. The authors decompose process supervision into Trajectory Supervision and Space Supervision, and introduce the Unified Latent Probe (ULP) to quantify mutual information between latent trajectories and explicit reasoning steps. Experiments reveal an 'Information-Performance Binding' showing reasoning accuracy depends on information fidelity in the latent chain, suggesting supervision should shift from geometric imitation toward mutual information maximization.

Evaluation and Benchmarking Alignment and RLHF EIT-NLP Unified Latent Probe What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis