technique
Semantic Neighbor Mixing
techniqueactiveprovisional
semantic-neighbor-mixing-5d532786·1 events·first seen 7d agoAliases: Semantic Neighbor Mixing
Co-occurring entities
More like this (12)
Sparse Mixture-of-ExpertsWhen Does Mixing Help? Analyzing Query Embedding Interpolation in Multilingual Dense RetrievalSupervised Semantic DifferentialFast Adaptive Semantic Entropymixture-density networksData Mixture SurgeryMDA (Mixture-Density Ambiguity)The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language ModelGraph Neural Network leveraging Higher-order Class Label Connectivity for Heterophilous GraphsPair M-distTying the Loop -- Tied Expert Layers in Mixture-of-Experts Language ModelsSemantic Triplet Restoration
Recent events (1)
N-GRPO: Semantic Neighbor Mixing for Improved Policy Optimization in LLM Reasoning
A new arXiv preprint introduces N-GRPO, an exploration strategy for the GRPO reinforcement learning framework that improves solution diversity during rollout by mixing embeddings of anchor tokens with their nearest semantic neighbors rather than using token-level sampling or random noise. The method is evaluated on DeepSeek-R1-Distill-Qwen models of various sizes and shows consistent improvements on math reasoning benchmarks plus out-of-distribution generalization. The work targets a known limitation in RLHF-style training: redundant rollout trajectories that reduce effective learning signal.