technique
Dual-Evidence Gradient Purification
techniqueactive
dual-evidence-gradient-purification-ca5eb1ca·1 events·first seen 26d agoAliases: Dual-Evidence Gradient Purification
Co-occurring entities
More like this (12)
Evolved Policy GradientsPolicy Gradient Methodsgradient accumulationGradient Labspolicy gradientDeep Double DescentDenoising Diffusion Policy OptimizationEDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM GradingIntegrated Gradientsbehavioral-gradient validatorgradient flow dynamicsOn-Policy Distillation (OPD)
Recent events (1)
TextReg: Regularization Framework for Mitigating Prompt Distributional Overfitting in LLM Optimization
TextReg addresses a failure mode in iterative prompt optimization where LLM-rewritten prompts grow longer, accumulate narrow rules, and generalize poorly—termed prompt distributional overfitting. The authors formalize this via 'representational inefficiency,' a dual-factor measure decomposing prompt inefficiency into capacity cost and scope narrowness. TextReg applies a soft-penalty regularization framework using Dual-Evidence Gradient Purification, Semantic Edit Regularization, and Regularization-Guided Prompt Update. On reasoning benchmarks, it achieves up to +11.8% OOD accuracy over TextGrad and +16.5% over REVOLVE.