Almanac
technique

CORE (Contrastive Reflection)

techniqueactiveprovisionalcore-contrastive-reflection--9a8214e0·1 events·first seen 20d ago

Aliases: CORE (Contrastive Reflection)

Co-occurring entities

More like this (12)

Recent events (1)

7arXiv · cs.AI·20d ago·source ↗

CORE: Contrastive Reflection for Sample-Efficient Reasoning Improvement

CORE (Contrastive Reflection) is a non-parametric learning algorithm that improves LLM reasoning by comparing successful and unsuccessful reasoning traces to generate compact natural-language 'insights' about reasoning strategies. Across four reasoning tasks, CORE outperforms both parametric baselines (GRPO/RLVR) and non-parametric baselines (GEPA, episodic RAG, MemRL) under fixed rollout budgets, achieving comparable or better gains with as few as five training samples. The method is also more context-efficient than prompt-optimization approaches, storing learned knowledge as interpretable natural-language descriptions rather than raw traces or weight updates. The results suggest contrastive distillation of reasoning traces may be a more efficient route to self-improvement than traditional fine-tuning.