Almanac
technique

CORA

techniqueactiveprovisionalcora-166b5b1b·1 events·first seen 2d ago

Aliases: CORA

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·2d ago·source ↗

CORA: Consistency-Oriented Reasoning Alignment addresses thinking-answer gap in multimodal RLVR

Researchers identify and analyze a systematic inconsistency between reasoning traces and final answers in RLVR-trained large vision-language models, showing the problem persists throughout GRPO training and inference. They propose CORA, which introduces a lightweight plug-and-play consistency reward model and a Hybrid Reward Advantage Splitting (HRAS) mechanism to coordinate task and consistency optimization. Experiments across multimodal reasoning benchmarks show CORA improves both task performance and reasoning faithfulness.