Almanac
dataset

Perceptually Perturbed Judgment Dataset

datasetactiveprovisionalperceptually-perturbed-judgment-dataset-d9fae811·1 events·first seen 15d ago

Aliases: Perceptually Perturbed Judgment Dataset

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·15d ago·source ↗

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

This paper identifies and analyzes 'Perceptual Judgment Bias' in multimodal LLM judges, where models anchor on response text rather than visual evidence when the two conflict. The authors introduce a Perceptually Perturbed Judgment Dataset using counterfactual responses to isolate perceptual errors, and a training framework combining GRPO-based reward modeling with batch-ranking objectives. Experiments on MLLM-as-a-Judge benchmarks show improved perceptual fidelity, ranking coherence, and alignment with human evaluation.