Entity · benchmark

ClinHallu

benchmarkactiveclinhallu-cad4fe2d·1 events·first seen Jun 15, 2026

Aliases: ClinHallu

Co-occurring entities

More like this (12)

BenHalluEval BenHalluScore LegalHalluLens HalluTruthQA ClinEnv CLI-Hub cc-haha HLE HealthClaw Hallucinations Leaderboard MCLASH ECL

Recent events (1)

5arXiv · cs.AI·Jun 15, 2026·source ↗

ClinHallu benchmark diagnoses stage-wise hallucinations in medical multimodal LLM reasoning

Researchers from Alibaba DAMO Academy introduce ClinHallu, a benchmark of 7,031 validated instances designed to identify where hallucinations originate within medical MLLM reasoning pipelines. Each instance is annotated with a structured reasoning trace decomposed into Visual Recognition, Knowledge Recall, and Reasoning Integration stages, with stage-replacement interventions to measure the causal impact of correcting each stage. The paper also demonstrates that trace-supervised fine-tuning reduces stage-wise hallucinations, offering both diagnostic and mitigation value for clinical AI systems.

Evaluation and Benchmarking AI Safety Research Alibaba DAMO Academy ClinHallu +1 more