paper
Measuring Epistemic Resilience of LLMs Under Misleading Medical Context
paperactiveprovisional
measuring-epistemic-resilience-of-llms-under-misleading-medical-context-d10c8be9·1 events·first seen 6d agoAliases: Measuring Epistemic Resilience of LLMs Under Misleading Medical Context
Co-occurring entities
More like this (12)
Reassessing High-Performing LLMs on Polish Medical Exams: True Competence or Bias-Driven Performance?Clinically Grounded Privacy Evaluation of Medical LMsThe Masked Advantage: Uncovering Local-Language Access to Cultural Knowledge in LLMsSecurity and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs RespondEDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM GradingHuman Adults and LLMs as Scientists: Who Benefits from Active Exploration?Open Medical-LLM LeaderboardRevising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online DiscussionsLLM-Guided Evolution for Medical Decision PipelinesBeyond Third-Person Audits: Situated Interaction Auditing for User-Centered LLM Bias ResearchHow reliable are LLMs when it comes to playing dice?How reliable are LLMs when it comes to playing dice?
Recent events (1)
MedMisBench: LLMs show fragile epistemic resilience under misleading medical context
Researchers introduce MedMisBench, a benchmark of 10,932 medical questions paired with 48,889 misleading context injections, to measure whether LLMs maintain correct medical judgment under adversarial pressure. Across 11 model configurations, mean accuracy drops from 71.1% to 38.0% when misleading context is injected, with authority-framed falsehoods achieving 69.5% attack success. A 14-member international clinical panel flagged serious potential harm in 38.2% of reviewed cases. The work argues that existing medical benchmarks measure knowledge but not robustness to manipulation, exposing a structural gap in LLM safety evaluation for healthcare.