Entity · paper

Clinically Grounded Privacy Evaluation of Medical LMs

paperactiveclinically-grounded-privacy-evaluation-of-medical-lms-355bdf58·1 events·first seen Jun 9, 2026

Aliases: Clinically Grounded Privacy Evaluation of Medical LMs

More like this (12)

Clinician-Level Agreement Without Clinical Caution: LLM Evaluator Limits in Medical AI Benchmarking Measuring Epistemic Resilience of LLMs Under Misleading Medical Context Online Safety Monitoring for LLMs Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond MedRLM Towards Root Memories: Benchmarking and Enhancing Implicit Logical Memory Retrieval for Personalized LLMs Trade-offs in Medical LLM Adaptation: An Empirical Study in French QA RAS: Measuring LLM Safety Through Refusal Alignment Groc-PO: Grounded Context Preference Optimization for Truthful Multimodal LLMs Beyond Third-Person Audits: Situated Interaction Auditing for User-Centered LLM Bias Research Reassessing High-Performing LLMs on Polish Medical Exams: True Competence or Bias-Driven Performance?CM-LRS

Recent events (1)

6arXiv · cs.CL·Jun 9, 2026·source ↗

Clinically grounded privacy evaluation framework reveals high memorization risk in medical LMs

Researchers introduce a tiered adversarial framework for evaluating privacy leakage in medical language models, moving beyond simple training-text recovery to realistic clinical threat models. Applied to an LM pretrained on 378k clinical notes, the framework finds that routine encounter metadata (name, DOB, provider, visit date) elicits high verbatim memorization and sensitive-diagnosis recovery (AUROC 0.91 for abortion, 0.81 for HIV). The study also finds that exact-match memorization overstates disclosure risk because 36% of memorized tokens reflect templated documentation. The work provides a practical contextual privacy evaluation methodology for medical LMs trained on longitudinal patient data.

Evaluation and Benchmarking AI Safety Research Clinically Grounded Privacy Evaluation of Medical LMs +1 more