Almanac
dataset

MedCaseReasoning

datasetactiveprovisionalmedcasereasoning-8402ddd5·1 events·first seen 18d ago

Aliases: MedCaseReasoning

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·18d ago·source ↗

MedCase-Structured: A Text-to-FHIR Dataset for Benchmarking Diagnostic Reasoning in Clinically Realistic EHR Settings

The paper introduces a pipeline for converting unstructured clinical text into HL7 FHIR R4 bundles, enabling evaluation of LLMs in realistic electronic health record settings. Applied to the MedCaseReasoning dataset, it produces MedCase-Structured, a synthetic benchmark achieving valid FHIR generation for 82.5% of cases. Key finding: LLMs show consistently lower diagnostic accuracy on structured FHIR inputs compared to plain text, underscoring the gap between standard benchmarks and real-world clinical deployment conditions.