Almanac
benchmark

AUPRC

benchmarkactiveprovisionalauprc-4d9aa7c3·2 events·first seen 20d ago

Aliases: AUPRC

Co-occurring entities

More like this (12)

Recent events (2)

5arXiv · cs.AI·20d ago·source ↗

Reverse Probing: Supervised Token-level Uncertainty Quantification for LLMs in Clinical Text

The paper introduces Reverse Probing, a novel uncertainty quantification framework designed specifically for clinical text summarization that estimates token-level uncertainty from pre-existing labeled summaries rather than sampling new outputs. It extracts uncertainty signals from four categories of internal model activations, treating text as a probe into the model's internal state. Evaluated on two expert-annotated clinical datasets, it outperforms eight adapted baselines on all metrics, achieving up to 4× higher AUPRC while reducing inference time and compute. Feature analysis identifies delta energy and neighborhood context as the most consistent predictors of uncertainty across models.

4arXiv · cs.CL·15d ago·source ↗

Evidence-Augmented ML for Self-Harm Surveillance in Emergency Department Triage Notes

Researchers developed a three-stage pipeline combining traditional machine learning with LLM-based screening and evidence extraction to detect self-harm in Australian emergency department triage notes. The system achieved AUPRCs around 0.88 in both internal and external validation, and transferred to two external hospital sites without site-specific retraining. A notable capability is identifying the primary self-harm method with 95% accuracy, enabling more granular public health surveillance beyond binary classification.