Almanac
benchmark

eating disorder safety evaluation

benchmarkactiveprovisionaleating-disorder-safety-evaluation-66513209·1 events·first seen 15d ago

Aliases: eating disorder safety evaluation

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·15d ago·source ↗

Systematic Evaluation of LLM Safety Failures on Eating Disorder Queries with Clinician Feedback

This paper investigates how LLMs respond to queries from users with eating disorders, finding that specific linguistic cues in prompts increase the likelihood of unsafe model responses. Working with clinical ED experts, the authors systematically vary risk levels in user prompts to measure the extent to which LLMs uncritically adapt to potentially dangerous inputs. The study highlights a gap between perceived model safety and actual harm facilitation in sensitive health contexts.