paper
Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond
paperactiveprovisional
security-and-privacy-prompts-in-the-wild-what-users-ask-llms-and-how-llms-respond-9f470643·1 events·first seen 7h agoAliases: Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond
Co-occurring entities
More like this (12)
Clinically Grounded Privacy Evaluation of Medical LMsBeyond Third-Person Audits: Situated Interaction Auditing for User-Centered LLM Bias ResearchThe Masked Advantage: Uncovering Local-Language Access to Cultural Knowledge in LLMsMeasuring Epistemic Resilience of LLMs Under Misleading Medical Contextfrontier LLMsLLM Safety LeaderboardRevising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online DiscussionsFlaws in the LLM Automation NarrativeHuman Adults and LLMs as Scientists: Who Benefits from Active Exploration?SpeechLLMLLM CLIBe My Tutor: On-Policy Co-Distillation for Mutual LLM Improvement via Peer Feedback
Recent events (1)
Study of security and privacy prompts in the wild reveals LLM response quality gaps and inconsistency
Researchers analyzed 14,727 security and privacy (S&P) prompts drawn from WildChat's 3.2M real user-LLM conversations, categorizing them into nine topic areas and evaluating response quality across 270 advice-seeking prompts. Commercial models substantially outperformed open-weight models (GPT achieving 98% 'good enough' responses vs. Llama 4 at 47%), but even high-performing commercial models showed inconsistent responses across repeated runs of the same prompt. The study is the first to analyze real user S&P queries to LLMs rather than expert-authored test sets, surfacing both a capability gap and a reliability concern.