Almanac
benchmark

MosaicLeaks

benchmarkactiveprovisionalmosaicleaks-5cf08717·1 events·first seen 2d ago

Aliases: MosaicLeaks

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·2d ago·source ↗

MosaicLeaks: Benchmark for evaluating secret-keeping in research agents

ServiceNow published a post on Hugging Face introducing MosaicLeaks, an evaluation focused on whether research agents can maintain confidentiality of sensitive information during task execution. The work targets a specific safety and alignment concern for agentic systems: information leakage during multi-step research workflows. This is relevant to the growing body of work on agent safety and trustworthiness in enterprise contexts.