Entity · dataset

Dynaword

datasetactivedynaword-2343d64d·1 events·first seen Jun 5, 2026

Aliases: Dynaword

Co-occurring entities

PropMe SimpleTrace DFM Decoder Common Pile infini-gram Comma

More like this (12)

WordPiece WordVoice WordNet Word2World DevicesWorld DAIC-WOZ WWDC Bag of Words (BoW)WY Algorithm Qwen Code TDW-MAT Word Coverage Score (WCS)

Recent events (1)

5arXiv · cs.CL·Jun 5, 2026·source ↗

PropMe framework distinguishes memorization capability from propensity in LLMs

A new arXiv preprint introduces PropMe, a framework that separates whether LLMs can be forced to reproduce training data (capability) from whether they do so under ordinary use (propensity). The authors also release SimpleTrace, a lightweight pipeline using infini-gram to attribute model outputs to training corpora. Evaluating two open models on Common Pile and Dynaword, they find a consistent gap: adversarial prefix attacks elicit strong memorization, but propensity scores remain low in non-adversarial settings. The paper argues memorization audits should report both worst-case extractability and ordinary leakage propensity.

Evaluation and Benchmarking AI Safety Research PropMe SimpleTrace Dynaword +4 more