IMPACTeen: Annotated dataset for social influence detection in adolescent communication contexts
IMPACTeen is a new Polish/English bilingual dataset of 1,021 social influence scenarios targeting adolescent communication contexts, with 5,100 annotation records from five distinct annotator perspectives (teenagers, parents, psychologists, communication experts, teachers). The dataset covers influence techniques, intentions, consequences, and resistance, and was constructed via constrained LLM generation followed by human editing. It is intended to support research on social influence detection, annotator disagreement modeling, cross-lingual NLP, and LLM training and evaluation.
Related guides (1)
Related events (8)
Annotated dataset for enthymeme detection in political tweets with disagreement-aware training
Researchers present a dataset of 1,482 politically controversial tweets annotated by five annotators for enthymemes — arguments with unstated premises or conclusions — designed to study label variation rather than eliminate it. Annotation guidelines are grounded in Walton's argumentation schemes, and the paper includes a complexity analysis of cognitive load in the task. Preliminary experiments show that models trained on annotator disagreement outperform those trained on hard majority-vote labels, suggesting value in preserving annotation disagreement for subjective NLP tasks.
CATCH-ME dataset: multilingual multi-turn counterspeech against hate speech and misinformation for RAG systems
Researchers introduce CATCH-ME, a large-scale expert-curated multilingual dataset of multi-turn dialogues addressing the intersection of hate speech and misinformation across five languages and seven marginalized groups. The dataset is anchored in verified external knowledge (fact-checking articles and NGO reports) with document- and chunk-level span annotations, making it directly usable for RAG-based counterspeech systems. It addresses a gap in existing resources, which are limited to single-turn English dialogues, and is intended to improve the factual grounding and persuasiveness of LLM-generated counterspeech.
Persuasion Index: Theory-grounded taxonomy and open-source tool for analyzing rhetorical persuasion
Researchers introduce Persuasion Index (PI), a 15-dimension taxonomy of persuasive rhetorical cues grounded in psychology and communication theory, implemented via 55 sub-features using lexicons and rule-based detectors. PI is evaluated on four public datasets across domains and shown to provide interpretable, computationally lightweight predictive signal for persuasion-related outcomes. The framework is released as an open-source package and web interface, with stated applications including AI safety and detection of information manipulation.
Interaction SSD: Modeling Annotator Identity Effects on Hate Speech Semantic Gradients
This paper introduces Interaction SSD, an extension of Supervised Semantic Differential that tests how semantic meaning varies across moderating variables such as annotator group identity. Applied to the UC Berkeley Measuring Hate Speech corpus, the method detects that annotator racial identity significantly moderates hate-speech judgments, with a shared gradient distinguishing dehumanizing hostility from counter-speech and an interaction gradient revealing group-linked differences in predictive semantic cues. The approach makes moderated meaning-outcome relationships statistically testable and interpretable through standard SSD tooling.
CommunityFact: A Dynamic, Multilingual, Multi-domain Benchmark for Misinformation Detection in the Wild
CommunityFact is a refreshable benchmark for misinformation detection containing 15,992 standalone claims across five languages and two domains, designed to address limitations of static benchmarks. The authors evaluate ten LLMs under varying inference-time conditions including chain-of-thought reasoning and web-search augmentation, finding that web access yields the largest performance gains. A key finding is that web-enabled LLMs' source-selection policies are systematically misaligned with sources that human Community Notes raters converge on, a gap addressable through retrieval expansion or pruning. The benchmark also proposes using Community Notes as a training signal for claim-conditioned source suggesters.
AI-Mediated Communication Can Steer Collective Opinion via LLM Editing Biases
This paper demonstrates empirically that LLMs from multiple model families introduce directional biases when editing human-written texts on contested topics (e.g., nudging toward gun control, against atheism). The authors develop a mathematical opinion-dynamics model showing these biases are amplified through social networks, shifting collective opinion at scale. An audit of X's 'Explain this post' feature finds evidence of pro-life bias in Grok's outputs on abortion content, traced to specific design choices. The paper concludes with implications for EU legislative efforts on AI-mediated communication.
LLM-Assisted Discovery of ADHD Signals in Turkish Teacher Narratives Beyond Rating Scales
This study analyzes de-identified Turkish teacher evaluation forms from clinical ADHD assessments, comparing predictive signals from structured rating scales (CTRS-R:S) and open-ended teacher narratives. The authors find that structured and narrative information encode complementary signals, with minimal overlap between cases missed by each modality. An LLM-assisted theme discovery pipeline reveals distinct attention, behavioral, and family-related patterns in narratives that structured scales miss, demonstrating NLP's potential to augment traditional ADHD screening.
ALMANAC dataset provides action-level mental model annotations for studying human-agent collaboration
Researchers introduce ALMANAC, a dataset of 2,987 collaboration actions drawn from the Map Task dyadic routing paradigm, each annotated with theory-informed mental model labels covering self-reasoning, perceived partner intent, and perceived team goal. The dataset targets a gap in LLM agent training data: current agents are optimized for task completion but lack process-level collaborative competence grounded in mental model alignment. Six LLMs are benchmarked on predicting human next-turn behavior and mental model states. The work provides a resource for evaluating and potentially training agents toward more human-like collaborative reasoning.
