2arXiv cs.CL (Computation and Language)·25h ago

Dataset and analysis of scam trends and rail paths from Reddit self-disclosure narratives

Researchers build a dataset of 21,304 Reddit posts from scam-related subreddits to analyze yearly trends in scam types and multi-stage rail paths from 2023–2025. An LLM-assisted annotation method labels 1,800 posts for scam chain analysis, and a topic model examines community support behavior. The work is primarily a social science/NLP contribution to fraud detection research rather than an AI capability or infrastructure advance.