paper
Automated reproducibility assessments in the social and behavioral sciences using large language models
paperactiveprovisional
automated-reproducibility-assessments-in-the-social-and-behavioral-sciences-using-large-language-models-6323e3a8·1 events·first seen 5d agoAliases: Automated reproducibility assessments in the social and behavioral sciences using large language models
More like this (12)
Civil Court Simulation with Large Language ModelsThe Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of Large Language ModelsExploring Adversarial Robustness and Safety Alignment in Multilingual Multi-Modal Large Language ModelsAgentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Applicationlarge language modelslarge language model agentsMultimodal Large Language ModelsReinforcement Learning for Language Modelsmulti-turn language modelsLarge Language Models (frontier)8B autoregressive language modelAdaptive Multi-Resolution Procedural Knowledge Compression for Large Language Models
Recent events (1)
LLMs automate reproducibility assessments in social and behavioral sciences, outperforming human reanalysts
A preprint from arXiv demonstrates that an LLM pipeline can automate reproducibility assessments of published social and behavioral science studies, recovering original effect sizes in 41% of cases (vs. 34% for human reanalysts) and reaching the same qualitative conclusion in 96% of cases (vs. 74% for humans). The study evaluated 76 published studies with predefined claims. The results suggest LLMs could serve as a scalable tool for systematic auditing of empirical research, addressing the resource-intensive nature of traditional reproducibility efforts.