Almanac
dataset

DCLM

datasetactiveprovisionaldclm-ee22e7fa·1 events·first seen 12h ago

Aliases: DCLM

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·12h ago·source ↗

Scaling laws study finds LLM social simulation fidelity mostly improves with compute, with notable exceptions

A new arXiv preprint investigates whether scaling compute improves the fidelity of LLM-based social simulations across three domains: opinion modeling, behavioral simulation, and longitudinal forecasting. Using 85 Qwen3-architecture models trained under fixed-compute budgets from 10^18 to 10^20 FLOPs, plus 35 larger open-weight models up to 70B parameters, the authors find strong scaling in most settings. However, longitudinal forecasting, underrepresented populations, and specific cognitive bias calibration tasks (e.g., risk aversion) scale poorly, with fine-tuning failing to close gaps from 0.5B to 8B parameters. The work provides empirical grounding for where scaling will and will not suffice for social simulation research.