benchmark
temporally grounded QA benchmark
benchmarkactive
temporally-grounded-qa-benchmark-a4368b42·1 events·first seen 26d agoAliases: temporally grounded QA benchmark
Co-occurring entities
More like this (12)
Recent events (1)
Temporally Ordered Pre-training Improves LLM Factual Freshness (Kairos)
Researchers from Kyutai pre-train 6B-parameter models on temporally ordered Common Crawl snapshots and compare them against standard shuffled pre-training baselines. They introduce a benchmark of over 7,000 temporally grounded questions to evaluate whether models correctly associate facts with their corresponding time periods. Results show sequentially trained models match shuffled baselines on general language understanding while exhibiting more up-to-date and temporally precise factual knowledge. Code, checkpoints, and datasets are released under the Kairos project.