Entity · benchmark

temporally grounded QA benchmark

benchmarkactivetemporally-grounded-qa-benchmark-a4368b42·1 events·first seen May 22, 2026

Aliases: temporally grounded QA benchmark

Co-occurring entities

Kyutai Common Crawl temporally ordered pre-training Kairos

More like this (12)

StrategyQA Protocol QA FreshQA CORE benchmark EQ-Bench Time-MQA Bias Benchmark for Question Answering ResearchQA Auto Benchmark Audit (ABA)SimpleQA TableQA ChartQA

Recent events (1)

6arXiv · cs.AI·May 22, 2026·source ↗

Temporally Ordered Pre-training Improves LLM Factual Freshness (Kairos)

Researchers from Kyutai pre-train 6B-parameter models on temporally ordered Common Crawl snapshots and compare them against standard shuffled pre-training baselines. They introduce a benchmark of over 7,000 temporally grounded questions to evaluate whether models correctly associate facts with their corresponding time periods. Results show sequentially trained models match shuffled baselines on general language understanding while exhibiting more up-to-date and temporally precise factual knowledge. Code, checkpoints, and datasets are released under the Kairos project.

Training Infrastructure Frontier Model Releases Kyutai Common Crawl temporally ordered pre-training +3 more