Entity · benchmark

LongBench-Write

benchmarkactivelongbench-write-27bf8a37·1 events·first seen Jun 9, 2026

Aliases: LongBench-Write

Co-occurring entities

More like this (12)

LongBench-Pro LongBench v2 FoldBench LiveBench SorryBench SelectBench Int-Bench LiTBench LabBench TriggerBench DeliveryBench Benchling

Recent events (1)

5arXiv · cs.CL·Jun 9, 2026·source ↗

IS-CoT framework addresses long-form generation collapse in LLMs via interleaved structural thinking

Researchers introduce IS-CoT (Interleaved Structural Chain-of-Thought), a framework that embeds a dynamic Plan-Write-Reflect cycle into LLM generation to overcome severe length collapse observed in reasoning-enhanced models for open-ended writing tasks beyond 2,000 words. The authors construct a multi-teacher training dataset of interleaved reasoning traces and train IS-Writer-8B, which achieves state-of-the-art results on LongBench-Write, outperforming DeepSeek-V3.2 by 3.08 points. The work identifies static hierarchical planning as a root cause of long-form degradation and proposes an in-model alternative to external agentic workflows.

Long Context Evolution Evaluation and Benchmarking DeepSeek V4 LongBench-Write IS-Writer-8B +1 more