benchmark
LongBench-Write
benchmarkactiveprovisional
longbench-write-27bf8a37·1 events·first seen 8d agoAliases: LongBench-Write
Co-occurring entities
More like this (12)
Recent events (1)
IS-CoT framework addresses long-form generation collapse in LLMs via interleaved structural thinking
Researchers introduce IS-CoT (Interleaved Structural Chain-of-Thought), a framework that embeds a dynamic Plan-Write-Reflect cycle into LLM generation to overcome severe length collapse observed in reasoning-enhanced models for open-ended writing tasks beyond 2,000 words. The authors construct a multi-teacher training dataset of interleaved reasoning traces and train IS-Writer-8B, which achieves state-of-the-art results on LongBench-Write, outperforming DeepSeek-V3.2 by 3.08 points. The work identifies static hierarchical planning as a root cause of long-form degradation and proposes an in-model alternative to external agentic workflows.