Almanac
technique

TunerDiT

techniqueactiveprovisionaltunerdit-2da23f43·1 events·first seen 16d ago

Aliases: TunerDiT

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·16d ago·source ↗

TunerDiT: Training-free Progressive Steering of Diffusion Transformers for Multi-Event Video Generation

TunerDiT is a training-free method for steering video diffusion transformers (DiTs) to generate long-horizon videos containing multiple sequential events. The approach identifies intrinsic turning points in the DiT denoising trajectory where text conditioning shifts from global layout to fine-grained detail, then applies two steering mechanisms: Event-Partitioned Masking and Cross-Event Prompt Fusion. The authors also introduce Meve, a benchmark prompt suite for multi-event video generation, and report state-of-the-art results across 8 metrics with improved text alignment scaling with event count.