technique
3D-RoPE
techniqueactiveprovisional
3d-rope-a283bbcd·1 events·first seen 19d agoAliases: 3D-RoPE
Co-occurring entities
More like this (12)
Recent events (1)
VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion
VideoMLA applies Multi-Head Latent Attention (MLA) to causal video diffusion, replacing per-head keys and values with a shared low-rank content latent and decoupled 3D-RoPE positional key, achieving 92.7% reduction in per-token KV memory. The paper investigates why MLA works despite pretrained video attention not being low-rank (unlike the spectral assumption motivating MLA in LLMs), finding that the MLA bottleneck itself determines effective rank rather than the pretrained spectrum. On VBench, VideoMLA matches short-horizon baselines, achieves best overall score at long horizons, and delivers 1.23x throughput improvement on a single NVIDIA B200 GPU.