Entity · technique

3D-RoPE

techniqueactive3d-rope-a283bbcd·1 events·first seen May 29, 2026

Aliases: 3D-RoPE

Co-occurring entities

NVIDIA B200 KV Cache VideoMLA VBench Multi-head Latent Attention (MLA)

More like this (12)

2D-RoPE RoPE ST-RoPE Möbius RoPE Rotary Position Embedding (RoPE)3D-Fit MPI3D CoRP ReaORE ROCm online 3D reconstruction OpenRLHF

Recent events (1)

6arXiv · cs.AI·May 29, 2026·source ↗

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

VideoMLA applies Multi-Head Latent Attention (MLA) to causal video diffusion, replacing per-head keys and values with a shared low-rank content latent and decoupled 3D-RoPE positional key, achieving 92.7% reduction in per-token KV memory. The paper investigates why MLA works despite pretrained video attention not being low-rank (unlike the spectral assumption motivating MLA in LLMs), finding that the MLA bottleneck itself determines effective rank rather than the pretrained spectrum. On VBench, VideoMLA matches short-horizon baselines, achieves best overall score at long horizons, and delivers 1.23x throughput improvement on a single NVIDIA B200 GPU.

Training Infrastructure Long Context Evolution NVIDIA B200 KV Cache 3D-RoPE +5 more