Almanac
benchmark

Inter4K

benchmarkactiveinter4k-738d0668·1 events·first seen 1mo ago

Aliases: Inter4K

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.LG·1mo ago·source ↗

RefDecoder: Reference-Conditioned Video VAE Decoder for Enhanced Visual Generation

RefDecoder addresses an architectural asymmetry in latent diffusion models where denoising networks are heavily conditioned but decoders remain unconditional, causing detail loss and inconsistency. The approach injects high-fidelity reference image signals into the VAE decoding process via reference attention, with a lightweight image encoder mapping reference frames into high-dimensional tokens co-processed at each decoder up-sampling stage. Evaluated on Inter4K, WebVid, and Large Motion benchmarks, RefDecoder achieves up to +2.1dB PSNR over unconditional baselines and improves VBench I2V scores across subject consistency, background consistency, and overall quality. The module is plug-and-play, compatible with existing video generation systems including Wan 2.1 and VideoVAE+ without additional fine-tuning.