Almanac
technique

Channel-wise Vector Quantization

techniqueactiveprovisionalchannel-wise-vector-quantization-f821f509·1 events·first seen 22d ago

Aliases: Channel-wise Vector Quantization

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·22d ago·source ↗

Channel-wise Vector Quantization (CVQ): A New Image Tokenization Paradigm with Next-Channel Prediction

Researchers introduce Channel-wise Vector Quantization (CVQ), which replaces conventional patch-wise discrete tokens with channel-wise tokens that represent an image as discrete levels of visual detail. Built on CVQ, the Channel-wise Autoregressive (CAR) model uses a 'next-channel prediction' objective, generating images by progressively refining from global structure to fine-grained attributes. CVQ achieves 100% codebook utilization with a 16K+ codebook and the CAR model scores 86.7 on DPG and 0.79 on GenEval for text-to-image generation. The approach offers a structural alternative to raster-order patch-based autoregressive image generation.