technique
Dual Layer Aggregation (DLA)
techniqueactiveprovisional
dual-layer-aggregation-dla--e4658f4a·1 events·first seen 22d agoAliases: Dual Layer Aggregation (DLA)
Co-occurring entities
More like this (12)
Recent events (1)
Squeezing Capacity from MLLMs for Subject-driven Image Generation via Dual Layer Aggregation
This paper proposes conditioning diffusion models on Multimodal Large Language Models (MLLMs) that jointly encode text and reference images, augmented with VAE-based identity conditioning to address copy-paste artifacts and identity preservation failures in subject-driven image generation. A Dual Layer Aggregation (DLA) module aggregates multi-level MLLM features, and a multi-stage denoising strategy progressively balances semantic and fine-detail identity signals during inference. Experiments show improved human preference scores on subject-driven generation benchmarks compared to prior approaches that encode text and reference images separately.