technique
Sparse-structure Multimodal Diffusion Transformer
techniqueactiveprovisional
sparse-structure-multimodal-diffusion-transformer-6e50d00b·1 events·first seen 15h agoAliases: Sparse-structure Multimodal Diffusion Transformer
Co-occurring entities
More like this (12)
Sparse TransformerConditional Diffusion TransformerLinear Diffusion Transformerflow-matching diffusion transformerDynamic Short Convolutions Improve TransformersVariable-Width TransformersStable Diffusion Turbotransformer architectureMultimodal GainRepresentation-Conditioned Diffusion Modelstext-to-image diffusion modelDiffusion Language Models
Recent events (1)
FLUX3D: Diffusion-aligned sparse representation for high-fidelity image-to-3D Gaussian Splatting
Researchers introduce FLUX3D, an image-to-3D Gaussian Splatting framework that addresses two structural bottlenecks in sparse voxel-based 3D generation: a representation bottleneck from discriminative 2D features and a cross-modal correspondence bottleneck in diffusion transformers. The system introduces Diffusion-Aligned Structured Latents (DA-SLAT) and a Sparse-structure Multimodal Diffusion Transformer (SMDiT) with Modal-Aware Rotary Positional Embedding (MARoPE) to improve 2D-3D alignment. Benchmark results claim substantial improvements in appearance fidelity over all current state-of-the-art methods for 3DGS asset generation.