other
Unified Multimodal Models (UMMs)
otheractive
unified-multimodal-models-umms--61361e3f·1 events·first seen 28d agoAliases: Unified Multimodal Models (UMMs)
Co-occurring entities
More like this (12)
multimodal classification modelsMultimodal Large Language Modelsmultimodal embeddingMultimodal Learningmultimodal agentsmultimodal neuronsmultimodal pretrainingLatent World Recovery for Multimodal Learning with Missing Modalitiesembedding modelsMultimodal GainMMMUMultimodal Augmented Generation via Multimodal Retrieval Workshop
Recent events (1)
Semantic Generative Tuning (SGT) for Unified Multimodal Models
This paper introduces Semantic Generative Tuning (SGT), a post-training paradigm for unified multimodal models (UMMs) that bridges the gap between visual understanding and visual generation. The authors find that image segmentation tasks serve as optimal generative proxies, providing structural semantics that improve both perception and generative layout fidelity. SGT aligns representation spaces across understanding and generation objectives, improving feature linear separability and visual-textual attention allocation. Evaluations show consistent gains on multimodal comprehension and generative fidelity benchmarks.