technique
image segmentation
techniqueactive
image-segmentation-ac560f92·1 events·first seen 29d agoAliases: image segmentation
Co-occurring entities
More like this (12)
Recent events (1)
Semantic Generative Tuning (SGT) for Unified Multimodal Models
This paper introduces Semantic Generative Tuning (SGT), a post-training paradigm for unified multimodal models (UMMs) that bridges the gap between visual understanding and visual generation. The authors find that image segmentation tasks serve as optimal generative proxies, providing structural semantics that improve both perception and generative layout fidelity. SGT aligns representation spaces across understanding and generation objectives, improving feature linear separability and visual-textual attention allocation. Evaluations show consistent gains on multimodal comprehension and generative fidelity benchmarks.