Almanac
technique

Semantic Generative Tuning (SGT)

techniqueactivesemantic-generative-tuning-sgt--c3580094·1 events·first seen 28d ago

Aliases: Semantic Generative Tuning (SGT)

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·28d ago·source ↗

Semantic Generative Tuning (SGT) for Unified Multimodal Models

This paper introduces Semantic Generative Tuning (SGT), a post-training paradigm for unified multimodal models (UMMs) that bridges the gap between visual understanding and visual generation. The authors find that image segmentation tasks serve as optimal generative proxies, providing structural semantics that improve both perception and generative layout fidelity. SGT aligns representation spaces across understanding and generation objectives, improving feature linear separability and visual-textual attention allocation. Evaluations show consistent gains on multimodal comprehension and generative fidelity benchmarks.