Almanac
benchmark

T2I-CompBench

benchmarkactiveprovisionalt2i-compbench-d42ffa91·1 events·first seen 43h ago

Aliases: T2I-CompBench

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.AI·43h ago·source ↗

IV-CoT: Implicit Visual Chain-of-Thought for Structure-Aware Text-to-Image Generation

Researchers propose Implicit Visual Chain-of-Thought (IV-CoT), a latent visual reasoning framework that decomposes visual conditioning queries into a structural-to-semantic cascade for text-to-image generation. The method uses training-only sketch supervision to guide structural queries without requiring sketch extraction at inference time, enabling implicit CoT reasoning in a single forward pass. IV-CoT achieves improved results on GenEval and T2I-CompBench benchmarks, targeting persistent weaknesses in multimodal LLMs around object counts, spatial relations, and attribute binding.