5OpenAI Blog·1mo ago

Glow: Better reversible generative models

OpenAI introduces Glow, a reversible generative model using invertible 1x1 convolutions that extends prior work on normalizing flows. The model generates realistic high-resolution images, supports efficient sampling, and learns disentangled features for attribute manipulation. Code and an online visualization tool are released alongside the paper.

Multimodal Progress Glow invertible 1x1 convolutions OpenAI normalizing flows

Related guides (2)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Related events (8)

6Openai Blog·1mo ago·source ↗

Image GPT: Transformer Models Applied to Pixel Sequences for Image Generation and Classification

OpenAI demonstrates that a large transformer model trained autoregressively on pixel sequences can generate coherent image completions and samples, analogous to text generation. The work establishes a correlation between generative sample quality and downstream image classification accuracy. The best generative model achieves features competitive with top convolutional networks in the unsupervised setting, suggesting shared representational principles across modalities.

Frontier Model Releases Multimodal Progress Transformers convolutional neural network OpenAI +2 more

8Openai Blog·1mo ago·source ↗

Introducing 4o Image Generation

OpenAI has integrated a native image generation capability directly into GPT-4o, positioning it as a primary model capability rather than a separate system. The announcement frames this as their most advanced image generator to date, emphasizing both aesthetic quality and practical utility. This represents a shift toward unified multimodal models that generate images natively rather than relying on separate diffusion-based pipelines.

Frontier Model Releases Inference Economics GPT-4o GPT-4o Image Generation OpenAI +1 more

2Openai Blog·1mo ago·source ↗

OpenAI: Generative Models Overview (2016)

A 2016 OpenAI blog post describing four research projects centered on generative models as a branch of unsupervised learning. The post explains what generative models are, their importance, and potential future directions. This is an archival piece predating modern large language models and diffusion systems, representing early foundational work at OpenAI.

generative models unsupervised learning OpenAI

5Openai Blog·1mo ago·source ↗

Improved Techniques for Training Consistency Models

OpenAI presents improved training techniques for consistency models, a class of generative models capable of producing high-quality samples in a single step without adversarial training. The work advances a nascent alternative to diffusion-based generation that trades multi-step sampling for single-step inference. The post originates from OpenAI's research blog, indicating continued investment in efficient generative modeling.

Inference Economics Multimodal Progress Latent Consistency Models OpenAI Diffusion Models

7Openai Blog·1mo ago·source ↗

OpenAI Launches gpt-image-1 Image Generation Model via API

OpenAI has made its latest image generation model, gpt-image-1, available through its API for developers and businesses. The model is positioned for professional-grade, customizable visual generation integrated directly into third-party tools and platforms. This follows OpenAI's earlier consumer-facing image generation features and extends them to programmatic access.

Enterprise Deployment Patterns Agent and Tool Ecosystem GPT-Image-1.5 OpenAI API OpenAI +1 more

5arXiv · cs.LG·18d ago·source ↗

Review: Generative Models, Multimodal Learning, and Closed-Loop Workflows in Inverse Materials Design

This arxiv review surveys recent advances in generative modeling for inverse materials design, covering variational autoencoders, normalizing flows, autoregressive models, and diffusion models applied to crystalline solid discovery. It examines how multimodal learning fuses crystal structures, thermodynamic data, spectroscopy, microscopy, and scientific text into transferable chemical-space representations. The paper also reviews closed-loop design pipelines integrating conditional generation with Bayesian optimization, reinforcement learning, and active learning, and identifies recurring failure modes including surrogate exploitation, diversity collapse, and the stability-synthesizability gap.

Evaluation and Benchmarking Agent and Tool Ecosystem Bayesian Optimization Multimodal Learning active learning +6 more

5arXiv · cs.LG·24d ago·source ↗

Representation-Conditioned Diffusion Models for Controllable Image Generation

This paper explores conditioning diffusion models on representations from pre-trained self-supervised models as an alternative to text prompts or semantic maps, which require large annotated datasets. The self-conditioning mechanism improves unconditional image generation quality and provides a controllable representation space. The authors identify directions of variation in this space and demonstrate smoothness and disentanglement properties, suggesting potential for fine-grained generative control without heavy annotation overhead.

Frontier Model Releases Multimodal Progress Representation-Conditioned Diffusion Models Self-Supervised Learning Disentangled Representation Learning +1 more

7Openai Blog·1mo ago·source ↗

Introducing ChatGPT Images 2.0

OpenAI has launched ChatGPT Images 2.0, a new image generation model integrated into ChatGPT. The release highlights improved text rendering, multilingual support, and advanced visual reasoning capabilities. This represents an upgrade to OpenAI's consumer-facing image generation offering.

Frontier Model Releases Multimodal Progress ChatGPT ChatGPT Images 2.0 OpenAI