OpenAI Releases New and Improved Embedding Model
OpenAI announced a new embedding model described as significantly more capable, cost-effective, and simpler to use than prior offerings. The announcement was published in December 2022 and represents an update to OpenAI's text embedding API surface. No specific benchmark numbers or architectural details are provided in the available body text.
Related guides (3)
Related events (8)
New embedding models and API updates from OpenAI
OpenAI announced new embedding models alongside API updates, expanding their developer-facing infrastructure offerings. The release likely includes updated text-embedding models with improved performance or cost characteristics. This is part of OpenAI's ongoing effort to maintain and grow its API platform for enterprise and developer use cases.
Introducing text and code embeddings
OpenAI launched a new embeddings endpoint in its API, enabling natural language and code tasks such as semantic search, clustering, topic modeling, and classification. The endpoint provides vector representations of text and code, making it easier for developers to build applications requiring semantic understanding. This was a significant early step in OpenAI's API product expansion beyond text generation.
OpenAI Releases Most Capable Open-Weights Models
OpenAI has released what it describes as its most capable open-weights models, framing the move as a major step toward broader AI accessibility. The announcement emphasizes openness, flexibility, and global reach as core motivations. This marks a significant shift in OpenAI's historically closed model distribution strategy.
Welcome EmbeddingGemma, Google's new efficient embedding model
Google has released EmbeddingGemma, a new embedding model announced via the Hugging Face blog. The model appears to be positioned as an efficient option for generating text embeddings, likely derived from or related to the Gemma model family. Details on architecture, benchmarks, and use cases are expected in the full post.
OpenAI Improves Fine-Tuning API and Expands Custom Models Program
OpenAI announced enhancements to its fine-tuning API giving developers greater control over the training process, alongside an expansion of its custom models program. The updates aim to provide more flexibility for enterprise and developer use cases requiring tailored model behavior. Specific new features include additional hyperparameter controls and tooling improvements, while the custom models program expansion opens new pathways for organizations to build bespoke models with OpenAI's assistance.
Mistral AI Releases Codestral Embed: First Code-Specialized Embedding Model
Mistral AI has launched Codestral Embed (codestral-embed-2505), its first embedding model specialized for code retrieval and semantic understanding. The model claims to outperform leading competitors including Voyage Code 3, Cohere Embed v4.0, and OpenAI's large embedding model across benchmarks including SWE-Bench, CodeSearchNet, and Text2SQL tasks. It supports variable output dimensions and precisions (including int8), enabling storage/quality trade-offs, and is priced at $0.15 per million tokens via Mistral's API with batch discounts available.
Qwen3 Embedding: State-of-the-Art Text Embedding and Reranking Models Released
Alibaba's Qwen team has released the Qwen3 Embedding series, a set of open-weights text embedding and reranking models built on the Qwen3 foundation model. The models are designed for retrieval and reranking tasks and claim state-of-the-art performance across multiple benchmarks. They are released under the Apache 2.0 license and are available on Hugging Face and ModelScope.
Text and Code Embeddings by Contrastive Pre-training
OpenAI published research on generating text and code embeddings using contrastive pre-training. The approach trains models to produce dense vector representations useful for semantic search, classification, and code retrieval tasks. This work underpins OpenAI's embeddings API offerings and represents an early public articulation of their embedding methodology.


