GPT-3
gpt-3-d8bbf930·14 events·first seen 28d agoAliases: GPT-3
Co-occurring entities
More like this (12)
Recent events (14)
OpenAI licenses GPT-3 technology to Microsoft
OpenAI has agreed to license GPT-3 to Microsoft for use in Microsoft's own products and services. This represents an early and significant commercial partnership between the two organizations, predating Microsoft's broader Azure OpenAI Service. The deal marks one of the first major exclusive or preferential licensing arrangements for a large language model.
Language models are few-shot learners
OpenAI published the GPT-3 paper introducing a 175-billion-parameter autoregressive language model demonstrating strong few-shot learning capabilities across a wide range of NLP tasks. The work showed that scaling language models dramatically improves task-agnostic, few-shot performance, often matching or exceeding fine-tuned models without any gradient updates. This paper became a foundational milestone in the development of large language models and the modern AI landscape.
Customizing GPT-3 for your application
OpenAI announced fine-tuning capabilities for GPT-3, enabling developers to customize the model for specific applications via a single command. This feature allows users to adapt GPT-3's behavior to their use case by training on domain-specific data. The announcement marks an early milestone in making large language model customization accessible through an API.
GPT-3 Powers Over 300 Applications via OpenAI API
OpenAI reports that more than 300 applications are now using GPT-3 through its API to deliver search, conversation, text completion, and other AI features. The announcement highlights the growing commercial ecosystem built on top of GPT-3 as of early 2021. This represents an early milestone in API-based AI deployment at scale.
New GPT-3 capabilities: Edit & insert
OpenAI released updated versions of GPT-3 and Codex that support editing and inserting content into existing text, expanding beyond the original completion-only paradigm. These new capabilities allow the models to make targeted modifications to text rather than only appending to it. The release represents an incremental but meaningful expansion of the GPT-3 API surface.
WebGPT: Improving the factual accuracy of language models through web browsing
OpenAI fine-tuned GPT-3 to answer open-ended questions more accurately by giving it access to a text-based web browser. The system, called WebGPT, uses reinforcement learning from human feedback to learn to search the web, read pages, and cite sources. This work represents an early demonstration of retrieval-augmented generation and tool-use in large language models.
Scaling Kubernetes to 7,500 Nodes
OpenAI describes scaling Kubernetes clusters to 7,500 nodes to support large-scale AI training workloads including GPT-3, CLIP, and DALL·E. The post details infrastructure challenges and solutions enabling both massive model training and rapid small-scale research iteration. This represents a significant engineering milestone in ML training infrastructure at the time of publication (January 2021).
Aligning language models to follow instructions
OpenAI published a blog post describing their work on aligning language models to follow human instructions, corresponding to the InstructGPT research. This work introduced reinforcement learning from human feedback (RLHF) as a core technique for training models to be more helpful, honest, and aligned with user intent. The approach demonstrated that smaller instruction-tuned models could outperform larger base models on human preference evaluations, marking a foundational shift in how language models are trained and deployed.
OpenAI Trains System Solving Grade School Math Problems at ~55% Accuracy
OpenAI released a system for solving grade school math word problems that achieves roughly twice the accuracy of a fine-tuned GPT-3 model. The system scored 55% on a sample test where 9-12 year olds scored 60%, suggesting near-human performance on elementary math. This work represents an early milestone in neural network mathematical reasoning capabilities.
Three Years from GPT-3 to Gemini 3
A commentary piece from One Useful Thing reflecting on the three-year arc from GPT-3 to the anticipated Gemini 3, framing the trajectory as a shift from chatbots to agents. The piece appears to offer a retrospective and forward-looking analysis of the AI landscape's evolution. As a tier-2 commentary source, it likely synthesizes trends rather than reporting new technical developments.
Evaluating Large Language Models Trained on Code
OpenAI published research on evaluating large language models trained on code, introducing the Codex model and the HumanEval benchmark for assessing code generation capabilities. The work established foundational methodology for measuring functional correctness of code produced by LLMs using a pass@k metric. This paper became a landmark reference for code-focused LLM evaluation and influenced subsequent code generation research across the field.
SetFit: Efficient Few-Shot Learning Without Prompts
SetFit is a framework for few-shot text classification that fine-tunes Sentence Transformers on small labeled datasets without requiring prompts or large language models. The approach generates contrastive sentence pairs from few examples, fine-tunes a dense embedding model, and then trains a lightweight classifier head. It achieves competitive accuracy with GPT-3-scale models using far fewer parameters and labeled examples.
CLIP: Connecting Text and Images
OpenAI introduced CLIP (Contrastive Language-Image Pre-training), a neural network that learns visual concepts from natural language supervision. CLIP enables zero-shot visual classification by accepting natural language descriptions of categories rather than requiring task-specific training data. The approach mirrors the zero-shot transfer capabilities demonstrated by GPT-2 and GPT-3 in the language domain.
Anthropic publishes foundational 'Core Views on AI Safety' position paper
Anthropic released a detailed position paper outlining their core views on AI safety, arguing that transformative AI could arrive within a decade driven by predictable scaling laws, and that no one currently knows how to train powerful AI systems to robustly behave well. The document explains Anthropic's founding rationale and research strategy, highlighting four priority areas: scaling supervision, mechanistic interpretability, process-oriented learning, and understanding AI generalization. Originally published March 2023, this represents Anthropic's canonical public statement of their safety philosophy and strategic priorities.