Entity · model

GPT-2

modelactivegpt-2-2a27d8b7·20 events·first seen May 19, 2026

Aliases: GPT-2

Co-occurring entities

More like this (12)

GPT-3 GPT-1 GPT-4 GPT-2 124M GPT-5.2 GPT-2 355M GPT-4.1 GPT-2-small GPT GPTs GPT-4V GPT-4o

Recent events (20)

4arXiv · cs.CL·Jul 23, 2026·source ↗

GPT-2 models generalize to unlike coordination without direct training exposure

Researchers use Filtered-Corpus Training (FiCT) to train GPT-2 models on corpora with all unlike coordination instances removed, finding that models still generalize successfully to unlike coordination with perplexity and grammaticality judgments comparable to unfiltered-trained models. Internal representation analyses suggest models handle unlike coordination by treating conjoined elements as structurally similar or via a deletion-like mechanism, both learnable from alike coordination alone. The work contributes to debates in theoretical linguistics about coordination while also probing how language models acquire and represent grammatical structures beyond their direct training distribution.

Evaluation and Benchmarking GPT-2 Filtered-Corpus Training Exposure is Optional: Learning Unlike Coordination in Language Models

5arXiv · cs.CL·Jul 21, 2026·source ↗

Mobius Learning: cyclic depth folding enables depth-role superposition in Transformers

Researchers introduce Mobius Learning, a training architecture where different data streams follow cyclically shifted block orders in a Transformer, forcing each block group to be optimized in both shallow and deep representational roles — a property they call depth-role superposition. Experiments with a modified GPT-2 small (124M) trained on 2.5B FineWeb tokens show lower validation loss than a fixed-order looped Transformer at larger numbers of block-sequence passes. The architecture is also naturally suited to memory-constrained distributed training, as each worker stores only one block group rather than the full model stack.

Training Infrastructure FineWeb GPT-2 Mobius Learning +1 more

6arXiv · cs.LG·Jul 15, 2026·source ↗

Information-theoretic framework establishes tight sample complexity laws for watermark forensics in generative models

A new arXiv preprint develops an information-theoretic framework for watermark forensics in generative model outputs, organizing detection, attribution, payload extraction, and localization into a 'forensic ladder' with precise sample complexity bounds. The main theorem establishes the first tight entropy-rate law for multi-user attribution: attributing text to one of N users costs Θ(log N/h) tokens under statistically distortion-free schemes, with a matching converse. The paper also identifies two fundamental gaps — a window where text is provably machine-made but unattributable, and a footprint-resolution uncertainty principle — validated experimentally on GPT-2, Pythia-410M, and Qwen2.5.

Evaluation and Benchmarking AI Safety Research Watermark Forensics for Generative Models: An Information-Theoretic Perspective Qwen2.5 GPT-2 +1 more

3arXiv · cs.CL·Jul 13, 2026·source ↗

Spectral initialization schemes for LM pretraining show diagnostic value but no performance gain

Researchers analyze weight spectra across eleven pretrained GPT-2-style checkpoints varying in size, language, and training corpus, finding consistent depth-wise patterns in Frobenius norm and effective-rank entropy. They construct initialization schemes that mimic these spectral profiles and compare them against standard initialization methods. Despite visibly altering structural spectral patterns, the proposed initializers do not yield performance improvements over pretrained-weight reuse. The results suggest pretrained spectra are useful diagnostics of model structure but that coarse spectral matching is insufficient for effective transfer.

Training Infrastructure GPT-2

5arXiv · cs.CL·Jun 23, 2026·source ↗

LIHA reveals first-token broadcaster heads as mechanistic source of language identity in transformers

Researchers introduce Language Identity Head Ablation (LIHA), a causal intervention that zeros individual attention heads to measure language-switching behavior across 2,700 prompt-language pairs in seven languages. Applied to GPT-2, LIHA identifies a small set of 'first-token broadcaster' heads that propagate language identity signals throughout generation, with compensatory redistribution following a hierarchical, feedforward pattern. A controlled comparison between Qwen2.5-1.5B-Base and Qwen2.5-1.5B-Instruct provides direct causal evidence that instruction tuning reorganizes language identity circuits toward early-layer localization. The findings offer mechanistic grounding for why multilingual models generate in the wrong language and why this is difficult to correct.

Evaluation and Benchmarking Alignment and RLHF First-Token Broadcasters: Mechanistic Origins of Language Identity and Distributed Robustness in Transformers Language Identity Head Ablation Qwen2.5-7B-Instruct-1M +2 more

6arXiv · cs.LG·Jun 18, 2026·source ↗

Program synthesis used to reverse-engineer transformer attention heads with executable Python surrogates

Researchers propose a pipeline that approximates transformer attention heads with executable Python programs generated by a language model, then re-ranked by held-out predictive accuracy. Applied to GPT-2, TinyLlama-1.1B, and Llama-3B, fewer than 1,000 programs reproduce attention patterns with >75% average IoU similarity on TinyStories. Replacing 25% of attention heads with programmatic surrogates incurs only a 16% average perplexity increase while preserving downstream QA performance, demonstrating a path toward symbolic transparency in neural models.

Evaluation and Benchmarking AI Safety Research Llama 3.2 GPT-2 Explaining Attention with Program Synthesis +2 more

5arXiv · cs.LG·Jun 10, 2026·source ↗

Local linear structures in LLM weights and activations are dynamic, not fixed global directions

A new arXiv paper investigates the nature of linear structures in transformer weights and activations, finding strong local low-rank task-gradient structure but rejecting the hypothesis that fixed task planes exist. The authors show that useful bases drift substantially within 100 optimization steps, yet early recovery updates form a trajectory-prefix basis capturing 77% of LoRA recovery displacement. They also establish a formal connection between parameter perturbations and activation steering, finding a 0.58 cosine similarity between gradient-step-induced activation shifts and CAA steering vectors, suggesting linear structures are evolving local geometries rather than stable global task directions.

Evaluation and Benchmarking Alignment and RLHF CAA Qwen-0.5B LoRA +4 more

3Hacker News·Jun 9, 2026·source ↗

Retrospective on GPT-2's 'Too Dangerous to Release' decision (2019)

A blog post revisiting OpenAI's 2019 decision to initially withhold GPT-2 due to misuse concerns has surfaced on Hacker News with significant engagement (239 points, 89 comments). The post examines the historical episode where OpenAI staged the release of GPT-2, citing fears of misuse for disinformation. This retrospective is relevant as a case study in AI safety communication and the evolution of lab release policies.

Open Weights Progress AI Safety Research GPT-2 OpenAI

5arXiv · cs.AI·Jun 4, 2026·source ↗

GASING pedagogy-guided CoT training enables strong arithmetic reasoning in 86M-parameter GPT-2 model

Researchers train a small 86M-parameter GPT-2 decoder from scratch using Chain-of-Thought supervision derived from GASING, an Indonesian left-to-right arithmetic pedagogy, without any reinforcement learning. The model achieves over 80% accuracy on held-out arithmetic problems and competes with substantially larger models. Mechanistic analyses reveal two emergent capabilities: an explicit procedural pathway and a subsequent associative 'mental arithmetic' capacity that bypasses step-by-step computation. The work suggests that pedagogically structured training data can yield efficient arithmetic capability at small scale.

Evaluation and Benchmarking Alignment and RLHF GASING TOBA tokenizer GPT-2 +1 more

4arXiv · cs.CL·May 21, 2026·source ↗

SymbolicLight V1: Spike-Gated Dual-Path Language Model with High Activation Sparsity

SymbolicLight V1 is a 194M-parameter spiking language model that combines binary Leaky Integrate-and-Fire spike dynamics with a continuous residual stream, replacing dense self-attention with a dual-path module using exponential-decay aggregation and spike-gated local attention. Trained from scratch on a 3B-token Chinese-English corpus, it achieves validation perplexity of 8.88–8.93 at over 89% per-element activation sparsity, trailing GPT-2 201M by 7.7% in PPL. Ablations indicate that temporal integration via LIF dynamics contributes more to performance than sparsity alone, and a 0.8B-parameter scale-up on 48.8B tokens demonstrates optimization stability. Current dense-hardware inference is slower than GPT-2; neuromorphic deployment is framed as a future opportunity.

Training Infrastructure Inference Economics GPT-2 Dual-Path SparseTCAM Spiking Neural Networks +2 more

5arXiv · cs.CL·May 21, 2026·source ↗

Conditional Scale Entropy: A Wavelet-Derived Tool for Mechanistic Interpretability of Metaphor Processing in Transformers

This paper introduces Conditional Scale Entropy (CSE), a wavelet-derived measure of how transformer computation engages across frequency scales at each layer, and applies it to study metaphor processing in decoder-only language models. The authors prove CSE is invariant to update magnitude, isolating structural computation patterns from intensity. Across architectures ranging from GPT-2 (124M) to LLaMA-2 7B and GPT-oss 20B, metaphorical tokens consistently produce higher spectral breadth than literal tokens in early-to-mid layers, with the effect surviving permutation correction and specificity controls. The work establishes multi-scale coordination as a consistent mechanistic signature of metaphorical language processing and positions CSE as a general interpretability tool for cross-depth structure in transformers.

Evaluation and Benchmarking AI Safety Research Conditional Scale Entropy mechanistic interpretability GPT-2 +3 more

8Openai Blog·May 20, 2026·source ↗

Better language models and their implications

OpenAI announced GPT-2, a large-scale unsupervised language model capable of generating coherent multi-paragraph text and achieving state-of-the-art performance on language modeling benchmarks. The model demonstrated zero-shot capability across reading comprehension, machine translation, question answering, and summarization without task-specific fine-tuning. OpenAI notably withheld the full model release citing misuse concerns, marking an early high-profile instance of staged/responsible release policy.

Frontier Model Releases Evaluation and Benchmarking GPT-2 zero-shot learning unsupervised language modeling +3 more

5Openai Blog·May 20, 2026·source ↗

MuseNet: OpenAI's Transformer-Based Multi-Instrument Music Generation System

OpenAI released MuseNet, a deep neural network capable of generating 4-minute musical compositions across 10 instruments and multiple styles. The system uses the same large-scale transformer architecture as GPT-2, trained on hundreds of thousands of MIDI files to predict the next token in a sequence. MuseNet discovered patterns of harmony, rhythm, and style without explicit musical programming, demonstrating the generality of the GPT-2 unsupervised approach beyond text.

Frontier Model Releases Multimodal Progress GPT-2 MIDI MuseNet +1 more

5Openai Blog·May 20, 2026·source ↗

GPT-2: 6-Month Follow-Up — 774M Parameter Model Released

OpenAI released the 774 million parameter version of GPT-2 as part of its staged release strategy, following the 124M model in February and 355M model in May 2019. The release is accompanied by an open-source legal agreement to facilitate model-sharing partnerships between organizations. OpenAI also published a technical report on coordinating with the AI research community around publication norms and staged disclosure practices.

Frontier Model Releases Open Weights Progress GPT-2 124M GPT-2 OpenAI +2 more

6Openai Blog·May 20, 2026·source ↗

Fine-tuning GPT-2 from Human Preferences

OpenAI fine-tuned the 774M parameter GPT-2 model using human feedback across summarization and style-continuation tasks, requiring 60k and 5k human labels respectively. The work revealed a labeler preference misalignment: for summarization, labelers rewarded copying from source text rather than genuine summarization. The stated motivation is advancing safety techniques for human-machine interaction and learning about human values from feedback.

Frontier Model Releases Evaluation and Benchmarking Reinforcement Learning from Human Feedback GPT-2 Fine-tuning GPT-2 from Human Preferences +2 more

5Openai Blog·May 20, 2026·source ↗

GPT-2 1.5B Full Release Completes OpenAI's Staged Release Experiment

OpenAI released the full 1.5B parameter GPT-2 model along with code and weights, completing its staged release process that began earlier in 2019. The release also includes tooling to help detect GPT-2 outputs. OpenAI frames this as a test case for responsible staged release practices for future powerful models, acknowledging that larger models had already been released by others in the interim.

Open Weights Progress AI Safety Research GPT-2 OpenAI +1 more

9Openai Blog·May 20, 2026·source ↗

CLIP: Connecting Text and Images

OpenAI introduced CLIP (Contrastive Language-Image Pre-training), a neural network that learns visual concepts from natural language supervision. CLIP enables zero-shot visual classification by accepting natural language descriptions of categories rather than requiring task-specific training data. The approach mirrors the zero-shot transfer capabilities demonstrated by GPT-2 and GPT-3 in the language domain.

Frontier Model Releases Evaluation and Benchmarking GPT-3 GPT-2 Contrastive Language-Image Pretraining (CLIP)+3 more

6Openai Blog·May 20, 2026·source ↗

Language models can explain neurons in language models

OpenAI uses GPT-4 to automatically generate and score natural-language explanations for the behavior of individual neurons in large language models. The methodology is applied to all neurons in GPT-2, producing a public dataset of explanations and quality scores. The authors acknowledge the explanations are imperfect, framing this as an early step toward automated mechanistic interpretability. This work establishes a scalable pipeline for neuron-level analysis that could inform future interpretability and safety research.

Evaluation and Benchmarking AI Safety Research GPT-2 automated mechanistic interpretability neuron explanation dataset +2 more

3Hugging Face Blog·May 19, 2026·source ↗

Training CodeParrot from Scratch

Hugging Face published a detailed walkthrough of training CodeParrot, a GPT-2-style language model trained from scratch on GitHub code data. The post covers dataset preparation, tokenizer training, model configuration, and distributed training setup using the Accelerate library. It serves as both a technical tutorial and a demonstration of open-source code generation model development practices circa late 2021.

Training Infrastructure Open Weights Progress GitHub Code Dataset CodeParrot GPT-2 +2 more

4Hugging Face Blog·May 19, 2026·source ↗

From GPT2 to Stable Diffusion: Hugging Face arrives to the Elixir community

Hugging Face announces Bumblebee, a library bringing Hugging Face model support to the Elixir programming language ecosystem. The integration enables Elixir developers to run models including GPT-2 and Stable Diffusion via the Nx numerical computing library. This expands the reach of Hugging Face's model hub beyond Python-centric workflows into the BEAM/Elixir ecosystem.

Inference Economics Agent and Tool Ecosystem Elixir Bumblebee GPT-2 +3 more