Almanac
← Events
4OpenAI Blog·1mo ago

OpenAI Microscope: Neural Network Visualization Collection

OpenAI released Microscope, a collection of visualizations covering every significant layer and neuron across eight vision 'model organisms' commonly studied in interpretability research. The tool is designed to make it easier for researchers to analyze features that form inside neural networks. It targets the interpretability research community and aims to support progress in understanding complex neural systems.

Related guides (2)

Related events (8)

6Openai Blog·1mo ago·source ↗

Understanding Neural Networks Through Sparse Circuits

OpenAI has published work on mechanistic interpretability using a sparse model approach aimed at understanding how neural networks reason internally. The research seeks to make AI systems more transparent by identifying sparse circuits within neural networks. This work is positioned as supporting safer and more reliable AI behavior through improved interpretability.

5Openai Blog·1mo ago·source ↗

Introducing Activation Atlases

OpenAI and Google researchers jointly developed activation atlases, a new neural network interpretability technique that visualizes what interactions between neurons represent. The method aims to improve understanding of internal decision-making processes in AI systems. This work is positioned as a tool for identifying weaknesses and investigating failures in deployed AI systems.

5Openai Blog·1mo ago·source ↗

Multimodal neurons in artificial neural networks

OpenAI researchers discovered neurons in CLIP that respond to the same concept across literal, symbolic, and conceptual representations. This finding parallels multimodal neurons previously observed in biological brains and helps explain CLIP's ability to classify unusual visual renditions of concepts. The work is presented as a step toward understanding the associations and biases learned by CLIP and similar vision-language models.

6Google Deepmind Blog·1mo ago·source ↗

Gemma Scope 2: Interpretability Tools Released Across Entire Gemma 3 Family

DeepMind has released Gemma Scope 2, an open interpretability toolkit covering the full Gemma 3 model family. The release extends the original Gemma Scope effort to provide the AI safety community with tools for understanding complex language model behavior. By making these tools openly available across all Gemma 3 variants, DeepMind aims to support mechanistic interpretability research at scale.

6Openai Blog·1mo ago·source ↗

Language models can explain neurons in language models

OpenAI uses GPT-4 to automatically generate and score natural-language explanations for the behavior of individual neurons in large language models. The methodology is applied to all neurons in GPT-2, producing a public dataset of explanations and quality scores. The authors acknowledge the explanations are imperfect, framing this as an early step toward automated mechanistic interpretability. This work establishes a scalable pipeline for neuron-level analysis that could inform future interpretability and safety research.

6Openai Blog·1mo ago·source ↗

OpenAI to Acquire Neptune

OpenAI has announced the acquisition of Neptune, a platform focused on experiment tracking and model monitoring. The acquisition is aimed at improving visibility into model behavior and strengthening internal research tooling. This move suggests OpenAI is investing in infrastructure to better instrument and observe training runs at scale.

3Openai Blog·1mo ago·source ↗

Interpretable Machine Learning Through Teaching

OpenAI published a method in 2018 that trains AI systems to teach each other using examples that are also interpretable to humans. The approach automatically selects maximally informative examples to convey a concept, such as representative images for a category like 'dogs'. Experiments showed the method effective at teaching both AI systems and humans, bridging machine learning interpretability with pedagogical example selection.

5Google Deepmind Blog·1mo ago·source ↗

Teaching AI to See the World More Like We Do

DeepMind has published a new research paper analyzing how AI systems organize and perceive the visual world differently from humans. The work examines the gap between human visual cognition and current AI visual representations. The research aims to understand and potentially close the perceptual alignment gap between human and machine vision.