OpenAI Updates Model Spec with Under-18 Teen Protection Principles
OpenAI is revising its Model Spec to include new Under-18 Principles that govern how ChatGPT interacts with teenage users. The update introduces stronger guardrails and age-appropriate behavioral guidance grounded in developmental science. This builds on OpenAI's broader ongoing effort to improve safety for minors using ChatGPT.
Related guides (3)
Related events (8)
OpenAI Releases Teen Safety Policies for Developers via gpt-oss-safeguard
OpenAI has published prompt-based teen safety policies targeting developers who build on its models, specifically leveraging the gpt-oss-safeguard model to moderate age-specific risks. The release provides structured guidance and tooling for filtering or adjusting AI outputs in contexts where minors may be users. This represents an extension of OpenAI's safety infrastructure into the developer-facing layer, addressing regulatory and reputational pressure around youth-facing AI deployments.
OpenAI Building Age Prediction and Parental Controls in ChatGPT
OpenAI is developing age prediction capabilities and parental control features within ChatGPT to deliver age-appropriate experiences for teenage users. The initiative aims to support families with new safety tools and restrict content based on inferred or verified user age. This represents a product-safety effort at the intersection of AI deployment and child protection policy.
Building more helpful ChatGPT experiences for everyone
OpenAI is announcing a set of ChatGPT safety and helpfulness improvements including new parental controls for teen users, routing of sensitive conversations to reasoning models, and partnerships with external experts. The update reflects OpenAI's ongoing effort to balance accessibility with safeguards across different user demographics. Routing sensitive queries to reasoning models is a notable architectural/policy decision that may affect response quality and safety outcomes.
OpenAI Improves ChatGPT Mental Health Responses with Expert Collaboration
OpenAI worked with over 170 mental health experts to enhance ChatGPT's handling of sensitive conversations involving distress. The update improves the model's ability to recognize emotional distress, respond with empathy, and direct users to real-world support resources. OpenAI reports a reduction in unsafe responses of up to 80% as a result of these changes.
Helping ChatGPT better recognize context in sensitive conversations
OpenAI has released safety updates to ChatGPT aimed at improving context awareness in sensitive conversations. The updates focus on detecting risk signals over time within a conversation rather than evaluating individual messages in isolation. This represents an incremental improvement to ChatGPT's safety and harm-reduction capabilities in high-stakes interactions.
How should AI systems behave, and who should decide?
OpenAI published a policy post clarifying how ChatGPT's behavior is shaped and governed, outlining plans to allow greater user customization of model behavior. The post also describes intentions to solicit broader public input into decision-making around AI system behavior. This represents an early public articulation of OpenAI's approach to behavioral governance and value alignment in deployed systems.
OpenAI Upgrades Moderation API with GPT-4o-Based Multimodal Model
OpenAI has released an updated Moderation API powered by a new model built on GPT-4o, extending content moderation capabilities to both text and images. The update aims to improve accuracy in detecting harmful content, giving developers better tools for building moderation systems. This represents an expansion of OpenAI's safety infrastructure into multimodal domains.
Introducing ChatGPT
OpenAI announced ChatGPT, a conversational model trained to engage in dialogue, answer follow-up questions, acknowledge errors, challenge incorrect premises, and decline inappropriate requests. The model's dialogue format represented a significant step in making large language models accessible and interactive for general users. This November 2022 launch marked a pivotal moment in public AI adoption.


