OpenAI Introduces AgentKit, Expanded Evals, and Reinforcement Fine-Tuning for Agents
OpenAI has released a suite of developer tools aimed at accelerating agent development from prototype to production. The release includes AgentKit (a new agent-building framework), expanded evaluation capabilities, and reinforcement fine-tuning (RFT) specifically designed for agentic use cases. These tools represent OpenAI's continued push to provide end-to-end infrastructure for building and deploying AI agents at scale.
Related guides (4)
Related events (8)
New Tools for Building Agents
OpenAI announced new tools aimed at developers building AI agents, published on March 11, 2025. The announcement comes from OpenAI's official blog, signaling a continued push to expand the agent-building ecosystem. Specific tools and capabilities were not detailed in the provided body text, but the source and framing indicate a product/tooling release targeting the agentic development workflow.
OpenAI Improves Fine-Tuning API and Expands Custom Models Program
OpenAI announced enhancements to its fine-tuning API giving developers greater control over the training process, alongside an expansion of its custom models program. The updates aim to provide more flexibility for enterprise and developer use cases requiring tailored model behavior. Specific new features include additional hyperparameter controls and tooling improvements, while the custom models program expansion opens new pathways for organizations to build bespoke models with OpenAI's assistance.
OpenAI o1 and New Developer Tools Announced
OpenAI has announced the full release of the o1 model alongside a set of developer-facing updates including Realtime API improvements and a new fine-tuning method. The announcement targets developers building on the OpenAI platform. Specific capability details and pricing were not elaborated in the source body.
The next evolution of the Agents SDK
OpenAI has updated its Agents SDK with native sandbox execution and a model-native harness, enabling developers to build secure, long-running agents that operate across files and tools. The update targets production-grade agentic workflows by providing safer code execution environments and tighter integration with OpenAI models. This represents a continued push by OpenAI to mature its developer tooling for autonomous agent deployment.
New Tools and Features in the Responses API
OpenAI announced new tools and features for its Responses API, expanding the capabilities available to developers building on the platform. The update likely includes additional built-in tools, improved function calling, or new modalities accessible through the API. As a Tier 1 source announcement, this represents a meaningful expansion of OpenAI's developer-facing infrastructure. Specific details were not available in the body text provided.
OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments
This Hugging Face blog post introduces OpenEnv, a framework for evaluating tool-using AI agents in real-world environments. The piece appears to address the challenge of benchmarking agentic systems that interact with external tools and environments, moving beyond static benchmarks toward dynamic, practical evaluation settings. As a tier-2 commentary piece, it likely discusses methodology, design choices, and results from applying OpenEnv to assess agent capabilities.
OpenAI Gym Beta Release
OpenAI released the public beta of OpenAI Gym, a toolkit for developing and comparing reinforcement learning algorithms. The toolkit includes a suite of environments ranging from simulated robots to Atari games, along with a site for comparing and reproducing results. This represented a significant early infrastructure contribution to the RL research community.
OpenAI Introduces Deep Research Agent
OpenAI has launched 'deep research,' an agentic capability that uses reasoning to synthesize large volumes of online information and complete multi-step research tasks autonomously. The feature is initially available to ChatGPT Pro users, with rollout to Plus and Team tiers to follow. It represents a step toward practical autonomous research agents built on OpenAI's reasoning model infrastructure.



