6OpenAI Blog·1mo ago

Prompt Caching in the API

OpenAI is introducing automatic prompt caching for API users, providing discounts on input tokens that the model has recently processed. The feature reduces costs for repeated or overlapping prompt prefixes without requiring explicit developer configuration. This follows Anthropic's similar caching feature and reflects broader industry movement toward inference cost optimization.

Inference Economics Enterprise Deployment Patterns Prompt Caching OpenAI API OpenAI Anthropic

Related guides (4)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost Structure of Running AI Models in Production

Read asIn-depth

Related events (8)

5Openai Blog·1mo ago·source ↗

Understanding prompt injections: a frontier security challenge

OpenAI has published a blog post addressing prompt injection attacks as a key security challenge for AI systems. The post covers how these attacks work and outlines OpenAI's multi-pronged approach including research, model training improvements, and safeguard development. This signals OpenAI's formal positioning on agentic security threats as their models are increasingly deployed in tool-using and autonomous contexts.

AI Safety Research Agent and Tool Ecosystem prompt injection OpenAI

4Anthropic News·20d ago·source ↗

Anthropic Publishes Quantitative Case Study on Prompt Engineering for Long-Context Recall

Anthropic shares a quantitative case study evaluating prompting techniques to improve Claude's recall over 75,000–90,000 token contexts. Two techniques are tested: extracting reference quotes before answering, and providing few-shot examples of correctly answered questions. The study uses Claude Instant 1.2 on a government document dataset constructed via a 'randomized collage' method, with multiple-choice Q&A pairs generated by Claude itself. Results show measurable recall improvements over a baseline prompt, with methodology and notebooks shared publicly.

Long Context Evolution Evaluation and Benchmarking Claude Claude API randomized collage +3 more

8Openai Blog·1mo ago·source ↗

OpenAI Announces Function Calling, Longer Context, and API Price Reductions

OpenAI introduced function calling capabilities to its API, enabling models to reliably output structured JSON for calling developer-defined functions. The update also includes longer context windows, more steerable models (gpt-3.5-turbo-16k and gpt-4 updates), and reduced pricing on several API tiers. These changes significantly expand the practical utility of OpenAI models for agentic and tool-use applications.

Long Context Evolution Frontier Model Releases GPT-3.5 Turbo OpenAI API OpenAI +4 more

3Github Trending·29d ago·source ↗

prompt-optimizer: Open-Source TypeScript Prompt Optimization Tool

prompt-optimizer is an open-source TypeScript tool designed to help users write better prompts and improve AI outputs. The repository has accumulated 29,603 total stars with 76 new stars today, indicating sustained community interest. It represents a category of tooling focused on prompt engineering automation and optimization.

Agent and Tool Ecosystem linshenkx prompt-optimizer

7Openai Blog·1mo ago·source ↗

OpenAI to Acquire Promptfoo

OpenAI announced the acquisition of Promptfoo, an AI security platform focused on identifying and remediating vulnerabilities in AI systems during development. The acquisition signals OpenAI's intent to deepen its enterprise security capabilities. Promptfoo has been widely used by developers to red-team and evaluate LLM applications for safety and reliability issues.

AI Safety Research Enterprise Deployment Patterns Promptfoo OpenAI +1 more

7Deepseek News·1mo ago·source ↗

DeepSeek API Introduces Context Caching on Disk, Cutting Token Prices by ~90%

DeepSeek has launched a disk-based context caching service for its API, reducing cache-hit token pricing to $0.014 per million tokens versus $0.14 for cache misses—a 90% cost reduction. The system requires no code changes, runs automatically for prefix-matched inputs, and reduces first-token latency from ~13s to ~500ms on 128K prompts. DeepSeek attributes the feasibility of disk caching to the compact KV cache produced by its MLA (Multi-head Latent Attention) architecture in DeepSeek V2, which it claims makes it the first LLM API provider to deploy extensive disk caching at scale. The service supports up to 1 trillion tokens per day with no concurrency limits.

Long Context Evolution Frontier Model Releases DeepSeek API DeepSeek V4 Context Caching on Disk +2 more

6Openai Blog·1mo ago·source ↗

Designing AI agents to resist prompt injection

OpenAI published a blog post describing how ChatGPT's agent workflows are designed to resist prompt injection and social engineering attacks. The approach focuses on constraining risky actions and protecting sensitive data within agentic pipelines. This represents OpenAI's public articulation of defensive design principles for deployed AI agents.

AI Safety Research Enterprise Deployment Patterns prompt injection ChatGPT social engineering +2 more

4Openai Blog·1mo ago·source ↗

OpenAI Introduces Enterprise-Grade Features for API Customers

OpenAI announced expanded enterprise capabilities for API customers, including enhanced security features and controls, updates to the Assistants API, and new cost management tools. The announcement targets enterprise adoption by addressing common organizational requirements around security, compliance, and budget oversight. No specific model capability changes are described.

Enterprise Deployment Patterns Agent and Tool Ecosystem OpenAI Assistants API OpenAI