DeepSeek API Major Upgrade: Function Calling, FIM, Chat Prefix Completion, JSON Output, and 8K Token Limit
DeepSeek has released a significant API update adding Function Calling (up to 128 parallel calls, OpenAI-compatible), JSON Output, Chat Prefix Completion, and FIM (Fill-In-the-Middle) Completion to both deepseek-chat and deepseek-coder models. The update also raises the max_tokens ceiling to 8K in the Beta API. Several features are in Beta and will be open-sourced once stable. The Function Calling and JSON Output implementations are explicitly designed to be compatible with the OpenAI API.
Related guides (4)
Related events (8)
DeepSeek-V2.5: Merged Open-Source Model Combining General and Coding Capabilities
DeepSeek has released DeepSeek-V2.5, an open-source model that merges DeepSeek-V2-Chat-0628 and DeepSeek-Coder-V2-0724 into a single unified model. The release improves general conversational capabilities, coding performance, instruction-following, and writing tasks while also strengthening safety properties—raising the overall safety score from 74.4% to 82.6% and reducing safety spillover rate from 11.3% to 4.6%. The model is available via backward-compatible API endpoints (deepseek-chat and deepseek-coder) and on HuggingFace, retaining features like Function Calling, FIM completion, and JSON output. Benchmark results show improvements on HumanEval Python and LiveCodeBench, though SWE-verified performance remains an acknowledged weak area.
DeepSeek V2.5-1210: Final Update to V2.5 Series, V3 Generation Teased
DeepSeek has released DeepSeek-V2.5-1210, the final update to its V2.5 model series, with claimed improvements across math, coding, writing, and roleplay benchmarks. The model is available as open weights on Hugging Face. DeepSeek also announced the launch of Internet Search on chat.deepseek.com. The release marks the end of the V2 generation, with the company signaling work on next-generation foundation models.
DeepSeek-R1-0528 Released with Improved Benchmarks, Reduced Hallucinations, and Function Calling
DeepSeek has released DeepSeek-R1-0528, an updated version of its R1 reasoning model featuring improved benchmark performance, reduced hallucinations, enhanced front-end capabilities, and new support for JSON output and function calling. The API interface remains unchanged, and open-source weights are available on Hugging Face. This is an incremental update to the R1 series rather than a new flagship model.
DeepSeek releases DeepSeek-V3.2 on Hugging Face
DeepSeek has released DeepSeek-V3.2, a new text-generation model published on Hugging Face under the deepseek-ai organization. The model supports fp8 precision, is endpoints-compatible, and has accumulated over 3.6 million downloads and 1,446 likes, indicating significant community uptake. This appears to be a successor to DeepSeek-V3, continuing the lab's competitive open-weights model series.
DeepSeek-V3.1 Release: Hybrid Think/Non-Think Model with Agent-Focused Upgrades
DeepSeek has released V3.1, a hybrid inference model supporting both thinking and non-thinking modes in a single model, positioned as their first step toward the agent era. The model features improved tool use and multi-step agent task performance, with benchmarks showing gains on SWE-bench and Terminal-Bench, and faster thinking efficiency compared to DeepSeek-R1-0528. The base model received 840B tokens of continued pretraining for long-context extension, a new tokenizer, and open-source weights are available on HuggingFace. API updates include 128K context for both modes, Anthropic API format compatibility, and strict function calling support in beta.
DeepSeek V4 Preview Release: 1.6T-param Pro and 284B Flash Models with 1M Context, Open-Sourced
DeepSeek has released DeepSeek-V4 as an open-weights preview, comprising two MoE variants: V4-Pro (1.6T total / 49B active parameters) and V4-Flash (284B total / 13B active parameters). Both models support 1M token context by default, enabled by a novel Token-wise compression and DeepSeek Sparse Attention (DSA) architecture. V4-Pro claims open-source SOTA on agentic coding benchmarks and world-class math/STEM/coding performance rivaling top closed-source models, while V4-Flash offers near-parity reasoning at lower cost and latency. The API is live today with OpenAI and Anthropic compatibility, and legacy model endpoints will be retired in July 2026.
DeepSeek releases DeepSeek-V3.1 on Hugging Face
DeepSeek has released DeepSeek-V3.1, a new text-generation model published on Hugging Face under the deepseek-ai organization. The model supports fp8 precision, text-generation-inference, and endpoint deployment, and has accumulated over 220K downloads and 824 likes shortly after release. This appears to be an updated iteration of the DeepSeek-V3 series, a frontier-class open-weights model family.
DeepSeek releases DeepSeek-V3.2-Speciale on Hugging Face
DeepSeek has published DeepSeek-V3.2-Speciale, a new text-generation model, on Hugging Face under the deepseek-ai organization. The model uses the deepseek_v32 architecture and supports fp8 precision with safetensors format. Early traction is notable with nearly 10,000 downloads and 708 likes shortly after release.



