What DeepSeek V4 is
DeepSeek V4 is a family of open-weights AI models built by DeepSeek, a Chinese AI laboratory. "Open-weights" means the actual model files are freely downloadable — unlike the AI assistants from OpenAI or Anthropic, which you can only access through their companies' services. V4 comes in two sizes: V4-Pro, a massive model with 1.6 trillion total parameters (think of parameters as the knobs the model learned to tune during training), and V4-Flash, a leaner version at 284 billion parameters designed for faster, cheaper responses.
Both models use a design called Mixture of Experts (MoE): instead of running all those parameters on every query, only a small active slice is used at a time — 49 billion for Pro, 13 billion for Flash. This is what lets a model with a staggering total size still be practical to run.
Why it matters
DeepSeek V4 matters for two big reasons: capability and cost.
On capability, V4 supports a 1 million token context window by default — meaning it can read and reason over roughly 750,000 words of text in a single session. That's enough to hold an entire codebase, a long legal document, or hours of conversation history. This is powered by a new technique called DeepSeek Sparse Attention (DSA), which makes processing very long inputs more efficient.
On cost, DeepSeek permanently cut V4-Pro's API price by 75%, continuing a pattern the lab has established across its model generations. When a frontier-class model slashes its price permanently, it puts pressure on every other provider in the market.
How it got here: the V3 lineage
V4 didn't appear out of nowhere. DeepSeek built up to it through a rapid series of releases:
- DeepSeek V3 launched as a 671-billion-parameter open-source model running at 60 tokens per second — three times faster than its predecessor — at very low API prices.
- DeepSeek R1 followed as a reasoning-focused model claiming performance on par with OpenAI's o1 on math and coding benchmarks, released under the permissive MIT License.
- V3.1 added hybrid "thinking and non-thinking" modes in a single model, along with improved tool use for multi-step tasks.
- V3.2-Exp introduced the sparse attention architecture that V4 would later build on, alongside a 50%+ API price cut.
- V3.2 integrated chain-of-thought reasoning directly into tool-use workflows, trained on a new pipeline covering over 1,800 environments.
V4 is the synthesis of all these experiments into a single, larger, more capable release.
The controversy: distillation and geopolitics
DeepSeek V4's rise has not been without friction.
Distillation allegations: Anthropic publicly accused DeepSeek (along with Moonshot AI and MiniMax) of running coordinated "distillation attacks" — generating over 16 million exchanges through approximately 24,000 fraudulent accounts to extract responses from Claude and use them to train DeepSeek's own models. Anthropic framed this as both a violation of its terms of service and a national security concern, arguing that models trained this way inherit capabilities without the safety guardrails. A separate report described a broader gray-market ecosystem of API proxy networks enabling this kind of data harvesting at scale.
Hardware access: Before V4's public release, DeepSeek gave Huawei several weeks of early access for hardware optimization — while denying the same to Nvidia and AMD. This was a notable departure from prior practice and signals DeepSeek's deliberate alignment with China's domestic chip ecosystem amid ongoing US export controls.
Benchmark reality check: Despite strong claims on agentic coding tasks, at least one industry analysis noted that V4 trails leading open and closed models on aggregate benchmarks — a reminder that headline numbers don't always tell the full story.
Where it fits in the broader landscape
V4 sits alongside a wave of competitive open-weights releases from Chinese and international labs — Qwen3, Kimi K2.6, and others — all pushing toward frontier capability at lower cost. Nvidia has framed its own open-weights investments partly as a strategic response to Chinese labs building capable models on non-Nvidia hardware. The US government's NIST TRAINS task force, meanwhile, is moving toward pre-deployment national security evaluations of frontier models — a regulatory environment that will increasingly shape what models like V4 can be used for in certain contexts.
For practitioners and businesses, DeepSeek V4 represents a genuine option: frontier-adjacent capability, fully downloadable, with aggressive pricing and broad API compatibility (including OpenAI and Anthropic API formats). The geopolitical and safety questions are real, but so is the model.




