Almanac
model

DeepSeek-R1-0528

modelactivedeepseek-r1-0528-732d0625·3 events·first seen 1mo ago

Aliases: DeepSeek-R1-0528

Co-occurring entities

More like this (12)

Recent events (3)

6Deepseek News·1mo ago·source ↗

DeepSeek-R1-0528 Released with Improved Benchmarks, Reduced Hallucinations, and Function Calling

DeepSeek has released DeepSeek-R1-0528, an updated version of its R1 reasoning model featuring improved benchmark performance, reduced hallucinations, enhanced front-end capabilities, and new support for JSON output and function calling. The API interface remains unchanged, and open-source weights are available on Hugging Face. This is an incremental update to the R1 series rather than a new flagship model.

8Deepseek News·1mo ago·source ↗

DeepSeek-V3.1 Release: Hybrid Think/Non-Think Model with Agent-Focused Upgrades

DeepSeek has released V3.1, a hybrid inference model supporting both thinking and non-thinking modes in a single model, positioned as their first step toward the agent era. The model features improved tool use and multi-step agent task performance, with benchmarks showing gains on SWE-bench and Terminal-Bench, and faster thinking efficiency compared to DeepSeek-R1-0528. The base model received 840B tokens of continued pretraining for long-context extension, a new tokenizer, and open-source weights are available on HuggingFace. API updates include 128K context for both modes, Anthropic API format compatibility, and strict function calling support in beta.

6Deepseek·7d ago·source ↗

DeepSeek releases R1-0528-Qwen3-8B distilled reasoning model on Hugging Face

DeepSeek released DeepSeek-R1-0528-Qwen3-8B, an 8B parameter text-generation model on Hugging Face, combining the R1-0528 reasoning capabilities with a Qwen3 base. The model has accumulated over 306K downloads and 1K likes shortly after release, indicating strong community uptake. This appears to be a distilled version of the R1-0528 reasoning model targeting smaller-scale deployment.