The preview version of DeepSeek-V4 is officially launched, providing a new experience of 1M ultra-long context memory

On April 24, DeepSeek announced that the preview version of the new series of models DeepSeek-V4 was officially launched and open sourced simultaneously. DeepSeek-V4 has a million-word ultra-long context and leads the domestic and open source fields in terms of agent capabilities, world knowledge and reasoning performance. The model is divided into two versions: deepseek-v4-flash and deepseek-v4-pro according to size.

Log in to the official website or official App from now on to talk to the latest DeepSeek-V4 and explore the new experience of 1M ultra-long context memory. The API service has been updated synchronously and can be called by modifying model_name to deepseek-v4-pro or deepseek-v4-flash.

Compared with the previous generation model, DeepSeek-V4-Pro's Agent capabilities are significantly enhanced. In the Agentic Coding evaluation, V4-Pro has reached the best level of current open source models, and also performed well in other Agent-related evaluations. At present, DeepSeek-V4 has become the Agentic Coding model used by internal employees of the company. According to evaluation feedback, the usage experience is better than Sonnet 4.5, and the delivery quality is close to Opus 4.6 non-thinking mode, but there is still a certain gap with Opus 4.6 thinking mode.

According to reports, DeepSeek-V4 has created a new attention mechanism that compresses in the token dimension and combines it with DSA sparse attention (DeepSeek Sparse Attention) to achieve world-leading long context capabilities and significantly reduce the requirements for computing and video memory compared to traditional methods. From now on, 1M (one million) context will be standard for all official DeepSeek services.

The maximum context length of V4-Pro and V4-Flash is 1M. Both support non-thinking mode and thinking mode . The thinking mode supports the reasoning_effort parameter to set the thinking intensity (high/max). For complex Agent scenarios, it is recommended to use thinking mode and set the intensity to max.

Currently, DeepSeek API has been launched simultaneously on V4-Pro and V4-Flash, supporting the OpenAI ChatCompletions interface and Anthropic interface. When accessing a new model, the base_url remains unchanged and the model parameter needs to be changed to deepseek-v4-pro or deepseek-v4-flash.