Quick Technology reported on May 3 that the DeepSeek V4 series of large models was officially released on April 24. It has been 15 months since last year’s DeepSeek R1 update. The performance of V4 has also triggered discussions at home and abroad, and Americans are also very concerned.

There have been many tests to evaluate the ability of DeepSeek V4. A previous research report organized by 3 senior researchers at the Council on Foreign Relations showed that it lags behind the top American large models by about 7 months.

Now the Center for Artificial Intelligence Standards and Innovation (CAISI), a subsidiary of the National Institute of Standards and Technology (NIST), has also come to evaluate DeepSeek V4. Their conclusion is that DeepSeek V4 lags behind the United States by about 8 months, which is similar to the previous gap.

In their AI capability evaluation results, DeepSeek V4 scored 800 points, and the current strongest one is GPT-5.5, with a score of more than 1200 points, GPT-5.4 and Opus 4.6 is also above 1000 points.

The overall performance of DeepSeek V4 is similar to that of GPT-5 8 months ago, but DeepSeek officials previously considered it to be similar to GPT-5.4 in a release report.

However, CAISI also admitted that DeepSeek V4 is the most powerful large AI model in China that they have evaluated, and is very strong in nine tests in the five fields of network, software engineering, natural science, abstract reasoning and mathematics.

More importantly, DeepSeek V4 is more cost-effective. Even compared with the most cost-effective GPT-5.4 mini large model in the United States, DeepSeek V4 has better test costs in 4 out of 7 benchmarks, ranging from 41% to 53% higher.