Web programming public test ranking: DeepSeek-R1 surpassed Claude 4 and crowned the world's number one

Is the status of the programming king Claude unstable? ? The latest battle report of the large model arena is released.The new version of DeepSeek R1 won the first place in web programming, narrowly beating Claude Opus 4. You must know that Claude Opus 4 is recognized as the "world's strongest encoding model."

So, what is the origin of DeepSeek-R1-0528 that can defeat Claude Opus 4 in programming?

Looking at the name, you might think it is a minor version update, but in fact -

It is almost equal to OpenAI o3-high on LiveCodeBench, and many netizens even speculated that it is the legendary R2.

Looking at it this way, when it comes to programming, neither side seems to be easy to mess with~

So without further ado, let’s test DeepSeek-R1-0528 first-hand to see how powerful Kangkang is.

Test it actually

Currently, DeepSeek-R1-0528 has been launched on the official DeepSeek website, App and mini program (open Deep Thinking).

Here we go directly to the official website to experience.

Test 1: Make an animated solar system app

The prompt words are as follows:

Make an animated solar system app using web searches.

just think49 secondsFinally, DeepSeek-R1-0528 gave a piece of python code.

After running with VS Code, the results are as follows:

There are animations that can run independently, but the page is relatively rough.

However, if you change to other prompt words, the effect will be obviously different.

Use Three.js to simulate the solar system and display the name of the planet when the mouse is hovering over it.

In just 34 seconds, DeepSeek-R1-0528 clarified the design idea:

The key is this timeCan be run directly with one click, no need to open your own editor separately.(Running the function feels like opening a blind box and may not always appear)

And it also has animation and interaction, and the effect goes directly to Next Nevel~

Test 2: Front-end web page production

Next, we ask DeepSeek to generate a website with the theme AGI, and the prompt words are as follows:

Please design a webpage with the theme of artificial general intelligence (AGI), including three conceptual parts: "knowledge sharing", "community" and "future creation". Each part should be equipped with a corresponding icon and concise description. The overall style is modern and technological, highlighting AGI's innovative and collaborative spirit. Use HTML, CSS and JavaScript for interactivity and visual effects.

After thinking for 23 seconds, DeepSeek-R1-0528 subsequently provided a piece of HTML code, which can still be run with one click.

Test 3: Create a Tetris mini-game

Finally, let’s try the English prompt words:

Create a full featured version of tetris with beautiful graphics and controls.
Create a full version of Tetris with beautiful graphics and controls.

As you can see, DeepSeek-R1-0528 thinks12 secondsThen a piece of python code is given.

The running result will be like:

Although it is indeed a Tetris mini-game, the basic demo has obvious bugs and lacks interaction buttons.

Unwilling to give up, we tried to let DeepSeek continue to improve, but it overturned the second time.

The improved game still doesn’t work properly(always passes through walls), and does not implement the interactive functions we explicitly requested.

To summarize, judging from the above simple actual measurement, the new version of DeepSeek R1, as an open source model, has indeed made great progress in programming capabilities, but there is still some room for improvement.

But there is one thing to say, it is obviously more friendly to ordinary domestic users.(Compared to the Claude model, it is free and easy to obtain).

One More Thing

In addition to the update of the programming ability list, the new version of DeepSeek R1 was also selectedThe best open source text model currently available.

Under the MIT license, it ranks sixth on the overall list and first among open source.

In the subdivision field, it ranks 4th in difficulty prompts and 5th in mathematics. It is a very capable player in the open source model.

However, it is worth mentioning that Kimi’s new model has just won the code open source SOTA——

Open source code model with only 72B parametersKimi-Dev, achieving open source SOTA with a score of 60.4% on SWE-bench Verified.

Not only is its programming level better than the latest DeepSeek-R1, it also performs well compared to closed-source models.

So what if we don’t know its true abilities (doge)~