Parkour, vault, backflip, nunchaku, drunken fist... If you only look at these keywords, you will probably think that this is the admissions brochure of a martial arts school. But this time, the negative reviewer was talking about the Spring Festival Gala program "Wu BOT". Dozens of robots are lined up one after another, with coherent movements and tight rhythm. On stage, it is a set of silky smooth combos that start at zero frames.


As soon as the barrage opened, six people were shocked.


Weibo was a hot mess, with hundreds of comments in the comments section. Everyone said: "Shocked, watch it a million times."


Even the well-informed editorial staff couldn't help but gasp.


To be honest, compared to last year’s standing position, the Yushu robot in this year’s Spring Festival Gala even flipped, hit, and executed all moves. Not to mention that it was exactly the same as a human. It was simply testing the boundaries of surpassing human beings...

So the question is, how does the robot make such cool moves? How do robots become so humane?

This time, the reviewer sneaked into the rehearsal room of the Spring Festival Gala in advance and interviewed the Yushu G1 robot that was on stage this time - Benben, and listened to it tell the behind-the-scenes stories to all the reviewers.

As soon as he slipped into the room, Mr. Bad Review caught the attention of everyone. Benben is a hard worker, doing somersaults so high that he can't do it.


This was followed by another Mantis Fist with smooth joints and perfectly controlled body swings:


The last set of martial arts combos comes to an end, with iron armor and steel fists showing brute strength. Please feel the sense of oppression:


But Benben who walked off the stage was just an ordinary "person".

To be foolproof under the focus of the camera, there are more hardships that no one knows about.

When he took off his coat in the rehearsal room, his body was covered with scars from practice. Fortunately, the harder you work, the luckier you get. This sentence also applies to silicon-based workers.


I believe everyone can see that the performances in this year's Spring Festival Gala are extremely difficult. If last year's robots could only imitate humans, this year's robots are already on the way to surpassing humans.

Even though the action has changed from last year's stance output to this year's difficult stunts, Wang Qixin, CMO of Yushu Technology, said in an interview that Benben and his brothers succeeded in zero rollovers every time in the large-scale rehearsals of the Spring Festival Gala.

And behind the perfect performance, there is a whole set of technical plans that are being worked on frantically.

Even dance shoes as small as inconspicuous are real engineering equipment. In order not to create a psychological shadow when stepping on the glass stage of the Spring Festival Gala, these shoes must be able to absorb impact and ensure stable landing, and the glue materials must be carefully selected.


Even the control algorithm of the robot has undergone a wave of major optimization.

In the past, everyone always thought that robots were not very smart, but in fact, the success of every action on this year's Spring Festival Gala was the result of them listening to the music, watching the stage, understanding the environment, and finally taming their limbs in real time.

In other words, how high the legs should be raised and where the formation will go next are all up to the robot to observe and adjust. This closed loop of perception, decision-making, and action has always been one of the long-term difficulties of embodied intelligence.


To be honest, at first, the negative reviewers thought that was all. It wasn’t until we caught the robot actor Benben to have a chat that we discovered that behind these “military generals” in front of the Spring Festival Gala, there was actually some drama that we had never seen before, and they spoke with high emotional intelligence...

Behind this is the voice dialogue ability that Yushu and the Volcano Engine have trained together. They have put a lot of effort into intelligence, eyesight and words.

For example, when we asked him or Jackie Chan who was more powerful, Benben immediately became humble:

The combination of this answer and the laughter filled me with the desire to survive:

I don’t know how you feel about it, but I feel that the talking Benben is no longer like a cold dancing machine and has a little more emotion.

We can clearly feel that Benben's voice is not only similar to that of a real person, but also his emotional expression is different for different content. Good things are high-pitched and fast; bad things are low-pitched and the mood is low.

After in-depth interviews with the technical team of the Volcano Engine, I found that behind Benben’s words, it all relies on the beanbag speech synthesis model.

Before the robot outputs each sentence, the model must first understand the semantics and emotion of the context, and then decide on the expression method. Whether the speaking speed is fast or slow, the intonation is high or low, and even the position of the pause and emotional parameters are dynamically generated. That's why it doesn't sound like a good read, but more like a human speaking.

The voice line is not randomly generated, but specially created according to the temperament of Yushu G1, focusing on a young male.

However, having feelings alone is not enough. What really enlightened Benben was the Beanbao language model.

Not only is the speech recognition accurate, but if you ask it to read the entire Spring Festival greeting, in less than ten seconds, the auspicious words will be output directly and wholesale:

The speech on behalf of the robot community is also watertight:

Benben also revealed to the reviewer that the visual understanding ability of the large bean bag model can even allow robots to understand the world.

I don’t know if you guys have seen the previous dressing guide of Evil Doubao. Blue high heels and red stockings, ruffles are recommended for straight men, and a short skirt can be used as a shawl... Doubao with his eyes opened has not done any good, he is just taking revenge on mankind.

Fortunately, Benben is very honest. Let it evaluate the outfits worn by relatives during the Chinese New Year. It is not just a boast. It can really understand what you are wearing, and then directly give you the full emotional value:

But everyone’s expectations for the combination of robots and large models are obviously more than these. Bad Reviews interviewed the team behind Volcano Engine,"On the one hand, we want the robot to be more emotional and able to chat and accompany us; but more importantly, we want to verify a more general set of capabilities - to allow the machine to understand human speech, and then turn the understanding into action."

Of course, this is a bit mysterious. The bad reviewer asked on the spot: Now it seems, isn’t it just giving orders with your mouth?

Now the technicians couldn't sit still. To do this well, it was much more complicated than it seemed to a layman.

People lightly say "go forward a little bit", but "forward" is relative to whose direction? How many centimeters is "one point"? This is the first level of speech recognition + large model semantic reasoning, turning vague human words into precise intentions.

Next, the model is responsible for translating the instructions to the robot and breaking them down into detailed action arrangements. How much to lift the legs first, where to turn the body, and when to land the feet all have to be calculated accurately. Simultaneous planning of dozens of joints and complex collaborative control is the second level of the large model.

Although there are not many things that voice-controlled robots can do now, maybe they can only give you a hug.

But this is just the first step for robots to understand human speech. Maybe one day, with just one command, robots can take care of housework, help with homework, and go out to work to subsidize the household. It is as easy as turning on your mobile phone and letting Doubao supervise your children's homework and teach them how to dress.

It's just that the kid at that time might have learned how to reverse attack the robot, so that the silicon-based life was willing to become a homework ghostwriter...

Yushu Technology CMO Wang Qixin also mentioned in the interview that this cooperation with Volcano Engine has improved the intimacy and vividness of robot interaction. In essence, it is making up for the shortcomings of communication between robots and people.

But the real changes in robots go beyond “speaking more like humans.” From the outside in, robots are starting to learn like humans.

Through reinforcement learning and action imitation, they can disassemble and absorb human videos and behaviors, and then transform them into their own action logic. In other words, it no longer just executes preset procedures according to the script, but develops its own skills in the process of understanding the environment and adapting to changes. This step is the technical basis for future robots to enter complex real-world scenarios.


In the short term, robots will give priority to commercial and display scenarios; in 3 to 5 years, they will replace humans on a large scale in industrial and high-risk environments; and when reliability and interaction capabilities further mature, humanoid robots may have the opportunity to truly enter homes in 5 to 10 years.

In other words, what we saw today at the Spring Festival Gala is just the first step in verifying their capabilities. The goal of future robots is to gradually become long-term partners in human production and life.

Looking back to 2025, AI and embodied intelligence have become a national topic. Even if you are not paying attention deliberately, it is undeniable that each of us is being swept forward by the wave of technology.

And this time, 25 of the same Yushu robots that can walk and talk on the stage were given away through the Spring Festival Gala Doubao APP lottery interaction, just like an invitation to the future delivered to us.


Many people are worried about getting lost in the era of rapid development, but Mr. Bad Review believes that the end point of scientific and technological development is a better life for mankind..

In the past, you may have experienced a lot of troubles and confusion. In those late nights when there is no one to talk to, we choose to hand over the problem to AI.

In the future, it may be like the Spring Festival promotional video produced by Seedance 2.0. When we open the door after a year of fatigue, robots have already tidied the room and prepared meals. The time lost in trivial matters can finally be given back to the more important people around you.


I wish all missionaries a happy new year. In the new year, I hope technology will continue to advance, and I hope it will bring real ease.

May the intelligence of the future be closer to life and your life be more leisurely.