Google announced at this month’s “Made on YouTube” event that YouTube’s automatic dubbing technology has ushered in a new upgrade: the introduction of AI lip-sync (lip-sync) function, which aims to solve the long-standing problem of “out of sync between sound and picture” in machine-translated video content. The feature will first be rolled out in 20 languages including English, German, French and Spanish, with more languages to follow in the coming months.

It is reported that YouTube’s automatic dubbing and automatic translation have been controversial for automatically reproducing video titles and audio tracks. Many users hope to have a unified option to turn off such automatic translation and dubbing. Multilingual users and Bilibili creators have reported that the quality of AI-generated translations is uneven compared with human translations. Currently, YouTube does not provide the function of turning off dubbing globally. Users need to manually adjust the audio track on a video-by-video basis. This has also prompted some developers to launch browser plug-ins such as "YouTube Anti-Translate" to specifically block automatic translation and dubbing layers.

The key breakthrough of this update is that the AI lip sync function can use artificial intelligence technology to perfectly align the automatically generated audio track with the mouth shape of the characters in the video, greatly improving the look and feel and achieving a smoother and more natural video experience. Creators can choose to turn on the lip sync dubbing function through YouTube Studio. The first pilot is open to members of the YouTube Partner Program, and Google is expected to extend it to all videos in the future.
In terms of multilingual dubbing, YouTube relies on self-developed AI models (including Gemini and Aloud) to generate multilingual audio tracks, which not only restores the emotion and intonation of the original speaker's voice, but also separates background sounds and human voices. According to Google, after some channels enabled multilingual dubbing, the number of non-native viewers tripled, showing strong growth potential.
Although AI automatic dubbing and lip synchronization technology play a significant role in expanding creators' audience and advertising revenue, there is still a lot of controversy over whether it will affect the authenticity of the original content and the audience experience. Supporters believe that this move facilitates global audience viewing and enhances the influence of content; while critics worry that automation will damage the unique style of the original work. Whether AI lip sync can completely bridge the gap between ideal and reality, the industry is still observing its impact.