Details of ByteDance intern’s poisoning of his own model revealed. How big is the impact?

On October 19, ByteDance’s large model training was attacked by interns, which attracted widespread attention. According to multiple people familiar with the matter, a technical team at ByteDance suffered an internal technical attack in June this year. An intern was dissatisfied with the team's resource allocation and used attack code to disrupt the team's model training tasks.

It is reported that the main person involved in the incident was an intern surnamed Tian. He took advantage of a vulnerability in the Huggingface (HF) platform and wrote destructive code in the company's shared model, causing the model training effect to fluctuate and fail to produce expected training results.

A former Byte technical employee told Ifeng.com that "the rights of interns at Byte AILab are not much different from those of full-time employees, which gave the opportunity for this incident to happen." He also expressed concern about the negative impact of this incident. "After this incident, the rights of interns will definitely be greatly reduced."

After the news came to light, the intern involved tried to refute the rumors on social platforms and shift the responsibility to others, but was quickly denied by people close to ByteDance.

According to relevant insiders on Gitbub, "You (referring to Tian) conducted malicious attacks on the cluster code for as long as 2 months, causing huge harm to nearly 30 employees at all levels of the company, and making nearly a quarter of your colleagues' work in vain. All records and reviews prove this is an undeniable fact!"

The person also shared a recording of an investigator’s interview with an intern named Tian Keyu. The conversation in the recording restored the attack process: The code Tian entered first was originally used to affect communication and randomness. “At the beginning, it was not for the purpose of attacking, it was for deb.” ug, but this will indeed involve some running conditions of the program. But later it will pass through some files, namely those upload files, and the code will become an attack code. Its main function is to modify the code, and then it will cause some consequences."

In the recording, it is suspected that Tian himself responded by admitting that he made the code offensive through updates. He also made it clear to the inquirer that "we are all very dissatisfied because of certain reasons."

According to rumors, the loss may exceed 10 million US dollars, but insiders said that the actual loss was not as serious as the rumors.

It is understood that the incident occurred at the end of June this year.At present, ByteDance has dismissed the intern surnamed Tian and notified the relevant industry alliances and the school where the intern worked.

However, the above-mentioned relevant people familiar with the matter said,In addition to being fired by Byte, Tian has not yet received any punishment.

According to multiple sources, the intern surnamed Tian is a doctoral student at a domestic university and has been interning at Byte AILab since September 2021. His team just proposed VAR research with Peking University Wang Liwei’s team in April this year, surpassing DiT in terms of image generation quality, inference speed, data efficiency and scalability. In addition, the inference speed of VAR is about 20 times faster than the traditional autoregressive model.

As of press time, ByteDance has not publicly responded to this matter.