A group of researchers from Cornell University’s technical team discovered a new type of backdoor attack, which they demonstrated can “manipulate the natural language modeling system to produce wrong output and evade any known defenses.”
The Cornell technical team stated that they believe these attacks can compromise algorithmic transactions, email accounts, etc. The research was supported by the Google Academy Research Award and the NSF and Schmidt Futures Program.
According to a study published on Friday, backdoors can manipulate natural language modeling systems without “accessing the original code or models by uploading malicious code to open source sites frequently used by many companies and programmers.” Researchers named these attacks “code poisoning” in a speech at the USENIX security conference on Thursday.
This attack will give individuals or companies huge powers to modify various content including movie reviews, and even machine learning models of investment banks, so it ignores news that may have an impact on the company’s stock.
“The attack is blind: the attacker does not need to observe the execution of his code, nor does he need to observe the weight of the backdoor model during or after training. The attack “instantly” synthesizes the poisoning input during model training and uses multi-objective optimization , To achieve high accuracy on both the main task and the backdoor task at the same time,” the report said.
“We showed how to use this attack to inject single-pixel and physical backdoors into the ImageNet model, switch the model to backdoors with hidden functions, and backdoors that do not require the attacker to modify the input during inference. Then we demonstrated that a code poisoning attack can evade anything. Known defenses, and based on the detection of deviations from the model’s trusted computing graph, a new defense is proposed.”
Eugene Bagdasaryan is a PhD candidate in Computer Science at Cornell Institute of Technology. He co-authored this new paper with Professor Vitaly Shmatikov. He explained that many companies and programmers use models and codes from open source sites on the Internet. This study proved the importance of reviewing and verifying materials before integrating them into any system.
“If hackers can implement code poisoning, they can manipulate models that automate supply chains and propaganda, as well as resume screening and delete harmful comments,” Bagdasaryan said.
Shmatikov added that in previous attacks, the hacker had to access the model or data during training or deployment, which required penetration of the victim’s machine learning infrastructure. “With this new attack, the attack can be completed before the model even exists or collects the data-and one attack can actually target multiple victims,” Shmatikov said.
This paper deeply studied the attack method of “computing the loss value calculation in the training code of the compromise model, and injecting the backdoor into the machine learning model”.
Using a sentiment analysis model, the team was able to replicate how the attack works on certain things, such as always classifying any reviews of movies made by Ed Wood as positive.
“This is an example of a semantic backdoor that does not require the attacker to modify the input during reasoning. The backdoor is triggered by unmodified comments written by anyone, as long as they mention the name chosen by the attacker,” the paper found. “The machine learning pipeline includes code from open source and proprietary repositories, managed through build and integration tools. The code management platform is a known vector for malicious code injection, allowing attackers to directly modify the source code and binary code.” The research It is pointed out that popular ML repositories have thousands of branches, “only with basic tests (such as the shape of test output).” In order to defend against attacks, researchers have proposed a system that can detect deviations in the original code of the model.
But Shmatikov said that due to the popularity of artificial intelligence and machine learning technologies, many non-expert users are building models using codes that they hardly understand.
“We have shown that this can have devastating security consequences,” Shmatikov said. He added that more work needs to be done on how to use attacks to automate propaganda and other destructive work. Shmatikov said that the goal of this work is now to create a defense system that will “eliminate this entire class of attacks and make AI/ML safe even for non-expert users.”