Table of Contents
The popular Python Pickle serialization format, which is common for distributing AI models, offers ways for attackers to inject malicious code that will be executed on computers when loading models with PyTorch.
Like all repositories of open-source software in recent years, AI model hosting platform Hugging Face has been abused by attackers to upload trojanized projects and assets with the goal of infecting unsuspecting users. The latest technique observed by researchers involves intentionally broken but poisoned Python object serialization files called Pickle files.
Often described as the GitHub for machine learning, Hugging Face is the largest online hosting database for open-source AI models and other machine learning assets. In addition to hosting services, the platform provides collaboration features for developers to share their own apps, model transformations, and model fine-tunings.
[Beware the 10 most critical LLM vulnerabilities and learn how Gen AI is transforming the cyber threat landscape ]
“During RL research efforts, the team came upon two Hugging Face models containing malicious code that were not flagged as ‘unsafe’ by Hugging Face’s security scanning mechanisms,” researchers from security firm ReversingLabs wrote in a new report. “RL has named this technique ‘nullifAI,’ because it involves evading existing protections in the AI community for an ML model.”
To ban or not to ban, that is the pickle
While Hugging Face supports machine learning (ML) models in various formats, Pickle is among the most prevalent thanks to the popularity of PyTorch, a widely used ML library written in Python that uses Pickle serialization and deserialization for models. Pickle is an official Python module for object serialization, which in programming languages means turning an object into a byte stream — the reverse process is known as deserialization, or in Python terminology: pickling and unpickling.
The process of serialization and deserialization, especially of input from untrusted sources, has been the cause of many remote code execution vulnerabilities in a variety of programming languages. Similarly, the Python documentation for Pickle has a big red warning: “It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.”
That poses a problem for an open platform like Hugging Face, where users openly share and have to unpickle model data. On one hand, this opens the potential for abuse by ill-intentioned individuals who upload poisoned models, but on the other, banning this format would be too restrictive given PyTorch’s popularity. So Hugging Face chose the middle road, which is to attempt to scan and detect malicious Pickle files.
This is done with an open-source tool dubbed Picklescan that essentially implements a blacklist for dangerous methods or objects that could be included in Pickle files, such as eval
, exec
, compile
, open
, etc.
However, researchers from security firm Checkmarx recently showed that this blacklist approach is insufficient and can’t catch all possible abuse methods. First, they showed a bypass based on Bdb.run
instead of exec
, with Bdb being a debugger built into Python. Then, when that was reported and blocked, they found another bypass using an asyncio gadget that still used built-in Python functionality.
Bad pickles
The two malicious models found by ReversingLabs used a much simpler approach: They messed with the format expected by the tool. The PyTorch format is essentially a Pickle file compressed with ZIP, but the attackers compressed it with 7-zip (7z) so the default torch.load()
function would fail. This also caused Picklescan to fail to detect them.
After unpacking them, the malicious Pickle files had malicious Python code injected into them at the start, essentially breaking the byte stream. The rogue code, when executed, opened a platform-aware reverse shell that connected back to a hardcoded IP address.
But that got the ReversingLabs researchers wondering: How would Picklescan behave if it encountered a Pickle file in a broken format? So they created a malicious but valid file, that was correctly flagged by Picklescan as suspicious and triggered a warning, then a file with malicious code injected at the start but an “X” binunicode Pickle opcode towards the end of the byte stream that essentially broke the stream before the normal 0x2E
(STOP) opcode was encountered.
Picklescan produced a parsing error when it encountered the X opcode, but did not provide a warning about the suspicious functions encountered earlier in the file, which had already been executed by the time the X opcode triggered the parsing error.
“The failure to detect the presence of a malicious function poses a serious problem for AI development organizations,” the researchers wrote. “Pickle file deserialization works in a different way from Pickle security scanning tools. Picklescan, for example, first validates Pickle files and, if they are validated, performs security scanning. Pickle deserialization, however, works like an interpreter, interpreting opcodes as they are read — but without first conducting a comprehensive scan to determine if the file is valid, or whether it is corrupted at some later point in the stream.”
Avoid pickles from strangers
The developers of Picklescan were notified and the tool was changed to be able to identify threats in broken Pickle files, before waiting for the file to be validated first. However, organizations should remain wary of models from untrusted sources delivered as Pickle files, even if they were first scanned with tools such as Picklescan. Other bypasses are likely to be found in the future because blacklists are never perfect.
“Our conclusion: Pickle files present a security risk when used on a collaborative platform where consuming data from untrusted sources is the basic part of the workflow,” the researchers wrote.