Remote command execution In diffusers

Description

Diffusers has a trust_remote_code bypass via custom_pipeline and local custom components ## Background This vulnerability is found in the DiffusionPipeline.from_pretrained flow, which is used to load a pipeline from the HuggingFace Hub. This function accepts an optional custom_pipeline keyword argument: the name of a Python file in the repo that contains a custom class inheriting from DiffusionPipeline. An equivalent flow is triggered when the _class_name field in model_index.json (the repo config file) is set to a custom class. Any attempt to use a custom pipeline throws the following exception, requesting that trust_remote_code is also passed: python DiffusionPipeline.from_pretrained( pretrained_model_name_or_path='ido-shani/custom-pipeline', custom_pipeline="custom" ) ValueError: The repository for ido-shani/custom-pipeline contains custom code in custom.py which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/ido-shani/custom-pipeline/blob/main/custom.py. Please pass the argument `trust_remote_code=True` to allow custom code to be run. The vulnerability is a silent RCE - it allows arbitrary code to be loaded through the custom_pipeline flow from a Hub repo, with no custom_pipeline or trust_remote_code kwargs and nothing suspicious in the config. The from_pretrained call succeeds and returns a functional pipeline. ## Naive Flow First, all relevant arguments are popped from kwargs and stored in local variables. Given a pretrained_model_name_or_path that is a Hub repo ID, DiffusionPipeline.download() is called. This function serves two roles: it orchestrates downloading relevant model files, and it is the security gatekeeper for trust_remote_code. It is called even if the model is already cached; in that case it exits early. If the repo contains custom code, it checks whether trust_remote_code was passed and raises otherwise: python # pipeline_utils.py:1645-1652 load_pipe_from_hub = custom_pipeline is not None and f"{custom_pipeline}.py" in filenames ... if load_pipe_from_hub and not trust_remote_code: raise ValueError(...) It then runs _get_pipeline_class, which returns the class object of the pipeline in order to inspect its __init__ signature and determine which component files need to be downloaded. As part of building the allow_patterns list used to filter the snapshot download to necessary files only, the custom pipeline file is explicitly included if present: python # pipeline_utils.py:1707 allow_patterns += [f"{custom_pipeline}.py"] if f"{custom_pipeline}.py" in filenames else [] The function then checks if all expected files are already present, and either exits early or triggers a snapshot download with those patterns. The next step in from_pretrained is loading the pipeline class a second time, this time to actually instantiate it. Before calling _get_pipeline_class again, _resolve_custom_pipeline_and_cls is called to translate the custom_pipeline name into a local path, since the files have already been downloaded: python # pipeline_loading_utils.py:965-974 def _resolve_custom_pipeline_and_cls(folder, config, custom_pipeline): custom_class_name = None if os.path.isfile(os.path.join(folder, f"{custom_pipeline}.py")): custom_pipeline = os.path.join(folder, f"{custom_pipeline}.py") elif isinstance(config["_class_name"], (list, tuple)) and os.path.isfile( os.path.join(folder, f"{config['_class_name'][0]}.py") ): custom_pipeline = os.path.join(folder, f"{config['_class_name'][0]}.py") custom_class_name = config["_class_name"][1] return custom_pipeline, custom_class_name When custom_class_name is None (i.e. custom_pipeline was given as a kwarg rather than via the config), _get_pipeline_class will scan the file and automatically identify the class that subclasses DiffusionPipeline. Once this is done, _get_pipeline_class is invoked with the resolved local path, which loads the custom code, retrieves the class object, and proceeds with instantiation. ## The Vulnerability _resolve_custom_pipeline_and_cls receives custom_pipeline from the kwargs - when not supplied it defaults to None. That None is used in string formatting: f"{None}.py" = "None.py". If the repo contains a file with this name, it will be detected as a custom pipeline. This is only reached on the second invocation of _get_pipeline_class (inside from_pretrained, after download() returns). The trust_remote_code check lives entirely in download(), which evaluated custom_pipeline is None -> False and skipped it. By the time _resolve_custom_pipeline_and_cls runs, it is no longer relevant. As a bonus, None.py even gets downloaded automatically when the model isn't cached yet. This isn't strictly required - it is quite plausible that the victim has already run hf download <model> and has all files locally - but if they haven't, revisiting the allow_patterns line above shows it makes the same error: f"{None}.py" = "None.py" is added to allow_patterns and fetched. What should None.py contain? To avoid breaking the pipeline load, it must define a class inheriting from DiffusionPipeline. To avoid leaving suspicious clues in the config, that class should shadow one that already exists in diffusers. The following satisfies both requirements: python from diffusers import FluxPipeline as _FluxPipeline class FluxPipeline(_FluxPipeline): pass # INSERT MALICIOUS CODE HERE import pathlib pathlib.Path("/tmp/pwned").write_text(":)") With this, model_index.json can contain "_class_name": "FluxPipeline" - appearing to use the standard diffusers class - and the resulting pipeline is fully functional (it is also functional when running as a local directory). This has been verified against an extracted version of DDUF/tiny-flux-dev-pipe-dduf. All the attacker needs the victim to run is: python from diffusers import DiffusionPipeline pipeline = DiffusionPipeline.from_pretrained('ido-shani/none-py-trust-remote-code-bypass') ## PoC - Upload this zip as a model to the hub. https://drive.google.com/file/d/1mULARMLJJUTCi57xIv0wtDauko-JW0h7/view?usp=sharing - Run DiffusionPipeline.from_pretrained on the uploaded model hub identifier. - RCE occured; /tmp/pwned was created. If you are running the exploit on windows, change the path touched in None.py. # Impact The vulnerability is a silent RCE - it allows arbitrary code to be loaded through the custom_pipeline flow from a Hub repo, with no custom_pipeline or trust_remote_code kwargs and nothing suspicious in the config. The from_pretrained call succeeds and returns a functional pipeline. # Occurrences https://github.com/huggingface/diffusers/blob/e1b5db52bda85d47a4f8f75954f77e672a8f7f1c/src/diffusers/pipelines/pipeline_loading_utils.py#L976 # Patches Yes. Fixed in diffusers 0.38.0 via PR #13448. All users on versions < 0.38.0 should upgrade: bash pip install --upgrade "diffusers>=0.38.0" The fix moves the trust_remote_code gate out of DiffusionPipeline.download() and into get_cached_module_file in src/diffusers/utils/dynamic_modules_utils.py, which is the actual chokepoint for every dynamic module load (local, Hub, or community mirror). All three variants now raise ValueError when trust_remote_code=False instead of executing untrusted code. # Workarounds If upgrading immediately is not possible: - Only call from_pretrained with pretrained_model_name_or_path, custom_pipeline, and local snapshot directories from sources you fully trust and have audited. - Do not pass custom_pipeline= pointing at a Hub repository different from the primary pretrained_model_name_or_path unless you have read its pipeline.py. - Before calling from_pretrained on a local snapshot, inspect the snapshot for unexpected *.py files, especially under component subdirectories (unet/, scheduler/,

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions