fix: use SafeLoader for YAML parsing and harden script/import paths#348
Open
georgepwall1991 wants to merge 2 commits into
Open
fix: use SafeLoader for YAML parsing and harden script/import paths#348georgepwall1991 wants to merge 2 commits into
georgepwall1991 wants to merge 2 commits into
Conversation
Posting uses `yaml.CLoader` (falling back to `yaml.Loader`) for loading `.posting.yaml` collection files, and `yaml.FullLoader` for theme files. Both loaders can deserialize arbitrary Python objects via tags like `!!python/object/apply:os.system`, which means opening a malicious collection or theme file triggers arbitrary code execution — no user interaction required beyond loading the directory. Since collections are designed to be shared (YAML files in git repos), this is a realistic attack surface: a poisoned `.posting.yaml` in a cloned collection runs code the moment Posting scans it. Changes: - yaml.py: swap `CLoader`/`Loader` for `CSafeLoader`/`SafeLoader`. SafeLoader only deserializes plain data types (strings, dicts, lists, numbers, booleans) which is all Posting's YAML files use. The custom `str_presenter` was already registered on `SafeRepresenter` so dumping is unaffected. - themes.py: replace `yaml.FullLoader` with `yaml.safe_load` for the same reason — theme files are just color/font mappings. - scripts.py: resolve `collection_root` before the `is_relative_to` check so symlinked collection roots can't bypass the path boundary. Also prefix imported module names with `_posting_script_` to avoid shadowing stdlib/third-party modules while the script's directory is temporarily on `sys.path`, and guard the `sys.path.remove()` in the finally block. - postman.py: sanitize Postman folder names during import to strip path separators and traversal sequences. Uses the sanitized name for both `Collection.path` and `Collection.name` (the latter is what `save_to_disk()` actually uses for directory creation). Deduplicates names that collide after sanitization so distinct folders don't merge. - tests/test_security.py: 20 regression tests covering YAML tag rejection, script path traversal (including symlink escape), Postman folder sanitization with deduplication, and sys.path restoration.
The previous commit sanitized folder names at the Postman import layer, but Collection.save_to_disk() itself still built filesystem paths directly from request.name and child.name. This meant OpenAPI imports (which derive names from operation summaries and URL paths containing slashes) could write files outside the target directory, and any future importer would inherit the same exposure. Moved sanitization into save_to_disk() so it applies regardless of where the collection data came from. Request names and child collection names are now stripped of path separators and traversal sequences before being used as path components. Names that collide after sanitization get a _2, _3 suffix so distinct items don't silently merge. Requests and child collections track used names separately since they occupy different namespaces on disk (files vs directories) — a request called "users" and a sub-collection called "users" coexist as users.posting.yaml and users/ without conflict. Also sanitized the default output directory and env filename in __main__.py. Repeated imports with names that sanitize identically now get unique directories rather than merging into the same one. Added 9 new tests covering save_to_disk traversal blocking, OpenAPI- style slash names, duplicate name deduplication, and the sanitize helper functions directly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
While poking around the codebase I noticed that
yaml.pyimportsCLoader/Loaderrather thanCSafeLoader/SafeLoader. BothCLoaderandLoaderwill happily deserialize arbitrary Python objects via YAML tags like!!python/object/apply:os.system, which means a crafted.posting.yamlfile in a shared collection can execute arbitrary code the moment Posting scans the directory — no user interaction needed beyond opening the collection.Since collections are designed to be shared as YAML files in git repos, this felt worth fixing. The theme loader (
themes.py) had the same issue usingyaml.FullLoader.While I was in there I also noticed a few related path-safety issues in the script executor, the import handlers, and the persistence layer, so I've included fixes for those too.
Changes
src/posting/yaml.pySwapped
CLoader/LoaderforCSafeLoader/SafeLoader. SafeLoader only handles plain data types (strings, dicts, lists, numbers, booleans), which is all Posting's YAML files actually use. The customstr_presenterwas already registered onSafeRepresenter, so dump behaviour is unchanged.src/posting/themes.pyReplaced
yaml.load(theme_file, Loader=yaml.FullLoader)withyaml.safe_load(theme_file)— theme files are just color/font mappings, no need for the unsafe loader.src/posting/scripts.pyTwo things here:
The
is_relative_to()path-boundary check was comparing a resolved script path against an unresolvedcollection_root. If the collection root itself is a symlink,resolve()produces a path that isn't relative to the unresolved root, which means the check could be bypassed. Now both sides are resolved before comparison.During script loading the script's parent directory gets temporarily inserted at
sys.path[0]. A co-located file likejson.pyorhttpx.pycould shadow stdlib/third-party imports for the rest of the process while it's on the path. The module name is now prefixed with_posting_script_to reduce collision risk, and thesys.path.remove()in thefinallyblock is guarded againstValueError.src/posting/importing/postman.pyPostman folder names (
item.name) were used directly in path construction with no sanitization, so a folder named../../etccould write files outside the target directory during import. Folder names are now stripped of everything except\w, spaces, hyphens, and dots — and names that collide after sanitization get a_2,_3suffix so distinct folders don't silently merge. The sanitized name is used for bothCollection.pathandCollection.name(the latter is whatsave_to_disk()actually uses for directory creation).src/posting/collection.pyThis is the core of the second commit.
Collection.save_to_disk()was building filesystem paths directly fromrequest.nameandchild.namewith no sanitization, which meant any importer that passed through unsanitized names (OpenAPI operation summaries contain slashes, for example) could write files outside the intended directory.Rather than patching each importer individually, I centralised the defence here:
_sanitize_path_component()— strips a string down to a safe single-path component (removes separators, traversal sequences, problematic characters), returns"unnamed"when nothing is left._unique_safe_name()— wraps the sanitizer and appends_2,_3suffixes to avoid collisions.save_to_disk()now runs every request name and child name through these helpers before using them as path components. Request names and child names are tracked in separate used-name sets since they don't actually collide on disk (users.posting.yamlandusers/coexist fine).src/posting/__main__.pyThe default output directory for
posting importwas built fromcollection.namewithout sanitization, so a malicious collection name could escape the default collection directory. Now uses_unique_output_dir()which sanitizes the name and picks a non-existing directory to avoid silently merging with a previous import. The.envfilename for Postman imports is also sanitized.tests/test_security.py(new)29 regression tests covering:
!!python/object/apply,!!python/object,!!python/module) and safe data loadingCollection.save_to_disksanitization (request name traversal, child name traversal, OpenAPI-style slash names, duplicate name deduplication)_sanitize_path_componentand_unique_safe_nameunit testssys.pathrestoration after both successful and failing script executionTesting
All existing tests continue to pass (194 existing + 29 new = 223 total, 4 skipped). The SafeLoader change doesn't affect any existing fixture files since they only contain plain YAML data structures.