Skip to content

fix: load voice presets with a restricted unpickler (CWE-502)#397

Open
officialasishkumar wants to merge 1 commit into
microsoft:mainfrom
officialasishkumar:fix/voice-preset-weights-only-load
Open

fix: load voice presets with a restricted unpickler (CWE-502)#397
officialasishkumar wants to merge 1 commit into
microsoft:mainfrom
officialasishkumar:fix/voice-preset-weights-only-load

Conversation

@officialasishkumar
Copy link
Copy Markdown

Summary

  • The voice-preset loaders in demo/web/app.py and demo/realtime_model_inference_from_file.py crash on startup, making the streaming web demo and the realtime file demo unusable on PyTorch >= 2.6 (App/Demo exception since fix: use weights_only=True (CWE-502) #392).
  • Load the presets through a restricted pickle.Unpickler that keeps the CWE-502 protection while actually being able to reconstruct the objects.

Problem

Commit 303b283 changed both loaders to torch.load(..., weights_only=True) wrapped in torch.serialization.safe_globals([BaseModelOutputWithPast, DynamicCache]) to close a CWE-502 risk (arbitrary code execution from a tampered .pt).

The presets are dicts of transformers.modeling_outputs.BaseModelOutputWithPast objects (each holding a last_hidden_state tensor and a DynamicCache). PyTorch's weights-only unpickler refuses the SETITEMS opcode on dict subclasses — it accepts only the exact dict, OrderedDict and Counter types, even when the subclass is allowlisted via safe_globals. So the load can never succeed and aborts on startup with:

_pickle.UnpicklingError: Weights only load failed. ...
Can only SETITEMS for dict, collections.OrderedDict, collections.Counter,
but got <class 'transformers.modeling_outputs.BaseModelOutputWithPast'>

BaseModelOutputWithPast has to stay an object — the model accesses outputs.past_key_values and outputs.last_hidden_state — so the presets cannot simply be flattened to plain tensors.

Fix

Add a shared load_voice_preset helper in vibevoice/processor/vibevoice_streaming_processor.py (imported by both demos) that loads via a restricted pickle.Unpickler:

  • find_class resolves only the container classes the presets are built from (OrderedDict, BaseModelOutputWithPast, DynamicCache) and torch's tensor-rebuilding primitives (torch._utils._rebuild_*, torch.*Storage). Any other global, e.g. os.system, raises UnpicklingError, so a tampered preset still cannot execute arbitrary code — the CWE-502 protection is preserved.
  • The same restriction is applied to the module-level load / loads that torch.load uses for legacy-format metadata, so both the zip and legacy serialization formats stay safe.

Because the restricted unpickler uses standard pickle semantics, SETITEMS on BaseModelOutputWithPast works and the objects load correctly. The now-unused BaseModelOutputWithPast / DynamicCache imports are removed from both demos.

Test plan

Validated on PyTorch 2.12 + transformers 4.51.3:

  • Reproduced App/Demo exception since fix: use weights_only=True (CWE-502) #392: the bundled presets fail to load under weights_only=True + safe_globals with the exact SETITEMS error.
  • All 25 bundled demo/voices/streaming_model/*.pt files load via load_voice_preset, byte-identical (torch.equal on every last_hidden_state and every cache key/value tensor) to a trusted full unpickle.
  • Security: a .pt whose payload calls os.system is rejected with UnpicklingError in both the zip and legacy formats, and the side effect never runs.
  • Downstream attribute access (outputs.past_key_values, outputs.last_hidden_state) and item access (outputs["last_hidden_state"]) keep working.

Closes #392

Commit 303b283 switched the voice-preset loaders in demo/web/app.py and
demo/realtime_model_inference_from_file.py to
`torch.load(..., weights_only=True)` guarded by
`torch.serialization.safe_globals([BaseModelOutputWithPast, DynamicCache])`
to close a CWE-502 arbitrary-code-execution risk.

That load can never succeed. The presets are dicts of
`transformers.modeling_outputs.BaseModelOutputWithPast` objects, and
PyTorch's weights-only unpickler refuses the `SETITEMS` opcode on `dict`
subclasses (it accepts only the exact `dict`, `OrderedDict` and `Counter`
types) even when the class is allowlisted via `safe_globals`. Loading
therefore aborts on startup with:

    _pickle.UnpicklingError: Weights only load failed. ...
    Can only SETITEMS for dict, collections.OrderedDict, collections.Counter,
    but got <class 'transformers.modeling_outputs.BaseModelOutputWithPast'>

making the streaming web demo and the realtime file demo unusable on
PyTorch >= 2.6.

`BaseModelOutputWithPast` has to stay an object (the model accesses
`outputs.past_key_values` and `outputs.last_hidden_state`), so the presets
cannot simply be flattened to plain tensors. Instead, load them through a
restricted `pickle.Unpickler` whose `find_class` resolves only the
container classes the presets are built from (`OrderedDict`,
`BaseModelOutputWithPast`, `DynamicCache`) plus torch's tensor-rebuilding
primitives. Any other global, e.g. `os.system`, is refused, so a tampered
preset still cannot execute arbitrary code and the CWE-502 protection is
preserved. The same restriction is applied to the module-level `load` /
`loads` that `torch.load` uses for legacy-format metadata, so both
serialization formats stay safe.

The shared `load_voice_preset` helper lives in the streaming processor
module imported by both demos; the now-unused `BaseModelOutputWithPast`
and `DynamicCache` imports are removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

App/Demo exception since fix: use weights_only=True (CWE-502)

1 participant