Bug
get_param() at transformer_tricks.py:57-62 downloads safetensors into a hard-coded relative path ./get_param_tmp/, and never cleans it up. When flashify_repo is called more than once from the same working directory, safetensors files from previous runs accumulate in that dir, and the subsequent glob('./get_param_tmp/*.safetensors') + load_file() loop merges all of them into a single param dict. The resulting flashified checkpoint contains a mix of tensors from multiple source models.
Repro
import transformer_tricks as tt
tt.flashify_repo('Qwen/Qwen3-1.7B', dir='a', strict=True) # leaves Qwen shards in ./get_param_tmp
tt.flashify_repo('meta-llama/Llama-3.2-1B', dir='b', strict=True) # merges Qwen shards + Llama weights -> broken checkpoint in b/
The second call produces a b/model.safetensors whose keys/shapes don't match either model consistently (confirmed by inspecting with safetensors.safe_open).
Impact
Silent — no error is raised; flashify_repo reports success and uploads a corrupted checkpoint. Hit during a batch upload of 6 FlashNorm variants; two of the uploads (open-machine/Llama-3.2-1B-FlashNorm, open-machine/Gemma-3-1B-FlashNorm) had to be re-run after catching the contamination via safetensors inspection.
Suggested fix
Use a unique tempdir per call (and clean up on exit) in get_param():
import tempfile
with tempfile.TemporaryDirectory(prefix='flashify_') as dir:
snapshot_download(repo_id=repo, allow_patterns='*.safetensors',
local_dir=dir)
files = glob.glob(dir + '/*.safetensors')
...
Affects only get_param(); no API changes.
Bug
get_param()attransformer_tricks.py:57-62downloads safetensors into a hard-coded relative path./get_param_tmp/, and never cleans it up. Whenflashify_repois called more than once from the same working directory, safetensors files from previous runs accumulate in that dir, and the subsequentglob('./get_param_tmp/*.safetensors')+load_file()loop merges all of them into a singleparamdict. The resulting flashified checkpoint contains a mix of tensors from multiple source models.Repro
The second call produces a
b/model.safetensorswhose keys/shapes don't match either model consistently (confirmed by inspecting withsafetensors.safe_open).Impact
Silent — no error is raised;
flashify_reporeports success and uploads a corrupted checkpoint. Hit during a batch upload of 6 FlashNorm variants; two of the uploads (open-machine/Llama-3.2-1B-FlashNorm,open-machine/Gemma-3-1B-FlashNorm) had to be re-run after catching the contamination viasafetensorsinspection.Suggested fix
Use a unique tempdir per call (and clean up on exit) in
get_param():Affects only
get_param(); no API changes.