Skip to content

Add CD HF inference asset verification#120

Draft
jmccaffrey-nv wants to merge 5 commits into
mainfrom
dev/jmccaffrey/hf-access-test
Draft

Add CD HF inference asset verification#120
jmccaffrey-nv wants to merge 5 commits into
mainfrom
dev/jmccaffrey/hf-access-test

Conversation

@jmccaffrey-nv

Copy link
Copy Markdown
Collaborator

Summary

  • add a CD-only Hugging Face asset verification for README inference instructions
  • run the verification through the prebuilt Docker image with persistent caches
  • authenticate to Hugging Face and verify Alpadreams + Wan2.1 I2V HF assets without adding the network-dependent check to regular CI

Testing

  • python3 -m py_compile flashdreams/tests/test_readme_hf_inference_assets.py
  • python3 flashdreams/tests/test_readme_hf_inference_assets.py (skips unless FLASHDREAMS_HF_ACCESS_TEST=1)
  • git diff --cached --check

Note: local ruff was not run because this machine has uv 0.11.7 while the repo requires >=0.11.8.

@copy-pr-bot

copy-pr-bot Bot commented May 20, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@liruilong940607

Copy link
Copy Markdown
Collaborator

/ok to test 17ed805

The optional-import fallback used mypy/pyright's `# type: ignore[assignment]`
syntax, which ty does not recognize. Rename the import binding to `_pytest`
and use the ty-native `# ty:ignore[invalid-assignment]` directive.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@jmccaffrey-nv jmccaffrey-nv marked this pull request as draft May 21, 2026 07:18
@jmccaffrey-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 511e714

Replace the docker-based test driver with a standalone check script and
wire it into both CI and CD via a uniform in-venv invocation.

- Extract the previously-embedded "_inner_script" heredoc into
  `flashdreams/tests/check_hf_inference_assets.py`. The script resolves
  every HF asset referenced by the README's headline recipes
  (`omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf` and
  `wan21-i2v-14b-480p`) and downloads one small file per unique repo, so
  gated-access and token-scope regressions surface as failed downloads
  rather than visibility-only false negatives.

- Delete the old docker driver. The prebuilt base image that drove its
  default never ships publicly; only the Dockerfile recipe on top of a
  CUDA base will. Pinning that internal image in source-of-truth code
  was misleading.

- CI: add a single step in the existing cpu job that runs the script
  when HF_TOKEN is available; skip with a warning otherwise so external
  fork PRs do not block.

- CD: replace the docker-run job with the same script, executed inside
  the public `nvidia/cuda` container. Drops four FLASHDREAMS_* cache-dir
  env vars and the FLASHDREAMS_HF_ACCESS_TEST enable flag (only HF_TOKEN
  remains).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@greptile-apps

greptile-apps Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a Hugging Face asset reachability check that runs authenticated hf_hub_download calls against the two README headline recipes (omnidreams-sv-2steps-chunk2-loc6-lightvae-lighttae-perf and wan21-i2v-14b-480p), surfacing gated-repo or token-scope regressions before they reach users.

  • cd.yml (new): Dedicated CD job that enforces the HF token and runs the check unconditionally on every push to main and on tags; the token validation step is currently placed after all dependency installation, meaning a missing secret only fails the job after several minutes of setup work.
  • ci.yml (modified): Adds the same check as an optional step in the existing CPU job, gracefully skipping with a warning when HF_TOKEN is absent (e.g. external fork PRs) — token correctly scoped to the step only.
  • check_hf_inference_assets.py (new): Walks pipeline component attrs to collect HF URLs/bare repo IDs, deduplicates by repo_id@revision, and downloads a small probe file per repo; the URI-scheme guard that routes bare HF IDs is missing hf:// and gs:// prefixes.

Confidence Score: 3/5

Safe to merge after the token-check ordering in cd.yml is fixed; a missing HF_TOKEN secret currently wastes minutes of setup before the job fails.

The CD workflow validates the HF token only after running apt-get and a full uv sync, so any CD environment that lacks the secret will burn several minutes of runner time before erroring out. The verification script also has a URI-scheme gap that could surface as a confusing HF API error if future pipeline configs introduce hf:// or gs:// paths. The ci.yml change is clean and handles the no-token case correctly.

.github/workflows/cd.yml warrants a second look for step ordering and secret scoping; flashdreams/tests/check_hf_inference_assets.py is otherwise solid but the URI heuristic and AssertionError use are worth addressing.

Important Files Changed

Filename Overview
.github/workflows/cd.yml New CD workflow for HF asset verification; HF_TOKEN validation is ordered after expensive dependency installation (fail-fast concern), and the token is unnecessarily exposed at job level rather than scoped to the verification step.
.github/workflows/ci.yml Adds optional HF inference asset check to the CPU job; gracefully skips when HF_TOKEN is absent, token is correctly scoped to the step only.
flashdreams/tests/check_hf_inference_assets.py New script that walks pipeline component attrs, resolves HF URLs/repo IDs, and downloads small probe files to verify authenticated access; non-HF URI scheme guard is incomplete (misses hf:// and gs://) and the failure path raises AssertionError instead of RuntimeError.

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant CD as cd.yml job
    participant CI as ci.yml cpu job
    participant Script as check_hf_inference_assets.py
    participant HF as Hugging Face Hub

    GH->>CD: push to main / tag / workflow_dispatch
    CD->>CD: apt-get + uv sync (expensive setup)
    CD->>CD: Verify HF_TOKEN set (currently last)
    CD->>Script: uv run python check_hf_inference_assets.py
    Script->>Script: Load OMNIDREAMS_RUNNERS + RUNNER_WAN21_I2V_14B_480P
    Script->>Script: _pipeline_urls() — collect component attrs
    loop per (repo_id, revision) not yet seen
        Script->>HF: "hf_hub_download(probe file, token=HF_TOKEN)"
        HF-->>Script: local cache path
    end
    Script-->>CD: exit 0 (pass) or exit 1 (fail)

    GH->>CI: push / merge_group
    CI->>CI: lint + CPU unit tests
    alt HF_TOKEN present
        CI->>Script: uv run python check_hf_inference_assets.py
        Script->>HF: authenticated probe downloads
        HF-->>Script: ok
        Script-->>CI: exit 0
    else HF_TOKEN absent (fork PR)
        CI->>CI: warning: skipping — exit 0
    end
Loading

Reviews (1): Last reviewed commit: "Refactor HF inference asset verification" | Re-trigger Greptile

Comment thread .github/workflows/cd.yml
Comment on lines +40 to +55
- name: Install workspace dependencies
env:
BLOCK_SPARSE_ATTN_SKIP_CUDA_BUILD: "TRUE"
run: |
uv venv --clear
uv sync --extra runners --no-install-package transformer-engine-torch

- name: Verify HF token is configured
run: |
test -n "${HF_TOKEN}" || {
echo "::error::HF_TOKEN secret is required for CD HF asset verification"
exit 1
}

- name: Verify HF inference assets
run: uv run --no-sync python flashdreams/tests/check_hf_inference_assets.py

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The "Verify HF token is configured" step sits after all the expensive dependency-installation steps. If HF_TOKEN is missing from the CD secrets, the job will fail only after spending minutes on apt-get, uv venv, and uv sync. Moving the check to the first step (or at least before "Install workspace dependencies") fails fast and avoids wasting runner time.

Suggested change
- name: Install workspace dependencies
env:
BLOCK_SPARSE_ATTN_SKIP_CUDA_BUILD: "TRUE"
run: |
uv venv --clear
uv sync --extra runners --no-install-package transformer-engine-torch
- name: Verify HF token is configured
run: |
test -n "${HF_TOKEN}" || {
echo "::error::HF_TOKEN secret is required for CD HF asset verification"
exit 1
}
- name: Verify HF inference assets
run: uv run --no-sync python flashdreams/tests/check_hf_inference_assets.py
- name: Verify HF token is configured
run: |
test -n "${HF_TOKEN}" || {
echo "::error::HF_TOKEN secret is required for CD HF asset verification"
exit 1
}
- name: Install workspace dependencies
env:
BLOCK_SPARSE_ATTN_SKIP_CUDA_BUILD: "TRUE"
run: |
uv venv --clear
uv sync --extra runners --no-install-package transformer-engine-torch
- name: Verify HF inference assets
run: uv run --no-sync python flashdreams/tests/check_hf_inference_assets.py

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread .github/workflows/cd.yml
Comment on lines +19 to +22
env:
UV_PROJECT_ENVIRONMENT: /tmp/flashdreams-venv
UV_LINK_MODE: copy
HF_TOKEN: ${{ secrets.HF_TOKEN }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The HF_TOKEN is set at job-level env, so it's injected into every step — including Checkout, Setup uv, and apt-get — though it's only needed by the final verification step. Following the pattern in ci.yml (step-scoped env) reduces the blast radius if any earlier step unexpectedly logs its environment.

Suggested change
env:
UV_PROJECT_ENVIRONMENT: /tmp/flashdreams-venv
UV_LINK_MODE: copy
HF_TOKEN: ${{ secrets.HF_TOKEN }}
env:
UV_PROJECT_ENVIRONMENT: /tmp/flashdreams-venv
UV_LINK_MODE: copy

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

)
else:
path = _probe(repo_id, revision, token=token)
elif "/" in url_or_id and not url_or_id.startswith(("http", "s3", "/")):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The heuristic not url_or_id.startswith(("http", "s3", "/")) doesn't guard against hf:// or gs:// URIs. If any component attr ever contains an hf://org/model/… or gs://bucket/… path, the elif branch fires, the full URI is passed as repo_id to _probe(), and the HF API returns a confusing 404/repo-not-found error instead of the expected "skipped non-HF path" message.

Suggested change
elif "/" in url_or_id and not url_or_id.startswith(("http", "s3", "/")):
elif "/" in url_or_id and not url_or_id.startswith(("http", "s3", "gs", "hf", "/")):

Comment on lines +86 to +89
raise AssertionError(
f"no probe file ({', '.join(_PROBE_FILES)}) downloadable "
f"from {repo_id}@{revision}: {last}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 AssertionError is semantically reserved for programmer-error assertions and is silenced when Python runs with the -O flag. An expected failure path (none of the probe files exist in a valid HF repo) is better represented as a RuntimeError so it can't be accidentally suppressed.

Suggested change
raise AssertionError(
f"no probe file ({', '.join(_PROBE_FILES)}) downloadable "
f"from {repo_id}@{revision}: {last}"
)
raise RuntimeError(
f"no probe file ({', '.join(_PROBE_FILES)}) downloadable "
f"from {repo_id}@{revision}: {last}"
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants