Skip to content

EXP-22322: Support GH_USER/GH_TOKEN env vars to clone GitHub subrepos over HTTPS#91

Open
Piranja wants to merge 1 commit into
mainfrom
task/EXP-22322-PAT-token
Open

EXP-22322: Support GH_USER/GH_TOKEN env vars to clone GitHub subrepos over HTTPS#91
Piranja wants to merge 1 commit into
mainfrom
task/EXP-22322-PAT-token

Conversation

@Piranja
Copy link
Copy Markdown
Contributor

@Piranja Piranja commented May 29, 2026

Ticket link

EXP-22322

PR description

  • What is the context for this PR?

In CI we use a GitHub PAT that only works over HTTPS, but .s7substate
lists every subrepo with an SSH URL ([email protected]:readdle/...)
because the same file is used locally with developers' personal SSH
keys. We do not want to rewrite .s7substate.

This PR adds an auto-activated auth mode: when both GH_USER and
GH_TOKEN are set in the environment, s7 transparently uses HTTPS+PAT
for every git network operation, while leaving .s7substate and the
cloned subrepos' .git/config untouched. Works recursively through
nested s7 subrepos (e.g., RDPDFKit → TesseractOCR) since they inherit
the env vars.

  • Why did you take the approach you did?

Implementation uses git's own url.<base>.insteadOf config injected
via -c flags on every git invocation. Concretely, s7 prepends:

-c url.https://USER:[email protected]/[email protected]:
-c url.https://USER:[email protected]/.insteadOf=ssh://[email protected]/

at the single chokepoint
+[GitRepository runGitWithArguments:stdOutOutput:stdErrOutput:currentDirectoryPath:]
in Git.m. Both SSH URL shapes ([email protected]:… and
ssh://[email protected]/…) are covered. Git applies the rewrite only at
network-resolution time — the URL stored in .git/config after clone
stays as the original SSH form, so the token never lands on disk and
existing URL-comparison logic (S7PostCheckoutHook.m:checkSubrepoUrlChanged)
is unaffected.

Why centralize at the chokepoint rather than rewrite in .s7substate
or at the clone call sites:

  • .s7substate must keep SSH URLs for local SSH-key users.
  • Every git command (clone/fetch/push/pull) funnels through
    +runGitWithArguments:..., so one injection covers everything.
  • +executeCommand: (used only for git init and local
    git merge --no-edit) doesn't touch remote URLs, so it doesn't need
    the rewrite.

Other design choices worth flagging:

  • Activation gate is encapsulated in +envGitHubTokenAuthEnabled,
    currently auto-on when both env vars are set. Swapping to require an
    explicit S7_USE_GH_TOKEN=1 opt-in is a one-line change there.
  • Host scope is github.com only — matches every URL in our
    .s7substate (and the nested RDPDFKit .s7substate) and the GH_*
    naming.
  • Userinfo percent-encoding for GH_USER / GH_TOKEN so tokens
    with /, @, :, %, etc. don't break URL parsing.
  • Trace masking: when S7_TRACE_GIT=1, the token is replaced with
    *** in the s7-printed command line. A one-line startup banner
    s7: GitHub token auth: enabled=YES|NO makes it easy to confirm the
    gate from CI logs.

Tests:

  • New unit-test file system7-tests/gitGitHubTokenAuthTests.m (12 cases)
    covering the argv builder, percent-encoding of special chars in both
    user and token, and trace-line masking.
  • New integration case
    system7-tests/integration/case-cloneViaGitHubTokenAuth.sh
    exercises the env-unset / env-set paths end-to-end via
    S7_TRACE_GIT=1, verifies the token never appears in the cloned
    subrepo's .git/, and confirms non-github URLs are untouched.
  • All existing unit tests pass. All integration cases pass except the
    pre-existing case-cloneS7andLFSmixedRepo.sh, which fails on the
    same git-lfs hook-template drift seen before this branch (unrelated
    to PAT auth).

@Piranja Piranja requested a review from lyeskin May 29, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant