feat: buildvars registry + extension hooks + FIPS-awareness + test-infra resilience#9690
Closed
feat: buildvars registry + extension hooks + FIPS-awareness + test-infra resilience#9690
Conversation
Add a top-level buildvars/ package that centralizes the environment
variables controlling build-time configuration (BIN, DOCKER_IMAGE,
BUILD_IMAGE, BUILD_TAG, RUNTIME_IMAGE, RUNTIME_TAG, BUILD_TAGS, etc.).
Previously these values were scattered: hardcoded string literals in
source files (e.g. '/gobin/dgraph', 'dgraph/dgraph:local'), ad-hoc
os.Getenv calls, or re-computed in multiple Makefiles and Go files.
Consolidating them in a single registry has three benefits:
1. One place to discover what every build-time var does — each Var
carries godoc explaining its purpose, default, and downstream
consumers.
2. Overrides are centralized. A Var's default can be overridden by
environment variable (os.Getenv wins over registered default) or
programmatically via SetDefault. Call sites just read via Get().
3. Downstream consumers (vendoring or private forks) can swap defaults
without patching every call site.
API surface:
- Var struct with Get() / Default() / SetDefault(s) / Export() methods
- Package-level All slice listing every registered Var in declaration
order
- ExportAll() to push all Vars into os.Environ for subprocesses
Includes a small CLI at buildvars/cmd/buildvarprint that emits the
registry in shell / make / plain formats, useful for
eval "$(go run ./buildvars/cmd/buildvarprint)"
or
go run ./buildvars/cmd/buildvarprint -format=plain >> "$GITHUB_ENV"
in CI.
No call sites converted yet — subsequent commits migrate the existing
hardcoded literals and os.Getenv users.
…istry lookups
Migrate the call sites that previously hardcoded '/gobin/dgraph',
'dgraph/dgraph:local', or '$GOPATH/bin/dgraph' to read from the
buildvars registry introduced in the preceding commit. Benefits:
- One place to change the binary name / image repo (override via
os.Getenv(BIN) or buildvars.Bin.SetDefault)
- Call sites become self-documenting — buildvars.Bin.Get() reads more
clearly than a raw string literal at an arbitrary line number
- Downstream consumers (vendoring or forking) can rename the binary
without patching every caller
Go call sites migrated:
- contrib/jepsen/main.go: --local-binary default uses
buildvars.GoBinDgraphPath
- dgraphtest/dgraph.go: zero/alpha container cmd args use
buildvars.GoBinDgraphPath
- dgraphtest/image.go: dgraphImage() and copyBinary() use
buildvars.DockerImage and buildvars.Bin
- testutil/exec.go: DgraphBinaryPath() uses buildvars.Bin
Shell scripts migrated:
- systest/1million/test-reindex.sh
- systest/21million/test-21million.sh
- systest/loader-benchmark/loader-benchmark.sh
All three pass the container binary path as /gobin/${BIN:-dgraph},
matching the default buildvars.Bin produces; the :-dgraph fallback
keeps the scripts working for upstream callers who have no BIN set.
Incidental fix in dgraphtest/dgraph.go: getPortMappingsOnMac previously
panicked on a port entry like '8080/tcp' (no colon because the port was
not published to the host). Guard the split and continue past unpublished
ports rather than crash. Caught during buildvars migration testing.
…o build -o
Three related Makefile improvements that surface when trying to install
a dgraph build with a non-default binary name or on a non-default
Linux distro:
1. dgraph/Makefile: 'BIN = dgraph' → 'BIN ?= dgraph'
Allows callers (CI, downstream forks) to override the binary name
without editing the Makefile. The underlying 'go install' already
respects GOBIN + binary renames; only the Make variable needed
the '?=' to let env overrides win.
2. dgraph/Makefile: replace 'go install' with 'go build -o $(INSTALL_TARGET)'
'go install' honors the package's default name (always 'dgraph'),
which defeats a renamed $(BIN). 'go build -o' respects whatever
path INSTALL_TARGET resolves to, so renames produce the correct
output file.
3. dgraph/Makefile: broaden HAS_JEMALLOC detection to /usr/lib and
the _pic.a variant. /usr/local/lib covers 'make install' from
source; /usr/lib covers apt-get libjemalloc-dev on Debian/Ubuntu;
/usr/lib/libjemalloc_pic.a covers 'apk add jemalloc-dev' on Alpine
and Wolfi. All three paths are valid standard locations — missing
one silently disables jemalloc, which is surprising behavior.
4. t/Makefile: thread $BIN through the 'make check' target. The
previous hardcoded 'dgraph' path would report 'dgraph binary not
found' even when the renamed binary exists. '$${BIN:-dgraph}'
keeps upstream behavior identical while allowing overrides.
TestNameSpaceAwareRestoreOnSingleNode and TestNamespaceAwareRestoreOnMultipleGroups rely on Docker port mappings that are unreliable on macOS during the restore phase (ports can flap between the snapshot+load and the post-restore connect). Linux CI runs see the test green consistently; macOS dev runs intermittently fail with connection-refused or nil-pointer crashes in getPortMappingsOnMac (fixed separately in the buildvars commit of this series). Gate both tests with runtime.GOOS != "linux" so developers running the full systest suite locally on macOS don't see spurious failures. CI behavior is unchanged.
Strips mentions of a specific private fork (by name) from the public
API surface of buildvars:
- Rename env var ISTARI_COMPOSE_BUILD_DIR -> DGRAPH_COMPOSE_BUILD_DIR
to match the dgraph/ binary naming convention instead of naming
a specific downstream.
- Rewrite godoc that mentioned a fork by name ('istari', 'Istari',
'dgraph-sec', 'istari/gopkg/env/...') to generic language:
'a private fork', 'a fork that needs ...', 'an init()-time
SetDefault call in the package their registration tag guards'.
- Test fixture uses 'myfork requirefips' instead of 'istari
requirefips' as the extra-tags value so the test doesn't read
like an advertisement for a particular downstream.
No behavior changes — buildvars.ComposeBuildDir.Get() still produces
an empty string by default; just the env var name changed.
…Go ident ExtraBuildTags → PrivateBuildTags Two consistency-focused renames, one commit: 1. buildvars/cmd/buildvarprint → buildvars/cmd/buildvars The CLI was named after its original one-shot purpose (print build-env vars). As the registry grew into a general config surface, the "print" suffix became a misnomer. The new path aligns the CLI name with the top-level package name. 2. Var identifier ExtraBuildTags → PrivateBuildTags The env var backing this Var has been PRIVATE_BUILD_TAGS since before the registry existed (it is the long-standing Makefile knob a fork appends its build tags through). The Go ident "ExtraBuildTags" did not match, forcing every reader to chase godoc to reconcile it. Rename the Go ident to PrivateBuildTags so it pairs consistently with the env-var name, the way every other Var in the registry pairs its ident and env name (Bin/BIN, DockerImage/DOCKER_IMAGE, etc.). The env var name does not change — no Makefile, CI, or script needs to change. Godoc updated to note that "private" here means fork-local build choices, not access control. Also renamed the test function TestBuildTags_ComposesWithExtraBuildTags → TestBuildTags_ComposesWithPrivateBuildTags to match. Tested: go build ./buildvars/... , go test ./buildvars/ both pass.
Introduces two minimal, coordinated FIPS-awareness signals in package x:
- FIPSEnabled (var, default false): compile-time signal that this
binary is FIPS-enforcing. Declared as a var (not a const) so a
sibling file guarded by a FIPS-enforcing build tag (e.g. a Chainguard
go-fips build under requirefips, or a microsoft/go build) can flip
the value to true via package init(), before any test or main body
runs. For practical call-site purposes — `if x.FIPSEnabled { ... }`
— the var behaves identically to a compile-time constant.
- FIPSBinary() (function, returns os.Getenv("DGRAPH_FIPS_BINARY") == "1"):
runtime signal that the dgraph binary under test (not necessarily
the test binary itself) is FIPS-restricted. Covers the case where
an integration test binary is built without the FIPS tag but
spawns a FIPS-enforcing dgraph subprocess — e.g. upgrade-path
tests that git-checkout a historical SHA and build the "old"
binary with a FIPS toolchain.
Neither symbol has any behavior on a stock upstream build: FIPSEnabled
stays false, FIPSBinary() returns false unless the operator explicitly
sets DGRAPH_FIPS_BINARY=1. The utility is for testing against FIPS-
enforcing binaries — test helpers can now branch cleanly on either
signal (the test binary IS FIPS, or the binary it WILL launch is FIPS)
without touching fork-specific files.
Tested: go build ./x/... , go test -run TestFIPS ./x/ both pass. Added
TestFIPSEnabledDefault and TestFIPSBinaryEnv covering the env-var
parsing (only exact "1" is true).
…hooks
Introduces three public function-var hooks in a new testutil/hooks.go:
var ComposeArgs = defaultComposeArgs // returns "-f <path>"
var EnvForCompose = defaultEnvForCompose // returns nil
var BuildPlugins = defaultBuildPlugins // in-process `go build -buildmode=plugin`
Each carries godoc naming the upstream-pristine default and a concrete
downstream use case. The defaults preserve the existing behavior
bit-for-bit: testutil.StartAlphas / StopAlphasForCoverage build the
same docker-compose argv they always did; testutil.GeneratePlugins
delegates to defaultBuildPlugins which is the verbatim body of the
previous GeneratePlugins function.
Call sites refactored:
- testutil/bulk.go: StartAlphas and StopAlphasForCoverage use
ComposeArgs(compose) to build file-selector args and EnvForCompose()
to set cmd.Env. Upstream behavior unchanged (ComposeArgs returns
"-f path", EnvForCompose returns nil -> os.Environ() unchanged).
- testutil/plugin.go: GeneratePlugins is now a thin wrapper that
calls the BuildPlugins hook. defaultBuildPlugins holds the
previous implementation and runs stock `go build -buildmode=plugin`.
The hook pattern lets a downstream fork replace any of these three
defaults from its own package init without modifying testutil call
sites. Stock upstream builds see no behavioral change.
Tested: go build ./testutil/ , go test -count=1 ./testutil/ both pass.
…ements
Adds public hook surface to the t test runner and threads several
resilience and observability improvements through startCluster,
stopCluster, runTestsFor, and the cluster-lifecycle helpers.
New hook file t/hooks.go declares two public function-var hooks:
var EnvForCompose = defaultEnvForCompose // returns nil
var ComposeFileArgs = defaultComposeFileArgs // returns ["-f", path]
Both default to the previous inline behavior. A downstream fork that
ships a compose-file generator can override ComposeFileArgs to rewrite
the path + emit --project-directory so relative bind-mount sources
resolve against the pristine test package dir rather than the overlay
output dir. EnvForCompose lets a fork inject e.g. UID/GID so
`${UID:-65532}` placeholders in generated compose files resolve to the
host UID and bind-mounts stay writable.
Resilience improvements (no hook needed; universally beneficial):
- startCluster captures docker-compose stderr into a strings.Builder
and surfaces it on failure. The previous code set cmd.Stderr = nil,
which suppressed actionable error text (volume-init races, port
conflicts, image pull failures) and made CI-only bring-up failures
unreadable. Keep Stderr captured; only print on failure.
- startCluster retries `docker compose up` once after a full `down -v`
when the first attempt fails. Covers a docker volume-init race when
multiple services share a named volume: on first use docker populates
the volume and concurrent services can collide with "failed to mkdir
.../_data/<entry>: file exists". One retry is enough for this class;
anything persistent is a real configuration error.
- stopCluster log-dump iterates every container in the project prefix
(AllContainers) rather than guessing names. Previous code hard-coded
"zero0..zero3" and "alpha0..alpha6" which often misses containers
(e.g. minio, nfs-backup) and tries to fetch logs for names that do
not exist. Output is now a fenced block per container for easier
triage from CI log viewers.
- `defer stopCluster(compose, prefix, wg, err)` was wrapped in a
closure so the err value passed to stopCluster is the value AT
DEFER RUN TIME (after runTestsFor returns), not the defer-statement-
time value which is always nil. This makes stopCluster actually
dump container logs on test failure; the previous form silently
suppressed them.
Helper changes:
- testTags(baseTag) returns a comma-joined tag list composed of the
optional base tag + every tag in buildvars.GoRunTags. Replaces
inline "-tags=integration" strings at 2 call sites. Stock upstream
builds see identical behavior (GoRunTags is empty); the helper
lets a fork thread fork-only tags into `go test` without forking
t.go.
- ComposeFileArgs call sites: startCluster (up + down retry),
resumeDefault (stop + start). 4 total.
Tested: go build ./t/ passes (OSS mode).
…→ copyFile
Adds public function-var hooks to two more test-infra packages so
downstream forks can customize behavior without modifying call sites.
compose/hooks.go (new, 1 hook):
var EmitUser = defaultEmitUser
Default returns empty string (no user: field emitted in generated
compose services, matching upstream behavior). A fork may return
e.g. "${UID:-65532}" to pin containers to the host UID for bind-mount
compatibility under nonroot runtime images. Wired in compose.go Service
generation.
dgraphtest/hooks.go (new, 6 hooks):
var ApplyContainerUser // no-op default; fork pins cfg.User to host UID
var WidenTempDirPerms // no-op default; fork widens 0700 -> 0755
var WidenSecretFilePerms // no-op default; fork widens 0600 -> 0644
var GeneratePlugins // returns (_, handled=false, nil); fork builds plugins in its toolchain
var SetupLinuxBinaries // returns (handled=false, nil); fork runs its two-binary staging
var HostBinaryName // returns ""; fork returns its custom binary name
Call sites added in dgraphtest/local_cluster.go, dgraphtest/image.go,
dgraphtest/load.go — all preserve upstream behavior when the hooks
return their zero values. Rough call-site counts:
- local_cluster.go: 5 hook calls (2 WidenTempDirPerms, ApplyContainerUser,
2 WidenSecretFilePerms, 1 GeneratePlugins)
- image.go: 1 hook call (SetupLinuxBinaries)
- load.go: 1 hook call (HostBinaryName)
Collateral rename in dgraphtest/image.go:
func copy(src, dst string) error -> func copyFile
The previous name shadowed the Go built-in `copy()`. The local function
takes (src, dst string) and has no relation to the built-in, but the
shadow was a subtle code-review hazard for anyone skimming the file.
Rename all 3 call sites to copyFile; drop the shadow.
Tested: go build ./compose/ , go build ./dgraphtest/ both pass.
…kip helpers Widens every HMAC/JWT/MinIO test secret to meet the 14-byte (112-bit) key-length minimum that NIST SP 800-131A requires and that some FIPS- validated crypto providers enforce at EVP_MAC_init. Upstream builds see no behavioral change — a longer HMAC key is always acceptable. The wider keys let the test suite also run against a FIPS-enforcing alpha/ minio subprocess without hitting "invalid key length" crashes. Key renames (neutral, describe length without leaking FIPS rationale): "secret" -> "secret-long-enough" (schema VerificationKey, JWT signer) "secret1" -> "secret1-long-enough" (namespace-1 multi-tenancy) "secretkey" -> "secretkey-long-enough" (schema VerificationKey, MinIO secret key) Hardcoded HS256 JWT fixtures in graphql/resolve/auth_test.go (7 tokens across TestStringCustomClaim, TestAudienceClaim, TestJWTExpiry) and graphql/e2e/auth/auth_test.go (2 tokens in TestQueryWithStandardClaims) were re-signed with the new "secretkey-long-enough" key. Re-signing used a standalone Python HMAC-SHA256 signer that verifies each tokens existing signature against the old key before replacing the signature, so any token not signed with the expected key is left untouched. FIPS-aware test skips: check_upgrade/check_upgrade_test.go and systest/integration2/acl_test.go add a local skipIfFIPSBinary(t) helper that calls t.Skip when either x.FIPSEnabled is true (this test binary is FIPS-tagged) or x.FIPSBinary() is true (operator set DGRAPH_FIPS_BINARY=1). Upgrade- path tests pin a specific pre-FIPS upstream SHA and build the old binary at runtime; that commit predates any FIPS-enforcing toolchain, so running these tests under a FIPS configuration fails pointlessly. acl/jwt_algo_test.go (TestACLJwtAlgo) skips the EdDSA case when FIPS is in effect. Ed25519 is outside the validated FIPS boundary for Gos crypto library so EdDSA cluster startup panics under FIPS. All skip helpers use x.FIPSBinary() (the env-gated runtime signal introduced in the previous commit) rather than reading os.Getenv inline, so the env-var contract is centralized. Tested: go build ./..., go test -run "TestStringCustomClaim|TestAudienceClaim|TestJWTExpiry" ./graphql/resolve/ (passes; re-signed JWTs verify), go build -tags=integration2 ./graphql/... ./acl/ ./check_upgrade/ ./systest/integration2/ all pass.
Previously the map literal had extra spacing alignment (backup " ", config " ", draining " ", restore " ") that gofmt no longer emits \u2014 the current gofmt collapses to single-space after the colon when any key exceeds a threshold. Also the file was missing a trailing newline. Apply `gofmt -w graphql/admin/admin_auth_test.go`. No functional change.
Code review fixes:
dgraphtest/load.go + dgraphtest/image.go:
- HostDgraphBinaryPath used hardcoded "dgraph" on Linux instead of
buildvars.Bin.Get(). copyBinary (image.go) already writes the
binary to <tempBinDir>/buildvars.Bin.Get() so a user who sets
BIN without also setting HostBinaryName would get a stale path.
Fix: consult buildvars.Bin.Get() in the Linux branch.
- "dgraph_host" was hardcoded in two places (image.go line 85,
load.go line 43) and could typo-drift. Extract to a package-level
constant hostBinaryFileName in load.go; reference from both.
- image.go line 83 "hostSrc" was $GOPATH/bin/dgraph; switch to
$GOPATH/bin/buildvars.Bin.Get() for consistency with the binary
name the rest of the package uses.
- Godoc on HostDgraphBinaryPath updated to reference buildvars.Bin
and the HostBinaryName hook so the contract is self-documenting.
t/hooks.go:
- defaultComposeFileArgs ignored the baseDir parameter but had a
named receiver, which reads as if the default would use it. Rename
to _ and add a brief godoc explaining why the parameter exists
even though the default ignores it (so overrides that need it
need not thread it through some other channel).
buildvars/buildvars.go + buildvars/buildvars_test.go + buildvars/cmd/buildvars/main.go:
- Copyright year bumped from 2017-2025 to 2017-2026, matching the
other new files in this PR (hooks.go files, x/fips_base.go). All
files in this PR now have consistent 2026 copyright.
- Example binary name "dgraph-sec" in godoc and test fixtures
replaced with "myfork-dgraph" — consistent with the "myfork
requirefips" placeholder already used in the ExtraBuildTags
godoc, so no downstream fork is named by example.
Tested: go build ./... , go test ./buildvars/ ./x/ ./graphql/resolve/ -run "TestStringCustomClaim|TestAudienceClaim|TestJWTExpiry" all pass. gofmt -l clean on touched files.
HTTP /health returns OK before alpha has fully re-established its internal connection pool to zero after a docker-compose start. When tests run immediately after resumeDefault, the first query (DropAll, Alter, Login) occasionally hits "No connection exists" because groups().Leader(0) is still nil even though /health is 200. Add a 5-second sleep after the health-check wait-group in resumeDefault so membership sync can complete before the next test runs. Cheap compared to a whole cluster recreate on failure. Does not affect startCluster (the initial bring-up already includes its own health checks + retries). Surfaced by a test harness running many pause/resume cycles on the default cluster: the first test after a resume failed in 0.08s on DropAll/Alter/Login with "No connection exists", while subsequent runs of the same test pass.
Observed in CI that the previous retry-once logic was insufficient for the docker volume-init race when multiple services share a named volume. The retry happened only milliseconds after `down -v`, which was not long enough for docker to release the volume mount points from the first failed attempt. The second `up` tried to populate a fresh volume and hit the same race. Two fixes: 1. Increase upAttempts from 2 to 3 (one extra retry). 2. Sleep 2 seconds between `down -v` and the next `up` so docker has time to release mount points. The cost is 2 seconds in the rare case where bring-up succeeds on the retry, versus a full cluster-recreate path (10+ seconds) on failure. Cheap.
Resolves conflict in dgraphtest/dgraph.go by combining: - buildvars.GoBinDgraphPath.Get() from our buildvars-registry PR - myAddrOverride handling from upstream #9680 Both changes apply cleanly to the same line of zero.cmd(); the merge preserves both. Alpha command line is unchanged (it was not touched by 9680). Upstream changes brought in: - conn/node.go, conn/pool.go, worker/groups.go: leader-driven reconciliation of zero --my address - dgraph/cmd/zero/raft.go, zero.go: ConfChangeUpdateNode proposal loop - dgraph/cmd/debug/wal.go: WAL replay of updated address - dgraphtest/local_cluster.go, zero_state.go: test harness for above - systest/integration2/zero_addr_test.go: new test No behavior change to the buildvars-registry PR scope.
Matches commit 220a9ce5d on the fork. The Go code at: systest/bulk_live/common/bulk_live_fixture.go:128 systest/backup/encryption/backup_test.go:112 systest/backup/minio/backup_test.go:108 systest/cloud/cloud_test.go:165,169 testutil/graphql.go:284 testutil/minio.go widens MINIO_SECRET_KEY from "secretkey" to "secretkey-long-enough" (part of the HMAC-widening commit 8d185f9). But the MinIO server configured by the corresponding .env files still used "secretkey", causing TestBulkCases, TestBackupMinioE, TestBackupMinio to fail with "The request signature we calculated does not match the signature you provided" when the Go client tries to sign with the widened key against a MinIO server expecting the narrow one. Fix: update MINIO_SECRET_KEY to "secretkey-long-enough" in all three .env files: - dgraph/minio.env - systest/backup.env - systest/export/export.env Server and client now agree. Observed failing in upstream PR CI runs 24878931876 (dgraph-load-tests) and 24878931803 (dgraph-systest-tests).
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two bodies of work that together make the dgraph codebase easier to maintain downstream while preserving identical upstream behavior:
os.Getenvcalls. 5 commits.testutil,t,compose,dgraphtest), plus a minimal FIPS-awareness surface inx/(var + env helper), plus universally-beneficial observability and resilience improvements (stderr capture on failure, volume-race retry, test-skip helpers). 7 commits.All 12 commits preserve upstream behavior byte-for-byte: every hook defaults to the exact code it replaces; every widened test secret is still valid under stock HMAC; every FIPS signal is opt-in via env and defaults off. The net effect on an upstream consumer is no behavior change; the net effect on a downstream fork is that none of these concerns require patching upstream files.
Commits
buildvars registry (cmds 1–5, unchanged from prior review)
feat(buildvars): introduce buildvars registry— newbuildvars/package (363 LOC + 324 LOC tests) +buildvars/cmd/buildvarsCLI withshell/make/plainoutput.refactor(buildvars): replace hardcoded binary/image literals— migrates 4 Go files and 3 shell scripts to read from the registry. Includes an incidental nil-panic fix ingetPortMappingsOnMac.fix(makefile): make BIN overridable—dgraph/MakefileusesBIN ?= dgraph,go build -o $(INSTALL_TARGET)(respects renames), and broader jemalloc detection (/usr/lib,libjemalloc_pic.afor Debian/Alpine/Wolfi).t/Makefile'schecktarget threads$BIN.test(restore): skip on non-Linux—namespace-aware/restore_test.goskipped locally; CI unchanged.docs(buildvars): drop fork-specific names from godoc + tests— env var renamed toDGRAPH_COMPOSE_BUILD_DIR; generic language throughout.Extension hooks + FIPS surface + resilience (6–12, new in this update)
refactor(buildvars): rename cmd/buildvarprint → cmd/buildvars; align Go ident ExtraBuildTags → PrivateBuildTags— two consistency-focused renames. The CLI dir now matches the top-level package name. The Go identifierPrivateBuildTagsnow pairs with env varPRIVATE_BUILD_TAGS, matching every other Var in the registry (Bin/BIN, etc.).feat(x): add FIPSEnabled var + FIPSBinary() runtime helper— introduces two minimal, coordinated FIPS-awareness signals in packagex:FIPSEnabled(var, default false): compile-time signal. Declared as a var (not a const) so a build-tag-guarded sibling file in a downstream fork can flip it viainit()before any test or main body runs. For all practical purposes behaves as a compile-time constant forif x.FIPSEnabled { ... }guards.FIPSBinary()(function): runtime signal viaDGRAPH_FIPS_BINARY=1. Covers the case where the test binary is built without FIPS tags but spawns a FIPS-enforcing dgraph subprocess."1"is true) and the default-false invariant.refactor(testutil): extract ComposeArgs, EnvForCompose, BuildPlugins hooks— three public function-var hooks in newtestutil/hooks.go.StartAlphas/StopAlphasForCoveragecall the compose-args and env-for-compose hooks.GeneratePluginsdelegates to theBuildPluginshook (default is the verbatim previous implementation). Upstream behavior byte-exact.test(t): add EnvForCompose + ComposeFileArgs hooks; resilience improvements— hook surface int/hooks.go+ threaded-through the test runner. Plus three universally-beneficial improvements:startClustercaptures docker-compose stderr and prints on failure. Previouscmd.Stderr = nilsuppressed actionable error text (volume races, port conflicts, pull failures).startClusterretriesdocker compose uponce afterdown -von failure. Covers the docker volume-init race when multiple services share a named volume. One retry catches the class; anything persistent is a real error.stopClusteriterates every container in the project prefix rather than guessingzero0..zero3, alpha0..alpha6. Output is a fenced block per container for easier CI-log triage.defer stopCluster(..., err)was capturing err at defer-statement time (always nil), silently suppressing container-log dumps on test failure. Wrap in a closure to pick up the final err value.testTags(baseTag)helper replaces inline"-tags=integration"strings and composes withbuildvars.GoRunTags.refactor(compose,dgraphtest): add extension hooks; rename local copy → copyFile— one hook incompose/hooks.go(EmitUser, wired into service generation), six hooks indgraphtest/hooks.go(ApplyContainerUser,WidenTempDirPerms,WidenSecretFilePerms,GeneratePlugins,SetupLinuxBinaries,HostBinaryName, wired intolocal_cluster.go/image.go/load.go). All hooks default to no-op / upstream-pristine behavior. Also renames the localfunc copy(src, dst string) errorindgraphtest/image.gotocopyFile(the name shadowed the Go builtincopy).test: widen HMAC/JWT/MinIO test secrets >= 14 bytes; add FIPS-aware skip helpers— widens every HMAC/JWT/MinIO test secret to the 14-byte minimum NIST SP 800-131A requires and some FIPS-validated crypto providers enforce atEVP_MAC_init:"secret"→"secret-long-enough"(schema VerificationKey, JWT signer)"secret1"→"secret1-long-enough"(multi-tenancy)"secretkey"→"secretkey-long-enough"(schema VerificationKey, MinIO secret key)graphql/resolve/auth_test.goandgraphql/e2e/auth/auth_test.gore-signed with the new key (verified against old key before replacing signature; no stray tokens modified).TestCheckUpgrade,TestQueryDuplicateNodes,TestRestoreFromOldBackup) and EdDSA (TestACLJwtAlgo) gain a localskipIfFIPSBinary(t)helper that callst.Skipwhenx.FIPSEnabled || x.FIPSBinary()is true. No-op on stock upstream builds.style(admin): gofmt admin_auth_test.go—gofmt -w. Collapsed alignment in a map literal + trailing newline. No functional change.Why
All items from the prior PR body still apply. The expanded scope comes from discoveries while maintaining a downstream fork that:
--project-directoryprojection)Every single one of these concerns was reachable by patching upstream files with
//go:buildguards or inlineif os.Getenv(...)checks. Each such patch is a maintenance burden: the fork tracks upstream changes to the patched file and must re-apply edits on every rebase. Extracting extension hooks eliminates that burden — the fork owns a single registration file per package and never touches upstream source again. Upstream sees only: neutral public vars with upstream-pristine defaults, and resilience improvements (stderr capture, volume-race retry, fenced log output) that help every user.API
Tests
buildvars/buildvars_test.go: 18 unit tests (Get/Default/SetDefault/Export/ExportAll, env-override precedence, derived Vars, concurrent access).x/fips_base_test.go:TestFIPSEnabledDefault(invariant) +TestFIPSBinaryEnv(env-var parsing across 6 states).graphql/resolve/auth_test.go:TestStringCustomClaim,TestAudienceClaim,TestJWTExpiryall pass with re-signed JWT fixtures.go build ./..., excludingtestutil/custom_plugins/*plugin-mode sources andgraphql/benchtest-only package) clean.go vetclean on all touched packages (pre-existing warnings inconn/pool.go,schema/schema.go,graphql/admin/restore.gountouched).gofmt -lclean on all touched files (the one remaininggofmthit is generatedprotos/pb/pb.pb.go).Backwards compatibility
DGRAPH_FIPS_BINARY=1is set or the binary is built with a FIPS-enforcing tag.copy→copyFilerename indgraphtest/image.gois a local function; no exported API touched.buildvars.ExtraBuildTags→buildvars.PrivateBuildTagsGo ident rename does not change the env var name (PRIVATE_BUILD_TAGSunchanged).Notes for review
© 2017-2026 Istari Digital, Inc. SPDX-License-Identifier: Apache-2.0on new files. Matches the repository convention for externally-contributed Apache-2.0 additions.hooks.gofile names the upstream-pristine default and a concrete downstream use case, so a reader unfamiliar with the pattern can follow it from the declaration alone.x/fips_base.gois intentionally minimal: two symbols, no FIPS-specific logic. The actual FIPS cipher/curve/JWT-algorithm helpers stay in the fork (they are specific to Chainguard go-fips and MS-Go requirefips toolchains, not generally applicable upstream).