Harden release paths and polish statusline docs#31
Merged
Conversation
Pin native binary checksums inside the npm package and validate release tarball extraction before installing so postinstall no longer trusts a mutable checksum asset beside the payload. Add release-version guardrails to keep tag, Cargo, npm, installer, and checksum metadata aligned. Constraint: G001 ultragoal requires npm supply-chain hardening before the next story Rejected: Continue downloading .sha256 from GitHub Release as the trust root | it can be swapped with the tarball by the same compromised release boundary Confidence: high Scope-risk: moderate Directive: Update npm/checksums.json whenever VERSION advances, and keep workflow tag verification before publishing Tested: node --check npm/install.js; node --check npm/verify-release.js; node npm/verify-release.js 0.7.0; npm view [email protected] version; cd npm && npm pack --dry-run; node npm/install.js; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings Not-tested: GitHub Actions OIDC publish execution in remote runner
Release and publish jobs now verify packaged/downloaded tarballs against the npm-pinned checksum manifest, while install cleanup avoids async temp deletion and cross-device rename exposure.\n\nConstraint: G001 ultra-review found the checksum manifest was not tied back to release artifacts.\nRejected: Trust checksums.json format-only validation | it lets stale or mistyped hashes pass until user postinstall.\nConfidence: high\nScope-risk: narrow\nDirective: Keep npm publish blocked unless downloaded release binaries match npm/checksums.json.\nTested: node --check npm/install.js npm/verify-release.js; node npm/verify-release.js 0.7.0; direct release tarball checksum verification; npm pack --dry-run; node npm/install.js; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Move checksum binding to npm publish time so the immutable npm package pins the exact release tarballs it downloaded, avoiding precomputed gzip tarball deadlock while preserving install-time checksum enforcement.\n\nConstraint: G001 review found release.yml compared non-deterministic tar.gz output against precommitted hashes.\nRejected: Force precomputed tarball hashes in release.yml | gzip headers make archives non-reproducible and new releases cannot know hashes before CI builds them.\nConfidence: high\nScope-risk: moderate\nDirective: Do not reintroduce release-time comparison against stale source checksums; publish must write checksums.json from verified release artifacts before npm publish.\nTested: node --check npm/install.js npm/verify-release.js npm/test-verify-release.js; node npm/verify-release.js 0.7.0 --verify-packlist; node npm/test-verify-release.js; direct v0.7.0 tarball+sha256 verification; node npm/install.js; npm pack --dry-run; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Keep publish-time checksum generation explicit, require a separate read-only verification pass, and add installer-focused regression tests to prevent supply-chain hardening regressions from merging untested.\n\nConstraint: G001 review found --write-checksums could neutralize --require-checksums when combined.\nRejected: Keep write and require in one verify invocation | it obscures whether a manifest was generated or independently verified.\nConfidence: high\nScope-risk: narrow\nDirective: Run npm test in CI whenever release verification or installer hardening changes.\nTested: node --check npm/install.js npm/verify-release.js npm/test-verify-release.js npm/test-install.js; npm test --prefix npm; node npm/verify-release.js 0.7.0 --verify-packlist; direct v0.7.0 tarball+sidecar write then require verification; write+require negative check; node npm/install.js; npm pack --dry-run; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; workflow YAML parse; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Ensure manual release and npm publish dispatches operate only on refs/tags commits, remove the production installer test bypass, and make publish-time verification reject malformed release tarballs before generating package checksums.\n\nConstraint: G001 ultra-review found dispatch could resolve a mutable branch named like a release tag.\nRejected: Trust checkout ref name alone | GitHub checkout can resolve branch-like refs without proving refs/tags identity.\nConfidence: high\nScope-risk: moderate\nDirective: Keep release and publish workflows pinned to refs/tags/ and verify HEAD equals the tag commit before producing or publishing assets.\nTested: node --check npm/install.js npm/verify-release.js npm/test-verify-release.js npm/test-install.js; npm test --prefix npm; node npm/verify-release.js 0.7.0 --verify-packlist; direct v0.7.0 tarball+sidecar layout/write/require verification; write+require negative check; node npm/install.js; npm pack --dry-run --ignore-scripts; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; workflow YAML parse; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Close the remaining release-dispatch and publish-gate gaps by proving workflows run from immutable tag commits, validating release tarball layout before checksum generation, and bounding installer downloads.\n\nConstraint: G001 review found manual dispatch could resolve mutable branch refs and publish verified checksums without proving tarball installability.\nRejected: Accept version-file checks as tag proof | matching versions do not prove HEAD is refs/tags/.\nConfidence: high\nScope-risk: moderate\nDirective: Release/publish workflows must continue checking refs/tags/ identity before asset production or npm publish.\nTested: node --check npm/install.js npm/verify-release.js npm/test-verify-release.js npm/test-install.js; npm test --prefix npm; node npm/verify-release.js 0.7.0 --verify-packlist; direct v0.7.0 tarball+sidecar layout/write/require verification; write+require negative check; node npm/install.js; npm pack --dry-run --ignore-scripts; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; workflow YAML parse; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Require manual dispatches to declare the tag commit, upload per-target provenance with release assets, and verify that npm checksum generation uses artifacts from the expected immutable commit.\n\nConstraint: G001 architecture review found filename-only release asset checks could accept stale assets after tag movement.\nRejected: Treat current remote tag equality as sufficient provenance | it does not bind already-uploaded assets to the expected source commit.\nConfidence: high\nScope-risk: moderate\nDirective: Preserve provenance verification whenever publish regenerates npm/checksums.json from GitHub Release assets.\nTested: node --check npm/install.js npm/verify-release.js npm/test-verify-release.js npm/test-install.js; npm test --prefix npm; node npm/verify-release.js 0.7.0 --verify-packlist; direct v0.7.0 tarball+sidecar layout/write/require verification; write+require negative check; node npm/install.js; npm pack --dry-run --ignore-scripts; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; workflow YAML parse; git diff --check\nNot-tested: GitHub Actions release/publish jobs live against a new tag
Revalidate cached Codex enrich decisions against a lightweight rollout-tree fingerprint, cap candidate scan work, and avoid fixed per-candidate first-line allocation while preserving fail-safe ambiguity behavior. Constraint: G002 requires cached sessions not to bypass new same-cwd ambiguity and scan work to stay bounded. Rejected: Always trust cached path mtime alone | it cannot detect newly-created same-cwd rollouts. Confidence: high Scope-risk: moderate Directive: Keep Codex enrich fail-safe; budget-exceeded or ambiguous scans must not populate the cache. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Real large ~/.codex session tree latency on user machines.
Bind cache reuse to the normalized cwd that produced the single candidate, and replace per-hit rollout-file fingerprint scans with recent day-directory mtime checks plus a total directory-entry scan budget. Constraint: ultra-review-loop found HIGH blockers for cwd-unbound cache reuse and cache-hit file scanning on the statusline hot path. Rejected: Reusing cache by session key and file mtime alone | it can display a previous cwd's Codex session after key reuse or pane drift. Rejected: Fingerprinting every rollout file on every cache hit | it regresses the intended cached hot path. Confidence: high Scope-risk: moderate Directive: Cache hits must remain cwd-bound; new rollout creation should force a rescan via day-directory mtime changes. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Filesystems that do not update directory mtimes on rollout creation.
Record stale same-cwd rollout peers during full scan and invalidate cache reuse if any watched peer becomes fresh without a day-directory mtime change. Constraint: ultra-review-loop round 2 found existing stale same-cwd rollouts could become fresh without changing the day-directory fingerprint. Rejected: Trusting day-directory mtime alone | existing rollout appends update file mtime, not necessarily parent directory mtime. Confidence: high Scope-risk: moderate Directive: Keep cache reuse conditional on cwd match, stable day fingerprint, fresh cached path, and stale watched peers. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Filesystems with unusual utimes behavior for test fixtures beyond POSIX-style mtime updates.
Cap year/month/day discovery entries before scanning rollout files so pathological sessions trees cannot bypass the G002 scan budget. Constraint: Round 3 review left date-directory discovery as the remaining scan-work hardening gap. Rejected: Bounding only rollout file entries | huge bogus date trees can still consume hot-path work before file scanning starts. Confidence: high Scope-risk: narrow Directive: Any discovery overflow must degrade to Ambiguous rather than single-candidate enrichment. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Real-world Codex homes with more than 512 valid year/month/day entries per level.
Partial read failures in the Codex rollout date tree must not collapse to a trusted empty or single-candidate scan, because cache reuse and enrich correctness depend on a complete local view. Constraint: ultra-review-loop blocker required read_dir and metadata failures to degrade to Ambiguous instead of trusting partial scan state Rejected: Treat unreadable paths as empty directories | that can hide fresh same-cwd rollouts and revive stale cache confidence Confidence: high Scope-risk: narrow Directive: Keep future Codex cache fast paths fail-safe whenever scan completeness cannot be proven Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Permission-denied chmod scenario under non-macOS filesystems
Codex cache confidence must not survive rollout stat, mtime, or partial-line uncertainty because any hidden same-cwd rollout can make a displayed single session wrong. Constraint: ultra-review-loop found rollout metadata failures and watched stale-peer stat failures still collapsed to trusted cache/scan state Rejected: Treat rollout stat errors as stale or non-rollout | that hides incomplete filesystem views and can preserve stale cache confidence Confidence: high Scope-risk: narrow Directive: Preserve tri-state scan semantics for rollout filesystem uncertainty; do not convert unknown rollout state into false/empty Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Non-Unix symlink-specific stat-failure fixtures are cfg(unix)
Codex rollout matching should trust only newline-complete session_meta records; otherwise one partial rollout can hide ambiguity behind a cached or discovered single candidate. Constraint: ultra-review-loop found short unterminated first lines still parsed as complete enough for scan confidence Rejected: Only fail on over-limit unterminated lines | short truncated writes carry the same partial-view ambiguity Confidence: high Scope-risk: narrow Directive: Keep candidate scan parsing line-completeness-sensitive; do not parse newline-missing rollout headers as evidence Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: None
Codex session enrichment must only reuse cache or display a single candidate when day fingerprints and rollout headers are proven complete. Constraint: architecture review found day-dir mtime failures and malformed first lines still collapsed uncertainty into cache or no-candidate state Rejected: Keep malformed first lines lenient in candidate scanning | leniency belongs in pure parsers, not confidence-building rollout scans Confidence: high Scope-risk: narrow Directive: Do not let optional mtime fields or parser None values participate in cache confidence without an explicit fail-safe gate Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Non-Unix pre-epoch mtime fixture paths are cfg(unix)
Cached Codex sessions should only be reused while the cached path is still provably the same kind of rollout file, not merely a path with matching mtime. Constraint: ultra-review-loop found cache reuse survived replacing the rollout file with a directory that preserved mtime and day fingerprint Rejected: Trust cached path mtime alone | directories and non-files can preserve mtime while no longer being rollout data Confidence: high Scope-risk: narrow Directive: Cache-hit gates must prove rollout file identity before trusting cached session data Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: None
Codex cache reuse should require the same bounded visibility proof as a miss path, so stable day mtimes alone cannot bypass scan-entry budget or incomplete-view guards. Constraint: architecture review found cache hits compared day fingerprints without proving current scan budget/completeness Rejected: Keep mtime-only fingerprint probe | it preserves latency but cannot prove fail-safe visibility under entry explosion Confidence: high Scope-risk: narrow Directive: Cache-hit probes must continue to fail closed on scan budget or incomplete filesystem visibility before returning cached sessions Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: None
Protect render-time stdin, config reads, and CPU double-sample delay with explicit caps while keeping normal statusline payloads and default sampling behavior unchanged. Constraint: G003 requires stdin and CPU/config hot-path limits without changing ordinary statusline output. Rejected: Global async worker or new dependency | too broad for a bounded render-path hardening story. Confidence: high Scope-risk: narrow Directive: Keep statusline input/config limits fail-safe and quiet; oversized or corrupt inputs should degrade to defaults rather than block rendering. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Real-world oversized Claude stdin from an actual host integration.
Add regression coverage that exercises the CPU sample-window cap through the sampled core and verifies oversized stdin/config behavior through the actual binary render path. Constraint: Review loop flagged helper-only coverage as insufficient for G003 hot-path guarantees. Rejected: Wall-clock latency assertions | live CPU and process scheduling would make the tests flaky. Confidence: high Scope-risk: narrow Directive: Keep hot-path limit tests deterministic through injected seams or isolated HOME/config paths, not timing guesses. Tested: cargo fmt --check; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: External Claude/Grok review completion; local Claude timed out and Grok required auth in this environment.
Refresh the support matrix/config docs for Claude rate-limits, lterm/cmux, and Codex manual status paths, and make NO_COLOR-mutating render tests restore prior environment state. Constraint: G004 requires docs refresh plus environment-variable test hygiene without changing runtime behavior. Rejected: Exporting test-only env utilities or adding a docs generator | broader than the targeted hygiene/docs story. Confidence: high Scope-risk: narrow Directive: Keep process-env tests guarded by ENV_LOCK plus RAII restoration; do not leave NO_COLOR mutated after a test. Tested: cargo fmt --check; NO_COLOR=1 cargo test render::tests:: --locked; cargo test --all-targets --locked; cargo clippy --all-targets --locked -- -D warnings; git diff --check Not-tested: Live lterm/cmux or tmux rendering outside the documented command examples.
Polish README wording and align the design-lab HTML preview with the current calm glyph ramp and COLOR-ONCE contract before opening the PR. Constraint: User requested README and HTML proofreading plus PR creation after the completed ultragoal work. Rejected: Editing npm/homebrew README files or changing runtime code | outside the requested README/HTML polish scope. Confidence: high Scope-risk: narrow Directive: Keep design-lab copy synchronized with README theme tokens and glyph-color-only behavior. Tested: git diff --check Not-tested: Browser visual rendering of docs/design-lab.html.
Constraint: User requested one more Claude-assisted proofreading pass before PR finalization. Rejected: Editing docs/design-lab.html in this pass | Claude found no additional HTML issues and README-only wording fixes kept the update scoped. Confidence: high Scope-risk: narrow Tested: python3 HTMLParser on docs/design-lab.html; git diff --check origin/main...HEAD Not-tested: Browser visual rendering of docs/design-lab.html.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/design-lab.htmlwith the current calm glyph ramp and COLOR-ONCE previewVerification
python3HTML parser fordocs/design-lab.htmlgit diff --check origin/main...HEADcargo fmt --checknpm test --prefix npmNO_COLOR=1 cargo test render::tests:: --lockedcargo test --all-targets --lockedcargo clippy --all-targets --locked -- -D warningsNotes
.serena/andHANDOFF.mdwere not staged.