Skip to content

Release v0.25.0#237

Merged
tis24dev merged 27 commits into
mainfrom
dev
Jun 20, 2026
Merged

Release v0.25.0#237
tis24dev merged 27 commits into
mainfrom
dev

Conversation

@tis24dev

@tis24dev tis24dev commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Bootstrap release PR that lands the new immutable-tag-safe governed release pipeline on main AND cuts the first release under it.

This release carries the 22 dev commits ahead of main, including: fail-closed on incomplete backup archives (H04), refuse cross-category hardlinks during selective restore (#27), redact notifier tokens/secret-URLs from logs (#34 #35 #36), notify + metrics on early-init failures, and config docs disambiguation.

Pipeline change in this PR (see ci(release) commit):

  • pr-vX.Y.Z trigger -> auto release PR -> squash-merge -> vX.Y.Z created ONCE on the squash commit -> GoReleaser release -> dev synced from main.
  • release-intake / release-guard / post-merge-release workflows + release-policy.sh; release.yml tag-on-main gate; sync-dev.yml + autotag.yml removed.

IMPORTANT: merge with "Squash and merge" only (post-merge-release asserts a single-parent squash commit). On merge, post-merge-release fast-forwards dev and creates the v0.25.0 tag once, which triggers GoReleaser to publish.

Summary by CodeRabbit

  • New Features
    • Release pipeline helpers and post-merge/tag intake workflows; added concurrency gating for releases
    • Per-installation passphrase salt support and manifest salt embedding for encryption/decryption
    • System accounts restore during staged restores; accounts category staging
    • Dedup auditing + dedup symlink materialization; archive completeness checks
    • Secret redaction in logs and notifier errors
  • Bug Fixes
    • Benign “backup in progress” handling; tighter temp/workspace and registry cleanup security
    • Fail-closed integrity checks for archives and optimizations; improved PBS/PVE exclusion logic
    • Restore firewall restart failure messaging
  • Documentation
    • Updated security/process allowlist guidance and passphrase decryption steps

Greptile Summary

This PR lands the new immutable-tag-safe governed release pipeline on main and cuts v0.25.0. It bundles 22 dev commits covering: fail-closed incomplete-archive detection (H04), cross-category hardlink refusal in selective restore (#27), notifier token/URL redaction from logs (#34–36), per-installation passphrase salt for AGE encryption, accounts category restore with all-or-nothing atomic writes and rollback, temp-workspace security hardening (#53–56), and dedup symlink materialization on restore (#70).

  • Release pipeline: autotag.yml / sync-dev.yml removed; replaced by release-intake → auto PR → squash-merge → post-merge-release (tags once on squash commit) → GoReleaser. Policy is always sourced from trusted origin/main.
  • Backup integrity: CreateArchive now aggregates skipped entries and fails closed on ErrArchiveIncomplete; VerifyArchive adds entry-count reconciliation via a newline-counting writer.
  • Accounts restore: writeFilesAtomic implements two-phase prepare-all / commit-all with per-entry rollback so /etc/passwd, shadow, group, and gshadow are updated all-or-nothing.

Confidence Score: 5/5

Safe to merge.

All core changes are correctly implemented. The only finding is a %w-to-%s error-chain loss in two notifier sites, which has no behavioral impact on backup/restore correctness.

internal/notify/gotify.go and internal/notify/telegram.go: error chain is dropped when wrapping redacted network errors.

Important Files Changed

Filename Overview
internal/backup/archiver.go Adds fail-closed incomplete-archive detection and entry-count reconciliation in VerifyArchive.
internal/orchestrator/restore_accounts.go New file: safe merge of /etc/passwd, shadow, group, gshadow via writeFilesAtomic with rollback.
internal/notify/gotify.go Adds token redaction but changes %w to %s, dropping the error chain.
internal/notify/telegram.go Same broken-chain issue as gotify.go.

Fix All in Claude Code Fix All in Cursor Fix All in Codex

Reviews (2): Last reviewed commit: "fix(orchestrator): write the account DB ..." | Re-trigger Greptile

tis24dev and others added 23 commits June 12, 2026 12:21
…is off (#60)

collectPVEConfigSnapshot only excluded user.cfg/domains.cfg when BACKUP_PVE_ACL
was disabled, so the credential files under /etc/pve/priv (shadow.cfg password
hashes, token.cfg API token secrets, tfa.cfg TFA secrets) still shipped in the
backup, making the toggle misleading. Exclude them too via "**/priv/<file>.cfg"
patterns (token-anchored on priv/, leaving notifications.cfg/authkey.key/acme
untouched), reusing the existing withTemporaryExcludes + ** glob machinery.

/etc/pve is a pmxcfs mount backed by config.db, which still carries the same
secrets when BACKUP_CLUSTER_CONFIG is on; emit a WARNING in
collectPVEClusterSnapshot and document that fully excluding them also requires
BACKUP_CLUSTER_CONFIG=false. Add PVE manifest entries for the three priv files
and an orchestrator consistency test that locks the pve_access_control category
to the restore constants.
…NFIGS is off (#61)

collectPBSConfigSnapshot only excluded user.cfg/acl.cfg/domains.cfg when
BACKUP_USER_CONFIGS was disabled, so the credential files token.cfg (API token
secrets), shadow.json (password hashes), token.shadow and tfa.json (TFA secrets)
still shipped in the backup, making the toggle misleading. Exclude them too via
basename patterns (they are top-level files in /etc/proxmox-backup), reusing the
existing withTemporaryExcludes machinery.

Unlike the PVE sibling (#60) there is no pmxcfs/config.db residual path, so no
extra caveat is needed. Manifest parity is intentionally not added: the PBS
manifest helper warns + counts not-found on absent files (no suppress option),
and these secrets are commonly absent (no TFA/tokens/local-password users), which
would produce misleading warnings. Add an orchestrator consistency test locking
the pbs_access_control category to the restore constants.
The opt-in dedup/prefilter stages could damage backup fidelity:

#71: replaceWithSymlink removed the duplicate then created the symlink, so a
failed os.Symlink lost the staged file. Create the link at a temp name and
os.Rename it over the duplicate, leaving the original untouched on any error.

#72: minifyJSON round-tripped through Unmarshal-into-any + Marshal, losing number
precision (>2^53), duplicate keys and key order. Use json.Compact (token-level
whitespace stripping only) and also skip JSON under sensitive config dirs like
the .conf/.cfg/.ini handling.

#70: dedup replaced duplicate files with symlinks the archiver preserved, so a
selective restore could leave dangling links and a full restore changed file
type. deduplicateFiles now records each created symlink (path+mode) in
var/lib/proxsave-info/dedup_manifest.json; the restore always extracts that
manifest and, after extraction, materializes each dedup symlink back into a
regular file (with containment via restoreFS), removing dangling links with a
warning when the target's category was not selected.
…PVE backup variants (#62 #65)

#62: getDatastoreList logged a failed 'proxmox-backup-manager datastore list'
command and unparsable JSON only at Debug and always returned (datastores, nil),
so a PBS host with a real enumeration failure silently shipped an archive with
incomplete per-datastore status. Log those two failures at Warning (noting the
raw datastore.cfg is still collected); the benign "CLI not present" case stays
Debug. Behaviour is unchanged otherwise (best-effort, never aborts the backup).

#65: defaultPVEBackupPatterns missed legacy vzdump compression variants, so
backups produced with lzop or xz were not recognised for sampling/small-copy.
Add *.vma.lzo/*.vma.xz/*.tar.lzo/*.tar.xz, and unify the duplicated inline
default list in the file sampler to reference defaultPVEBackupPatterns so the two
cannot drift.
…ularity (#59)

BackupManifest.SystemFiles was declared and assigned from c.systemManifest, but
systemManifest was never initialized or populated (unlike pve/pbsManifest), so
system_files was always absent from manifest.json. Populate it during the system
collection phase via a central hook: safeCopyFile records each target's outcome
(collected/not_found/skipped/failed) and safeCopyDir records one entry per
directory target while suppressing per-file recording inside the walk, so the
manifest stays at collection-target granularity (no per-file bloat) and matches
the pve/pbs manifest style. Gated by recordSystemManifest so PVE/PBS/archive
copies are unaffected.
…stead of preserving it (#53)

cleanupBackupWorkspace (deferred at the end of RunGoBackup) preserved the staging
workspace whenever a temp-dir registry was present (the normal CLI path), only
logging "preserved ... will be removed at the next startup". That left plaintext
copies of sensitive files (shadow, SSL/SSH keys) gathered before encryption on
disk for the whole inter-run window (or up to 24h).

Always RemoveAll the workspace when the run finishes (success or failure) and then
Deregister it; if removal fails, keep it registered so the next run's orphan
sweep retries. The registry now serves crash recovery only: a process that dies
before the deferred cleanup leaves a registered orphan that the next run cleans.
…cleanup (#54 #55)

#54: the shared temp root /tmp/proxsave was created/used via MkdirAll + Stat,
which follows symlinks and never checks ownership/mode, so an attacker who
pre-creates it as a symlink or a world-writable/non-root-owned directory before
ProxSave runs could redirect the staging workspace. Add ensureSecureTempRoot,
which Lstats the root, refuses a symlink / non-directory / group-or-world-writable
/ non-root-owned root, and creates it 0700 when absent; use it in
prepareBackupWorkspace, preparePlainBundleCommon and the rclone download path.
CheckTempDirectory now also rejects a symlinked root (osStat followed it).

#55: CleanupOrphaned called os.RemoveAll(entry.Path) with no containment, so a
poisoned registry (or a controlled PROXMOX_TEMP_REGISTRY_PATH) could delete
arbitrary paths. Only remove paths that are non-symlink directories contained
under the trusted temp root and carrying the .proxsave-marker written before
registration; untrusted entries are dropped from the registry without touching
the filesystem.
…walk (#56)

collectCustomPaths copies each operator-supplied path into the staging tempDir
via safeCopyDir. A broad entry ("/", "/tmp", "/tmp/proxsave") made the source
walk descend into the in-progress staging tree and copy it into itself, causing
recursion, churn and disk blow-up.

Add isWithinStagingDir and prune the staging workspace from the source walk
(SkipDir on tempDir and below) in safeCopyDir/safeCopyFile. The prune is scoped
to the custom-path phase via collectingCustomPaths, since every other collection
source is a fixed system path that never contains tempDir; this keeps generic
collection (and its tests) unaffected.
…iles selectively restorable (#66 #67)

Several files are collected into the backup but match no restorable category,
so a selective restore (even "Full") silently omits them; only the plain
fallback restored them.

- Add a Common, staged, opt-in `accounts` category (etc/passwd|group|shadow|
  gshadow|sudoers) with a safe merge-apply (restore_accounts.go) that never
  clobbers the host identity:
  * preserves host root and system accounts (uid/gid < 1000) and never
    overwrites an existing host account or group (gid/members kept);
  * imports only regular backup users (uid >= 1000, 32-bit, not the (uid_t)-1
    sentinel) whose primary gid is not 0 and not any existing host group gid;
  * imports only new regular groups without a gid collision; for an existing
    host group name, merges only imported members and only when the gid matches
    (blocks gid-spoofed/primary-gid injection into sudo/docker/...);
  * keeps passwd<->shadow in sync (locked placeholder when a staged shadow line
    is missing) and replaces /etc/sudoers only when it passes `visudo -c`;
  * gated on root, real FS, dry-run, and an empty/unreadable current passwd.
- Extend `crontabs` with etc/cron.{daily,hourly,weekly,monthly}/ and
  `storage_stack` with etc/keys/, etc/luks-keys/, etc/cryptsetup-keys.d/.
- Add a coverage drift guard test asserting the previously-orphaned paths now
  match a system-writable category.

PBS network.cfg is excluded (false orphan: real network config is
/etc/network/interfaces, already restorable via the staged network category).
…nd (#68)

SYSTEM_ROOT_PREFIX was documented in the backup.env template and supported by
CollectorConfig.SystemRootPrefix, but the top-level config never parsed it and
the orchestrator never copied it into the collector, so the option was inert on
the normal CLI path.

- config.Config: add the SystemRootPrefix field and parse SYSTEM_ROOT_PREFIX in
  parseSystemSettings (trimmed; CollectorConfig.Validate already rejects a
  non-absolute value, and empty/"/" mean real root).
- applyCollectorOverrides: copy cfg.SystemRootPrefix into the CollectorConfig
  (the single production builder).
- collector_pve: route the corosync and vzdump config reads (and their manifest
  records) through new effective*ConfigPath helpers that apply systemPath to the
  absolute defaults too, so the prefix is honored for those files instead of
  reading the real root. No behavior change when the prefix is empty (identity).

Verified by an adversarial review that confirmed the wiring is correct and
surfaced the corosync/vzdump bypass, now closed.
…+ align BACKUP_SCRIPT_REPOSITORY default (#69)

The script-repository collection (collectScriptRepository, snapshotting the
ProxSave install dir, normally /opt/proxsave) only skipped the EXACT top-level
entries `backup`/`log` (parts[0]), so the whole `.git/` tree (history/objects),
the plural `backups`/`logs`, and nested `<sub>/backup|log` leaked into the
archive. And the Go default for BACKUP_SCRIPT_REPOSITORY was `true` while the
shipped template and the project convention say `false`, so a config missing the
key silently re-enabled the snapshot.

- collectScriptRepository: skip by basename at ANY depth via `switch d.Name()` +
  SkipDir for `.git`/`.svn`/`.hg` (VCS metadata) and `backup`/`backups`/`log`/
  `logs` (regenerated output), instead of the parts[0]-only check.
- Flip the BACKUP_SCRIPT_REPOSITORY default to false in config.go (parse) and in
  GetDefaultCollectorConfig (collector.go) so the two defaults agree; explicit
  opt-in is still honored.
- docs/CONFIGURATION.md: the option no longer "includes .git".
- Tests: extend the script-repo test to assert .git/.svn/backups/logs/nested
  backup|log are excluded while real files remain; add a config test pinning the
  false default and the explicit opt-in.

An adversarial review confirmed the core logic (mutation-tested) and surfaced
the collector-default inconsistency and stale doc, both fixed here.
…deleting them on selective restore (#70)

The first #70 fix (6cf7f89) made full restore type-faithful but turned the
selective/staged restore into a DATA-LOSS path: deduplication is category-blind,
so a selected duplicate's symlink could point at a canonical whose category was
not selected; materializeDedupSymlinks then read the canonical from destRoot and,
on a miss, DELETED the user-selected file. It could also copy stale live content,
and the manifest write was best-effort.

Restore side (internal/orchestrator/restore_archive_extract.go):
- materializeDedupSymlinks now rebuilds every deduplicated duplicate from the
  BACKUP ARCHIVE bytes (materializeFromArchive streams the already-decrypted
  archive once, one canonical at a time), writing atomically (temp+rename). It
  never reads the on-disk/live canonical and never deletes a symlink; a canonical
  genuinely absent from the archive leaves the symlink in place.
- A canceled / failed scan keeps the manifest (completed=false) so a re-run can
  finish; a write failure likewise keeps it. A corrupt manifest is removed (not
  left on the live system), and the force-created empty proxsave-info dir is
  cleaned up.
- Removed the isFinalDest gate: materialize is safe on every extraction (never
  deletes, no-op without a manifest), so it also covers the export path (no more
  dangling exported symlinks).

Backup side (internal/backup/optimizations.go):
- replaceWithSymlink uses a unique temp name (never clobbers a real *.dedup.tmp).
- revertDedupSymlink writes temp+rename (atomic; a failed revert never deletes).
- A dedup state that cannot be made safe (manifest unwritten + partial revert)
  is now fatal: ApplyOptimizations returns the error and applyBackupOptimizations
  aborts the backup rather than archiving an unrecoverable tree.

Verified by two adversarial agent pools and a control review; regression tests
cover cross-category rebuild, archive-vs-stale-disk, missing canonical, cancel,
atomic revert, no-clobber and the fatal abort.
…laimed by optimizations (#73)

Collection stats (BytesCollected -> UncompressedSize) and the collection manifest
are snapshotted before ApplyOptimizations runs, but dedup/prefilter then shrink the
staged tree. The only real-consumer leak was the compression ratio: it divided the
faithful compressed/archive size by the stale (pre-optimization) uncompressed size,
over-crediting compression for the bytes dedup/prefilter actually removed.

- ApplyOptimizations now returns OptimizationResult{BytesReclaimed, DuplicatesReplaced}:
  deduplicateFiles sums each replaced duplicate's size (0 when the manifest-failure
  path fully reverts the symlinks); prefilterFiles sums per-file size reductions.
- collectBackupData sets stats.UncompressedSize = BytesCollected - BytesReclaimed
  (BytesCollected stays the honest "bytes read during collection" figure), so the
  ratio in the stats report, notifications and the Prometheus gauge reflects the
  archived payload. No change when optimizations are off, and the #70 abort path
  runs before the correction.
- Document BackupManifest as the pre-optimization collection inventory; it is an
  ExportOnly diagnostic with no read-back consumer, and the authoritative shipped
  record is the archive sidecar (.sha256 / <archive>.manifest.json).

Three analysis agents scoped the impact (only the ratio); a control agent verified
the fix closes #73 (negligible conservative residual: the tiny symlink tar entry is
not subtracted). Test asserts BytesReclaimed/DuplicatesReplaced.
…n random salt (#13)

Passphrase-based AGE encryption derived the X25519 recipient with a fixed, global scrypt salt, so the same passphrase produced the same recipient on every install (correlatable) and enabled cross-install precomputation.

Generate a per-installation random salt at AGE setup, persist it 0600 next to the recipient (identity/age/passphrase.salt), and embed it in every backup manifest (passphrase_salt) so the passphrase alone can re-derive the identity on any host. The scheme stays asymmetric: unattended/cron backups still encrypt to the stored public recipient without the passphrase.

Decryption tries the manifest salt first, then the fixed v1 and legacy salts, so archives produced before this change remain decryptable. Existing installs keep working on the fixed salt until the recipient is regenerated.

Also correct the docs (TROUBLESHOOTING.md, ENCRYPTION.md): passphrase backups must be decrypted via 'proxsave --decrypt' (the recipient is derived, not age's native scrypt mode), not raw 'age --decrypt' (#14).

Tests: per-install isolation, round-trip proving the salt is required, and legacy fixed-salt backward compatibility.
…t cannot be removed (#30)

In PBS Clean (1:1) restore, a failed removal of a target object absent from the backup was only logged as a warning and the apply returned nil, so the restore was reported "completed successfully" even though the target was not actually 1:1.

The 7 applyPBS*CfgViaAPI functions (remote, s3, datastore, sync, verify, prune, traffic-control) now accumulate failed clean-mode removes and return an error wrapping errPBSCleanRemoveIncomplete after create/update complete (no abort). maybeApplyPBSConfigsFromStage detects the sentinel and records the item as failed (so the restore reports "completed with warnings") without invoking the file-based fallback, which would force-rewrite the .cfg and drop the object, bypassing PBS's refusal (e.g. an in-use object). The stale object is left in place (the conservative outcome); only the reporting changes, consistent with the datastore path-mismatch site that already fails closed.

Not data loss: Clean mode only removes config entries of extra objects, and a failed remove leaves the object intact.

Tests: the 7 existing strict-cleanup tests now expect the sentinel and still assert create/update ran (no abort); plus a new maybeApply test asserting the summary error and that the destructive file fallback is not invoked.
…mpt (#28)

When a PVE firewall restore could not restart the firewall service, the failure was only logged as a warning and the operator was still shown a plain "Keep firewall changes?" prompt; pressing Keep disarmed the rollback, so a blind confirm on an uncertain live state could drop the auto-revert.

Capture the restart outcome and, when it failed, prepend a clear warning to the commit prompt ("the firewall service restart FAILED ... If unsure, choose Rollback"). The commit default stays Rollback (auto-rollback on timeout) and an explicit Keep is still honored, so a healthy restore is never aborted and no rollback is forced. The no-rollback-armed branch also logs a clearer warning.

Severity is bounded: .fw files are applied before the restart and pve-firewall is a polling daemon, so a failed restart does not flush the ruleset; this closes the informational gap (blind Keep) without health checks that could revert a healthy restore.

Tests: the firewall test harness now captures prompt messages; the existing "restart failure ... continues" test asserts the warning banner is shown.
…nly on lock contention

Notification channels were registered only after orchestrator/checker/storage init, so a failure in those early phases (encryption_setup, checker_config, storage_init) dispatched to an empty channel set: no Telegram/email/webhook/Gotify alert, and the Prometheus textfile kept the last successful run's metrics so status-based alerting never fired.

- Register notification channels in initializeBackupOrchestrator (before the fallible init phases), single registration, so early-init failures reach the configured channels. The notification itself is unchanged (standard format).

- Gate DispatchEarlyErrorNotification on !dryRun, matching the normal finalize path, so --dry-run never sends real notifications.

- Export a Prometheus fail metric (status=error) on early errors so textfile alerting fires instead of serving a stale success metric.

- Treat 'another backup is in progress' as a benign concurrency skip: new checks.CheckCodeBackupInProgress + orchestrator.ErrBackupInProgress make the run exit 0 with no failure notification, so a contending run no longer sends a spurious FAILURE alert.

- Relabel the storage_init early-error exit code ExitConfigError -> ExitStorageError.

Tests: early error notifies a registered channel exactly once; dry-run sends nothing; a live lock reports the BACKUP_IN_PROGRESS code.
…34 #35 #36)

Telegram bot tokens (URL path), Gotify tokens (URL query, URL-encoded in the error), and webhook secret URLs leaked verbatim into %w-wrapped transport errors and Warning log lines, which are shipped off-host (secondary/cloud log dispatch, --support bundle).

Add a redaction layer in internal/logging: MaskSecret (fixed asterisk prefix + last 4 chars; full mask for short/empty) and RedactSecrets (replaces a secret in both raw and URL-encoded form, covering the Gotify case).

- Source redaction: telegram/gotify/webhook flatten the transport error through RedactSecrets(err, secret) before wrap/log, so the propagated error (not just the log line) is clean and cannot reappear in a notification body.

- Central net: the Logger scrubs every line against registered secret values at the single logWithLabel chokepoint; backup_notifications registers the notifier tokens/URLs (the public Cloudflare relay token/HMAC are intentionally not registered). This also covers response-body echoes and the --support log.

Masking style is asterisk-prefix + last-4 suffix. Tests: MaskSecret/RedactSecrets (incl. URL-encoded form), logger central scrub, empty/short no-op, and a Gotify transport-error test asserting neither the raw nor URL-encoded token leaks.
#27)

In a selective (category-scoped) restore, a hardlink entry must not alias a
target outside the selected categories. A genuine proxsave archive never
contains hardlink entries (the archiver does no inode dedup, so
tar.FileInfoHeader only ever emits regular files and symlinks), so this guard
only triggers on a tampered or foreign archive, where a cross-category hardlink
(e.g. an in-category name aliasing /etc/shadow) is never legitimate. Symlinks
are intentionally left unconstrained: within-root traversal through a symlinked
parent is supported and covered by existing tests.
…hem as success (H04)

addToTar logged a Warning and continued (return nil) on every per-file failure,
and VerifyArchive only checked size/compression/`tar -t` listing, never
source-vs-archive. A backup that silently omitted or truncated files therefore
passed verification with exit 0.

Two structural, category-agnostic guards on the archive stage:

(a) addToTar records each source entry it cannot add (walk access error,
    Lstat/Readlink/FileInfoHeader failure, io.Copy failure including a file that
    grew mid-copy -> ErrWriteTooLong: a full-Size header but truncated body that
    `tar -t` never flags). CreateArchive then returns ErrArchiveIncomplete
    (-> ExitArchiveError) when any entry was skipped, after still walking the
    rest so the run reports every miss. An open failure on a sized file is left
    to the tar writer's hard "missed writing" failure; a 0-byte file has no body
    to lose.

(b) VerifyArchive reconciles the entry count: addToTar counts the headers it
    wrote and the verifier counts the `tar -t` listing lines. A listing SHORTER
    than written means entries were lost after write (on-disk corruption or
    truncation that integrity checks miss) -> ErrArchiveEntryCountMismatch
    (-> ExitVerificationError). A longer listing is tolerated (busybox tar
    splits member names on embedded newlines), and reconciliation is skipped for
    archives this instance did not create and for encrypted archives (no
    plaintext listing).

Per-run state lives on the Archiver, reset at CreateArchive start; it is written
only by the single walk goroutine and read after it joins (errChan
happens-before), so no locking is needed. Scope is the archiver stage only;
collector-stage silent skips (FilesFailed not driving the exit code) remain as
the separate M31/#58.

Tests: socket entry -> fail-closed with the rest still archived; complete
archive -> no error and exact entry count; verify mismatch (shorter listing)
fails while a longer listing is tolerated; foreign-archive verify skips
reconciliation. build + vet + go test + orchestrator -race green.
The SAFE_BRACKET_PROCESSES / SAFE_KERNEL_PROCESSES lists match the whole
bracketed name with plain-entry == exact (not prefix), and SAFE_PROCESSES
does not apply to the "Suspicious kernel-style process" warning at all.
The template and CONFIGURATION.md conflated these with the per-token
suspicious-process scan, so users tried to silence a bracketed LXC worker
(e.g. celery) via SAFE_PROCESSES or a plain/anchored name and it never
matched (discussion #236).

Document both detectors separately: what string each matches, the
exact-vs-prefix-vs-regex semantics, and that bracketed processes are
allowlisted via SAFE_BRACKET_PROCESSES/SAFE_KERNEL_PROCESSES (with a
celery example). Docs/template only; no behavior change.
Bumps the minor-updates group with 3 updates: [golang.org/x/crypto](https://github.com/golang/crypto), [golang.org/x/term](https://github.com/golang/term) and [golang.org/x/text](https://github.com/golang/text).


Updates `golang.org/x/crypto` from 0.52.0 to 0.53.0
- [Commits](golang/crypto@v0.52.0...v0.53.0)

Updates `golang.org/x/term` from 0.43.0 to 0.44.0
- [Commits](golang/term@v0.43.0...v0.44.0)

Updates `golang.org/x/text` from 0.37.0 to 0.38.0
- [Release notes](https://github.com/golang/text/releases)
- [Commits](golang/text@v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.53.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-updates
- dependency-name: golang.org/x/term
  dependency-version: 0.44.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-updates
- dependency-name: golang.org/x/text
  dependency-version: 0.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-updates
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Replace the manual-tag + destructive sync-dev flow with the governed
pr-vX.Y.Z -> release PR -> squash-merge -> create vX.Y.Z once -> GoReleaser
release -> dev-sync pipeline (matching cPSM and addhon). A version tag is now
CREATED exactly once on the squash commit and never moved or deleted, so it
stays compatible with a tag-immutability ruleset.

- add release-intake.yml: pr-v* trigger opens the dev->main release PR; sources
  the policy from the trusted base (origin/main); fail-closed tag/release probes;
  deletes the unprotected pr-v* trigger tag.
- add release-guard.yml: enforces PR shape/provenance and a well-formed release
  marker; bootstrap-allows when the base has no policy yet (required check).
- add post-merge-release.yml: squash + content checks, fast-forward sync of dev
  (replaces sync-dev's destructive reset), and creates the v* tag once on the
  squash commit (a plain push = creation) which re-triggers release.yml.
- add .github/scripts/release-policy.sh: shared helpers incl. present/absent/error
  existence probes (no in-repo version file; version comes from the tag).
- release.yml: replace the fragile base_ref gate with a tag-on-main gate job and
  add a concurrency group; GoReleaser, ECDSA signing, and attestation unchanged.
- remove sync-dev.yml (superseded) and autotag.yml (dead, if:false).
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 4 package(s) with unknown licenses.
See the Details below.

License Issues

go.mod

PackageVersionLicenseIssue Type
golang.org/x/crypto0.53.0NullUnknown License
golang.org/x/sys0.46.0NullUnknown License
golang.org/x/term0.44.0NullUnknown License
golang.org/x/text0.38.0NullUnknown License
Allowed Licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, LicenseRef-scancode-google-patent-license-golang

OpenSSF Scorecard

PackageVersionScoreDetails
gomod/golang.org/x/crypto 0.53.0 UnknownUnknown
gomod/golang.org/x/sys 0.46.0 UnknownUnknown
gomod/golang.org/x/term 0.44.0 UnknownUnknown
gomod/golang.org/x/text 0.38.0 UnknownUnknown

Scanned Files

  • .github/workflows/autotag.yml
  • .github/workflows/sync-dev.yml
  • go.mod

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a structured release pipeline, passphrase-salt encryption support, secret redaction, archive completeness checks, secure temp-root handling, collector and restore changes, PBS strict-mode propagation, and backup orchestration fixes.

Changes

CI/Release Pipeline Overhaul

Layer / File(s) Summary
Release policy helpers
.github/scripts/release-policy.sh
Adds SemVer tag validation, PR-tag mapping, prerelease detection, commit reachability checks, PR body marker extraction, remote tag/release probing, and fail-closed assertions.
Release intake and merge workflows
.github/workflows/release-intake.yml, .github/workflows/release-guard.yml, .github/workflows/post-merge-release.yml, .github/workflows/release.yml, .github/workflows/autotag.yml, .github/workflows/sync-dev.yml
Adds the tag-triggered release intake workflow, the PR release guard, the post-merge dev-sync/tag finalizer, and the release workflow gate with tag-level concurrency; removes the disabled autotag and sync-dev workflows.

Go Core Feature Changes

Layer / File(s) Summary
Secret redaction and notifier scrubbing
internal/logging/redact.go, internal/logging/redact_test.go, internal/logging/logger.go, internal/notify/gotify.go, internal/notify/telegram.go, internal/notify/webhook.go, internal/notify/redact_notify_test.go, cmd/proxsave/backup_notifications.go
Adds secret masking helpers, logger secret registration and scrubbing, and redaction of transport errors in notification senders and initialization.
Passphrase salt derivation and flow
internal/config/config.go, internal/config/config_test.go, internal/config/templates/backup.env, internal/orchestrator/passphrase_salt.go, internal/orchestrator/passphrase_salt_test.go, internal/orchestrator/encryption.go, internal/backup/checksum.go, internal/orchestrator/backup_run_helpers.go, internal/orchestrator/age_setup_workflow.go, internal/orchestrator/decrypt.go, internal/orchestrator/decrypt_prepare_common.go, internal/orchestrator/decrypt_workflow_ui.go, docs/ENCRYPTION.md, docs/TROUBLESHOOTING.md
Adds per-installation passphrase salt storage, manifest embedding, passphrase identity derivation with extra salts, and the related encrypt/decrypt workflow wiring and tests.
Archive completeness auditing
internal/backup/archiver.go, internal/backup/archiver_completeness_audited_test.go, internal/backup/archiver_verification_test.go
Adds incomplete-archive and entry-count mismatch errors, per-run skipped-entry accounting, tar listing line counting, reconciliation on verification paths, and matching completeness tests.
Secure temp roots and orphan cleanup
internal/orchestrator/temp_registry.go, internal/orchestrator/temp_registry_test.go, internal/orchestrator/backup_run_phases.go, internal/orchestrator/backup_run_phases_test.go, internal/orchestrator/decrypt.go, internal/orchestrator/decrypt_prepare_common.go, internal/checks/checks.go, internal/checks/checks_test.go, internal/orchestrator/additional_helpers_test.go
Adds secure temp-root validation, trusted workspace cleanup rules, symlink rejection in temp-directory checks, and updates backup/decrypt workspace setup and cleanup paths.
PVE/PBS credential exclusions, system manifest, and collector wiring
internal/backup/collector_pve.go, internal/backup/collector_pve_test.go, internal/backup/collector_pbs.go, internal/backup/collector_pbs_datastore.go, internal/backup/collector_pbs_test.go, internal/backup/collector.go, internal/backup/collector_manifest.go, internal/backup/collector_system.go, internal/backup/collector_system_test.go, internal/config/config.go, internal/config/templates/backup.env, internal/orchestrator/orchestrator.go, internal/orchestrator/orchestrator_test.go
Adds PVE/PBS exclusion patterns, system manifest recording, staging-workspace pruning, script-repository filtering, SystemRootPrefix wiring, and the default change for BackupScriptRepository.
Optimization dedup manifest and restore materialization
internal/backup/optimizations.go, internal/backup/optimizations_test.go, internal/backup/optimizations_bench_test.go, internal/backup/optimizations_structured_test.go, internal/orchestrator/backup_run_helpers.go, internal/orchestrator/backup_run_phases.go, internal/orchestrator/restore_archive_extract.go, internal/orchestrator/restore_dedup_materialize_test.go
Adds dedup manifest types and result reporting, changes optimization failure behavior, records reclaimed bytes, rewrites JSON minification, and restores deduplicated symlinks from the archive on extract.
System accounts restore category
internal/orchestrator/restore_accounts.go, internal/orchestrator/restore_accounts_test.go, internal/orchestrator/staging.go, internal/orchestrator/categories.go, internal/orchestrator/categories_coverage_test.go, internal/orchestrator/categories_test.go, internal/orchestrator/restore_workflow_ui_extract.go
Adds staged restoration of passwd, group, shadow, gshadow, and sudoers with host-safe merge rules, category registration, staged-apply wiring, and the supporting tests and category coverage updates.
PBS strict-mode removal propagation
internal/orchestrator/pbs_api_apply.go, internal/orchestrator/pbs_api_apply_test.go, internal/orchestrator/pbs_staged_apply.go, internal/orchestrator/pbs_staged_apply_maybeapply_test.go
Adds the strict-remove incomplete sentinel, returns it from PBS API apply helpers on removal failures, and makes staged apply skip file fallback with labeled failures.
Backup orchestration fixes
internal/orchestrator/orchestrator.go, cmd/proxsave/backup_execution.go, cmd/proxsave/backup_storage.go, internal/orchestrator/extensions.go, internal/orchestrator/early_error_notification_test.go, internal/orchestrator/restore_firewall.go, internal/orchestrator/restore_firewall_additional_test.go, go.mod, docs/CONFIGURATION.md
Adds the backup-in-progress sentinel, propagates benign skip handling through pre-checks, adjusts early-error notification dispatch, changes storage-init exit codes, improves firewall restart messaging, and wires the system-root-prefix config through collector overrides.

Sequence Diagram(s)

sequenceDiagram
  participant Setup as age setup workflow
  participant Salt as passphrase_salt.go
  participant Enc as encryption.go
  participant Manifest as checksum.go
  participant Restore as decrypt_workflow_ui.go
  participant Archive as restore_archive_extract.go

  Setup->>Salt: getOrCreatePassphraseSalt(recipientPath)
  Salt-->>Setup: persisted per-install salt
  Setup->>Enc: deriveDeterministicRecipientFromPassphraseWithSalt(passphrase, salt)
  Manifest-->>Restore: PassphraseSalt from archive manifest
  Restore->>Enc: parseIdentityInputWithSalts(secret, manifest salts)
  Enc-->>Restore: identities ordered with manifest salt first
  Restore->>Archive: materializeDedupSymlinks(destRoot, archivePath)
Loading
sequenceDiagram
  participant Run as runConfiguredBackup
  participant Checks as RunPreBackupChecks
  participant Lock as CheckLockFile
  participant Notify as DispatchEarlyErrorNotification
  participant UI as restore_firewall.go

  Run->>Checks: runPreBackupChecks()
  Checks->>Lock: CheckLockFile()
  Lock-->>Checks: CheckCodeBackupInProgress
  Checks-->>Run: skip=true
  Run-->>Run: exit success without notification

  Notify->>Notify: export Prometheus fail metrics
  Notify->>Notify: return early on dryRun
  UI->>UI: prepend restart failure warning to commit message
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • tis24dev/proxsave#204: Modifies the same orchestrator restore and workspace code paths that this PR also changes with secure temp roots and restore flow updates.
  • tis24dev/proxsave#212: Both PRs change DispatchEarlyErrorNotification in internal/orchestrator/extensions.go.
  • tis24dev/proxsave#216: Both PRs touch notification dispatch flow and the orchestrator test scaffolding around notifier channels.

Poem

🐇 I hop through tags and salts with glee,
No secrets leak in logs I see.
My archive counts each tarred-up beam,
And temp roots stay secure and clean.
With accounts restored and dedup sprites,
This rabbit naps on release nights.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.98% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Release v0.25.0' directly and clearly describes the primary purpose of the pull request—cutting a new software release version. It is concise, specific, and accurately reflects the main change.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread internal/orchestrator/passphrase_salt.go Fixed
Comment thread internal/orchestrator/temp_registry.go Fixed
Comment thread internal/orchestrator/temp_registry.go Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (3)
.github/workflows/post-merge-release.yml (1)

18-20: 💤 Low value

Token requirement without fallback is intentional but should be documented.

RELEASE_BOT_TOKEN has no fallback here (unlike other workflows using || github.token). This is correct because the PAT is required to re-trigger release.yml when pushing the tag, but if the secret is missing, the job will fail opaquely. Consider adding a validation step or clearer error message.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/post-merge-release.yml around lines 18 - 20, Add a
validation step early in the post-merge-release workflow that explicitly checks
whether RELEASE_BOT_TOKEN (referenced in the env section) is set and provides a
clear error message if the required secret is missing. Alternatively, add an
inline comment in the env section documenting why the fallback pattern (using ||
github.token) is intentionally omitted and that the RELEASE_BOT_TOKEN secret
must be configured for the workflow to re-trigger release.yml successfully.
internal/orchestrator/restore_accounts_test.go (1)

323-339: ⚡ Quick win

Add fail-closed baseline tests for /etc/group and /etc/shadow.

There’s good coverage for empty /etc/passwd, but not for empty/missing /etc/group and /etc/shadow. Adding those cases will lock the same anti-corruption behavior across the full account DB set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_accounts_test.go` around lines 323 - 339, Add
two new test functions similar to TestApplyAccountsSkipsWhenCurrentPasswdEmpty
to cover the same anti-lockout behavior for `/etc/group` and `/etc/shadow`
files. Create TestApplyAccountsSkipsWhenCurrentGroupEmpty that writes an empty
`/etc/group` baseline file and staged group data, then verifies
applyAccountsFromStage does not modify the current group file. Create
TestApplyAccountsSkipsWhenCurrentShadowEmpty that does the same for
`/etc/shadow`. Each test should follow the same pattern: set up FakeFS, write
empty current files, write staged files with content, call
applyAccountsFromStage, and assert that the current files remain unchanged using
readFake to validate the anti-corruption behavior is consistent across all
account database files.
internal/checks/checks.go (1)

640-649: 💤 Low value

Consider handling osLstat errors explicitly.

If osLstat fails (not due to os.IsNotExist), the code silently continues to subsequent checks. While rare, an Lstat failure after a successful Stat could indicate a TOCTOU race or filesystem issue. Consider logging or returning an error when lerr != nil && !os.IsNotExist(lerr).

Suggested enhancement
 	// osStat follows symlinks, so a pre-created /tmp/proxsave pointing at an
 	// attacker-controlled directory would otherwise pass. Reject a symlinked root
 	// (issue `#54`).
-	if linfo, lerr := osLstat(tempRoot); lerr == nil && linfo.Mode()&os.ModeSymlink != 0 {
+	linfo, lerr := osLstat(tempRoot)
+	if lerr != nil && !os.IsNotExist(lerr) {
+		result.Code = "LSTAT_FAILED"
+		result.Error = fmt.Errorf("temp root lstat failed - path: %s: %w", tempRoot, lerr)
+		result.Message = result.Error.Error()
+		return result
+	}
+	if lerr == nil && linfo.Mode()&os.ModeSymlink != 0 {
 		result.Code = "SYMLINK_REJECTED"
 		result.Error = fmt.Errorf("temp path is a symlink - path: %s", tempRoot)
 		result.Message = result.Error.Error()
 		return result
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/checks/checks.go` around lines 640 - 649, The symlink check block
using osLstat is silently ignoring errors that are not "file not found" errors.
When osLstat returns a non-nil error that is not os.IsNotExist, the code should
explicitly handle this case rather than continuing silently. Add a check after
the osLstat call to detect when lerr != nil && !os.IsNotExist(lerr), and return
an error result in that scenario to surface potential TOCTOU race conditions or
filesystem issues instead of allowing the validation to proceed unchecked.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/backup/archiver_completeness_audited_test.go`:
- Around line 50-53: The net.Listen call for creating a unix socket is not using
a context-aware approach and triggers the noctx linter rule. Replace the
net.Listen("unix", sockPath) call with
net.ListenConfig{}.Listen(context.Background(), "unix", sockPath) to provide the
required context parameter while maintaining the same functionality for the test
socket setup.

In `@internal/backup/collector.go`:
- Around line 990-995: The `safeCopyDir` function records the manifest entry
with StatusCollected status before the directory copy and walk operations
complete. If ensureDir or the subsequent walk operations fail, the manifest
entry incorrectly remains as StatusCollected, misreporting the collection
outcome. Move the `c.recordSystemManifestEntry` call that records the directory
with StatusCollected to after the directory operations have completed
successfully, or add error handling to update the manifest entry status to
reflect any failures that occur during the walk. This fix should be applied
consistently across all locations mentioned (lines 990-995, 997-999, and
1048-1051).

In `@internal/checks/checks_test.go`:
- Around line 117-149: The test TestCheckLockFile_InProgressSetsCode only covers
the path where a lock file already exists when CheckLockFile is called, but it
does not test the atomic create-race condition in internal/checks/checks.go
where os.O_EXCL fails with os.IsExist(err). This race branch can return a failed
result without CheckCodeBackupInProgress, triggering unwanted failure
notifications. Add a regression test case that simulates the race condition by
having another process or goroutine create the lock file between when
CheckLockFile checks for existence and when it attempts to create the file with
os.O_EXCL, then verify the result has Code set to CheckCodeBackupInProgress.
Also ensure the os.IsExist(err) error handling branch in checks.go sets
CheckCodeBackupInProgress on the result to match the benign-skip contract.

In `@internal/orchestrator/pbs_staged_apply_maybeapply_test.go`:
- Around line 447-448: The test uses a hardcoded `/tmp` path in the
datastore.cfg configuration, which can be filtered by datastore safety checks,
making the fallback-bypass assertion non-deterministic. Replace the `/tmp` path
with a safe temporary directory path obtained from the test's temp directory
(such as using t.TempDir() or fakeFS's configured temp location) so that the
fallback mechanism would definitely write the file when invoked. This change
applies to both occurrences of WriteFile calls for the datastore.cfg file in the
test.

In `@internal/orchestrator/restore_accounts.go`:
- Around line 135-146: The sequential writeFileAtomic calls for etcPasswdPath,
etcShadowPath, etcGroupPath, and etcGshadowPath do not provide atomic semantics
as a group, leaving the system in an inconsistent state if any write fails after
earlier ones succeed. Implement an all-or-nothing approach by writing all four
files to temporary locations first, then atomically move or swap them into place
only after all temporary writes have been verified successful, ensuring either
all four files are updated or none are.
- Around line 276-283: Before calling the upsert function for brand-new backup
groups, sanitize the group line to restrict its member list to only those users
present in the importedUsers collection. Modify the line variable to filter out
any existing host accounts from the members before passing it to the upsert
call, ensuring that importing new supplementary groups does not inadvertently
add existing host accounts to those groups.
- Around line 94-118: The readCurrentAccountFile function returns an empty
string for missing files, but the code proceeds to merge and rewrite auth
database files even when critical baseline files are missing. This can create
unsafe partial baselines missing host system entries. Add validation checks
after each readCurrentAccountFile call for currentGroup, currentShadow, and
currentGshadow to ensure they are not empty, and return an error if any are
empty (similar to how /etc/passwd is already protected). These checks should
occur before any merge or rewrite operations to fail closed when required
baseline files are missing.

In `@internal/orchestrator/restore_archive_extract.go`:
- Around line 75-80: The materializeDedupSymlinks function call at line 79 is
fire-and-forget and does not check for failures, even though the function
detects cancellation and incomplete states internally. Modify the
materializeDedupSymlinks call to capture its return value and check for errors
or partial failure indicators, then ensure these failures are properly
propagated to the restore result (likely by updating stats.filesFailed or
returning an error) so that failOnPartialExtraction at line 85 can detect and
handle the incomplete dedup reconstruction.

---

Nitpick comments:
In @.github/workflows/post-merge-release.yml:
- Around line 18-20: Add a validation step early in the post-merge-release
workflow that explicitly checks whether RELEASE_BOT_TOKEN (referenced in the env
section) is set and provides a clear error message if the required secret is
missing. Alternatively, add an inline comment in the env section documenting why
the fallback pattern (using || github.token) is intentionally omitted and that
the RELEASE_BOT_TOKEN secret must be configured for the workflow to re-trigger
release.yml successfully.

In `@internal/checks/checks.go`:
- Around line 640-649: The symlink check block using osLstat is silently
ignoring errors that are not "file not found" errors. When osLstat returns a
non-nil error that is not os.IsNotExist, the code should explicitly handle this
case rather than continuing silently. Add a check after the osLstat call to
detect when lerr != nil && !os.IsNotExist(lerr), and return an error result in
that scenario to surface potential TOCTOU race conditions or filesystem issues
instead of allowing the validation to proceed unchecked.

In `@internal/orchestrator/restore_accounts_test.go`:
- Around line 323-339: Add two new test functions similar to
TestApplyAccountsSkipsWhenCurrentPasswdEmpty to cover the same anti-lockout
behavior for `/etc/group` and `/etc/shadow` files. Create
TestApplyAccountsSkipsWhenCurrentGroupEmpty that writes an empty `/etc/group`
baseline file and staged group data, then verifies applyAccountsFromStage does
not modify the current group file. Create
TestApplyAccountsSkipsWhenCurrentShadowEmpty that does the same for
`/etc/shadow`. Each test should follow the same pattern: set up FakeFS, write
empty current files, write staged files with content, call
applyAccountsFromStage, and assert that the current files remain unchanged using
readFake to validate the anti-corruption behavior is consistent across all
account database files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 98d60f19-741d-44b1-bc80-59f0f6ab110a

📥 Commits

Reviewing files that changed from the base of the PR and between 59eb7b0 and 5628ac9.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (76)
  • .github/scripts/release-policy.sh
  • .github/workflows/autotag.yml
  • .github/workflows/post-merge-release.yml
  • .github/workflows/release-guard.yml
  • .github/workflows/release-intake.yml
  • .github/workflows/release.yml
  • .github/workflows/sync-dev.yml
  • cmd/proxsave/backup_execution.go
  • cmd/proxsave/backup_mode.go
  • cmd/proxsave/backup_notifications.go
  • cmd/proxsave/backup_storage.go
  • docs/CONFIGURATION.md
  • docs/ENCRYPTION.md
  • docs/TROUBLESHOOTING.md
  • go.mod
  • internal/backup/archiver.go
  • internal/backup/archiver_completeness_audited_test.go
  • internal/backup/archiver_verification_test.go
  • internal/backup/checksum.go
  • internal/backup/collector.go
  • internal/backup/collector_manifest.go
  • internal/backup/collector_pbs.go
  • internal/backup/collector_pbs_datastore.go
  • internal/backup/collector_pbs_test.go
  • internal/backup/collector_pve.go
  • internal/backup/collector_pve_test.go
  • internal/backup/collector_system.go
  • internal/backup/collector_system_test.go
  • internal/backup/optimizations.go
  • internal/backup/optimizations_bench_test.go
  • internal/backup/optimizations_structured_test.go
  • internal/backup/optimizations_test.go
  • internal/checks/checks.go
  • internal/checks/checks_test.go
  • internal/config/config.go
  • internal/config/config_test.go
  • internal/config/templates/backup.env
  • internal/logging/logger.go
  • internal/logging/redact.go
  • internal/logging/redact_test.go
  • internal/notify/gotify.go
  • internal/notify/redact_notify_test.go
  • internal/notify/telegram.go
  • internal/notify/webhook.go
  • internal/orchestrator/additional_helpers_test.go
  • internal/orchestrator/age_setup_workflow.go
  • internal/orchestrator/backup_run_helpers.go
  • internal/orchestrator/backup_run_phases.go
  • internal/orchestrator/backup_run_phases_test.go
  • internal/orchestrator/categories.go
  • internal/orchestrator/categories_coverage_test.go
  • internal/orchestrator/categories_test.go
  • internal/orchestrator/decrypt.go
  • internal/orchestrator/decrypt_prepare_common.go
  • internal/orchestrator/decrypt_workflow_ui.go
  • internal/orchestrator/early_error_notification_test.go
  • internal/orchestrator/encryption.go
  • internal/orchestrator/extensions.go
  • internal/orchestrator/orchestrator.go
  • internal/orchestrator/orchestrator_test.go
  • internal/orchestrator/passphrase_salt.go
  • internal/orchestrator/passphrase_salt_test.go
  • internal/orchestrator/pbs_api_apply.go
  • internal/orchestrator/pbs_api_apply_test.go
  • internal/orchestrator/pbs_staged_apply.go
  • internal/orchestrator/pbs_staged_apply_maybeapply_test.go
  • internal/orchestrator/restore_accounts.go
  • internal/orchestrator/restore_accounts_test.go
  • internal/orchestrator/restore_archive_extract.go
  • internal/orchestrator/restore_dedup_materialize_test.go
  • internal/orchestrator/restore_firewall.go
  • internal/orchestrator/restore_firewall_additional_test.go
  • internal/orchestrator/restore_workflow_ui_extract.go
  • internal/orchestrator/staging.go
  • internal/orchestrator/temp_registry.go
  • internal/orchestrator/temp_registry_test.go
💤 Files with no reviewable changes (2)
  • .github/workflows/sync-dev.yml
  • .github/workflows/autotag.yml

Comment thread internal/backup/archiver_completeness_audited_test.go Outdated
Comment thread internal/backup/collector.go Outdated
Comment thread internal/checks/checks_test.go
Comment thread internal/orchestrator/pbs_staged_apply_maybeapply_test.go Outdated
Comment thread internal/orchestrator/restore_accounts.go
Comment thread internal/orchestrator/restore_accounts.go Outdated
Comment thread internal/orchestrator/restore_accounts.go
Comment thread internal/orchestrator/restore_archive_extract.go
Comment thread internal/orchestrator/restore_accounts.go
Comment thread internal/orchestrator/restore_accounts.go
tis24dev added 4 commits June 20, 2026 18:17
…ives

gosec G101 flagged consts/vars whose NAMES accidentally matched its credential
heuristic (pattern includes pw|pass|token|secret|cred). None hold credentials:
- tempWorkspaceRoot / tempWorkspaceMarker contain "pw" from the camelCase join
  temp+Workspace (values "/tmp/proxsave" and ".proxsave-marker").
- passphraseRandomSaltPrefix / passphraseRecipientSalt / legacyPassphraseRecipientSalt
  contain "pass" from "passphrase" (public domain-separation salt namespaces).

Rename the identifiers so the names no longer trip the heuristic, instead of
suppressing with #nosec. String VALUES are byte-for-byte unchanged, so the AGE
passphrase-derived key salts still decrypt existing archives (back-compat):
- tempWorkspaceRoot          -> workspaceRoot
- tempWorkspaceMarker        -> workspaceMarker
- passphraseRandomSaltPrefix -> randomSaltNamespaceV2
- passphraseRecipientSalt    -> recipientSaltV1
- legacyPassphraseRecipientSalt -> legacyRecipientSalt

The restore_access_control.go token-path consts (pveTokenCfgPath, pbsTokenCfgPath,
pbsTokenShadowPath) are intentionally NOT renamed: their names correctly mirror
real Proxmox files (token.cfg, token.shadow); renaming would misrepresent them and
#nosec is disallowed, so they stay as documented false positives.

Clears all 3 G101 alerts surfaced on the release PR (plus 2 pre-existing v1
siblings). Verified: go build/vet, orchestrator tests (-race) green, gosec G101
re-run drops 8 -> 3 (only the token-path FPs remain), no new gosec/staticcheck/vet
findings.
…0 PR

Five structural fixes (no #nosec/behavior suppressions), each with an anchored
regression test, from the CodeRabbit review of release PR #237:

- checks: CheckLockFile sets the BACKUP_IN_PROGRESS code on the lost atomic
  O_EXCL create race (os.IsExist branch), matching the stat-found-a-live-lock
  path, so a concurrent backup is a benign skip instead of a spurious failure
  notification.
- orchestrator/accounts: applyAccountsFromStage skips (anti-lockout) when the
  host /etc/group or /etc/shadow baseline is empty/unreadable, mirroring the
  existing empty-/etc/passwd guard, so it never rewrites the auth DB onto a
  missing baseline.
- orchestrator/accounts: mergeGroup restricts a brand-new backup group's members
  to imported users, so importing a new group never silently enrolls an existing
  host account (mirrors the existing-host-group member filtering).
- backup/collector: safeCopyDir records the system manifest status from the
  actual outcome (failed on error, collected on success) instead of recording
  collected before the copy can fail.
- orchestrator/restore: materializeDedupSymlinks returns an error on an
  incomplete reconstruction; the staged restore (failOnPartialExtraction) now
  fails closed instead of applying a partially reconstructed dedup tree.

Intentionally not changed: the restore_access_control.go token-path consts
(gosec G101 FPs whose names mirror real Proxmox files) and the heavier
all-or-nothing atomic account-DB write (CodeRabbit #6, deferred).

Verified: go build/vet, full -race tests for the three packages, and an
adversarial refuter pool (each new test proven to fail without its fix).
…237

- archiver: use a context-aware unix listener (net.ListenConfig{}.Listen with
  context.Background()) in the newTestSocket helper to satisfy the noctx lint;
  behaviorally identical (same socket inode, same cleanup).
- pbs staged apply: in TestMaybeApplyPBSConfigsFromStage_CleanRemoveIncomplete_
  SurfacesWithoutFallback, point the staged datastore.cfg at an empty t.TempDir()
  instead of /tmp. /tmp is non-empty so shouldApplyPBSDatastoreBlock DEFERS it
  (the file fallback would write nothing even if wrongly invoked), making the
  "live file must not be written" assertion vacuous; an empty temp dir is APPLIED,
  so the assertion now genuinely guards against a fallback-bypass regression.
  Matches the existing tests that already use safeDir := t.TempDir().

Test-only. Verified: go vet, -race tests for both packages, and an adversarial
refuter (regression injected -> the strengthened test catches it; the old /tmp
value passed falsely).
…bit #6)

applyAccountsFromStage wrote passwd/shadow/group/gshadow with four sequential
writeFileAtomic calls; each was atomic per-file but the SET was not, so a failure
partway through could leave the host with an inconsistent auth DB (passwd updated,
shadow stale) and risk a lockout.

Decompose writeFileAtomic into reusable prepareAtomicTempFile (phase 1: write the
temp with final owner/perms) + commitAtomicTempFile (phase 2: rename + dir fsync),
keeping writeFileAtomic's external behavior identical for all existing callers
(syncDir is hoisted to a package func). Add writeFilesAtomic, which prepares ALL
temps first (so the common disk-full/read-only/IO failure leaves every live file
untouched), then renames them in order, rolling already-committed files back to
their in-memory originals if a rename or dir fsync fails; a rollback that also
fails surfaces a CRITICAL manual-recovery error. The residual non-atomic window
(crash between renames) is documented as out of scope.

restore_accounts.go now writes the four files via writeFilesAtomic using the
current* contents as the rollback source; sudoers stays separate (visudo-validated).

Tests (each anchored, proven to fail against the old sequential writes):
prepare-phase failure leaves all four files unchanged; a mid-commit rename failure
rolls the committed files back to their originals; committed==true (post-rename
dir-fsync) rollback; and the double-failure CRITICAL path. FakeFS gains a
test-only RenameErr injector. Verified with go build/vet, gosec (no new finding),
full -race tests, and a three-lens adversarial refuter pool.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/orchestrator/restore_accounts_test.go (1)

522-566: 💤 Low value

Clarify temp cleanup assertion loop scope.

The comment at line 560 says "passwd, shadow" were prepared before the failure (since group creation fails at index 2), but the loop at line 561-565 also checks gshadow. While this still passes (gshadow temp was never created, so IsNotExist is true), explicitly checking only the two temps that were actually created would make the test intent clearer and avoid coupling to implementation ordering assumptions.

📝 Suggested clarification
-	// The temps prepared before the failure (passwd, shadow) must have been cleaned up.
-	for _, p := range []string{etcPasswdPath, etcShadowPath, etcGshadowPath} {
+	// The temps prepared before the failure (passwd, shadow) must have been cleaned up;
+	// gshadow was never started because group failed first.
+	for _, p := range []string{etcPasswdPath, etcShadowPath} {
 		if _, err := fakeFS.Stat(accountTempPath(t, p)); !os.IsNotExist(err) {
 			t.Errorf("temp for %s should not remain after prepare-phase failure (stat err=%v)", p, err)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/restore_accounts_test.go` around lines 522 - 566, The
loop that checks for cleaned-up temp files around line 561 currently includes
etcGshadowPath in the slice of paths to verify, but according to the comment,
only passwd and shadow temps were actually created before the failure occurs at
group. Update the slice in the for loop to contain only etcPasswdPath and
etcShadowPath to accurately reflect which temp files were prepared before the
failure, making the test intent clearer and removing assumptions about
implementation ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/orchestrator/restore_accounts_test.go`:
- Around line 522-566: The loop that checks for cleaned-up temp files around
line 561 currently includes etcGshadowPath in the slice of paths to verify, but
according to the comment, only passwd and shadow temps were actually created
before the failure occurs at group. Update the slice in the for loop to contain
only etcPasswdPath and etcShadowPath to accurately reflect which temp files were
prepared before the failure, making the test intent clearer and removing
assumptions about implementation ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31708f26-6894-4d76-a505-14b1ba95a666

📥 Commits

Reviewing files that changed from the base of the PR and between 5628ac9 and f407fe5.

📒 Files selected for processing (22)
  • internal/backup/archiver_completeness_audited_test.go
  • internal/backup/collector.go
  • internal/backup/collector_system_test.go
  • internal/checks/checks.go
  • internal/checks/checks_test.go
  • internal/orchestrator/additional_helpers_test.go
  • internal/orchestrator/backup_run_phases.go
  • internal/orchestrator/decrypt.go
  • internal/orchestrator/decrypt_prepare_common.go
  • internal/orchestrator/deps_test.go
  • internal/orchestrator/encryption.go
  • internal/orchestrator/fs_atomic.go
  • internal/orchestrator/fs_atomic_test.go
  • internal/orchestrator/passphrase_salt.go
  • internal/orchestrator/passphrase_salt_test.go
  • internal/orchestrator/pbs_staged_apply_maybeapply_test.go
  • internal/orchestrator/restore_accounts.go
  • internal/orchestrator/restore_accounts_test.go
  • internal/orchestrator/restore_archive_extract.go
  • internal/orchestrator/restore_dedup_materialize_test.go
  • internal/orchestrator/temp_registry.go
  • internal/orchestrator/temp_registry_test.go
🚧 Files skipped from review as they are similar to previous changes (13)
  • internal/orchestrator/decrypt_prepare_common.go
  • internal/orchestrator/temp_registry.go
  • internal/orchestrator/pbs_staged_apply_maybeapply_test.go
  • internal/orchestrator/passphrase_salt_test.go
  • internal/orchestrator/passphrase_salt.go
  • internal/orchestrator/backup_run_phases.go
  • internal/orchestrator/temp_registry_test.go
  • internal/orchestrator/decrypt.go
  • internal/checks/checks.go
  • internal/orchestrator/restore_archive_extract.go
  • internal/orchestrator/restore_accounts.go
  • internal/orchestrator/additional_helpers_test.go
  • internal/backup/archiver_completeness_audited_test.go

@tis24dev tis24dev merged commit f837787 into main Jun 20, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants