Skip to content

GH-3547: Add semi-automated release pipeline for Apache Parquet Java#3548

Open
RussellSpitzer wants to merge 6 commits intoapache:masterfrom
RussellSpitzer:feature/release-automation
Open

GH-3547: Add semi-automated release pipeline for Apache Parquet Java#3548
RussellSpitzer wants to merge 6 commits intoapache:masterfrom
RussellSpitzer:feature/release-automation

Conversation

@RussellSpitzer
Copy link
Copy Markdown
Member

Includes 85 bats unit tests covering all shared libraries.

Rationale for this change

Adds a release automation framework modeled after Apache Polaris, adapted for Parquet's Maven-based build. Replaces the manual maven-release-plugin workflow with explicit, scriptable steps that support both CI (GitHub Actions) and local execution, with dry-run by default.

What changes are included in this PR?

Scripts (release/bin/):

  • prepare-rc.sh: full pre-vote flow (branch, version, tag, Nexus, SVN, GitHub pre-release, vote email)
  • publish-release.sh: full post-vote flow (SVN promotion, final tag, Nexus release, GitHub release, version bump, announce email)
  • cancel-rc.sh: rollback a failed RC (Nexus drop, SVN cleanup)

Shared libraries (release/libs/):

  • _constants.sh, _log.sh, _exec.sh, _version.sh
  • _github.sh, _nexus.sh, _maven.sh

GitHub Actions workflows:

  • release-prepare-rc.yml, release-publish.yml, release-cancel-rc.yml
  • ci-release-scripts.yml (bats unit tests on PR/push)

Are these changes tested?

Only locally and with fake commands. To actually make this work we also have to raise an Infra ticket to get appropriate secrets applied to the parquet-java repo.

Are there any user-facing changes?

No, just for contributors


Release Workflow

This PR replaces the manual release process (documented in
How to Release
and the existing dev/ scripts: prepare-release.sh, source-release.sh,
finalize-release) with three GitHub Actions workflows backed by locally-runnable
Bash scripts. All workflows default to dry-run mode.

1. Prepare RC (Pre-Vote)

The release manager launches the "Prepare Release Candidate" workflow
(release-prepare-rc.yml) via workflow_dispatch with:

  • version — release version (e.g. 1.18.0)
  • rc_number (optional) — override RC number (auto-detected if empty)
  • dry_run — defaults to true; set to false for real execution
# Step in prepare-rc.sh Replaces (docs / existing script)
0 Validate inputs New — checks version format, verifies gpg/svn/mvnw
1 Create release branch Manual git branch parquet-X.Y.x
2 Auto-detect RC number Manual decision by release manager
3 Verify CI checks Manual check of GitHub Actions UI
4 Set POM versions dev/prepare-release.sh (mvn release:prepare)
5 Create RC tag and push dev/prepare-release.sh (mvn release:prepare)
6 Deploy to Nexus Manual mvn release:perform (prompted by prepare-release.sh)
7 Build source tarball dev/source-release.sh (git archive, gpg, shasum)
8 Stage to SVN dist/dev dev/source-release.sh (svn co, svn add, svn ci)
9 Create GitHub pre-release New — creates a GitHub pre-release with auto-generated notes

The script also generates a [VOTE] email template with all links and
hashes. The release manager copies this and manually sends it to
[email protected].

2. Vote

The release manager waits 72 hours for the community vote and tallies the
results manually.


If the vote fails → Cancel RC

The release manager launches the "Cancel Release Candidate" workflow
(release-cancel-rc.yml) via workflow_dispatch with:

  • version — release version (e.g. 1.18.0)
  • rc_number — the RC number to cancel (e.g. 1)
  • staging_repo_id — Nexus staging repository ID (e.g. orgapacheparquet-1234)
  • dry_run — defaults to true
# Step in cancel-rc.sh Replaces (docs / existing script)
1 Drop Nexus staging repo Manual Nexus UI operation
2 Delete SVN artifacts from dist/dev Manual svn rm

The script also generates a [RESULT][VOTE] failure email template.
The release manager fills in the failure reason and manually sends it.


If the vote passes → Publish Release

The release manager launches the "Publish Release" workflow
(release-publish.yml) via workflow_dispatch with:

  • version — release version (e.g. 1.18.0)
  • rc_number (optional) — RC that passed the vote (auto-detects latest if empty; rejects older RCs)
  • staging_repo_id — Nexus staging repository ID (e.g. orgapacheparquet-1234)
  • next_dev_version — next development version without -SNAPSHOT (e.g. 1.18.1)
  • dry_run — defaults to true
# Step in publish-release.sh Replaces (docs / existing script)
1 SVN Promotion Manual svn mv from dist/dev to dist/release
2 Old Release Cleanup Manual svn rm of prior versions
3 Create final release tag dev/finalize-release (git tag)
4 Release Nexus staging repo Manual Nexus UI "Release" button
5 Create GitHub Release Manual GitHub UI release creation
6 Bump to next dev version dev/finalize-release (mvn release:update-versions, versions:set-property, git commit)

3. Manual Follow-ups

After the publish workflow completes, the release manager must:

  1. Send the [ANNOUNCE] email (generated by step 7 of publish-release.sh) to [email protected] and [email protected]
  2. Create a PR against apache/parquet-site on the staging branch with the blog post template (generated by step 8 of publish-release.sh)

paths:
- 'release/**'

jobs:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rip out all of these tests if folks don't find them useful, it's just a bunch of Bash Unit Testing to try to make sure at least our helper functions do what they are supposed to. They use mocked output though so they have limited utility for testing the real true pathway.


steps:
- name: Checkout repository
uses: actions/checkout@v4
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parquet isn't using zimor yet, but in the future we should switch this and all other actions to hard coded sha's


- name: Cancel Release Candidate
env:
DRY_RUN: ${{ inputs.dry_run && '1' || '0' }}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anything in "secrets" is automatically redacted by Github Actions so we don't have to worry about any of this stuff being exposed

Comment thread .github/workflows/release-prepare-rc.yml Outdated
@RussellSpitzer RussellSpitzer force-pushed the feature/release-automation branch from f722749 to 967152b Compare May 6, 2026 21:16
… Java

Adds a release automation framework modeled after Apache Polaris,
adapted for Parquet's Maven-based build. Replaces the manual
maven-release-plugin workflow with explicit, scriptable steps that
support both CI (GitHub Actions) and local execution, with dry-run
by default.

Scripts (release/bin/):
- prepare-rc.sh: full pre-vote flow (branch, version, tag, Nexus,
  SVN, GitHub pre-release, vote email)
- publish-release.sh: full post-vote flow (SVN promotion, final tag,
  Nexus release, GitHub release, version bump, announce email)
- cancel-rc.sh: rollback a failed RC (Nexus drop, SVN cleanup)

Shared libraries (release/libs/):
- _constants.sh, _log.sh, _exec.sh, _version.sh
- _github.sh, _nexus.sh, _maven.sh

GitHub Actions workflows:
- release-prepare-rc.yml, release-publish.yml, release-cancel-rc.yml
- ci-release-scripts.yml (bats unit tests on PR/push)

Includes 85 bats unit tests covering all shared libraries.
@RussellSpitzer RussellSpitzer force-pushed the feature/release-automation branch from 967152b to 81d2bdd Compare May 6, 2026 21:17
Remove the next_dev_version input from publish-release.sh and the
workflow. The next version is always the current patch incremented
by one (e.g. 1.18.0 -> 1.18.1-SNAPSHOT), since the release branch
only produces patches for that major.minor.
Comment thread release/libs/_exec.sh
source "$LIBS_DIR/_constants.sh"
source "$LIBS_DIR/_log.sh"

function _redact_secrets {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is me being paranoid for local runs, Github will do this automatically but incase someone wants to run this locally and copies and pastes, this will protect their secrets.

The settings.xml now contains ${env.NEXUS_USERNAME} and
${env.NEXUS_PASSWORD} instead of the actual secret values. Maven
resolves these from environment variables at build time, so the
file itself contains no secrets and cannot be exfiltrated.
Comment thread release/libs/_maven.sh Outdated
The glob-based git tag -l "...-rc*" could match malformed tags
like "-rc10extra" or "-rc-foo". Add a _filter_rc_tags helper that
applies a strict ^...-rc[0-9]+$ regex after the glob, so only
well-formed RC tags are considered when auto-detecting RC numbers.
Comment thread release/libs/_version.sh
return 0
}

function _filter_rc_tags {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another paranoia from me, stops us from finding a 1.9.0-rcFoo being considered

Previously, a missing GITHUB_TOKEN silently skipped CI verification
and returned success, allowing a release to proceed even if CI was
red. Now it fails unless running in dry-run mode.
The find_latest_rc_number tests create temporary git repos and
run git commit, which requires user.name and user.email to be set.
The GitHub Actions runner has no default git identity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant