Skip to content

fix(release): Add more checks to release scripts#127

Open
NickLarsenNZ wants to merge 35 commits into
mainfrom
fix/release-scripts
Open

fix(release): Add more checks to release scripts#127
NickLarsenNZ wants to merge 35 commits into
mainfrom
fix/release-scripts

Conversation

@NickLarsenNZ
Copy link
Copy Markdown
Member

@NickLarsenNZ NickLarsenNZ commented May 20, 2026

Tip

This PR is intended to be compared with #125 so we can take the best ideas of both.

Summary

Harden the release scripts with shared validation, state assertions, and consistency improvements. The goal is to catch errors early (before mutations), make handover between colleagues safer, and reduce silent misbehavior.

Also incorporates features from #125: reduced duplication via shared helpers, and centralised variable derivation.

What changed

  • Created release/common.sh with shared functions sourced by all scripts
  • Added BATS tests for all common functions (release/test_common.bats)
  • All scripts now validate inputs before any mutations (validate_tag, validate_release_base_version, validate_what)
  • All scripts now check common dependencies early (check_common_dependencies)
  • Git state is asserted throughout: clean index, correct branch, correct repo, correct remote
  • Branch existence checks use exact matching instead of loose grep
  • Shallow clones removed (broke downstream tag/cherry-pick operations)
  • git commit -sam replaced with explicit git add of known files
  • pushd/popd with relative paths instead of cd with absolute paths
  • Renamed REPOSITORY to REMOTE for consistency
  • Fixed usage messages referencing wrong script names
  • Centralised variable derivation (derive_tag_vars, derive_branch_vars)
  • Centralised operator iteration (for_each_operator), cloning (ensure_clone), temp folder creation (ensure_temp_folder), and quote stripping (strip_double_quotes)
  • Renamed RELEASE to RELEASE_BASE for clarity

Checklist

Phase 1: Shared infrastructure

  • 1.1 - common.sh with validate_what, check_common_dependencies, assert_clean_index
  • 1.2 - validate_tag (CalVer, --no-rc flag) and validate_release_base_version
  • 1.3 - assert_cwd_is_repo
  • 1.4 - assert_on_branch
  • 1.5 - assert_remote_exists (checks github.com/stackabletech/$repo)

Phase 2: Apply shared functions to each script

  • 2.1 - create-release-branch.sh
  • 2.2 - create-release-candidate-branch.sh
  • 2.3 - merge-release-candidate.sh
  • 2.4 - tag-release-candidate.sh
  • 2.5 - post-release.sh

Phase 2b: Consolidate push_branch

  • 2.6 - Move push_branch to common.sh (deferred - current per-script implementations are short and clear)

Phase 3: Fix git state handling

  • 3.1 - Remove --depth 1 from clones
  • 3.2 - Replace git commit -sam with explicit git add + git commit -sm
  • 3.3 - assert_clean_index after commits
  • 3.4 - Subshells () for directory isolation (except release-branch-hygiene.sh which needs parent state)
  • 3.5 - Exact branch-existence checks (assert_remote_branch_exists, assert_remote_branch_not_exists, assert_tag_not_exists)
  • 3.6 - Commit divergence assertions (TODOs placed in scripts, deferred to future PR)

Phase 4: Improve dry-run and observability

  • 4.1 - Verify PR exists in merge-release-candidate.sh push mode (fails if missing). Dry-run skips the check to allow testing before PRs exist.
  • 4.2 - Summary output at end of each script (skipped - per-repo output already exists and set -e stops on first failure)
  • 4.3 - Document RC rejection in post-release.sh
  • 4.4 - Document demos branch-only workflow in README

Phase 5: Future-proofing (optional)

Not doing these now.

  • 5.1 - --non-interactive flag for merge-release-candidate.sh (deferred to future PR)
  • 5.2 - Progress/state file for handover resumability (deferred to future PR)
  • 5.3 - Two-pass operator loop (deferred to future PR)

Tip

The phases below incorporate improvements from @adwk67 in #125

Phase A: Reduce duplication

  • A.1 - derive_tag_vars (centralised variable derivation from tag)
  • A.2 - derive_branch_vars (centralised variable derivation from branch version)
  • A.3 - strip_double_quotes (replace inline quote stripping)
  • A.4 - for_each_operator (centralised operator iteration)
  • A.5 - ensure_clone (centralised clone-if-not-present)
  • A.6 - ensure_temp_folder (centralised temp folder creation)

Phase B: Improve existing functions

  • B.1 - Switch remote_branch_exists to git ls-remote (avoids modifying local refs)
  • B.2 - Switch assert_tag_not_exists to git ls-remote (now takes remote name as argument)
  • B.3 - Idempotent update_changelog (skip if tag already present, validate tag, use fixed-string matching)

Phase C: Add new safety checks

  • C.1 - No-op commit guard (extracted into commit_and_push_rc function)
  • C.2 - verify_release (post-mutation verification of Cargo, Helm, antora, changelog, etc.)
  • C.3 - shellcheck source=common.sh directive

Phase D: Bug fixes

  • D.1 - LIBGIT2_NO_PKG_CONFIG=1 workaround for non-NixOS users
  • D.2 - Add extra/ to git add paths in operator loop

Phase E: Update release-branch-hygiene.sh

  • E.1 - Apply all common.sh patterns to release-branch-hygiene.sh

Phase F: Future improvements

  • F.1 - Use subshells () instead of pushd/popd where functions don't propagate parent state (all scripts except release-branch-hygiene.sh which needs CREATED_PRS)

Ensures we are in a repo, or a specific repo.
Ensures we are on a specific branch.
Ensures we have a remote called origin, and it is for the repo we expect.
```
❯ bats release/test_common.bats
test_common.bats
 ✓ validate_what: accepts valid value
 ✓ validate_what: accepts last valid value
 ✓ validate_what: rejects invalid value
 ✓ validate_what: rejects value not in allowed set for this script
 ✓ validate_what: rejects empty value
 ✓ validate_tag: accepts final release tag
 ✓ validate_tag: accepts RC tag
 ✓ validate_tag: accepts multi-digit RC
 ✓ validate_tag: accepts month 12
 ✓ validate_tag: rejects month 0
 ✓ validate_tag: rejects month 13
 ✓ validate_tag: rejects leading zero on month
 ✓ validate_tag: rejects missing patch
 ✓ validate_tag: rejects empty tag
 ✓ validate_tag: rejects missing hyphen in rc
 ✓ validate_tag: --no-rc accepts final release
 ✓ validate_tag: --no-rc rejects RC tag
 ✓ validate_tag: rejects unknown flag
 ✓ validate_tag: rejects trailing arguments
 ✓ validate_release_base_version: accepts valid version
 ✓ validate_release_base_version: accepts month 11
 ✓ validate_release_base_version: rejects version with patch
 ✓ validate_release_base_version: rejects empty
 ✓ validate_release_base_version: rejects month 0
 ✓ validate_release_base_version: rejects leading zero
 ✓ assert_cwd_is_repo: passes in a git repo
 ✓ assert_cwd_is_repo: passes with matching name
 ✓ assert_cwd_is_repo: fails with wrong name
 ✓ assert_cwd_is_repo: fails outside a git repo
 ✓ assert_on_branch: passes on correct branch
 ✓ assert_on_branch: fails on wrong branch
 ✓ assert_on_branch: fails with empty argument
 ✓ assert_remote_exists: passes with SSH URL
 ✓ assert_remote_exists: passes with HTTPS URL
 ✓ assert_remote_exists: passes without .git suffix
 ✓ assert_remote_exists: fails with wrong repo name
 ✓ assert_remote_exists: fails with wrong org
 ✓ assert_remote_exists: fails with nonexistent remote
 ✓ assert_clean_index: passes on clean repo
 ✓ assert_clean_index: fails on staged changes
 ✓ assert_clean_index: fails on unstaged changes
 ✓ assert_clean_index: warns on untracked files, continues with y
 ✓ assert_clean_index: warns on untracked files, aborts with n

43 tests, 0 failures
```
…nctions for validating the tag and the "what"
Patch releases could break if the tag is too far back
We now have assertions that we are in the right place.
@NickLarsenNZ NickLarsenNZ self-assigned this May 20, 2026
NickLarsenNZ and others added 6 commits May 20, 2026 12:00
This sets commonly used variables based on the tag.

Also update RELEASE to RELEASE_BASE.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
This sets commonly used variables based on the given release base.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Replace all the manual double-quote stripping with it.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Allows us to apply a function with arguments _for each operator_ in the config.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Centralises the clone if not present check.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Centralizes `mkdir -p` of the temp folder.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
NickLarsenNZ and others added 5 commits May 20, 2026 14:59
This removes the need for fetching and then looking at local refs.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
This removes the need for fetching and then looking at local tags.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
NickLarsenNZ and others added 7 commits May 20, 2026 16:31
…ripts

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
…rgo build

This is a workaround for non-NixOS users where nix-shell provides libgit2 at build time but `LD_LIBRARY_PATH` doesn't include the nix store at runtime, causing a crash.

The proper fix belongs in the operator repos' nix shells.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Note: This wasn't done for release-branch-hygiene.sh because it modifies global variables.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
@NickLarsenNZ NickLarsenNZ requested a review from adwk67 May 21, 2026 07:33
@NickLarsenNZ NickLarsenNZ marked this pull request as ready for review May 21, 2026 07:34
@adwk67 adwk67 moved this to Development: In Review in Stackable Engineering May 21, 2026
Comment on lines +75 to +81
#
# LIBGIT2_NO_PKG_CONFIG: Workaround for non-NixOS users. nix-shell provides
# libgit2 at build time (found via pkg-config), but LD_LIBRARY_PATH may not
# include the nix store at runtime, causing a crash. This forces libgit2-sys
# to statically link its bundled copy instead. The proper fix belongs in the
# operator repos' nix shells.
LIBGIT2_NO_PKG_CONFIG=1 nix-shell --run 'make regenerate-charts'
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider this a hack? It causes openssl.so.3 issues for some users.
I think the fix is to include pkg-config for the operators.

assert_remote_branch_not_exists "$REMOTE" "$PR_BRANCH"

# create a new branch for the PR off of this
git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents resumability. We could maybe follow the same pattern of:

Suggested change
git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"
git switch "$PR_BRANCH" "$RELEASE_BRANCH" 2>/dev/null || git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"

exit 1
fi
# create a new branch for the PR off of this
git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as other comment...

Suggested change
git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"
git switch "$PR_BRANCH" "$RELEASE_BRANCH" 2>/dev/null || git switch -c "$PR_BRANCH" "$RELEASE_BRANCH"

Comment thread release/post-release.sh
# New branch that updates the CHANGELOG
CHANGELOG_BRANCH="chore/update-changelog-from-release-$RELEASE_TAG"
# Branch out from main
git switch -c "$CHANGELOG_BRANCH"
Copy link
Copy Markdown
Member Author

@NickLarsenNZ NickLarsenNZ May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This likely also fails on resume

Comment thread release/common.sh
fi

# gh authentication: if this fails you will need to e.g. gh auth login
gh auth status
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, also echo something to the user (e.g. "It was not possible to verify authentication: please login with e.g. gh auth login") so that it is clear what to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

2 participants