Skip to content

fix(stackrox): strip system-out from JUnit XMLs before SHARED_DIR copy#80796

Open
davdhacs wants to merge 3 commits into
openshift:mainfrom
stackrox:fix-junit-secret-size
Open

fix(stackrox): strip system-out from JUnit XMLs before SHARED_DIR copy#80796
davdhacs wants to merge 3 commits into
openshift:mainfrom
stackrox:fix-junit-secret-size

Conversation

@davdhacs

@davdhacs davdhacs commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

What

Strip <system-out> and <system-err> blocks from JUnit XML files in the stackrox-qa-e2e step
after dispatch.sh returns but before the EXIT trap fires.

Why

The StackRox QA E2E suite produces ~2.4 MiB of merged JUnit XML, of which 91% is <system-out>
content
(Gradle's captured test output). The ExitTrap--PostProcessPrep EXIT trap merges all
JUnit XMLs and copies the result to SHARED_DIR, which is backed by a Kubernetes Secret with a
hard 1 MiB size limit. Every LP interop run fails with:

failed to update secret: Secret "cr--acs--tests-aws" is invalid:
data: Too long: may not be more than 1048576 bytes

All tests pass — the job fails purely in post-processing.

How

The stripping runs after dispatch.sh returns — at that point, junit2jira (which includes
stdout/stderr in Jira issue descriptions) and flakechecker have already read the full JUnit
data during dispatch.sh's own exit trap. Then when this script exits, the ExitTrap--PostProcessPrep
finds the stripped files and the merged result fits in the Secret.

The sed pattern handles both single-line (<system-out><![CDATA[]]></system-out>) and multi-line
blocks. The single-line pattern comes first because GNU sed's range /start/,/end/d looks for the
end on the next line — when both tags are on the same line, the range extends to EOF.

After stripping, the total drops from ~2.4 MiB to ~230 KiB.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 19, 2026
@openshift-ci

openshift-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Inside cleanup-collect in stackrox-qa-e2e-commands.sh, a find loop is added that applies an in-place sed command to strip <system-out>…</system-out> and <system-err>…</system-err> blocks from all JUnit XML files under ARTIFACT_DIR, excluding the original_results subdirectory, before those files are copied to SHARED_DIR.

Changes

JUnit XML Cleanup

Layer / File(s) Summary
Strip system-out/err blocks before SHARED_DIR copy
ci-operator/step-registry/stackrox/qa-e2e/stackrox-qa-e2e-commands.sh
A find loop with in-place sed is inserted into cleanup-collect to delete <system-out> and <system-err> XML blocks from JUnit files in ARTIFACT_DIR (excluding original_results) before they are copied to SHARED_DIR. Original files are preserved in original_results/.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: stripping system-out from JUnit XMLs before copying to SHARED_DIR, which directly addresses the PR's core objective of reducing file size to meet Kubernetes Secret limits.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR modifies only a shell script for CI/CD test result processing; does not introduce or modify any Ginkgo test definitions, so the stable test names check is not applicable.
Test Structure And Quality ✅ Passed Custom check for Ginkgo test structure is not applicable; PR modifies only a bash shell script for CI/CD operations, not Go test code.
Microshift Test Compatibility ✅ Passed PR modifies only a Bash CI shell script with no new Ginkgo e2e tests or Go test code. MicroShift compatibility check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed Check not applicable: PR modifies only CI shell script, not Ginkgo e2e tests. SNO compatibility check applies only to new Go e2e tests.
Topology-Aware Scheduling Compatibility ✅ Passed Check not applicable: PR modifies a CI bash script for test result processing, not deployment manifests, operator code, or controllers that could introduce scheduling constraints.
Ote Binary Stdout Contract ✅ Passed The custom check "OTE Binary Stdout Contract" applies only to Go test binaries (main(), TestMain(), BeforeSuite(), etc.). This PR modifies a bash script for CI infrastructure post-processing, not a...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR modifies only a bash CI script to strip XML tags from JUnit output; no Ginkgo e2e tests are added, so the IPv6/disconnected network compatibility check does not apply.
No-Weak-Crypto ✅ Passed PR contains only shell script refactoring for XML test result handling with sed text manipulation; no cryptographic operations (weak, strong, custom, or secret comparisons) found.
Container-Privileges ✅ Passed PR modifies only a bash script that strips XML blocks from JUnit files; no container privileges or K8s security contexts are changed or introduced.
No-Sensitive-Data-In-Logs ✅ Passed PR strips and blocks from JUnit XMLs, reducing sensitive test output in SHARED_DIR. All logging statements are non-sensitive informational messages about copying/patching...
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 19, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

couldn't prepare candidate: couldn't rebase candidate onto 305ddb864ca697202a1db9630b00db383c485d93 due to conflicts
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ci-operator/step-registry/stackrox/qa-e2e/stackrox-qa-e2e-commands.sh`:
- Around line 59-60: The issue is that the `-path "${original_results}" -prune`
argument in the find command at line 59 fails to exclude the backup directory
due to a trailing slash mismatch, causing the sed command on line 60 to
incorrectly modify and overwrite the original backup XML files. To fix this,
ensure the path pattern used in the `-path` argument properly matches the actual
directory structure by removing any trailing slash from the `original_results`
variable before using it in the find command, so that the prune correctly
excludes the backup directory and preserves the original results.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ebb64165-212a-4695-9c4d-69526c36ff9c

📥 Commits

Reviewing files that changed from the base of the PR and between 305ddb8 and b2e0cef.

📒 Files selected for processing (1)
  • ci-operator/step-registry/stackrox/qa-e2e/stackrox-qa-e2e-commands.sh

Comment on lines +59 to +60
find "${ARTIFACT_DIR}" -path "${original_results}" -prune -o -type f -iname "*.xml" -print0 | while IFS= read -r -d '' xml_file; do
sed -i '/<system-out>.*<\/system-out>/d; /<system-out>/,/<\/system-out>/d; /<system-err>.*<\/system-err>/d; /<system-err>/,/<\/system-err>/d' "${xml_file}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Prune path can miss original_results due to trailing slash mismatch.

At Line 59, -path "${original_results}" may not match because original_results includes a trailing /. If prune misses, Line 60 will also rewrite backup XMLs, so originals are no longer preserved.

Suggested fix
-      find "${ARTIFACT_DIR}" -path "${original_results}" -prune -o -type f -iname "*.xml" -print0 | while IFS= read -r -d '' xml_file; do
+      original_results_prune="${original_results%/}"
+      find "${ARTIFACT_DIR}" -path "${original_results_prune}" -prune -o -type f -iname "*.xml" -print0 | while IFS= read -r -d '' xml_file; do
         sed -i '/<system-out>.*<\/system-out>/d; /<system-out>/,/<\/system-out>/d; /<system-err>.*<\/system-err>/d; /<system-err>/,/<\/system-err>/d' "${xml_file}"
       done
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ci-operator/step-registry/stackrox/qa-e2e/stackrox-qa-e2e-commands.sh` around
lines 59 - 60, The issue is that the `-path "${original_results}" -prune`
argument in the find command at line 59 fails to exclude the backup directory
due to a trailing slash mismatch, causing the sed command on line 60 to
incorrectly modify and overwrite the original backup XML files. To fix this,
ensure the path pattern used in the `-path` argument properly matches the actual
directory structure by removing any trailing slash from the `original_results`
variable before using it in the find command, so that the prune correctly
excludes the backup directory and preserves the original results.

@davdhacs davdhacs force-pushed the fix-junit-secret-size branch from b2e0cef to 6b7e13b Compare June 19, 2026 16:43
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 19, 2026
@davdhacs

Copy link
Copy Markdown
Contributor Author

/hold
/uncc

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2026
@davdhacs davdhacs marked this pull request as ready for review June 19, 2026 16:52
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 19, 2026
@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

The QA E2E suite produces ~2.4 MiB of JUnit XML, of which 91% is
<system-out> content (Gradle test output). The EXIT trap merges these
and copies to SHARED_DIR (a Kubernetes Secret with a 1 MiB limit),
failing with "Secret is invalid: data: Too long" on every run.

Strip <system-out> and <system-err> inside the EXIT trap, before
ExitTrap--PostProcessPrep merges and copies. By this point, dispatch.sh
has already exited and its ci_exit_trap has run junit2jira and
flakechecker with the full data. The stripping cannot go after
dispatch.sh because handle_dangling_processes kills the parent step
script, so code after dispatch.sh never executes.

The single-line sed pattern must come first because GNU sed's range
/start/,/end/d looks for end on the NEXT line — when both tags are
on the same line (Gradle's empty CDATA), the range extends to EOF.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the fix-junit-secret-size branch from 6b7e13b to 98ef3eb Compare June 19, 2026 19:41
@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Temporary config to rehearse the LP interop job against the stackrox
PR branch that fixes handle_dangling_processes killing the step script.
Combined with the step registry fix (system-out stripping), this
exercises both fixes end-to-end.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: job(s): periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws either don't exist or were not found to be affected, and cannot be rehearsed

Generated by `make jobs`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davdhacs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@davdhacs: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws N/A periodic Periodic changed
periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-acs--tests-aws-fips N/A periodic Periodic changed
periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws N/A periodic Registry content changed
periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-acs--tests-aws-fips N/A periodic Registry content changed
periodic-ci-stackrox-stackrox-master-ocp-4.21-lp-interop-cr-acs-latest-acs-tests-aws N/A periodic Registry content changed
periodic-ci-stackrox-stackrox-master-ocp-4-21-lp-interop-acs-tests-aws-fips N/A periodic Registry content changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@davdhacs

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-stackrox-stackrox-fix-dangling-process-kill-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci

openshift-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

@davdhacs: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws 98ef3eb link unknown /pj-rehearse periodic-ci-stackrox-stackrox-master-ocp-4.22-lpMainline-lp-ocp-compat-cr--acs--tests-aws

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant