Skip to content

[review] batch6 fixes: 10 articles (1 high + 4 medium + 4 low)#778

Merged
jing2uo merged 1 commit into
mainfrom
kb/fix/batch6-review-2026-05-30
May 30, 2026
Merged

[review] batch6 fixes: 10 articles (1 high + 4 medium + 4 low)#778
jing2uo merged 1 commit into
mainfrom
kb/fix/batch6-review-2026-05-30

Conversation

@jing2uo
Copy link
Copy Markdown
Collaborator

@jing2uo jing2uo commented May 30, 2026

Summary

Addresses the 8 findings in batch6-merged-pr-kb-review-2026-05-30.html
(codex review of the 37 merged batch6 convert PRs).

10 files touched, +57 / -25 lines:

Sev File Fix
HIGH VM_application_traffic_times_out_behind_kube_ovn_Geneve… Remove 12 leaked phase/ev evidence-note lines (pipeline scratch leaked into prose)
MED OpenTelemetry_Apache_HTTPD… Resolution split into step-1 root cause (remove conflicting user mount) + step-2 empirical mitigation, with caveat that step-2 mechanism is not characterised
MED KubeVirt_VM_Loses_Bridge_Network_Connectivity_When_br_netfilter… Add disruptive-change block: maintenance-window scheduling, rollback steps, third-party-agent audit, note durable fix is preventing workload from running on the node
MED TektonConfig_stuck_Not_Ready… Reorder: re-create installer set first (operator-aware); finalizer=null becomes last-resort with explicit preconditions
MED VolumeSnapshot_creation_latency… Add destructive-operation warning: DeletionPolicy=Delete + backend snapshot loss; document Retain alternative, backup-pipeline check, customer ack
LOW PipelineRun_stuck_in_PipelineRunPending… Drop "## Evidence" section with internal ev:c1/c3/c6/c7 labels; replace with sanitised "## Verification"
LOW Dynamic_NFS_storage_provisioning_for_Tekton_CICD… Strip "cluster glean-lab-base-0529" mentions from user-facing prose
LOW ResolutionRequest_reconciles_every_ten_hours… Strip "lab-base" mentions from user-facing prose
LOW Alertmanager_027_UTF_8… + Custom_matchers_with_whitespace… Add ## See Also cross-link between the two near-duplicate KBs

Test plan

  • Each touched KB renders with the existing site build
  • Spot-check the HIGH fix: no phase[0-9] / lab-base / ev:c substrings in the published body of VM_application_traffic_times_out_behind_kube_ovn_Geneve…
  • Spot-check the Tekton MED fix: re-create-first ordering reads correctly + preconditions checklist for finalizer=null is unambiguous

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Enhanced cross-references between related troubleshooting articles for better navigation
    • Added operational guidance and warning sections for disruptive procedures
    • Improved resolution steps with clearer, structured step-by-step guidance
    • Generalized environment-specific references for broader applicability
    • Added verification checklists and preconditions for critical operations
    • Refined explanations of expected system behavior and monitoring outcomes

Review Change Stack

Addresses findings from batch6-merged-pr-kb-review-2026-05-30.html
codex review of the 37 merged convert PRs.

High (1):
- VM application traffic times out behind kube-ovn Geneve overlay…
  remove 12 leaked phase2/phase4/phase5 evidence-note lines from the
  published article body (pipeline scratch notes leaked into prose).

Medium (4):
- OpenTelemetry Apache HTTPD auto-instrumentation fails to inject…
  Rewrite resolution as 2 steps: step 1 remove the actual conflicting
  user mount (root cause); step 2 the extra emptyDir as empirical
  mitigation when no overlap is visible.
- KubeVirt VM Loses Bridge Network Connectivity When br_netfilter…
  Add a "Disruptive node-level change" block with maintenance-window
  scheduling, rollback steps (capture prior sysctl/module-list state),
  third-party-agent audit guidance, and a note that durable fix is
  preventing the workload from running on the affected node, not just
  unloading the module once.
- TektonConfig stuck Not Ready after Tekton operator upgrade…
  Reorder: try re-create installer set first (operator-aware path);
  force finalizer=null is last resort with preconditions
  (deletionTimestamp set, only the controller's finalizer present,
  spec backed up, customer ack, owner-controller failure confirmed).
- VolumeSnapshot creation latency from stale VolumeSnapshotContent…
  Add explicit destructive-operation warning before the delete: with
  DeletionPolicy=Delete the snapshot-controller issues a backend
  DeleteSnapshot. Document the patch-to-Retain alternative + backup
  pipeline + customer-ack preconditions.

Low (4):
- PipelineRun stuck in PipelineRunPending…
  Rename "## Evidence" -> "## Verification" and drop the internal
  ev:c1/c3/c6/c7 labels.
- Dynamic NFS storage provisioning for Tekton CI/CD on ACP
  Strip "cluster glean-lab-base-0529" mentions from user-facing prose.
- ResolutionRequest reconciles every ten hours and why that is harmless
  Strip "lab-base" mentions from user-facing prose.
- Alertmanager 0.27+ UTF-8 matchers parser warning… +
  Custom matchers with whitespace trigger Alertmanager UTF-8 parser…
  Add "## See Also" cross-link between the two near-duplicate KBs so
  they don't drift independently.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

Walkthrough

Ten knowledge-base articles are updated with cross-references between related topics, removal of cluster-specific identifiers to improve applicability, addition of pre-execution safety warnings for destructive operations, and refinement of resolution procedures with clearer step ordering and preconditions.

Changes

Knowledge Base Documentation Improvements

Layer / File(s) Summary
Cross-reference linking between related articles
docs/en/solutions/Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md, docs/en/solutions/Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md
Two Alertmanager articles now include "See Also" sections that cross-link to each other, connecting the UTF-8 matchers parser warning with custom matchers whitespace issues.
Generalization of cluster-specific examples
docs/en/solutions/Dynamic_NFS_storage_provisioning_for_Tekton_CICD_on_ACP.md
Cluster name references are removed from prerequisites and verification sections, replacing specific ACP cluster identifiers with generic phrasing while retaining NFS CSI driver details and pipeline outcomes.
Pre-execution operational safety warnings
docs/en/solutions/KubeVirt_VM_Loses_Bridge_Network_Connectivity_When_br_netfilter_Is_Loaded_on_the_Worker.md, docs/en/solutions/VolumeSnapshot_creation_latency_from_stale_VolumeSnapshotContent_objects_on_ACP.md
New "disruptive operation" and "destructive operation" warning subsections are added, including maintenance scheduling, dependency audits, rollback capture procedures, and customer acknowledgement checklists before executing the remediation.
Resolution procedure refinement and restructuring
docs/en/solutions/OpenTelemetry_Apache_HTTPD_auto_instrumentation_fails_to_inject_when_user_volumes_leave_no_room_for_the_agent.md, docs/en/solutions/PipelineRun_stuck_in_PipelineRunPending_on_ACP_DevOps_Pipelines.md, docs/en/solutions/ResolutionRequest_reconciles_every_ten_hours_and_why_that_is_harmless.md, docs/en/solutions/TektonConfig_stuck_Not_Ready_after_Tekton_operator_upgrade_on_ACP.md
Resolution steps are rewritten with ordered sequences, generalized placeholders, code-behavior tie-ins, and dual-path procedures: OpenTelemetry separates conflict detection from workaround application; PipelineRun replaces example-specific evidence with template-based verification; ResolutionRequest clarifies log signature expectations; TektonConfig adds preferred operator-aware recovery before last-resort finalizer removal.
Removal of outdated documentation artifacts
docs/en/solutions/VM_application_traffic_times_out_behind_kube_ovn_Geneve_overlay_when_the_secondary_network_MTU_exceeds_1400_on_ACP.md
Lab-specific phase tracking and internal experimental findings section is deleted, leaving the document to conclude with production diagnostic guidance.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~4 minutes

Poem

🐰 With cross-links clear and examples bright,
We scrub away the cluster-names of night,
Add warnings loud before we play with fire,
Refactor steps to lift the doc's attire,
Our knowledge base grows wiser, pure delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title '[review] batch6 fixes: 10 articles (1 high + 4 medium + 4 low)' is vague and uses non-descriptive packaging notation rather than conveying the primary change; it reads like an internal tracking tag rather than a clear summary. Rewrite the title to highlight the primary change, such as 'Documentation fixes for operational safety and content cleanup across 10 KB articles' or 'Add operational warnings and sanitize user-facing content in batch6 KB fixes'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kb/fix/batch6-review-2026-05-30

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/en/solutions/ResolutionRequest_reconciles_every_ten_hours_and_why_that_is_harmless.md (1)

18-18: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cluster-specific identifier remains in Root Cause section.

The PR objectives indicate removing "lab-base mentions from user-facing prose." Line 18 contains "on lab-base the live deploy/tekton-pipelines-remote-resolvers..." which appears to be a cluster-specific reference. Consider generalizing this to "on a stock ACP cluster" or similar to match the sanitization applied to other articles in this batch.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/ResolutionRequest_reconciles_every_ten_hours_and_why_that_is_harmless.md`
at line 18, Replace the cluster-specific phrase "lab-base" in the sentence that
reads "on lab-base the live `deploy/tekton-pipelines-remote-resolvers`..." with
a generic descriptor such as "on a stock ACP cluster" (leave the resource name
`deploy/tekton-pipelines-remote-resolvers` unchanged and preserve backticks);
update the surrounding prose to match the sanitized style used elsewhere so the
sentence reads naturally and contains no environment-specific identifiers.
🧹 Nitpick comments (2)
docs/en/solutions/Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md (1)

81-83: ⚡ Quick win

Consider using hyperlinks instead of plain text references.

The cross-reference correctly identifies the related article, but the same formatting concern applies here: backticks are unconventional for article titles, and a hyperlink would improve usability.

📎 Suggested improvement with relative link
 ## See Also
 
-- `Alertmanager 0.27+ UTF-8 matchers parser warning for custom routing rules` — same root cause, broader walk-through of UTF-8-parser back-compat changes.
+- [Alertmanager 0.27+ UTF-8 matchers parser warning for custom routing rules](./Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md) — same root cause, broader walk-through of UTF-8-parser back-compat changes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md`
around lines 81 - 83, Replace the plain backticked reference in the "See Also"
section — specifically the entry "`Alertmanager 0.27+ UTF-8 matchers parser
warning for custom routing rules`" — with a properly formatted hyperlink using
the article title (no backticks) and a relative URL to that related doc; update
the bullet to read like a linkable title (e.g., Alertmanager 0.27+ UTF-8
matchers parser warning for custom routing rules) and add the relative markdown
link syntax so readers can click through.
docs/en/solutions/Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md (1)

86-88: ⚡ Quick win

Consider using hyperlinks instead of plain text references.

The cross-reference correctly identifies the related article, but using backticks for article titles is unconventional (backticks typically denote code/literals in Markdown). More importantly, the reference is plain text rather than a clickable hyperlink, which reduces usability.

📎 Suggested improvement with relative link
 ## See Also
 
-- `Custom matchers with whitespace trigger Alertmanager UTF-8 parser fallback warning` — same root cause, walks through the unquoted-whitespace example explicitly.
+- [Custom matchers with whitespace trigger Alertmanager UTF-8 parser fallback warning](./Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md) — same root cause, walks through the unquoted-whitespace example explicitly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md`
around lines 86 - 88, Under the "## See Also" section replace the backticked
plain-text reference `Custom matchers with whitespace trigger Alertmanager UTF-8
parser fallback warning` with a proper Markdown hyperlink to that related
article (use a relative link to the article's filename and keep the article
title as the link text) so the entry becomes a clickable cross-reference instead
of inline code-styled text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/en/solutions/OpenTelemetry_Apache_HTTPD_auto_instrumentation_fails_to_inject_when_user_volumes_leave_no_room_for_the_agent.md`:
- Line 63: Update the ambiguous instruction at line 63 in the document so it
reflects that step 2 is conditional: replace "The resolution has two parts; do
both in order" with a clear instruction such as "Attempt step 1 first; apply
step 2 only if step 1 does not resolve the issue." Also scan and adjust the
subsequent phrasing that starts with "When step 1 does not visibly apply" so it
remains consistent with the new line 63 wording and clearly indicates step 2 is
a fallback rather than mandatory.

---

Outside diff comments:
In
`@docs/en/solutions/ResolutionRequest_reconciles_every_ten_hours_and_why_that_is_harmless.md`:
- Line 18: Replace the cluster-specific phrase "lab-base" in the sentence that
reads "on lab-base the live `deploy/tekton-pipelines-remote-resolvers`..." with
a generic descriptor such as "on a stock ACP cluster" (leave the resource name
`deploy/tekton-pipelines-remote-resolvers` unchanged and preserve backticks);
update the surrounding prose to match the sanitized style used elsewhere so the
sentence reads naturally and contains no environment-specific identifiers.

---

Nitpick comments:
In
`@docs/en/solutions/Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md`:
- Around line 86-88: Under the "## See Also" section replace the backticked
plain-text reference `Custom matchers with whitespace trigger Alertmanager UTF-8
parser fallback warning` with a proper Markdown hyperlink to that related
article (use a relative link to the article's filename and keep the article
title as the link text) so the entry becomes a clickable cross-reference instead
of inline code-styled text.

In
`@docs/en/solutions/Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md`:
- Around line 81-83: Replace the plain backticked reference in the "See Also"
section — specifically the entry "`Alertmanager 0.27+ UTF-8 matchers parser
warning for custom routing rules`" — with a properly formatted hyperlink using
the article title (no backticks) and a relative URL to that related doc; update
the bullet to read like a linkable title (e.g., Alertmanager 0.27+ UTF-8
matchers parser warning for custom routing rules) and add the relative markdown
link syntax so readers can click through.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5dc35976-593d-4671-8b72-22e70f315a1f

📥 Commits

Reviewing files that changed from the base of the PR and between c192120 and a50dbfa.

📒 Files selected for processing (10)
  • docs/en/solutions/Alertmanager_027_UTF_8_matchers_parser_warning_for_custom_routing_rules.md
  • docs/en/solutions/Custom_matchers_with_whitespace_trigger_Alertmanager_UTF_8_parser_fallback_warning.md
  • docs/en/solutions/Dynamic_NFS_storage_provisioning_for_Tekton_CICD_on_ACP.md
  • docs/en/solutions/KubeVirt_VM_Loses_Bridge_Network_Connectivity_When_br_netfilter_Is_Loaded_on_the_Worker.md
  • docs/en/solutions/OpenTelemetry_Apache_HTTPD_auto_instrumentation_fails_to_inject_when_user_volumes_leave_no_room_for_the_agent.md
  • docs/en/solutions/PipelineRun_stuck_in_PipelineRunPending_on_ACP_DevOps_Pipelines.md
  • docs/en/solutions/ResolutionRequest_reconciles_every_ten_hours_and_why_that_is_harmless.md
  • docs/en/solutions/TektonConfig_stuck_Not_Ready_after_Tekton_operator_upgrade_on_ACP.md
  • docs/en/solutions/VM_application_traffic_times_out_behind_kube_ovn_Geneve_overlay_when_the_secondary_network_MTU_exceeds_1400_on_ACP.md
  • docs/en/solutions/VolumeSnapshot_creation_latency_from_stale_VolumeSnapshotContent_objects_on_ACP.md
💤 Files with no reviewable changes (1)
  • docs/en/solutions/VM_application_traffic_times_out_behind_kube_ovn_Geneve_overlay_when_the_secondary_network_MTU_exceeds_1400_on_ACP.md

## Resolution

Add a non-conflicting `volumeMount` path (e.g. `/tmp/.otel-instr-fix`) backed by a new `emptyDir` volume to the user's Deployment. This gives the OTel mutating webhook the headroom it needs to complete the Apache HTTPD instrumentation preparation steps so the merged layout is consistent. After the mitigation, the Apache HTTPD container no longer emits `No such file or directory` on startup and the OTel auto-instrumentation injection succeeds.
The resolution has two parts; do both in order, because step 1 addresses the root cause and step 2 is an empirical mitigation whose mechanism is not fully understood.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify whether both steps are required or conditional.

Line 63 states "do both in order," but line 67 introduces step 2 with "When step 1 does not visibly apply," suggesting step 2 is conditional and only needed if step 1 doesn't resolve the issue. Consider revising line 63 to something like "The resolution has two parts; attempt step 1 first, then apply step 2 if needed" to accurately reflect that step 2 is a fallback rather than always required.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/en/solutions/OpenTelemetry_Apache_HTTPD_auto_instrumentation_fails_to_inject_when_user_volumes_leave_no_room_for_the_agent.md`
at line 63, Update the ambiguous instruction at line 63 in the document so it
reflects that step 2 is conditional: replace "The resolution has two parts; do
both in order" with a clear instruction such as "Attempt step 1 first; apply
step 2 only if step 1 does not resolve the issue." Also scan and adjust the
subsequent phrasing that starts with "When step 1 does not visibly apply" so it
remains consistent with the new line 63 wording and clearly indicates step 2 is
a fallback rather than mandatory.

@jing2uo jing2uo merged commit 74eb346 into main May 30, 2026
2 checks passed
@jing2uo jing2uo deleted the kb/fix/batch6-review-2026-05-30 branch May 30, 2026 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant