NE-2715: Update keepalived-ipfailover to replace iptables with nftables#235
NE-2715: Update keepalived-ipfailover to replace iptables with nftables#235Thealisyed wants to merge 1 commit into
Conversation
|
@Thealisyed: This pull request references NE-2715 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis PR migrates the keepalived container from using iptables wrapper scripts to nftables for managing multicast firewall rules. The nftables package is installed, deprecated wrapper scripts are removed, configuration documentation is updated, and core failover setup/teardown logic switches to nftables table and rule management. Changesiptables to nftables migration for keepalived multicast
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Line 14: The README.md contains a testing artifact line "//For my testing
purpose" that should not be published; remove that exact line from README.md (or
replace it with appropriate documentation text) so the README only contains
intended documentation content.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 522cf7a2-4f21-4b3a-966a-31dc5b172233
📒 Files selected for processing (10)
README.mdipfailover/keepalived/Dockerfileipfailover/keepalived/conf/settings.shipfailover/keepalived/iptables-scripts/ip6tablesipfailover/keepalived/iptables-scripts/ip6tables-restoreipfailover/keepalived/iptables-scripts/ip6tables-saveipfailover/keepalived/iptables-scripts/iptablesipfailover/keepalived/iptables-scripts/iptables-restoreipfailover/keepalived/iptables-scripts/iptables-saveipfailover/keepalived/lib/failover-functions.sh
💤 Files with no reviewable changes (6)
- ipfailover/keepalived/iptables-scripts/iptables
- ipfailover/keepalived/iptables-scripts/iptables-restore
- ipfailover/keepalived/iptables-scripts/ip6tables-restore
- ipfailover/keepalived/iptables-scripts/iptables-save
- ipfailover/keepalived/iptables-scripts/ip6tables
- ipfailover/keepalived/iptables-scripts/ip6tables-save
iptables is deprecated in RHEL 9 and the ip_tables kernel module is removed in RHEL 10. This replaces all iptables usage in the ipfailover component with native nft commands. Changes: - Remove iptables-scripts/ chroot wrappers (no longer needed since nft runs directly in the container) - Install nftables package in Dockerfile - Replace iptables calls in failover-functions.sh with nft equivalents that create a dedicated "inet keepalived" table for the multicast accept rule (224.0.0.18) - Preserve OPENSHIFT_HA_IPTABLES_CHAIN env var for backward compat Assisted with Claude
b3b419e to
bd18a29
Compare
|
@coderabbitai can you do a quick review is all good? |
🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== README.md diff area ==="
cat README.md | head -30
echo ""
echo "=== Dockerfile ==="
cat ipfailover/keepalived/Dockerfile
echo ""
echo "=== failover-functions.sh ==="
cat ipfailover/keepalived/lib/failover-functions.sh
echo ""
echo "=== settings.sh ==="
cat ipfailover/keepalived/conf/settings.sh
echo ""
echo "=== iptables-scripts dir (should be gone or empty) ==="
ls ipfailover/keepalived/iptables-scripts/ 2>/dev/null || echo "Directory removed or empty"Length of output: 8651 🏁 Script executed: #!/bin/bash
echo "=== Full README.md ==="
cat README.md
echo ""
echo "=== Check nft heredoc indentation (tabs vs spaces) ==="
cat -A ipfailover/keepalived/lib/failover-functions.sh | grep -A5 'NFT'
echo ""
echo "=== Check if nftables listed twice in Dockerfile ==="
grep -c 'nftables' ipfailover/keepalived/Dockerfile
echo ""
echo "=== Check any remaining references to iptables in shell scripts ==="
rg -n 'iptables' ipfailover/keepalived/ --type shLength of output: 1675
✅ What looks good
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ipfailover/keepalived/lib/failover-functions.sh`:
- Around line 62-75: The heredoc invocation of nft (-f -) inside the
HA_IPTABLES_CHAIN block may fail silently; update the block in
failover-functions.sh to capture the exit status of the nft -f - operation
(after invoking nft with the provided heredoc for table/chain creation), and if
it fails (non-zero exit code) log a clear error including the attempted rule
("ip daddr 224.0.0.18 accept" / "nft -f -") via echo or process logger and exit
non-zero (or return failure) so setup does not continue without the multicast
accept rule; also consider checking nft availability before attempting the
heredoc (e.g., via command -v nft) to produce a more specific error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7499d2f7-b512-4081-bb8b-b9f59783b1d8
📒 Files selected for processing (9)
ipfailover/keepalived/Dockerfileipfailover/keepalived/conf/settings.shipfailover/keepalived/iptables-scripts/ip6tablesipfailover/keepalived/iptables-scripts/ip6tables-restoreipfailover/keepalived/iptables-scripts/ip6tables-saveipfailover/keepalived/iptables-scripts/iptablesipfailover/keepalived/iptables-scripts/iptables-restoreipfailover/keepalived/iptables-scripts/iptables-saveipfailover/keepalived/lib/failover-functions.sh
💤 Files with no reviewable changes (6)
- ipfailover/keepalived/iptables-scripts/ip6tables-save
- ipfailover/keepalived/iptables-scripts/ip6tables-restore
- ipfailover/keepalived/iptables-scripts/iptables-restore
- ipfailover/keepalived/iptables-scripts/iptables-save
- ipfailover/keepalived/iptables-scripts/iptables
- ipfailover/keepalived/iptables-scripts/ip6tables
✅ Files skipped from review due to trivial changes (1)
- ipfailover/keepalived/conf/settings.sh
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ipfailover/keepalived/lib/failover-functions.sh (1)
62-74:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake nft setup idempotent for partial state and fail fast when nft apply fails.
If
inet keepalivedalready exists but is incomplete,nft -f -can fail on table redefinition and setup still proceeds without guaranteeingip daddr 224.0.0.18 accept.Suggested fix
if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then echo " - Ensuring nft rule for keepalived multicast (224.0.0.18) ..." - if ! nft list chain inet keepalived filter 2>/dev/null | grep -q 'ip daddr 224.0.0.18 accept' ; then - echo " - Adding nft rule to accept multicast 224.0.0.18." - nft -f - <<-'NFT' - table inet keepalived { - chain filter { - type filter hook input priority 0; policy accept; - ip daddr 224.0.0.18 accept - } - } - NFT - fi + nft list table inet keepalived >/dev/null 2>&1 || \ + nft add table inet keepalived || { echo "ERROR: failed to create nft table inet keepalived"; return 1; } + + nft list chain inet keepalived filter >/dev/null 2>&1 || \ + nft add chain inet keepalived filter '{ type filter hook input priority 0; policy accept; }' || { echo "ERROR: failed to create nft chain inet keepalived filter"; return 1; } + + if ! nft list chain inet keepalived filter 2>/dev/null | grep -q 'ip daddr 224.0.0.18 accept' ; then + echo " - Adding nft rule to accept multicast 224.0.0.18." + nft add rule inet keepalived filter ip daddr 224.0.0.18 accept || { echo "ERROR: failed to add nft rule ip daddr 224.0.0.18 accept"; return 1; } + fi fi#!/bin/bash set -euo pipefail file="$(fd --full-path 'ipfailover/keepalived/lib/failover-functions.sh$' | head -n1)" python - <<'PY' import re, pathlib p = pathlib.Path("ipfailover/keepalived/lib/failover-functions.sh") t = p.read_text() m = re.search(r'if \[\[ -n "\$\{HA_IPTABLES_CHAIN:-\}" \]\]; then(.*?)\n\s*fi', t, re.S) block = m.group(1) if m else "" print("has_heredoc_apply:", "nft -f -" in block) print("has_table_guard:", "nft list table inet keepalived" in block) print("has_failfast_return:", "return 1" in block) PYExpected confirmation for current code: heredoc apply is present, while explicit table/chain guards and fail-fast return in this block are absent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ipfailover/keepalived/lib/failover-functions.sh` around lines 62 - 74, The current HA_IPTABLES_CHAIN block uses a heredoc with "nft -f -" without guarding for partial existing state and doesn't fail fast; update the block to first probe the existing state with "nft list table inet keepalived" and/or "nft list chain inet keepalived filter" and only run the heredoc to create missing pieces (or run targeted "nft add table"/"nft add chain"/"nft add rule" commands) when those probes show resources are absent, and capture the exit status of the nft apply (the heredoc or the targeted commands) and immediately return 1 (or exit non-zero) on failure so the caller sees the error; keep these checks and the fail-fast return inside the same HA_IPTABLES_CHAIN conditional so the behavior is idempotent and errors are propagated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@ipfailover/keepalived/lib/failover-functions.sh`:
- Around line 62-74: The current HA_IPTABLES_CHAIN block uses a heredoc with
"nft -f -" without guarding for partial existing state and doesn't fail fast;
update the block to first probe the existing state with "nft list table inet
keepalived" and/or "nft list chain inet keepalived filter" and only run the
heredoc to create missing pieces (or run targeted "nft add table"/"nft add
chain"/"nft add rule" commands) when those probes show resources are absent, and
capture the exit status of the nft apply (the heredoc or the targeted commands)
and immediately return 1 (or exit non-zero) on failure so the caller sees the
error; keep these checks and the fail-fast return inside the same
HA_IPTABLES_CHAIN conditional so the behavior is idempotent and errors are
propagated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 33de2090-fe7b-49e1-a8ea-9918ab0cf1d6
📒 Files selected for processing (9)
ipfailover/keepalived/Dockerfileipfailover/keepalived/conf/settings.shipfailover/keepalived/iptables-scripts/ip6tablesipfailover/keepalived/iptables-scripts/ip6tables-restoreipfailover/keepalived/iptables-scripts/ip6tables-saveipfailover/keepalived/iptables-scripts/iptablesipfailover/keepalived/iptables-scripts/iptables-restoreipfailover/keepalived/iptables-scripts/iptables-saveipfailover/keepalived/lib/failover-functions.sh
💤 Files with no reviewable changes (6)
- ipfailover/keepalived/iptables-scripts/iptables-save
- ipfailover/keepalived/iptables-scripts/ip6tables
- ipfailover/keepalived/iptables-scripts/ip6tables-restore
- ipfailover/keepalived/iptables-scripts/iptables-restore
- ipfailover/keepalived/iptables-scripts/ip6tables-save
- ipfailover/keepalived/iptables-scripts/iptables
|
/retest |
|
@Thealisyed: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/assign |
| if iptables -C ${chain} 1 -d 224.0.0.18/32 -j ACCEPT ; then | ||
| echo " - Removing keepalived multicast iptables rule ..." | ||
| iptables -D ${chain} 1 -d 224.0.0.18/32 -j ACCEPT | ||
| if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then |
There was a problem hiding this comment.
hum, this change is weird. I understand we don't rely on the chain name anymore, but in that case what is passed from the caller (settings.sh) to this function, and what it will receive? Can this be used? Should the variable name be something else?
Just some thinking if this will properly work
There was a problem hiding this comment.
settings.sh passes a chain name like "INPUT" via OPENSHIFT_HA_IPTABLES_CHAIN. The old code used it directly but nft doesn't insert into existing chains, it creates its own isolated table (inet keepalived hooked via type/hook/priority).
So the chain name value is architecturally irrelevant to nft, only non emptiness matters. The variable name is kept for backward compatibility with existing deployments.
What do you think?
| iptables -I ${chain} 1 -d 224.0.0.18/32 -j ACCEPT | ||
| if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then | ||
| echo " - Ensuring nft rule for keepalived multicast (224.0.0.18) ..." | ||
| if ! nft list chain inet keepalived filter 2>/dev/null | grep -q 'ip daddr 224.0.0.18 accept' ; then |
There was a problem hiding this comment.
Same behavior as the old code iptables -C / iptables -I had no error handling either. The chain policy is accept, so the multicast rule is defense in depth, not load-bearing.
Adding here would make setup fatal where it was previously best effort, which would be a behavior change?
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rikatz, Thealisyed The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Verified using cluster bot create OSP cluster. No issue with the existing test case. /verified by @mjoseph |
|
@melvinjoseph86: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| FROM registry.ci.openshift.org/ocp/4.18:base-rhel9 | ||
|
|
||
| RUN INSTALL_PKGS="kmod keepalived iproute psmisc nmap-ncat net-tools ipset ipset-libs procps-ng" && \ | ||
| RUN INSTALL_PKGS="kmod keepalived iproute psmisc nmap-ncat net-tools ipset ipset-libs procps-ng nftables" && \ |
There was a problem hiding this comment.
It's worth documenting why we didn't want to add iptables pkg to the Dockerfile, but it's okay to add nftables pkg.
I read that "standard containers lack the kernel authority to alter routing and firewall namespaces. You must grant the container explicit networking capabilities when launching iptables", so if that is why we did the iptables scripts instead of instaling the pkg, we may need to do the same for nftables.
There was a problem hiding this comment.
The old code didn't ship the binary for iptables. Instead it used chroot wrapper scripts to borrow the host's iptables binary (here)
We can install nftables directly in the container because it communicates with the kernel via netlink sockets therefore the binary version doesn't need to match the host kernel. The old chroot pattern existed because iptables had tighter coupling between the userspace binary and the host's kernel module.
No new privileges are needed as the container already runs with --privileged=true --net=host which covers what nft needs.
There was a problem hiding this comment.
Please document your tests that show this works on both rhel9 and rhel10. Also, if you could document that this works even when the binaries don't match, that would be definitive.
| # created to allow keepalived multicast (224.0.0.18) traffic. The actual | ||
| # chain name value is ignored; nft manages its own table/chain. | ||
| # Env var name kept for backward compatibility. | ||
| HA_IPTABLES_CHAIN="${OPENSHIFT_HA_IPTABLES_CHAIN:-""}" |
There was a problem hiding this comment.
Consider using both an old and new env var name, so we can eventually deprecate the old one. Definitely rename our local variable.
There was a problem hiding this comment.
Good idea. I'll add a new env var OPENSHIFT_HA_NFTABLES_RULE (or do you prefer a different name?) and fall back to the old one, something like:
HA_MULTICAST_ACCEPT="${OPENSHIFT_HA_NFTABLES_RULE:-${OPENSHIFT_HA_IPTABLES_CHAIN:-""}}"
This way the old env var still works, new deployments can use the new name and the local variable reflects what it actually controls.
| iptables -D ${chain} 1 -d 224.0.0.18/32 -j ACCEPT | ||
| if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then | ||
| if nft list table inet keepalived > /dev/null 2>&1 ; then | ||
| echo " - Removing keepalived multicast nft rules ..." |
There was a problem hiding this comment.
nit
| echo " - Removing keepalived multicast nft rules ..." | |
| echo " - Removing keepalived multicast nftables rules ..." |
| if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then | ||
| if nft list table inet keepalived > /dev/null 2>&1 ; then | ||
| echo " - Removing keepalived multicast nft rules ..." | ||
| nft delete table inet keepalived |
There was a problem hiding this comment.
We were just removing a rule before, but now we delete the whole table?
There was a problem hiding this comment.
Yes
With iptables we were removing a single rule from a shared chain that other things also used, so we had to delete just our rule. With nft, we own the entire inet keepalived table and nothing else uses it.
Deleting the whole table is the clean equivalent and safer since it leaves no rules behind.
There was a problem hiding this comment.
But it takes some time to recreate the table, and that could introduce race conditions and possible transient security issues.
| # Add the rule to the beginning of the chain. | ||
| echo " - adding iptables rule to $chain to access 224.0.0.18." | ||
| iptables -I ${chain} 1 -d 224.0.0.18/32 -j ACCEPT | ||
| if [[ -n "${HA_IPTABLES_CHAIN:-}" ]]; then |
There was a problem hiding this comment.
Please add comments to explain what's going on. The more the better.
Summary
iptables-scripts/chroot wrappers (no longer needed since nft runs directly in the container)nftablespackage in container imageinet keepalivednft table for the VRRP multicast accept rule (224.0.0.18)OPENSHIFT_HA_IPTABLES_CHAINenv var for backward compatibilityWhy
iptables is deprecated in RHEL 9 and the
ip_tableskernel module is removed in RHEL 10. Theiptables-nftuserspace shim remains but is deprecated. This migrates to native nft before the shim is removed entirely.Summary by CodeRabbit