Skip to content

autoddl: keep ALTER TABLE from yanking tables out of custom repsets#490

Open
rasifr wants to merge 4 commits into
mainfrom
task/SPOC-539/autoddl-repset-stickiness
Open

autoddl: keep ALTER TABLE from yanking tables out of custom repsets#490
rasifr wants to merge 4 commits into
mainfrom
task/SPOC-539/autoddl-repset-stickiness

Conversation

@rasifr
Copy link
Copy Markdown
Member

@rasifr rasifr commented Jun 3, 2026

With autoDDL and spock.include_ddl_repset both enabled, any ALTER TABLE
on a table that lived in a user-defined replication set was being
forcibly re-routed back into 'default' (or 'default_insert_only'),
silently undoing the operator's repset placement.

Apply ALTER stickiness rules in apply_repset_policy_for_reloid():

  - If the ALTER did not add or drop a PK / replica identity, leave
    repset membership entirely alone.
  - If the ALTER added a PK on a table in a custom repset, emit a
    NOTICE and leave membership alone. The operator can move it
    manually if they want full UPDATE/DELETE replication.
  - If the ALTER dropped the PK on a table in a custom repset that
    replicates UPDATE/DELETE, emit a WARNING and move it to
    default_insert_only. Without a replica identity, that custom
    repset would otherwise produce broken downstream replication.
  - Tables in the default repsets continue to be auto-re-routed as
    before.

Split the monolithic add_ddl_to_repset() into three pieces with a clear
single responsibility each:

  - add_ddl_to_repset() is now a thin orchestrator: GUC guard, extract
    target, open relation, expand partitions, apply policy per inheritor.
  - extract_ddl_target() owns parse-tree dispatch (ALTER TABLE / OBJECT_INDEX
    ATTACH PARTITION / CREATE / CREATE TABLE AS / CREATE SCHEMA / EXPLAIN)
    including the recursive cases.
  - apply_repset_policy_for_reloid() owns the per-relation policy:
    UNLOGGED skip and default vs default_insert_only routing.

Pure structural change; behavior is preserved across all DDL shapes.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR refactors Spock's autoDDL repset routing into a pipeline: extract_ddl_target() identifies table targets and stickiness intent, classify_alter_pkri_change() and classify_repset_membership() detect PK/RI transitions and current membership, and apply_repset_policy_for_reloid() routes tables with stickiness and safety-net fallbacks.

Changes

PK/RI-based repset stickiness for autoDDL

Layer / File(s) Summary
Entry point and target extraction refactor
src/spock_autoddl.c
Introduces AlterPkRiChange enum for PK/RI state tracking and refactors add_ddl_to_repset() to delegate parse-tree dispatch and target identification to extract_ddl_target(), which centralizes handling of ALTER TABLE, ALTER INDEX ... ATTACH PARTITION, CREATE/CREATE TABLE AS, CREATE SCHEMA recursion, and EXPLAIN ANALYZE unwrapping, returning targets with stickiness flags.
Repset routing policy with stickiness and fallbacks
src/spock_autoddl.c
Adds apply_repset_policy_for_reloid() implementing the post-ALTER routing decision tree: evicts UNLOGGED/TEMP relations, materializes index lists for post-ALTER PK/RI detection, applies ALTER-table stickiness via classification helpers, emits WARNING/NOTICE for unsafe PK/RI transitions, falls back to default_insert_only when necessary, and assigns tables to default or default_insert_only.
PK/RI change and membership classification
src/spock_autoddl.c
Adds classify_alter_pkri_change() to determine whether an ALTER TABLE added, dropped, or left unchanged a table's primary key or replica identity by scanning AlterTableCmd intent alongside post-state indexes; and classify_repset_membership() to detect participation in UPD/DEL-capable and user-defined repsets.
Test suite for repset stickiness scenarios
tests/tap/schedule, tests/tap/t/030_autoddl_repset_stickiness.pl
Adds TAP 030_autoddl_repset_stickiness (scheduled) and a test script that creates two custom repsets and validates ten scenarios (T1–T10): non-PK ALTER preserves membership; PK drops move default tables to default_insert_only; dropping PK in custom UPDATE/DELETE repsets emits WARNING and falls back to default_insert_only; insert-only repsets remain sticky across PK changes; PK additions emit NOTICE and promote insert-only to default where appropriate; and mixed-membership cases and inline-PK ADD are covered.

Poem

🐰 A rabbit scribbles in the log,
Tables hop safe through PK fog.
If keys fall out or keys arise,
Rep sets shift where caution lies.
Stickiness held, warnings shared—replication snug and charged.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title precisely captures the main change: preventing ALTER TABLE from removing tables from custom replication sets, which is the core objective.
Description check ✅ Passed The description clearly explains the problem (ALTER TABLE was re-routing tables from custom repsets to default ones), the solution (apply stickiness rules), and the specific behaviors for different ALTER scenarios.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch task/SPOC-539/autoddl-repset-stickiness

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)
src/spock_autoddl.c

src/spock_autoddl.c:12:10: fatal error: 'postgres.h' file not found
12 | #include "postgres.h"
| ^~~~~~~~~~~~
1 error generated.
Error: the following clang command did not run successfully:
/opt/infer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/bin/clang-18
@/tmp/coderabbit-infer/4e56d36273d515128e30aca0121bfa0819c412a6-a44352798d8c4c01/tmp/clang_command_.tmp.425570.txt
++Contents of '/tmp/coderabbit-infer/4e56d36273d515128e30aca0121bfa0819c412a6-a44352798d8c4c01/tmp/clang_command_.tmp.425570.txt':
"-cc1" "-load"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../../facebook-clang-plugins/libtooling/build/FacebookClangPlugin.dylib"
"-add-plugin" "BiniouASTExporter" "-plugin-arg-BiniouASTExporter" "-"
"-plugin-arg-BiniouASTExporter" "PREPEND_CURRENT_DIR=1"
"-plugin-arg-BiniouASTExporter" "MAX_STRING_SIZE=65535" "-cc1" "-triple"
"x86_64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "-disable-free"
"-clear-a

... [truncated 689 characters] ...

opt/infer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include"
"-internal-isystem" "/usr/local/include" "-internal-isystem"
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
"-internal-externc-isystem" "/usr/include/x86_64-linux-gnu"
"-internal-externc-isystem" "/include" "-internal-externc-isystem"
"/usr/include" "-Wno-ignored-optimization-argument" "-Wno-everything"
"-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fskip-odr-check-in-gmf"
"-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o"
"/tmp/coderabbit-infer/a44352798d8c4c01/file.o" "-x" "c"
"src/spock_autoddl.c" "-O0" "-fno-builtin" "-include"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../lib/clang_wrappers/global_defines.h"
"-Wno-everything"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Jun 3, 2026

Up to standards ✅

🟢 Issues 6 medium

Results:
6 new issues

Category Results
Complexity 6 medium

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/spock_autoddl.c`:
- Around line 391-416: classify_repset_membership currently collapses "in
default" and "in any custom" into only_in_default, which breaks mixed-membership
handling; change the membership detection to produce two distinct booleans
(e.g., in_default and in_any_custom) from classify_repset_membership (or call it
differently) and update the logic around pkchange: replace the top-level if
(!only_in_default) check with logic that only returns early (sticky custom
behavior) when the table is not in_default and is in_any_custom (i.e., if
(!in_default && in_any_custom) { /* existing sticky path: if pkchange ==
PKRI_DROPPED && in_upd_del_custom ... else notice and return */ }), otherwise
allow the default-managed reroute behavior to proceed (handling PKRI_DROPPED to
move to DEFAULT_INSONLY_REPSET_NAME and PKRI_ADDED notices with
DEFAULT_REPSET_NAME), keeping existing references to pkchange,
in_upd_del_custom, DEFAULT_INSONLY_REPSET_NAME, DEFAULT_REPSET_NAME,
table_close, and targetrel.

In `@tests/tap/t/030_autoddl_repset_stickiness.pl`:
- Line 3: The TAP plan at the top uses a hardcoded count ("use Test::More tests
=> 20;") but the script actually emits 16 assertions; update the test plan to
match the real count or switch to a dynamic plan: replace the fixed "use
Test::More tests => 20;" declaration with either the correct numeric count (16)
or call done_testing() at the end of the script so the harness no longer
over-counts; locate and update the "use Test::More" line in the test file
(tests/tap/t/030_autoddl_repset_stickiness.pl) accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 31820899-a103-43e7-a20a-ee753fcd6e7a

📥 Commits

Reviewing files that changed from the base of the PR and between 94f4ad6 and 6c770bf.

📒 Files selected for processing (3)
  • src/spock_autoddl.c
  • tests/tap/schedule
  • tests/tap/t/030_autoddl_repset_stickiness.pl

Comment thread src/spock_autoddl.c
@@ -0,0 +1,135 @@
use strict;
use warnings;
use Test::More tests => 20;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

The TAP plan is over-counted and will fail this test file.

This script runs 16 assertions, not 20, so the harness will report a bad plan even when the behavior is correct. Please fix the count or switch to done_testing().

Suggested fix
-use Test::More tests => 20;
+use Test::More tests => 16;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use Test::More tests => 20;
use Test::More tests => 16;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/tap/t/030_autoddl_repset_stickiness.pl` at line 3, The TAP plan at the
top uses a hardcoded count ("use Test::More tests => 20;") but the script
actually emits 16 assertions; update the test plan to match the real count or
switch to a dynamic plan: replace the fixed "use Test::More tests => 20;"
declaration with either the correct numeric count (16) or call done_testing() at
the end of the script so the harness no longer over-counts; locate and update
the "use Test::More" line in the test file
(tests/tap/t/030_autoddl_repset_stickiness.pl) accordingly.

rasifr and others added 3 commits June 4, 2026 18:38
With autoDDL and spock.include_ddl_repset both enabled, any ALTER TABLE
on a table that lived in a user-defined replication set was being
forcibly re-routed back into 'default' (or 'default_insert_only'),
silently undoing the operator's repset placement.

Apply ALTER stickiness rules in apply_repset_policy_for_reloid():

  - If the ALTER did not add or drop a PK / replica identity, leave
    repset membership entirely alone.
  - If the ALTER added a PK on a table in a custom repset, emit a
    NOTICE and leave membership alone. The operator can move it
    manually if they want full UPDATE/DELETE replication.
  - If the ALTER dropped the PK on a table in a custom repset that
    replicates UPDATE/DELETE, emit a WARNING and move it to
    default_insert_only. Without a replica identity, that custom
    repset would otherwise produce broken downstream replication.
  - Tables in the default repsets continue to be auto-re-routed as
    before.
Single-node TAP test that exercises the seven scenarios autoDDL repset
routing has to get right when spock.include_ddl_repset is enabled:

  T1: unrelated ALTER on a table in a custom repset is sticky
  T2: unrelated ALTER on a default-repset table does not churn
  T3: dropping PK on a default-repset table moves it to insert-only
  T4: dropping PK in a custom UPDATE/DELETE repset emits WARNING and
      falls back to default_insert_only
  T5: dropping PK in a custom insert-only repset is sticky
  T6: adding PK to a table in a custom repset emits NOTICE and leaves
      membership unchanged
  T7: CREATE / unrelated ALTER on no-PK tables still land in
      default_insert_only

T4 and T6 also assert on the exact NOTICE / WARNING text to lock the
operator-facing messages in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
classify_alter_pkri_change() walks atstmt->cmds looking for
AT_AddConstraint / AT_AddIndexConstraint to identify a PK add. But
"ALTER TABLE ... ADD COLUMN x int PRIMARY KEY" arrives as a single
AT_AddColumn command with the PRIMARY KEY embedded in the ColumnDef's
inline constraints list, not as a separate AT_AddConstraint. We were
missing that form: post-state had a PK, has_add_pkri_cmd stayed false,
and the classifier returned PKRI_UNCHANGED — so the stickiness check
short-circuited and the table never moved from default_insert_only
into default after the inline PK arrived.

Add an AT_AddColumn case that inspects ColumnDef->constraints for an
inline CONSTR_PRIMARY. Verified against the reproducer:

  CREATE TABLE t1(a int primary key, b int);
  SELECT spock.repset_add_table('mt', 't1');
  ALTER TABLE t1 DROP COLUMN a;        -- safety net: -> default_insert_only
  ALTER TABLE t1 ADD COLUMN a int PRIMARY KEY;  -- now correctly -> default

Add TAP scenario T8 to lock the fix in place: a no-PK table in
default_insert_only is promoted to default after ADD COLUMN ... PRIMARY
KEY. This test fails without the AT_AddColumn case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rasifr rasifr force-pushed the task/SPOC-539/autoddl-repset-stickiness branch from 6c770bf to 4e56d36 Compare June 4, 2026 14:29
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/tap/t/030_autoddl_repset_stickiness.pl`:
- Line 3: The TAP plan in the test file is wrong: update the Test::More plan
declaration (the line using "use Test::More tests => 27;") to match the actual
number of assertions (change 27 to 23) or replace it with "use Test::More tests
=> 23;" (alternatively use "use Test::More 'no_plan';" if you prefer not to fix
a numeric plan); ensure the plan value matches the total assertions across
T1–T10 so the TAP harness doesn't report a bad plan.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8a036fd8-52b1-4bed-9934-c499a1f5386c

📥 Commits

Reviewing files that changed from the base of the PR and between 6c770bf and 4e56d36.

📒 Files selected for processing (3)
  • src/spock_autoddl.c
  • tests/tap/schedule
  • tests/tap/t/030_autoddl_repset_stickiness.pl
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/tap/schedule
  • src/spock_autoddl.c

@@ -0,0 +1,180 @@
use strict;
use warnings;
use Test::More tests => 27;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

The TAP plan count is incorrect and will cause test failure.

The plan declares 27 tests, but the file contains only 23 assertions (2+2+2+3+2+3+2+3+2+2 across T1–T10). The TAP harness will report a bad plan even when all assertions pass.

🔧 Proposed fix
-use Test::More tests => 27;
+use Test::More tests => 23;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/tap/t/030_autoddl_repset_stickiness.pl` at line 3, The TAP plan in the
test file is wrong: update the Test::More plan declaration (the line using "use
Test::More tests => 27;") to match the actual number of assertions (change 27 to
23) or replace it with "use Test::More tests => 23;" (alternatively use "use
Test::More 'no_plan';" if you prefer not to fix a numeric plan); ensure the plan
value matches the total assertions across T1–T10 so the TAP harness doesn't
report a bad plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant