Skip to content

Fix race condition in slug generation#3

Merged
rameerez merged 4 commits into
mainfrom
fix/slug-race-condition-minimal
Feb 19, 2026
Merged

Fix race condition in slug generation#3
rameerez merged 4 commits into
mainfrom
fix/slug-race-condition-minimal

Conversation

@rameerez
Copy link
Copy Markdown
Owner

Summary

Fixes the race condition that occurs when two processes create records with the same base slug simultaneously. This is a minimal, focused fix that replaces the over-engineered PR #2.

The Problem

When two processes race:

  1. Process A: validates slug uniqueness (passes)
  2. Process B: validates slug uniqueness (passes)
  3. Process A: INSERT succeeds
  4. Process B: INSERT fails with RecordNotUnique

The Fix

  • Retry on collision: When RecordNotUnique is raised for a slug column, regenerate with a new random suffix and retry
  • PostgreSQL-safe savepoints: Use transaction(requires_new: true) to prevent transaction abort on retry
  • Dual-path handling: Support both NOT NULL columns (around_create) and nullable columns (after_create)
  • Fail-fast on exhaustion: Raise after 10 retries instead of falling back to timestamps

Changes

Metric PR #2 This PR
Lines added 1,280 ~170
New test files 9 1
New methods 7 5

Files Changed

  • lib/slugifiable/model.rb - Add retry logic with savepoints
  • test/slugifiable/race_condition_test.rb - New focused test file
  • test/test_helper.rb - Add StrictSlugModel fixture
  • CHANGELOG.md - Document changes and known limitations
  • README.md - Add NOT NULL slug column setup guide

Known Limitations

Documented in CHANGELOG:

  • before_create callbacks re-execute on retry (design callbacks to be idempotent)
  • Only DB-level RecordNotUnique triggers retry, not validation-level errors
  • Custom index names that don't contain "slug" or "_on_slug" won't trigger retry

Closes #2


🤖 Generated with Claude Code

When two processes create records with the same base slug simultaneously,
the second process now retries with a new random suffix instead of crashing.

Changes:
- Add around_create retry for NOT NULL slug columns (pre-INSERT collision)
- Add with_slug_retry helper with PostgreSQL-safe savepoints (requires_new: true)
- Add slug_column_not_null? optimization to skip overhead for nullable columns
- Add slug_unique_violation? detection for SQLite/PostgreSQL/MySQL
- Update set_slug to retry with savepoints on RecordNotUnique
- Use non-bang save in update_slug_if_nil (after_find path)

Closes #2 - simplified implementation replacing over-engineered PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Feb 19, 2026

Code Review

Overall this is a focused, well-structured fix for the TOCTOU race condition. The dual-path approach (around_create for NOT NULL, after_create for nullable) is clean, the CHANGELOG/README are thorough, and the savepoint handling is necessary and correct. A few issues worth addressing before merging:


Issues

1. The NOT NULL retry path isn't actually tested (medium)

test_strict_model_sequential_creates_get_unique_slugs exercises generate_unique_slug's EXISTS? loop, not the RecordNotUnique retry path. No test actually forces a DB-level collision on the NOT NULL path to confirm around_create retry works end-to-end.

2. around_create continuation called multiple times (medium, needs verification)

In with_slug_retry, retry re-executes transaction(requires_new: true) { yield }, which calls the around_create continuation block again. Whether Rails allows an around callback's continuation proc to be invoked multiple times is not guaranteed across Rails versions — this is worth a targeted test or a comment referencing Rails version behaviour. If the continuation is stateful, the second call could silently no-op or raise.

3. save\! in set_slug is an undocumented breaking change (medium)

Old code used self.save (non-bang); the new code uses save\! inside a savepoint. Any gem user whose model has validation errors at slug-save time (uncommon, but possible) will now get an unhandled exception where they previously got a silent failure. The CHANGELOG mentions the savesave\! change in update_slug_if_nil but not in set_slug.

4. Duplicate retry logic — set_slug vs with_slug_retry (minor)

set_slug has an inline retry loop that's structurally identical to with_slug_retry but without the raise if persisted? guard. The reason it can't reuse with_slug_retry (that guard would always fire in an after_create context) should be documented with a comment; otherwise a future maintainer will see the duplication and try to consolidate, breaking the nullable path.

# Can't use with_slug_retry here: record is already persisted (after_create),
# so the `raise if persisted?` guard in with_slug_retry would always fire.
def set_slug
  ...
end

5. Brittle source-code inspection tests (minor)

test_after_find_uses_non_bang_save and test_with_slug_retry_uses_savepoint read the .rb file and assert specific string literals including comments:

assert_includes method_content, "save # Non-bang"

Any comment rewording or whitespace change silently invalidates these. Prefer behavioural tests — e.g., mock save\! to raise and assert update_slug_if_nil doesn't propagate the error.

6. self.slug = nil before retry in set_slug is redundant (minor)

rescue ActiveRecord::RecordNotUnique => e
  ...
  self.slug = nil   # overwritten immediately on the next iteration
  retry             # jumps to: self.slug = compute_slug_for_retry
end

The nil assignment is dead — the first line of the begin block overwrites it. Remove or replace with a comment explaining the intent.

7. reset_strict_slug_model\! is dead code (minor)

Defined in SlugifiableTestHelper (test_helper.rb:130) but never called from anywhere; StrictSlugModel.delete_all is called directly in setup/teardown instead. Either wire it up or remove it.

8. Compound slug indexes not detected in MySQL/SQLite (minor, known gap)

slug_unique_violation? with pattern \bslug\b|_on_slug\b won't match index_products_on_slug_and_tenant_id in MySQL/SQLite (error messages use the index name, not column names). Worth adding to the Known Limitations list in the CHANGELOG/README.


Positives

  • Removing id_changed? from set_slug is correct — in an after_create callback it's always true, so slug.blank? was always the real guard.
  • The \bslug\b word-boundary regex correctly rejects false positives like slugged_items.
  • compute_slug_for_retry as a protected override point is a clean extension mechanism.
  • PostgreSQL savepoint handling (requires_new: true) is necessary and properly placed.
  • Documentation (CHANGELOG, README dual-path guide) is genuinely helpful.

rameerez and others added 3 commits February 19, 2026 20:26
Changes from Codex review:
- Remove compute_slug_for_retry in favor of direct compute_slug calls
- Add for_insert: parameter to with_slug_retry for path distinction
- Make with_slug_retry yield attempt count for caller decision-making
- Simplify set_slug to use shared with_slug_retry helper
- Fix boundary: == to >= for MAX_SLUG_GENERATION_ATTEMPTS check
- Update tests to be more behavioral and less source-inspection
- Add additional README warnings about retry behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Slug is already regenerated at the call site in retry_create_on_slug_unique_violation
when attempts.positive?, so recomputing it again in the helper was redundant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Feb 19, 2026

PR Review: Fix race condition in slug generation

Overall this is a solid, well-scoped fix. The approach—retry on RecordNotUnique with savepoints—is correct for this problem, and the implementation is substantially leaner than its predecessor. The dual-path design (around_create for NOT NULL, after_create for nullable) handles the two failure modes cleanly.

Below are the issues I found, ordered by severity.


Bugs / Correctness

1. set_slug removes id_changed? condition silently (but in a good way)

# Before
def set_slug
  return unless slug_persisted?
  self.slug = compute_slug if id_changed? || slug.blank?
  self.save
end

# After
def set_slug
  return unless slug_persisted?
  return unless slug.blank?
  # ...
end

For NOT NULL columns, ensure_slug_present sets the slug before INSERT. The old code then re-ran set_slug on after_create (because id_changed? is true) and overwrote the slug with a second compute_slug call—generating the slug twice per record. Removing id_changed? fixes this silent double-slug-generation. Worth calling out explicitly in the CHANGELOG since this is a subtle behavior fix on top of the race condition fix.

2. update_slug_if_nil no longer retries on collision

Before, update_slug_if_nil called set_slug, which now has retry logic. After this PR, update_slug_if_nil computes a slug and calls save (non-bang, no retry). A collision in the legacy-record repair path will silently fail rather than retry. Given this path handles records that escaped with nil slugs, silent failure on collision may be acceptable—but it's inconsistent with the retry behavior everywhere else and should be documented.


Design Concerns

3. Reliance on around_create yield being re-callable (highest risk)

def retry_create_on_slug_unique_violation
  with_slug_retry(for_insert: true) do |attempts|
    self.slug = compute_slug if attempts.positive?
    yield  # <-- re-invoked on retry
  end
end

Calling yield multiple times inside an around_create is not a documented Rails contract. It works in tests against SQLite today, but Rails' callback internals could change without warning. The CHANGELOG already acknowledges this, which is good. Consider adding a brief comment in the code itself pointing at the limitation and the Rails version range it was tested against—it'll make future maintenance easier.

4. save! in set_slug is a breaking change in error surfacing

# Before: non-bang, failure swallowed
self.save

# After: bang, raises ActiveRecord::RecordInvalid on other validation failures
save!

If a model has other validations that can fail after create (e.g., cross-model validations that depend on state), this will now raise where before it was silent. Probably the right call—failing loudly is better—but it's a behavior change for anyone relying on after_create :set_slug swallowing validation errors. Worth a CHANGELOG entry.

5. retry_create_on_slug_unique_violation is always registered

included do
  around_create :retry_create_on_slug_unique_violation  # runs for ALL models
  # ...
end

The method returns immediately for nullable columns (return yield unless slug_persisted? && slug_column_not_null?), so the overhead is minimal. But it does mean every create call on every slugifiable model pays for one slug_column_not_null? schema lookup (which hits columns_hash—cached by AR, so cheap). Worth noting in a comment that this is intentional for simplicity over a conditional registration.


Test Coverage

6. reset_strict_slug_model! is dead code

# test_helper.rb - defined but never called
def self.reset_strict_slug_model!
  StrictSlugModel.delete_all
end

The RaceConditionTest teardown calls StrictSlugModel.delete_all directly. This helper is unreferenced. Delete it.

7. test_with_slug_retry_runs_inside_savepoint asserts too weakly

assert transaction_depths.any? { |depth| depth >= 1 }

open_transactions >= 1 is true in any test that touches the DB (SQLite starts at 1 inside minitest's implicit transaction). This test passes even if requires_new: true is removed. A stronger check would be to assert depth >= 2 (outer test transaction + the savepoint), or verify that a retry succeeds even after a simulated PG::InFailedSqlTransaction state.

8. Internal method testing via .send

Tests like test_slug_unique_violation_detection_sqlite and test_slug_column_not_null_detection call private methods directly. This couples tests to implementation details. The unit-level coverage is valuable for these specific cases, but consider adding a comment acknowledging this as an intentional trade-off for correctness guarantees on cross-DB error parsing.


Minor Nits

  • slug_unique_violation? regex: _on_slug\b is slightly redundant since \bslug\b already catches _on_slug endings (the _ before on is a word char, but the boundary before slug is between _ and s—wait, _ is \w, so no boundary there). Actually \bslug\b would NOT match index_on_slug because _ before s is \w. So _on_slug is the correct fallback for that pattern. The regex is correct; a comment explaining why both alternatives are needed would help future readers.

  • The S1/S2/S3... comment labels (# S1:, # S2:) appear to be design spec references. Either link them to the spec document or remove them—orphaned labels become confusing over time.


Summary

Approach Correct and well-scoped
PostgreSQL savepoints Handled correctly
Cross-DB violation detection Solid, tested
around_create re-yield Works but undocumented Rails behavior—highest ongoing risk
save! change Breaking in theory, correct in practice
Test coverage Good behavioral coverage; savepoint test needs strengthening
Dead code reset_strict_slug_model!

The two things I'd fix before merging: the weak savepoint assertion (7) and the dead helper (6). The around_create re-yield risk (3) is the right call given the problem being solved, but deserves an inline comment for future maintainers.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 19, 2026

Code Review – PR #3: Fix race condition in slug generation

Overall this is a well-scoped, focused fix. The approach (DB-level retry with savepoints) is the correct solution for this class of race condition. The documentation and test coverage are good. Below are my findings organized by severity.


Bugs / Correctness

1. save! in set_slug is a breaking behavioral change

# Before
def set_slug
  self.slug = compute_slug if id_changed? || slug.blank?
  self.save
end

# After
with_slug_retry do |_attempts|
  self.slug = compute_slug
  save!
end

The switch from save to save! changes the error contract for the nullable column path. Any validation failure or other non-slug save error during set_slug will now raise instead of silently returning false. This is arguably better practice, but it's a breaking change for existing users whose models might have failing validations during after_create, or who rescue errors upstream. This deserves a semver bump to 0.3.0 and a prominent CHANGELOG note, not just a patch version.

2. update_slug_if_nil no longer retries slug collisions

The after_find legacy repair path now uses plain save with no retry:

def update_slug_if_nil
  return unless slug_persisted? && slug.nil?
  self.slug = compute_slug
  save
end

If two processes simultaneously load a legacy record with slug: nil, both will call compute_slug and race to save. The second will silently fail and that record's slug stays nil until the next read. This is probably acceptable for a legacy repair path, but it should be documented as a known limitation (it isn't currently).

3. test_update_slug_if_nil_uses_non_bang_save gives a false confidence

record.define_singleton_method(:save) { false }
record.send(:update_slug_if_nil)
assert record.slug.present?

The assertion passes because self.slug = compute_slug sets the in-memory attribute before save is called. The test verifies the in-memory object state, not that anything was persisted. The test is checking the right thing (that save! is not called), but the assertion name and structure are misleading—it would pass even if update_slug_if_nil never called save at all. Consider asserting record.reload.slug.nil? to confirm nothing was persisted.


Design Concerns

4. around_create wraps every create, including non-NOT-NULL models

included do
  around_create :retry_create_on_slug_unique_violation  # runs for all models
  after_create :set_slug
  ...
end

For nullable slug models (the common case), retry_create_on_slug_unique_violation immediately returns yield. This is a tiny overhead per create, but it's also a callback that appears in Model.before_create_chain when debugging, which can be confusing. Consider only registering around_create conditionally, or at least adding a clear comment above the included block explaining the early-exit behavior.

5. The NOT NULL setup requires manual user boilerplate

The gem now has two distinct setup paths, and the NOT NULL path requires users to define their own private ensure_slug_present method. If they forget or get it wrong (e.g., forget on: :create), they'll get a confusing NotNullViolation with no retry behavior. Consider whether generate_slug_based_on could detect null: false and automatically register the before_validation hook, eliminating user error.

6. with_slug_retry is called on every set_slug invocation for nullable columns

Even for models with no realistic collision risk (e.g., ID-based hex string slugs), every after_create now creates a savepoint:

with_slug_retry do |_attempts|
  self.slug = compute_slug
  save!
end

For ID-based slug strategies, compute_slug is deterministic—a retry with the same ID will generate the same slug and will hit the same collision again. The retry loop will always exhaust to MAX_SLUG_GENERATION_ATTEMPTS for this strategy. This is probably an edge case (ID-based slug collisions shouldn't happen unless there are duplicate IDs), but it's worth acknowledging.


Minor Issues

7. Magic string comments (S1, S2, S4, etc.)

The inline comments reference a design document by code (S1:, S2:, S4:, S6:, S7:):

# S1: around_create retry for NOT NULL slug columns (pre-INSERT collision)
# S2: Uses savepoint (requires_new: true) for PostgreSQL compatibility

This notation implies there's a specification document somewhere, but none is linked. These appear to have been design-phase labels that weren't cleaned up. Consider replacing them with self-explanatory prose comments, or linking to the PR/issue if they're meant for archeological purposes.

8. _on_slug regex boundary

message.match?(/\bslug\b|_on_slug\b/)

The pattern _on_slug\b has a trailing word boundary but no leading one. If a column or table happened to be named something_on_sluggable, the pattern wouldn't match (since \b after g requires a non-word char). In practice this is fine, but a consistent \b_on_slug\b or just \bslug\b would be clearer. The test test_false_positive_prevention_for_slugged_table_names covers the realistic false-positive case well.

9. slug_column_not_null? is called on each create

columns_hash is cached by ActiveRecord at the class level, so this is not a real performance concern. But slug_column_not_null? is called in both retry_create_on_slug_unique_violation and could be memoized at the class level (since null-ness of a column doesn't change at runtime) if you want to be explicit about the optimization.


Test Coverage

The test suite is good. Particular strengths:

  • test_false_positive_prevention_for_slugged_table_names correctly validates the regex boundary
  • Testing with_slug_retry raises after MAX_SLUG_GENERATION_ATTEMPTS
  • The for_insert + persisted? guard is tested

Missing tests:

  • A test verifying set_slug does NOT run retry logic for for_insert: true NOT NULL models (i.e., set_slug hits return unless slug.blank? and exits early)
  • A test for what happens when compute_slug returns the same value every retry (ID-based slugs)
  • An integration-style test that actually runs after_find on a real nil-slug record through TestModel.find to confirm update_slug_if_nil is wired correctly (the existing test_after_find_repair_sets_slug_for_nil covers this well—noting it as a strength)

Summary

Area Status
Race condition fix (core logic) Correct
Savepoint usage for PostgreSQL Correct
Version bump (0.2.1 vs 0.3.0 for breaking change) Needs discussion
savesave! behavioral change Breaking, needs documentation
update_slug_if_nil concurrency gap Acceptable but undocumented
Internal comment style (S1/S2 etc.) Cleanup needed
Test coverage Good with minor gaps

The implementation is sound. The main thing to resolve before merging is the save! behavioral change and whether it warrants a minor version bump.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 19, 2026

Code Review — PR #3: Fix race condition in slug generation

Overall this is a solid, well-scoped fix. The core design is sound: savepoint-based retry catches the actual DB constraint violation rather than relying solely on the optimistic EXISTS? check. The dual-path approach (nullable vs NOT NULL) and the test coverage are good. A few things to flag:


Bug / Correctness

around_create retry also re-runs after_create callbacks

The CHANGELOG and README warn that before_create callbacks re-execute on retry. But since around_create wraps both the INSERT and the post-insert callback chain, after_create callbacks (including the gem's own set_slug) also re-run. For NOT NULL columns set_slug exits early via return unless slug.blank?, so it's harmless in practice — but users who have after_create callbacks with side effects (emails, webhooks, etc.) can be caught off guard. The docs warn about before_create but stay silent on after_create. Worth adding a note to the CHANGELOG known limitations and the README caution block.


Inconsistency

generate_unique_slug still has a timestamp fallback that contradicts the stated behaviour

The CHANGELOG says:

Retry exhaustion raises RecordNotUnique instead of falling back to timestamp suffix (fail-fast behavior)

But generate_unique_slug still has:

if attempts >= MAX_SLUG_GENERATION_ATTEMPTS
  slug_candidate = "#{base_slug}-#{Time.current.to_i}-#{SecureRandom.random_number(1000)}"
end

The comment on MAX_SLUG_GENERATION_ATTEMPTS acknowledges this ("Also used by generate_unique_slug for EXISTS? check loop (with timestamp fallback)"), so it's not a hidden state. But the CHANGELOG entry is misleading — it implies the timestamp fallback is gone everywhere when it's only gone from the with_slug_retry path. A reader who hits the generate_unique_slug timestamp-fallback path in production will be confused by the changelog. Suggest clarifying the CHANGELOG to say "the DB-level retry (with_slug_retry) raises on exhaustion" rather than the broader claim.


Test quality

test_update_slug_if_nil_uses_non_bang_save doesn't verify save is called

record.define_singleton_method(:save) { false }
record.define_singleton_method(:save!) { raise "save! should not be called from update_slug_if_nil" }

record.send(:update_slug_if_nil)

assert record.slug.present?

This test confirms save! is not called and that slug is set in-memory. However, record.slug.present? passes even if no save method was invoked at all (since self.slug = compute_slug runs before save). Adding a call counter on save would make the assertion complete.

No actual concurrent write test for the new retry paths

The new tests simulate race conditions by injecting synthetic RecordNotUnique exceptions. The existing test/slugifiable/concurrency_test.rb appears to cover real threading, but it's worth checking whether it exercises the StrictSlugModel / NOT NULL path specifically, since that's the new code path added here.


Code style

Numbered review tags (S1S7) in production code

# S1: around_create retry for NOT NULL slug columns (pre-INSERT collision)
# S2: Uses savepoint (requires_new: true) for PostgreSQL compatibility
# S4: Skips overhead when slug column is nullable ...

These look like cross-reference markers from a design document, not long-lived documentation. The numbers are non-obvious out of context (e.g. # S4 appears before # S3). Consider replacing them with plain English comments that stand on their own.


Minor

  • reset_strict_slug_model! is defined in SlugifiableTestHelper but setup/teardown in race_condition_test.rb calls StrictSlugModel.delete_all directly and never calls the helper. Not a bug (same effect), but inconsistent with how reset_test_model! is used.

  • slug_unique_violation? correctly uses \bslug\b so it won't false-positive on slugged_items.name — word-boundary semantics prevent \bslug\b from matching inside slugged. The custom-index-name false-negative limitation is already documented; no action needed.


Summary

Area Finding Severity
Docs after_create callbacks also re-run on retry — not mentioned Low
Docs/Changelog Timestamp fallback still exists in generate_unique_slug; CHANGELOG overstates the change Low
Tests test_update_slug_if_nil_uses_non_bang_save missing assertion that save was invoked Low
Tests NOT NULL concurrent write path not covered by real threading tests Low
Style S1S7 review tags in production comments Nit

The core logic is correct and the test coverage is substantially better than the status quo. Happy to see this merged once the changelog wording is clarified.

@rameerez rameerez merged commit 20ab64a into main Feb 19, 2026
13 checks passed
@rameerez rameerez deleted the fix/slug-race-condition-minimal branch February 19, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant