Skip to content

Implement ready tombstone ledger#323

Merged
hardbyte merged 3 commits into
mainfrom
codex/ready-tombstone-ledger
Jun 6, 2026
Merged

Implement ready tombstone ledger#323
hardbyte merged 3 commits into
mainfrom
codex/ready-tombstone-ledger

Conversation

@hardbyte
Copy link
Copy Markdown
Owner

@hardbyte hardbyte commented Jun 6, 2026

Summary

  • Add the ready_tombstones queue-storage ledger and install it for each queue-storage substrate.
  • Stop deleting ready_entries on common completion, DLQ, discard, retry, cancellation, reprioritization, and SQL compatibility paths.
  • Teach claiming, exact counts, health checks, compatibility views/functions, and maintenance pruning to treat tombstones as spent ready lanes.
  • Fix the queue-prune guard so tombstoned ready rows do not keep old ready partitions active.
  • Rebuild and re-trust terminal live counters in v028 after refreshing the substrate, preserving the v027 counter rebucket invariant.
  • Fix the mixed Rust/Python chaos smoke flake by making wrong-language marker claims snooze instead of draining the other worker marker pool.
  • Update tests, TLA+ models, PostgreSQL object comments, README, architecture docs, troubleshooting, and upgrade guidance.

Fixes #309.

Why

The cursor allocator from PR #321 moved hot sequence bounds away from mutable rows, but ready_entries still generated dead tuples because completed jobs deleted their ready backing rows. Under overlapping readers that DELETE pressure hurts throughput and makes the ready ring less append-only than intended.

This change makes ready segments immutable until partition maintenance truncates them. Rare out-of-band ready mutations append tiny tombstone rows instead of deleting ready rows, so the hot lifecycle is dominated by inserts, cursor advancement, and partition truncation.

Design Notes

  • ready_entries remains the append-only source of ready job bodies for a segment.
  • ready_tombstones records rare ready-lane invalidations such as cancellation, reprioritization, and SQL compatibility DELETE of an available job.
  • claim_ready_runtime() treats tombstones as committed spent evidence so the claim cursor can advance over a contiguous prefix without skipping earlier live jobs.
  • Exact metrics and compatibility reads anti-join tombstones so retained ready rows are not exposed as live available work.
  • Maintenance truncates ready, done, and tombstone child partitions together.
  • Queue prune treats either matching done_entries or matching ready_tombstones as spent evidence.

Validation

  • cargo fmt --check
  • git diff --check
  • python3 -m py_compile awa-python/tests/mixed_fleet_helper.py
  • cargo test -p awa --test migration_test test_v027_rebuckets_existing_terminal_live_counts -- --nocapture
  • cargo test -p awa --test queue_storage_runtime_test test_queue_storage_prune_treats_ready_tombstone_as_spent -- --nocapture
  • cargo test -p awa --test queue_storage_runtime_test test_queue_storage_prune_pending_ready_match_is_scoped_by_enqueue_shard -- --nocapture
  • cargo test --package awa --test chaos_suite_test test_mixed_rust_and_python_workers_share_same_queue -- --exact --ignored --nocapture repeated 3 times locally
  • targeted queue-storage runtime tests for tombstones, compatibility DELETE, queue counts, terminal count decrement, and worker health checks
  • targeted migration readiness tests
  • ./correctness/run-tlc.sh storage/AwaDeadTupleContract.tla
  • ./correctness/run-tlc.sh storage/AwaSegmentedStorage.tla
  • ./correctness/run-tlc.sh storage/AwaSegmentedStorage.tla storage/AwaSegmentedStorageInterleavings.cfg
  • ./correctness/run-tlc.sh storage/AwaSegmentedStorageTrace.tla reaches the expected witness invariant behavior documented by that trace config
  • confirmed no tracked .so files

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine.

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a ready-tombstones ledger to AWA's queue-storage system, replacing physical deletion of ready entries with append-only tombstone records for cancellation and priority aging. The claim-cursor advancement logic is updated to treat tombstoned lanes as committed spent evidence and only advance across safe contiguous prefixes. Ready backing rows are retained for terminal-fact hydration until segment reclamation.

Changes

Ready Tombstones Implementation

Layer / File(s) Summary
Ready tombstone table definition and claim cursor logic
awa-model/migrations/v023_install_queue_storage_substrate.sql
ready_tombstones table added as partitioned append-only ledger with generation guard; claim_ready_runtime() updated to filter candidate rows by tombstone existence and advance cursor based on contiguous spent-prefix relationship.
Schema installation and compatibility layer
awa-model/migrations/v028_ready_tombstones.sql, awa-model/src/migrations.rs
Migration v028 installs substrate across active schemas; jobs_compat() anti-joins ready entries against tombstones to exclude cancelled lanes; delete_job_compat() tombstones ready lanes and releases unique claims instead of deleting; version incremented to 28.
Queue storage mutation operations
awa-model/src/queue_storage.rs
cancel_job_tx and age_waiting_priorities replace DELETE operations with CTE-based tombstone insertion; queue_counts_exact adds tombstone anti-join to availability computation; prune_oldest truncates ready_tombstones child partitions; delete_ready_backing_rows_tx helper removed.
Admin and health-check queries
awa-model/src/admin.rs, awa-worker/src/client.rs
Admin queries and health check add NOT EXISTS filters against ready_tombstones to exclude tombstoned rows from queue listings and availability counts.
Migration infrastructure and schema readiness
awa-model/src/storage.rs
queue_storage_schema_ready() extended to verify ready_tombstones existence in target schema.
TLA+ formal specification
correctness/storage/AwaSegmentedStorage.tla, correctness/storage/AwaDeadTupleContract.tla
readyTombstones state variable and ReadyTombstone(j) constructor added; CurrentReady excludes tombstoned jobs; new CancelReadyToTerminal(j) and ReprioritizeReady(j) actions; PruneReadySegment clears matching tombstones; safety invariants relaxed to permit lane_seq retention when job is in ready entries.
Runtime test coverage
awa/tests/migration_test.rs, awa/tests/queue_storage_runtime_test.rs
Schema readiness tests verify ready_tombstones existence; runtime assertions validate cancellation creates tombstones and retains backing rows; claim-cursor advancement tests for head vs. non-head tombstones; compat delete assertions confirm tombstone creation.
Spec mapping and correctness docs
correctness/storage/MAPPING.md, correctness/storage/README.md, correctness/README.md
MAPPING.md clarified with ready-tombstone keying and prefix-based cursor semantics; AwaDeadTupleContract.tla extended with TableSpec.ready_tombstones, new transaction types, and PruneReadyTx now truncates tombstones; correctness docs updated with coverage and retention semantics.
User-facing documentation
README.md, docs/architecture.md, docs/configuration.md, docs/adr/019-queue-storage-redesign.md, docs/troubleshooting.md, docs/upgrade-0.5-to-0.6.md
README adds "Core Concepts" section; architecture.md adds "Terms" subsection and updates storage-plane descriptions; configuration.md introduces scheduled-jobs/deferred-promotion section and clarifies tombstone semantics; ADR-019 updated with ready-tombstone ledger design; troubleshooting and upgrade guides include ready_tombstones in health checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • hardbyte/awa#197: This PR directly implements the queue-storage, TLA+ models, and docs components for the 0.6 release-readiness issue.
  • hardbyte/awa#169: This PR addresses MVCC dead-tuple concerns by replacing physical DELETEs with append-only tombstone records, reducing hot-table churn.

Possibly related PRs

  • hardbyte/awa#251: Both modify queue-availability/claim correctness in claim_ready_runtime cursor advancement and admin/exact counting paths, with the main PR additionally introducing ready_tombstones to exclude canceled/prioritized lanes.
  • hardbyte/awa#310: Both directly modify the v023 queue-storage substrate installer and claim_ready_runtime, with the main PR adding the ready_tombstones ledger and tombstone-aware claim logic.
  • hardbyte/awa#261: Main PR builds on shard-sensitive ready_entries/claim-join logic by further excluding lanes via anti-join against ready_tombstones in the same query paths.

Poem

A tombstone marks the ready's end,
No deletion, just append, my friend,
🦬 Claim and count skip over graves,
While segment prune the ledger saves. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Implement ready tombstone ledger' directly and concisely describes the main change in the changeset: adding a new queue-storage ledger table for ready tombstones and updating related operations to use tombstone semantics instead of deletion.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hardbyte hardbyte changed the title [codex] Implement ready tombstone ledger Implement ready tombstone ledger Jun 6, 2026
@hardbyte hardbyte marked this pull request as ready for review June 6, 2026 02:37
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61d099bb4d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread awa-model/src/queue_storage.rs Outdated
let truncate = sqlx::query(&format!("TRUNCATE TABLE {ready_child}, {done_child}",))
.execute(tx.as_mut())
.await;
let tomb_child = format!("{schema}.ready_tombstones_{slot}");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid letting tombstoned ready rows block pruning

When a ready row is tombstoned without a matching done_entries row (for example priority aging, or DELETE FROM awa.jobs for an available job), the pending-ready guard just above this still counts that retained backing row because it only left-joins done_entries and does not anti-join ready_tombstones. That means the newly added truncation of the tombstone partition is never reached for any queue slot containing those rows, so the slot cannot be reclaimed and later queue-ring rotation will keep seeing the non-empty ready_entries_* partition as busy.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
awa-model/src/queue_storage.rs (3)

2596-2615: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reset must truncate ready_tombstones too.

reset() rewinds ring generations, lane heads, and job IDs, but it leaves ready_tombstones behind. After that, fresh ready rows can reuse the same (ready_slot, ready_generation, queue, priority, enqueue_shard, lane_seq) key and get filtered out by stale tombstones, which makes new jobs unclaimable after a reset.

Suggested fix
             TRUNCATE
                 {schema}.ready_entries,
+                {schema}.ready_tombstones,
                 {schema}.done_entries,
                 {schema}.dlq_entries,
                 {schema}.leases,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@awa-model/src/queue_storage.rs` around lines 2596 - 2615, The reset() SQL
TRUNCATE call omits the ready_tombstones table, causing stale tombstones to
block reuse of (ready_slot, ready_generation, queue, priority, enqueue_shard,
lane_seq) keys after reset; update the TRUNCATE list in the SQL executed by
reset() (the query built with sqlx::query(&format!(...)) surrounding tables like
ready_entries, done_entries, etc.) to also include {schema}.ready_tombstones so
tombstones are cleared alongside the other tables.

6825-6875: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Filter tombstoned ready rows out of load_job().

age_waiting_priorities() now keeps the source ready row and marks it tombstoned. load_job() still reads every ready_entries row for the job, so a reprioritized job can return the tombstoned snapshot instead of the live lane because both ready candidates tie on state/run_lease/run_at. That makes the reported priority nondeterministic after aging.

Suggested fix
-            FROM {schema}.ready_entries
-            WHERE job_id = $1
+            FROM {schema}.ready_entries AS ready
+            WHERE job_id = $1
+              AND NOT EXISTS (
+                  SELECT 1 FROM {schema}.ready_tombstones AS tomb
+                  WHERE tomb.queue = ready.queue
+                    AND tomb.priority = ready.priority
+                    AND tomb.enqueue_shard = ready.enqueue_shard
+                    AND tomb.lane_seq = ready.lane_seq
+                    AND tomb.ready_slot = ready.ready_slot
+                    AND tomb.ready_generation = ready.ready_generation
+              )
             ORDER BY run_lease DESC, attempted_at DESC NULLS LAST, run_at DESC

Also applies to: 7854-7874

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@awa-model/src/queue_storage.rs` around lines 6825 - 6875, load_job() must
ignore ready_entries rows that have been tombstoned; update the
SELECT-from-{schema}.ready_entries query used by load_job() to exclude any row
present in {schema}.ready_tombstones by adding the same tombstone predicate used
in the diff (match on queue, priority, enqueue_shard, lane_seq, ready_slot,
ready_generation), e.g. add a NOT EXISTS(...) or LEFT JOIN ... WHERE
tomb.ready_slot IS NULL filter so the tombstoned snapshot is not returned. Make
the identical change to the second occurrence noted (the other block around
lines 7854-7874) so both load_job() query paths filter out tombstoned ready
rows.

10664-10717: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Exclude tombstoned lanes from the prune gate.

pending still counts any ready row without a matching done_entries row. After priority aging, the source lane is intentionally left in ready_entries with only a tombstone, so this gate will keep treating that slot as live and prune_oldest() can stop reclaiming queue partitions. correctness/storage/MAPPING.md:96 describes this check as an anti-join against ready_tombstones.

Suggested fix
+        let tomb_child = format!("{schema}.ready_tombstones_{slot}");
         let pending: i64 = sqlx::query_scalar(&format!(
             r#"
             SELECT count(*)::bigint
             FROM {ready_child} AS ready
+            LEFT JOIN {tomb_child} AS tomb
+              ON tomb.ready_slot = ready.ready_slot
+             AND tomb.ready_generation = ready.ready_generation
+             AND tomb.queue = ready.queue
+             AND tomb.priority = ready.priority
+             AND tomb.enqueue_shard = ready.enqueue_shard
+             AND tomb.lane_seq = ready.lane_seq
             LEFT JOIN {done_child} AS done
               ON done.ready_generation = ready.ready_generation
              AND done.queue = ready.queue
              AND done.priority = ready.priority
              AND done.enqueue_shard = ready.enqueue_shard
              AND done.lane_seq = ready.lane_seq
             WHERE done.lane_seq IS NULL
+              AND tomb.lane_seq IS NULL
             "#
         ))
         .fetch_one(tx.as_mut())
         .await
         .map_err(map_sqlx_error)?;
-
-        let tomb_child = format!("{schema}.ready_tombstones_{slot}");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@awa-model/src/queue_storage.rs` around lines 10664 - 10717, The pending-count
query (producing variable pending) currently counts ready rows that have no
matching done row but does not exclude lanes that are tombstoned, so slots left
with only tombstones block pruning; modify the SQL used in the
sqlx::query_scalar call referencing {ready_child} and {done_child} to LEFT JOIN
the corresponding ready_tombstones partition (ready_tombstones_{slot}) and add a
WHERE clause requiring the tombstone join to be NULL (e.g. AND
rt.ready_generation IS NULL or similar) so that rows that are tombstoned are
excluded from the anti-join count; update the query string used by the pending
calculation accordingly.
awa-worker/src/client.rs (1)

1971-1999: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't swallow queue-availability query failures in the health check.

Falling back to unwrap_or_default() here means a broken queue-storage query still produces healthy = true as long as SELECT 1 succeeds. With the new ready_tombstones dependency, a half-prepared or mismatched substrate now shows up as “empty queue” instead of an unhealthy runtime. Please log the error and fold availability-query success into the healthy result rather than treating failure as an empty set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@awa-worker/src/client.rs` around lines 1971 - 1999, The availability query
currently swallows errors by using unwrap_or_default() on the
sqlx::query_as(...).fetch_all(&self.pool).await call in the block that assigns
available_rows (inside the effective_storage.queue_storage_store() branch);
change this to propagate or handle the Result: on Err, log the error via the
existing logger and set the overall health flag to unhealthy (do not treat
failure as an empty set), while on Ok use the returned rows. Update the
health-check logic that uses available_rows to consider the query outcome
(failure => healthy = false) rather than treating a query error as "no available
rows."
🧹 Nitpick comments (1)
awa-model/src/admin.rs (1)

271-338: 🏗️ Heavy lift

Factor the tombstone-aware available-row predicate into one shared builder.

This exact ready_entries filter now exists here, in cancel_by_unique_key, in state_counts, and again in awa-worker/src/client.rs::health_check. MAPPING.md treats those as equivalent read-side projections, so any future change to tombstone keying or enqueue-shard semantics now has several places to drift. A shared SQL fragment/CTE builder would make that contract much harder to break.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@awa-model/src/admin.rs` around lines 271 - 338, The ready_entries
tombstone-aware filter repeated in queue_storage_current_jobs_cte (and also
present in cancel_by_unique_key, state_counts, and
awa-worker::client::health_check) should be factored into a single shared SQL
fragment builder function (e.g., ready_tombstone_predicate or
build_ready_available_cte) that returns the predicate/CTE string; update
queue_storage_current_jobs_cte to call that new function instead of inlining the
WHERE/NOT EXISTS block, and replace the duplicated filter in
cancel_by_unique_key, state_counts, and health_check to use the same builder so
the tombstone keying and enqueue_shard semantics are defined in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@correctness/storage/README.md`:
- Around line 107-118: The README claims "each DLQ transition is reachable" but
the listed actions include RescueToReady which is not a DLQ transition; update
the text to make scope consistent by either renaming the header/table to a more
general "Action coverage" (so keep all listed actions including RescueToReady)
or remove RescueToReady from the DLQ-specific list; adjust the sentence around
the table and the table caption accordingly to reference the chosen change and
verify the table rows (FailToDlq, TimeoutWaitingToDlq, PurgeDlq,
MoveFailedToDlq, RetryFromDlq) match the DLQ scope if you choose to keep the DLQ
wording.

---

Outside diff comments:
In `@awa-model/src/queue_storage.rs`:
- Around line 2596-2615: The reset() SQL TRUNCATE call omits the
ready_tombstones table, causing stale tombstones to block reuse of (ready_slot,
ready_generation, queue, priority, enqueue_shard, lane_seq) keys after reset;
update the TRUNCATE list in the SQL executed by reset() (the query built with
sqlx::query(&format!(...)) surrounding tables like ready_entries, done_entries,
etc.) to also include {schema}.ready_tombstones so tombstones are cleared
alongside the other tables.
- Around line 6825-6875: load_job() must ignore ready_entries rows that have
been tombstoned; update the SELECT-from-{schema}.ready_entries query used by
load_job() to exclude any row present in {schema}.ready_tombstones by adding the
same tombstone predicate used in the diff (match on queue, priority,
enqueue_shard, lane_seq, ready_slot, ready_generation), e.g. add a NOT
EXISTS(...) or LEFT JOIN ... WHERE tomb.ready_slot IS NULL filter so the
tombstoned snapshot is not returned. Make the identical change to the second
occurrence noted (the other block around lines 7854-7874) so both load_job()
query paths filter out tombstoned ready rows.
- Around line 10664-10717: The pending-count query (producing variable pending)
currently counts ready rows that have no matching done row but does not exclude
lanes that are tombstoned, so slots left with only tombstones block pruning;
modify the SQL used in the sqlx::query_scalar call referencing {ready_child} and
{done_child} to LEFT JOIN the corresponding ready_tombstones partition
(ready_tombstones_{slot}) and add a WHERE clause requiring the tombstone join to
be NULL (e.g. AND rt.ready_generation IS NULL or similar) so that rows that are
tombstoned are excluded from the anti-join count; update the query string used
by the pending calculation accordingly.

In `@awa-worker/src/client.rs`:
- Around line 1971-1999: The availability query currently swallows errors by
using unwrap_or_default() on the sqlx::query_as(...).fetch_all(&self.pool).await
call in the block that assigns available_rows (inside the
effective_storage.queue_storage_store() branch); change this to propagate or
handle the Result: on Err, log the error via the existing logger and set the
overall health flag to unhealthy (do not treat failure as an empty set), while
on Ok use the returned rows. Update the health-check logic that uses
available_rows to consider the query outcome (failure => healthy = false) rather
than treating a query error as "no available rows."

---

Nitpick comments:
In `@awa-model/src/admin.rs`:
- Around line 271-338: The ready_entries tombstone-aware filter repeated in
queue_storage_current_jobs_cte (and also present in cancel_by_unique_key,
state_counts, and awa-worker::client::health_check) should be factored into a
single shared SQL fragment builder function (e.g., ready_tombstone_predicate or
build_ready_available_cte) that returns the predicate/CTE string; update
queue_storage_current_jobs_cte to call that new function instead of inlining the
WHERE/NOT EXISTS block, and replace the duplicated filter in
cancel_by_unique_key, state_counts, and health_check to use the same builder so
the tombstone keying and enqueue_shard semantics are defined in one place.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9ef892a-46d2-4efd-8da0-88e18397fe37

📥 Commits

Reviewing files that changed from the base of the PR and between cb6c391 and 61d099b.

📒 Files selected for processing (21)
  • README.md
  • awa-model/migrations/v023_install_queue_storage_substrate.sql
  • awa-model/migrations/v028_ready_tombstones.sql
  • awa-model/src/admin.rs
  • awa-model/src/migrations.rs
  • awa-model/src/queue_storage.rs
  • awa-model/src/storage.rs
  • awa-worker/src/client.rs
  • awa/tests/migration_test.rs
  • awa/tests/queue_storage_runtime_test.rs
  • correctness/README.md
  • correctness/storage/AwaDeadTupleContract.tla
  • correctness/storage/AwaSegmentedStorage.tla
  • correctness/storage/MAPPING.md
  • correctness/storage/README.md
  • docs/adr/019-queue-storage-redesign.md
  • docs/architecture.md
  • docs/configuration.md
  • docs/queue-storage-substrate.md
  • docs/troubleshooting.md
  • docs/upgrade-0.5-to-0.6.md

Comment on lines +107 to 118
Action coverage from a `-coverage 1` run of the base config confirms each DLQ
transition is reachable:

| Action | States |
|---|---:|
| `FailToDlq` | 41,472 |
| `TimeoutWaitingToDlq` | 9,216 |
| `RescueToReady` | 6,912 |
| `PurgeDlq` | 4,608 |
| `MoveFailedToDlq` | 3,072 |
| `RetryFromDlq` | 1,536 |
| Action |
|---|
| `FailToDlq` |
| `TimeoutWaitingToDlq` |
| `RescueToReady` |
| `PurgeDlq` |
| `MoveFailedToDlq` |
| `RetryFromDlq` |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify the DLQ coverage claim to match the listed actions.

Line 107 says each DLQ transition is reachable, but Line 115 lists RescueToReady, which is not a DLQ transition. Please either rename the section/table scope or remove that action from the DLQ-specific list.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@correctness/storage/README.md` around lines 107 - 118, The README claims
"each DLQ transition is reachable" but the listed actions include RescueToReady
which is not a DLQ transition; update the text to make scope consistent by
either renaming the header/table to a more general "Action coverage" (so keep
all listed actions including RescueToReady) or remove RescueToReady from the
DLQ-specific list; adjust the sentence around the table and the table caption
accordingly to reference the chosen change and verify the table rows (FailToDlq,
TimeoutWaitingToDlq, PurgeDlq, MoveFailedToDlq, RetryFromDlq) match the DLQ
scope if you choose to keep the DLQ wording.

@hardbyte hardbyte merged commit 53ccd22 into main Jun 6, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(chaos): test_mixed_rust_and_python_workers_share_same_queue is race-based and flaky

1 participant