perf(maintenance): duration-margin hysteresis + branch cooldown + 250ms lease default (#169) by hardbyte · Pull Request #318 · hardbyte/awa

hardbyte · 2026-06-04T00:48:40Z

Summary

Follow-up to #316 from the bench post-mortem you ran. K=3 hysteresis alone wasn't enough — bench evidence showed rotate_lease still flapping at the 50ms boundary because "delayed" samples hovered at 51ms and "recovered" at 45-49ms. The branch also kept doing real (slow) work between flip and recovery, contributing to dead-tuple growth even with prune backoff.

Three coordinated changes

1. Duration-margin + K-consecutive hysteresis

Overrun sample: last_duration > tick_interval * 3/2 (1.5×)
On-time sample: last_duration < tick_interval * 7/10 (0.7×)
Anything in between is a deadband — neither counter advances
Still requires K=3 consecutive on either side before flipping
Kills the boundary flap directly: 51ms-vs-49ms can't accumulate K observations because they don't even count
Integer ratios — no f64 in the hot path

2. Per-branch cooldown

On flip-to-delayed, arm cooldown_ticks_remaining = 120 (~30s at the new 250ms lease_rotate cadence)
While non-zero, try_begin returns None → branch body is skipped: no ACCESS EXCLUSIVE attempts, no count(*) scans, no contribution to dead-tuple growth
Cooldown re-arms on every overrun observation while still delayed → chronically-failing branch stays quiet
Counters are frozen during cooldown (no body = no sample), so recovery requires K real on-time samples after cooldown expires

3. Default `lease_rotate_interval` 50ms → 250ms

The 50ms default was over-spec'd given prune backoff already throttles the heavy work
250ms still cycles the 8-slot lease ring in ~2s — plenty fast for reclamation
All other intervals unchanged (queue_rotate_interval stays 1000ms; claim_rotate_interval defaults to queue)

API change

MaintenanceBranchTracker::begin → try_begin returning Option<BranchTimer>. None signals "skip this tick's body"; all 13 select! arms updated to if let Some(timer) = .... Existing BranchTimer::finish() semantics preserved.

Test plan

cargo fmt --all
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo build --workspace
cargo test -p awa-worker --lib — 39 passing
22 maintenance unit tests passing; 10 new for this PR:
- deadband_sample_does_not_advance_counters
- k_consecutive_overruns_flips_and_arms_cooldown
- cooldown_skips_body
- cooldown_expires_then_body_runs
- cooldown_rearms_on_continued_overrun
- recovers_only_after_k_ontime_ticks_post_cooldown
- single_overrun_does_not_flip (updated for new thresholds)
- intermittent_overrun_does_not_flip (updated)
- finish_records_last_duration (updated for Option return)
- per_branch_state_is_independent (still passes)

What this is and isn't

This addresses the noise + per-tick work issues the bench surfaced on the maintenance side. It does not attempt to close the queue-metadata bloat cliff (queue_claim_heads / queue_enqueue_heads / queue_ring_state / queue_terminal_live_counts under sustained pin) — that's the per-row-state-machine model's fundamental limit and is the territory of #295 (rotation+TRUNCATE engine).

The next long_horizon re-run will tell us how much of the cliff this closes. If meaningful improvement: ship 0.6.0-beta.2 with the operational-guidance doc shape from the #169 release decision. If not: time to discuss #295 scope formally.

Summary by CodeRabbit

Chores
- Tuned lease rotation timing to reduce churn and improve resource stability.
- Strengthened maintenance task scheduling with overrun detection, cooldown suppression, and per-branch isolation to avoid repeated work during unhealthy periods.
Tests
- Updated unit tests to validate the new scheduling, cooldown and recovery behaviors.

…ms lease default (#169) Follow-up to #316 bench post-mortem. The K=3 hysteresis alone wasn't enough — bench evidence showed rotate_lease still flapping at the 50ms boundary because "delayed" samples hovered at 51ms and "recovered" at 45-49ms. K=3 accumulated quickly because every other tick crossed. The branch also kept doing real (slow) work between the flip and the recovery, contributing to the dead-tuple growth even with backoff on prune. Three coordinated changes in awa-worker: 1. Duration-margin + K-consecutive hysteresis. Samples must clearly exceed `tick_interval * 3/2` to count as overrun, or be clearly below `tick_interval * 7/10` to count as on-time. Anything in the deadband leaves both counters alone. Kills the boundary flap directly: 51ms-vs-49ms can no longer accumulate K observations. Integer ratios — no f64 in the hot path. 2. Per-branch cooldown. On flip-to-delayed, arm `cooldown_ticks_remaining = BRANCH_COOLDOWN_TICKS` (120 ticks = ~30s at the new 250ms lease_rotate cadence). While non-zero, `try_begin` returns None and the branch body is skipped — no ACCESS-EXCLUSIVE attempts, no count(*) scans, no contribution to the dead-tuple growth. Cooldown re-arms on every overrun observation while still delayed, so a chronically-failing branch stays quiet. Counters are frozen during cooldown (no body = no duration sample), so recovery requires K real on-time samples AFTER cooldown expires. 3. Default `lease_rotate_interval` 50ms → 250ms. The 50ms was over-spec'd given that prune backoff already throttles the heavy work — faster rotation just produced loop noise. 250ms still cycles the 8-slot lease ring in ~2s, which is plenty fast for reclamation. All other intervals unchanged. API change: `MaintenanceBranchTracker::begin` is now `try_begin` and returns `Option<BranchTimer>`. None signals "skip this tick's body"; all 13 select! arms updated to use `if let Some(timer) = ...`. The existing `BranchTimer::finish()` semantics are preserved. Tests: - Updated 6 existing branch-tracker tests for the new API + duration thresholds. - New: deadband_sample_does_not_advance_counters, k_consecutive_overruns_flips_and_arms_cooldown, cooldown_skips_body, cooldown_expires_then_body_runs, cooldown_rearms_on_continued_overrun, recovers_only_after_k_ontime_ticks_post_cooldown. - All 22 maintenance unit tests pass; 39 worker lib tests pass.

coderabbitai · 2026-06-04T00:48:50Z

Warning

Review limit reached

@hardbyte, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 51 minutes and 7 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 214c5bc8-09ae-4aa9-a0f5-a7dfbe4331cd

📥 Commits

Reviewing files that changed from the base of the PR and between 263ec46 and 7ddacce.

📒 Files selected for processing (3)

awa-worker/src/maintenance.rs
docs/architecture.md
docs/configuration.md

📝 Walkthrough

Walkthrough

This PR raises the default lease rotation interval from 50ms to 250ms and refactors the maintenance branch tracker into a cooldown-plus-hysteresis state machine (try_begin) that can suppress maintenance branch execution; maintenance loop wiring and tests are updated accordingly.

Changes

Lease Rotation Interval Configuration

Layer / File(s)	Summary
Increase lease rotation interval with documentation `awa-worker/src/client.rs`	ClientBuilder::new passes 250ms instead of 50ms as the default `lease_rotate_interval`, with expanded inline documentation explaining the change.

Maintenance Branch Overrun Tracker Refactor

Layer / File(s)	Summary
Branch state fields `awa-worker/src/maintenance.rs`	Adds `consecutive_overrun`, `consecutive_ontime`, and `cooldown_ticks_remaining` to `MaintenanceBranchState`.
try_begin gating and hysteresis logic `awa-worker/src/maintenance.rs`	Replaces `begin` with `try_begin` which implements cooldown-first skipping and duration-margin hysteresis, returns `Option<BranchTimer>`, emits overrun observability only when entering delayed, and suppresses branch execution by returning `None` during delays/cooldowns.
Test hooks and cooldown snapshot `awa-worker/src/maintenance.rs`	Adds `cooldown_snapshot` test helper exposing internal counters and cooldown state for assertions.
Maintenance loop conditional dispatch `awa-worker/src/maintenance.rs`	Leader maintenance `tokio::select!` loop and queue/lease/claim rotate branches now call `branch_tracker.try_begin(...)` and run bodies / call `timer.finish()` only when `Some(timer)` is returned.
Tracker tests refactor and tick replay `awa-worker/src/maintenance.rs`	Refactors tests to replay ticks via `try_begin`, distinguishing run vs skipped, updates assertions for deadband/hysteresis behavior, cooldown re-arming and expiry, and adjusts per-branch initial seeding for independence tests.

Sequence Diagram(s)

sequenceDiagram
  participant Scheduler as MaintenanceScheduler
  participant Tracker as BranchTracker
  participant Worker as MaintenanceTask
  Scheduler->>Tracker: try_begin(branch_id, last_duration)
  alt Suppressed
    Tracker-->>Scheduler: None
  else Allowed
    Tracker-->>Scheduler: Some(BranchTimer)
    Scheduler->>Worker: run maintenance branch
    Worker-->>Scheduler: completed
    Scheduler->>Tracker: timer.finish(branch_id, duration)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

hardbyte/awa#316: Adds hysteresis and consecutive-sample logic to MaintenanceBranchTracker that this PR builds upon with suppression gating and cooldown state.

Poem

🐰 I counted ticks beneath the moonlit sky,
Overruns whispered, "rest a little, try."
Cooldown set, hysteresis in my paw,
Leases breathe at two-fifty—peaceful law.
Thump-thump, the watcher-rabbit keeps the draw.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main changes: duration-margin hysteresis, branch cooldown, and 250ms lease default.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch brian/issue-169-maintenance-cooldown-margin

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f07cf3af44

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-04T00:50:08Z

+                } else if cross_threshold && state.is_delayed {
+                    // Already delayed and another overrun arrived —
+                    // re-arm cooldown but stay quiet about it.
+                    state.cooldown_ticks_remaining = BRANCH_COOLDOWN_TICKS;
+                    return None;


Don't re-arm cooldown from the stale overrun sample

When a branch first crosses the overrun threshold, try_begin returns None before any body runs, so record_finish is not called and last_duration remains the same overrun value throughout the 120 skipped ticks. Once the cooldown reaches zero, the next try_begin re-enters this already-delayed branch, sees that same stale last_duration with consecutive_overrun >= K, and this block re-arms cooldown and returns None again. In the real maintenance loop this means any branch that trips delayed once is skipped forever until the worker restarts; the tests mask this by manually overwriting last_duration during skipped ticks.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@awa-worker/src/maintenance.rs`:
- Around line 191-194: The branch can get stuck reusing a stale sampled duration
because try_begin returns None without consuming last_duration; to fix, ensure
each sampled duration is consumed exactly once by clearing or taking
last_duration before any early return. Update the logic around try_begin and the
cooldown handling: when you check state.cooldown_ticks_remaining and return
None, take() or set last_duration = None (or otherwise mark the sample consumed)
so that after cooldown the same overrun isn't re-evaluated; likewise, when
try_begin returns None elsewhere, consume last_duration before returning.
Reference: try_begin, last_duration, and state.cooldown_ticks_remaining.
- Around line 2528-2549: The helper replay_ticks is incorrectly mutating
tracker.branches[].last_duration before calling tracker.try_begin, which
simulates a duration even for ticks that were skipped by cooldown; change the
logic so you only set last_duration when try_begin returns Some(timer) (i.e.,
the body would run). Concretely, call tracker.try_begin(branch, tick_interval,
&metrics) first, check timer_opt.is_some(), and only then update
tracker.branches.lock().unwrap().entry(branch).or_default().last_duration =
Some(last_duration) (and simulate finishing the timer if necessary); keep
references to replay_ticks, tracker.try_begin, and the last_duration field to
locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c11d8e52-664e-44a7-98f2-163b65db6945

📥 Commits

Reviewing files that changed from the base of the PR and between cc9ba62 and f07cf3a.

📒 Files selected for processing (2)

awa-worker/src/client.rs
awa-worker/src/maintenance.rs

Codex and CodeRabbit both independently flagged a P1 bug on PR #318: without consuming the sample, the branch enters perpetual cooldown once it flips delayed. Trace: - Body N runs slow, record_finish writes last_duration=Some(slow). - Tick N+1: try_begin reads but doesn't consume; consecutive_overrun saturates past K; flip + cooldown=120; return None. - Ticks N+2..N+121: cooldown gate decrements, return None. No body runs, no record_finish call, last_duration STAYS Some(slow). - Tick N+122 (cooldown drained): try_begin sees the same stale Some(slow), saturates overrun again, re-arms cooldown forever. Worker has to restart to escape. The test helper hid this by writing last_duration before every iteration — synthetic samples even on ticks where the body was skipped. Fix: `state.last_duration.take()` in try_begin's hysteresis check. Each sample is evaluated exactly once. After flip, the consumed sample leaves last_duration = None; cooldown drains without writing anything; first post-cooldown try_begin sees None and skips the eval, body runs, record_finish writes a fresh sample, next try_begin evaluates it. Test fixes: - `replay_ticks` no longer seeds last_duration on each iteration. It calls record_finish *only* when the body would have run, matching production: skipped ticks don't produce samples. - New `seed_last_duration` helper for tests that need an initial sample to drive the very first try_begin observation. - Tests updated for the new K+1 recovery cadence: the very first post-cooldown body produces a sample evaluated on the *next* try_begin, so K evaluable on-time samples require K+1 ticks post-cooldown. All 22 maintenance tests pass with the corrected semantics. The production-side fix is the take() — the test fixes are bookkeeping that exposes (rather than hides) the new contract.

hardbyte · 2026-06-04T05:03:06Z

Corrected bench verdict after the last_duration.take() rescue fix + docs cleanup (7ddacce).

Run: custom-20260604T033058Z-4216b7 in postgresql-job-queue-benchmarking, Awa 7ddacce on pr-318, harness 02e3a4f, clean worktrees. Shape: warmup=30s, clean_1=10m, idle_1=idle-in-tx:70m, recovery_1=10m, 800/s, 32 workers, QUEUE_STRIPE_COUNT=4, LEASE_ROTATE_MS=250, completion batch 1024, flush 20ms.

Result:

Clean: steady. completion avg 800.0/s, enqueue avg 800.0/s, depth avg 11, p99 avg 63ms.
Idle pinned snapshot: perf(maintenance): duration-margin hysteresis + branch cooldown + 250ms lease default (#169) #318 delays the old cliff but does not remove the underlying limit. Over the full 70m pin, completion avg 796.96/s vs enqueue avg 800.0/s; final sample was completion 776.0/s, enqueue 800.4/s, p99 35.8s, depth 12,733.
Shape matters: around ~43m pinned it showed bounded stall/catch-up pulses, but from ~60m pinned depth began climbing monotonically. Last 20 idle samples: depth 1,153 -> 1,769 -> 3,458 -> 7,282 -> 12,733; p99 3.9s -> 6.2s -> 10.4s -> 20.6s -> 35.8s.
Recovery after releasing the pin was immediate. First recovery sample completed 1225/s and depth returned to 0; subsequent recovery windows were back to ~800/s, depth 0-21, p99 mostly <110ms.

Interpretation: #318 is still worth merging as a mitigation for maintenance noise / boundary flapping, and the corrected sample-consuming semantics are required. But the release-gate evidence still says sustained pinned MVCC pressure eventually hits the queue-metadata per-row update wall. This is not lease heartbeat churn anymore; it is the queue metadata model under a long pinned snapshot.

Recommendation: merge #318 as the mitigation, but do not frame it as closing #169. For v0.6 either document a hard MVCC discipline/ceiling for long pinned snapshots, or pull the #295 rotation/TRUNCATE metadata redesign into the release scope if that ceiling is unacceptable.

Codex and CodeRabbit both independently flagged a P1 bug on PR #318: without consuming the sample, the branch enters perpetual cooldown once it flips delayed. Trace: - Body N runs slow, record_finish writes last_duration=Some(slow). - Tick N+1: try_begin reads but doesn't consume; consecutive_overrun saturates past K; flip + cooldown=120; return None. - Ticks N+2..N+121: cooldown gate decrements, return None. No body runs, no record_finish call, last_duration STAYS Some(slow). - Tick N+122 (cooldown drained): try_begin sees the same stale Some(slow), saturates overrun again, re-arms cooldown forever. Worker has to restart to escape. The test helper hid this by writing last_duration before every iteration — synthetic samples even on ticks where the body was skipped. Fix: `state.last_duration.take()` in try_begin's hysteresis check. Each sample is evaluated exactly once. After flip, the consumed sample leaves last_duration = None; cooldown drains without writing anything; first post-cooldown try_begin sees None and skips the eval, body runs, record_finish writes a fresh sample, next try_begin evaluates it. Test fixes: - `replay_ticks` no longer seeds last_duration on each iteration. It calls record_finish *only* when the body would have run, matching production: skipped ticks don't produce samples. - New `seed_last_duration` helper for tests that need an initial sample to drive the very first try_begin observation. - Tests updated for the new K+1 recovery cadence: the very first post-cooldown body produces a sample evaluated on the *next* try_begin, so K evaluable on-time samples require K+1 ticks post-cooldown. All 22 maintenance tests pass with the corrected semantics. The production-side fix is the take() — the test fixes are bookkeeping that exposes (rather than hides) the new contract.

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread awa-worker/src/maintenance.rs

Comment thread awa-worker/src/maintenance.rs

hardbyte and others added 2 commits June 4, 2026 15:21

docs: align lease rotate default with maintenance tuning

7ddacce

This was referenced Jun 4, 2026

Validate MVCC / dead-tuple behaviour under sustained load #169

Open

0.6 release readiness: align TLA+ models, docs, and queue-storage replacement gates #197

Open

hardbyte mentioned this pull request Jun 5, 2026

Move queue storage cursors off hot rows #321

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(maintenance): duration-margin hysteresis + branch cooldown + 250ms lease default (#169)#318

perf(maintenance): duration-margin hysteresis + branch cooldown + 250ms lease default (#169)#318
hardbyte wants to merge 3 commits into
mainfrom
brian/issue-169-maintenance-cooldown-margin

hardbyte commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

hardbyte commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hardbyte commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Three coordinated changes

1. Duration-margin + K-consecutive hysteresis

2. Per-branch cooldown

3. Default lease_rotate_interval 50ms → 250ms

API change

Test plan

What this is and isn't

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hardbyte commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hardbyte commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

3. Default `lease_rotate_interval` 50ms → 250ms

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading