Shared cache backend (Postgres) for multi-server consistency#3
Merged
Conversation
Extract storage behind a CacheBackend SPI; Cache becomes a facade owning stats, TTL parsing, and Jackson value serialization. Ships two backends: - InMemoryBackend: today's behavior unchanged (live objects, zero serialization), kept as the default — existing apps pay nothing. - PostgresBackend: shared, cross-server-consistent, durable. Atomic incr via INSERT ... ON CONFLICT ... RETURNING, GIN-backed clearTag over TEXT[] tags, read-time expiry. Table in the Postgres-only migration_pg tier (V8), so H2 never sees it. Opt in with app.cache(CacheBackend.postgres(dbFactory)); default stays in-process. requiresSerialization() selects the byte path (shared) vs the live-object fast path (in-memory). Values on a byte backend carry a class-name header so a wrong-type get fails loudly and a non-serializable value throws at set time. Page caching (CachedHandler) bypasses a serializing backend for now — rendered-response caching lands in Phase 2. Tests: 6 new serialization unit tests + a shared CacheBackendContract run Docker-free via a SerializingMapBackend fixture; PostgresCacheBackendIT runs the same contract plus atomic-incr-under-concurrency, GIN clearTag, and read-time expiry against real Postgres. Existing CacheTest unchanged. See docs/2026-06-04-brace-shared-cache.md.
CachedHandler now caches a RenderedResponse — a serializable snapshot of the materialized response (status, content type, headers, body bytes) — instead of the Result object. A page rendered on one server is replayed by any other across a shared backend; it also works on the in-memory backend (stored as a live object). This removes the Phase 1 bypass that skipped page caching on serializing backends. No render seam needed in BraceHandler: View.of renders eagerly at construction, so a Result is already materialized by the time CachedHandler sees it — RenderedResponse.from just snapshots its fields. Result.raw rebuilds an arbitrary status/headers/bytes response on replay (Result.bytes hardcodes 200). Tests: cross-instance page-cache hit and status/header preservation through serialization (unit, via SerializingMapBackend) and against real Postgres BYTEA (IT). Existing wrap tests updated to read the effective response body, since a cache hit now replays as raw bytes.
Surface whether the cache is shared so operators know clear is fleet-wide: - CacheBackend.shared() (default false; PostgresBackend true), Cache.shared(). - /ops/cache and /ops/status report "shared"; POST /ops/cache/clear returns scope: "instance"|"fleet". Dashboard shows a shared/in-process label and a [clear fleet] vs [clear] button. clearCache already mapped to TRUNCATE on the shared backend (fleet-wide) and size() to a count query — no behavior change there, just clearer reporting. Docs: BRACE-AGENTS.md and README.md document the in-process-vs-shared choice, the one-line opt-in (app.cache(CacheBackend.postgres(dbFactory))), the per-use-case framing, and the shared-backend constraints (Jackson-round-trippable values, per-server getOrSet dogpile). Design doc marked Phases 1-3 done. Tests updated for the new dashboard label and the shared stat.
- BRACE-AGENTS.md: document what clear() actually clears — data is fleet-wide on a shared backend (TRUNCATE) / instance-only on the default, but hit/miss/eviction stats are per-instance and only the handling box resets; and only the app-registered Cache is touched by the ops endpoint. Add a multi-server note to the cache-diagnosis runbook (size fleet-wide, hitRate per-instance). - README.md: same clear/stats clarification. - docs/migrations/brace-0.1.6-to-0.1.7.md: add an "optional shared cache backend" section (additive, non-breaking) with before/after opt-in. - ClaudeMdGenerator: generated project CLAUDE.md now mentions the shared backend option, not just in-process.
Root cause of the missing migration guides: CLAUDE.md's documentation rule covered BRACE-AGENTS.md/README.md but never mentioned the docs/migrations/ guides, so agents had no instruction to write them. - CLAUDE.md: add a "Migration guides (per version step)" rule — keep the in-progress (-SNAPSHOT) step's guide current as changes land, require a guide even for no-breaking-change steps (so a gap never reads as "nothing changed"), and surface missing guides rather than backfilling silently. - docs/migrations/README.md: index of released steps with guide status, explicitly flagging the 0.1.1->0.1.6 guides as a known, untouched gap to backfill as a separate focused pass.
Correctness: - PostgresBackend: split counters into brace_cache_counters so a key used as both a value and a counter no longer clobbers itself (parity with the in-memory two-map design). V8 migration updated; counterCount() now reports the real count. - Cache.deserialize: bounds-check the length prefix and catch Class.forName/Jackson failures, treating a corrupt/truncated/ class-removed entry as a cache MISS instead of crashing the request (NegativeArraySize/OOM/BufferUnderflow). Wrong-but-valid type still fails loudly. getOrSet recomputes on a corrupt entry. - Cache: reuse Json.mapper() instead of a second ObjectMapper, so cached values and HTTP JSON share date/module config (no divergence). - CachedHandler: vary the page-cache key on HX-Request so htmx partials and full pages don't share an entry; replay the snapshot on a miss too so miss and hit return the same materialized Result shape. - Cache: reject null values on both backends (null is reserved for "missing"; previously diverged between get and getOrSet). - Cache.close() stops the sweep thread; Brace.stop() calls it, so a Postgres-backed sweep no longer hammers a closed pool after shutdown. Cleanup/efficiency: - PostgresBackend.run() delegates to DatabaseFactory.withSession instead of hand-rolling open/begin/commit/rollback/close. - PostgresBackend.size() caches the count(*) for ~5s (dashboard polls it). Tests: +1 IT (value/counter no-collision on Postgres), +unit tests for corrupt-bytes-as-miss, unknown-class-as-miss, getOrSet-recompute, null-rejection (both backends), htmx-key separation, value/counter independence, close(). Docs updated (two tables, null rule, htmx vary). 601 unit + 7 Postgres IT green.
megamattron
added a commit
that referenced
this pull request
Jun 14, 2026
Wire JMH into the brace-benchmark module (jmh-core + explicit annotationProcessorPaths — JDK 23+ disables implicit annotation processing) with a programmatic JmhRunner that always attaches the GC profiler, since gc.alloc.rate.norm is the point. run-jmh.sh installs the framework, rebuilds the benchmark jar, and runs from the repo root. RenderAllocBench isolates M6's before/after on the render unit: a jte engine without binaryStaticContent (StringOutput -> toString -> getBytes) vs with it (Utf8ByteOutput -> toByteArray), plus the JSON pair (writeValueAsString().getBytes() vs writeValueAsBytes()), parameterized by row count. Results (gc.alloc.rate.norm, deterministic +/-0.001 B/op; JDK 25): View render 12 rows: 34,008 -> 14,184 B/op (-58%); 100 rows: 127,187 -> 107,432 (-16%) JSON serialize 12 rows: 8,560 -> 1,656 B/op (-81%); 100 rows: 67,904 -> 18,448 (-73%) The View static-content saving is a near-constant ~19.8 KB/render regardless of rows, exactly as binaryStaticContent predicts. Time also dropped more than the review predicted (render -37%, JSON -43% at 100 rows) — a real CPU cut, not just GC pressure. Full results and mechanism notes recorded in the findings doc.
megamattron
added a commit
that referenced
this pull request
Jun 14, 2026
Follow-ups from the merge-gate code review of the Low batch: #1 Verify ?v= against the current fingerprint before promising immutable. serveStaticFile trusted any ?v= param's presence and emitted 1-year immutable; a stale or hand-rolled ?v= (or one a CDN/client appended) could pin wrong/old bytes for a year. Now Assets.currentVersion(path) returns the current content hash (shared (path,mtime) cache) and only an exact match earns immutable; everything else is revalidate-always. #2 Bundled htmx.min.js now carries an ETag + revalidate Cache-Control and honors conditional GETs (304), so browsers skip the ~50KB re-download each page. Not immutable — a brace upgrade can change the bytes at that fixed URL. #3 serveStaticFile uses one Files.readAttributes (size+mtime+isRegularFile) instead of four separate File stat syscalls (exists/isFile/length/lastModified). #4 Removed the now-dead null-invoker fallback in the request path and the unused null-producing Route(method,pattern,handler) constructor, so 'every Route has a non-null invoker' (L1) is enforced by construction. #8 isNotModified: dropped the unused trimmed-copy var (split ran on the original). StaticFilesTest +2 (stale ?v= -> revalidate, htmx revalidates); suite 846/846.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an opt-in shared cache backend so
Cacheis consistent across a horizontally-scaled deploy, without changing the per-process default. One line opts in:The in-process default is unchanged and stays first-class (live objects, zero serialization). The choice is per use case, not per deployment — you can run both (in-process for hot read-through pages, a shared
Cachefor counters/invalidation that must be consistent).Design + rationale:
docs/2026-06-04-brace-shared-cache.md.What's included (by phase)
CacheBackendSPI;Cachebecomes a facade owning stats, TTL parsing, and Jackson value serialization.InMemoryBackend(default, live objects) +PostgresBackend(bytes via JDBC). Atomicincr(INSERT … ON CONFLICT … RETURNING), GIN-backedclearTagoverTEXT[], read-time expiry. Postgres-onlymigration_pg/V8(H2 never sees it).CachedHandlercaches a serializableRenderedResponsesnapshot, so a page rendered on one server is replayed by any other. NoBraceHandlersurgery needed (Viewrenders eagerly).sharedflag on/ops/cache+/ops/status,scope: instance|fleeton clear, dashboard[clear fleet]label.BRACE-AGENTS.md+README.mdupdated.Backends at a glance
Shared-backend constraints (the in-process default has none): values must be Jackson-round-trippable and non-null;
getOrSetsingle-flight is per-server, not global. The near-cache (L1/L2) tier is deferred by design — see the doc.Code review
Ran a high-effort multi-agent review on the branch before this PR and fixed all 10 findings (commit
9c0b177). Notable real bugs caught:brace_cache_counterstable (parity with the in-memory two-map design).deserializecrash on corrupt/truncated bytes → bounds-checked; unreadable entries (incl. a class removed across a rolling deploy) are treated as a cache miss, not a 500.Json.mapper()(consistent date handling); vary page key onHX-Request; stop the sweep thread onBrace.stop(); reject null values;size()cached ~5s;PostgresBackendusesDatabaseFactory.withSession.Behavior / compatibility notes
app.cache(backend)are unaffected.set(key, null)throws) —nullis reserved for "missing". This was previously inconsistent (a permanent miss).cache.wrap(...)hits now return a materialized (raw-bytes)Result; the response bytes are identical, but custom middleware readingresult.body()should read the response bytes instead.HX-Request.Migration guide
docs/migrations/brace-0.1.6-to-0.1.7.mddocuments the optional opt-in. Separately, this branch adds aCLAUDE.mdrule requiring a migration guide per version step and adocs/migrations/README.mdindex that flags a pre-existing gap (no guides for 0.1.1→0.1.6) — tracked for a separate backfill, not addressed here.Tests
mvn verify) — sameCacheBackendContractagainst real Postgres, plus atomic-incr-under-concurrency, GINclearTag, read-time expiry, cross-instance page caching throughBYTEA, and value/counter no-collision.All green.