feat(webapp): add 60s/60s SWR cache to getEntitlement#3388
feat(webapp): add 60s/60s SWR cache to getEntitlement#3388
Conversation
Wraps getEntitlement in platform.v3.server.ts with the existing
platformCache (LRU memory + Redis) under a new `entitlement` namespace.
Eliminates a synchronous billing-service HTTP round trip on every
trigger.
Cache config: 60s fresh / 60s stale SWR. Cache key is the
organization id. Errors are caught inside the loader and return the
existing permissive { hasAccess: true } fallback, which is also
cached to prevent thundering-herd on billing outages.
Trade-off: plan upgrade/downgrade is now visible after up to ~120s
worst-case (60s fresh + 60s stale revalidation). Acceptable since
the existing limits and usage namespaces use 5min/10min, and the
defensive hasAccess: true fallback already exists.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
WalkthroughAdds an SWR cache for Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/webapp/app/services/platform.v3.server.ts (1)
539-550: IncludeorganizationIdin entitlement error logs for faster incident triage.Both new error paths log the error but omit tenant context, which makes debugging cross-tenant billing incidents slower.
📋 Suggested logging tweak
- logger.error("Error getting entitlement - no success", { error: response.error }); + logger.error("Error getting entitlement - no success", { + organizationId, + error: response.error, + }); ... - logger.error("Error getting entitlement - caught error", { error: e }); + logger.error("Error getting entitlement - caught error", { + organizationId, + error: e, + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/services/platform.v3.server.ts` around lines 539 - 550, The entitlement error logs inside the platformCache.entitlement.swr callback omit tenant context; update both logger.error calls (the one in the !response.success branch and the catch branch around client.getEntitlement) to include organizationId in the structured metadata so logs contain the tenant for faster triage—i.e., when logging in the platformCache.entitlement.swr block referencing client.getEntitlement, add organizationId to the object passed to logger.error alongside existing error/response.error fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/webapp/app/services/platform.v3.server.ts`:
- Around line 539-550: The entitlement error logs inside the
platformCache.entitlement.swr callback omit tenant context; update both
logger.error calls (the one in the !response.success branch and the catch branch
around client.getEntitlement) to include organizationId in the structured
metadata so logs contain the tenant for faster triage—i.e., when logging in the
platformCache.entitlement.swr block referencing client.getEntitlement, add
organizationId to the object passed to logger.error alongside existing
error/response.error fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 82bfad1d-c81a-4ff8-9372-fd4c6ba52be8
📒 Files selected for processing (2)
.server-changes/getEntitlement-swr-cache.mdapps/webapp/app/services/platform.v3.server.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
apps/webapp/app/services/platform.v3.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Add crumbs as you write code using
//@Crumbscomments or `// `#region` `@crumbsblocks. These are temporary debug instrumentation and must be stripped usingagentcrumbs stripbefore merge.
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/app/services/platform.v3.server.ts
**/*.ts{,x}
📄 CodeRabbit inference engine (CLAUDE.md)
Always import from
@trigger.dev/sdkwhen writing Trigger.dev tasks. Never use@trigger.dev/sdk/v3or deprecatedclient.defineJob.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
apps/webapp/**/*.server.ts: Access environment variables viaenvexport fromapp/env.server.ts. Never useprocess.envdirectly.
Always usefindFirstinstead offindUniquein Prisma queries.findUniquehas implicit DataLoader batching with active bugs (uppercase UUIDs returning null, composite key SQL correctness, 5-10x worse performance).findFirstavoids these issues.
Access Prisma database via the singleton instance exported fromapp/db.server.ts.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{server,test,spec}.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
For testable code, never import
env.server.tsin test files. Pass configuration as options instead. Separate testable services (e.g.,realtimeClient.server.ts) from singleton factories (e.g.,realtimeClientGlobal.server.ts).
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
Use named constants for sentinel/placeholder values (e.g.
const UNSET_VALUE = "__unset__") instead of raw string literals scattered across comparisons.
Files:
apps/webapp/app/services/platform.v3.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: Access environment variables through theenvexport ofenv.server.tsinstead of directly accessingprocess.env
Use subpath exports from@trigger.dev/corepackage instead of importing from the root@trigger.dev/corepath
Files:
apps/webapp/app/services/platform.v3.server.ts
🧠 Learnings (4)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-03-26T09:02:07.973Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3274
File: apps/webapp/app/services/runsReplicationService.server.ts:922-924
Timestamp: 2026-03-26T09:02:07.973Z
Learning: When parsing Trigger.dev task run annotations in server-side services, keep `TaskRun.annotations` strictly conforming to the `RunAnnotations` schema from `trigger.dev/core/v3`. If the code already uses `RunAnnotations.safeParse` (e.g., in a `#parseAnnotations` helper), treat that as intentional/necessary for atomic, schema-accurate annotation handling. Do not recommend relaxing the annotation payload schema or using a permissive “passthrough” parse path, since the annotations are expected to be written atomically in one operation and should not contain partial/legacy payloads that would require a looser parser.
Applied to files:
apps/webapp/app/services/platform.v3.server.ts
📚 Learning: 2026-04-15T15:39:22.659Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:22.659Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts,app/v3/queues.server.ts} : Do NOT add database queries to `triggerTask.server.ts` or `batchTriggerV3.server.ts` hot paths. Task defaults (TTL, etc.) are resolved via `backgroundWorkerTask.findFirst()` in `queues.server.ts` — piggyback on the existing query instead of adding new ones.
Applied to files:
.server-changes/getEntitlement-swr-cache.md
🔇 Additional comments (2)
apps/webapp/app/services/platform.v3.server.ts (1)
74-78: SWR entitlement namespace configuration looks solid.The dedicated
entitlementcache namespace with 60s fresh/60s stale TTL cleanly matches the intended behavior and existing cache architecture..server-changes/getEntitlement-swr-cache.md (1)
1-6: Change note is clear and accurately scoped.The note documents the behavior change, fallback behavior, and outage trade-off in a way that is easy to reason about.
Devin caught a correctness bug in the previous commit. Returning
{ hasAccess: true } from inside the SWR loader on error caused that
fail-open value to be cached for 60-120s, which could overwrite a
legitimate hasAccess: false during a transient billing outage and
grant a blocked org access for up to 120s.
Fix: catch errors inside the loader (so we don't trigger the @unkey/cache
unhandled-rejection issue during background revalidation) and return
undefined. Apply the fail-open default *outside* the SWR call so it
never becomes a cached access decision.
Trade-off: returning undefined from the loader still overwrites the
previous cached entry with an undefined value, but @unkey/cache's
swr() treats an undefined cached value as a miss and re-fetches on the
next request — so on billing recovery, the cache picks up the real
result immediately rather than serving a stale fail-open for up to
120s.
The stale parameter on @unkey/cache Namespace is the TOTAL ttl, not an additional window beyond fresh. Setting fresh: 60_000 and stale: 60_000 gave a fresh-only entry that expired entirely at 60s with no SWR window. Setting stale: 120_000 yields the intended behavior: fresh 0-60s, stale-revalidate 60-120s. Verified locally with agentcrumbs by running through cache miss, fresh hit, stale revalidate, and stale-with-failed-bg-revalidate scenarios against a live billing server.
When setPlan transitions a customer's plan (free_connected, updated_subscription, canceled_subscription), invalidate the new entitlement cache alongside the existing billing cache invalidation. Without this, a downgrade or cancellation could leave hasAccess: true served from cache for up to 120s — meaning revoked access would linger. Per Devin's review on the swr cache PR.
| // Errors must be caught inside the loader — @unkey/cache passes the loader | ||
| // promise to waitUntil() with no .catch(), so an unhandled rejection during | ||
| // background SWR revalidation would crash the process. Returning undefined | ||
| // on error tells SWR not to commit a fail-open value to the cache, which | ||
| // prevents transient billing errors from overwriting a legitimate | ||
| // hasAccess: false entry. The fail-open default is applied *outside* the | ||
| // SWR call so it never becomes a cached access decision. | ||
| const result = await platformCache.entitlement.swr(organizationId, async () => { | ||
| try { | ||
| const response = await client.getEntitlement(organizationId); | ||
| if (!response.success) { | ||
| logger.error("Error getting entitlement - no success", { error: response.error }); | ||
| return undefined; | ||
| } | ||
| return response; | ||
| } catch (e) { | ||
| logger.error("Error getting entitlement - caught error", { error: e }); | ||
| return undefined; | ||
| } | ||
| return result; | ||
| } catch (e) { | ||
| logger.error("Error getting entitlement - caught error", { error: e }); | ||
| }); |
There was a problem hiding this comment.
🚩 SWR undefined-return strategy depends on @unkey/cache not caching undefined
The SWR loader returns undefined on errors (lines 554, 559), and the comment at lines 542-548 explicitly states this prevents SWR from committing a fail-open value to the cache. This is critical: if @unkey/cache does cache undefined, then a transient billing error during background revalidation could overwrite a legitimate hasAccess: false cached entry, granting access to an org that should be blocked. The author clearly considered this (the comment is detailed), and most SWR cache implementations skip storing undefined/null, but this behavior should be verified against the actual @unkey/cache source. Meanwhile, the .server-changes/getEntitlement-swr-cache.md:6 description contradicts the code by claiming errors 'are also cached to prevent thundering-herd on billing outages' — if errors are NOT cached (per the code comments), thundering-herd protection only kicks in while a valid cached entry exists.
Was this helpful? React with 👍 or 👎 to provide feedback.
Wraps getEntitlement in platform.v3.server.ts with the existing
platformCache (LRU memory + Redis) under a new
entitlementnamespace.Eliminates a synchronous billing-service HTTP round trip on every
trigger.
Cache config: 60s fresh / 60s stale SWR. Cache key is the
organization id. Errors are caught inside the loader and return the
existing permissive { hasAccess: true } fallback, which is also
cached to prevent thundering-herd on billing outages.
Trade-off: plan upgrade/downgrade is now visible after up to ~120s
worst-case (60s fresh + 60s stale revalidation). Acceptable since
the existing limits and usage namespaces use 5min/10min, and the
defensive hasAccess: true fallback already exists.