Skip to content

fix(evmrpc): cap eth_getBlockReceipts goroutine fan-out and honor ctx.Done() in eth_subscribe logs (PLT-701)#3621

Open
amir-deris wants to merge 2 commits into
mainfrom
amir/plt-701-getBlockReceipts-goroutine-fan-out
Open

fix(evmrpc): cap eth_getBlockReceipts goroutine fan-out and honor ctx.Done() in eth_subscribe logs (PLT-701)#3621
amir-deris wants to merge 2 commits into
mainfrom
amir/plt-701-getBlockReceipts-goroutine-fan-out

Conversation

@amir-deris

@amir-deris amir-deris commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Part of the RPC endpoint protection workstream (Phase 0 — point fixes, PR 0d). Two independent hardening fixes in evmrpc:

  1. Hard cap on eth_getBlockReceipts goroutine fan-out (evmrpc/block.go)
  2. Make the eth_subscribe(logs) poll loop cancellation-aware (evmrpc/subscribe.go)

Linear: PLT-701

1. Cap eth_getBlockReceipts goroutine fan-out

The receipt loop previously spawned one goroutine per transaction with a bare sync.WaitGroup, so a block containing a very large number of txs would create an unbounded number of goroutines from a single request.

Replaced the manual WaitGroup/Mutex loop with an errgroup.Group whose limit is set to a hard cap:

const maxBlockReceiptsConcurrency = 100
...
g := new(errgroup.Group)
g.SetLimit(maxBlockReceiptsConcurrency)

errgroup.SetLimit blocks Go() until a slot frees, so it bounds the number of live goroutines (not just concurrent work). This covers the eth_, sei_, and sei2_ variants since they share GetBlockReceipts. Underlying DB reads are already bounded by the worker-pool DB semaphore, so additional goroutines beyond this cap only add scheduler/memory pressure without improving throughput.

Existing error handling is preserved (the "not found" receipt skip and first-error propagation, now via g.Wait()); the dead if returnErr != nil tail check was removed since errors flow through the return statements that feed the metrics defer.

2. Fix eth_subscribe(logs) poll loop to honor ctx.Done()

The poll loop ended each iteration with an unconditional time.Sleep(SleepInterval), so it kept polling and notifying long after the subscriber was gone. Replaced the sleep with a cancellation-aware select:

select {
case <-ctx.Done():       // deadline expired / request cancelled
    return
case <-rpcSub.Err():     // client disconnected / unsubscribed
    return
case <-time.After(SleepInterval):
}

This mirrors the existing newHeads handler (which selects on rpcSub.Err()) and adds the ctx.Done() case so the goroutine stops promptly on disconnect or deadline expiry.

Testing

  • go build ./evmrpc/..., go vet ./evmrpc/, gofmt -s all clean.
  • Existing suites pass: evmrpc and evmrpc/tests Subscribe|Logs|Receipt|Block selectors, including the multi-tx TestGetBlockReceipts integration test that exercises the rewritten fan-out path.

No new tests added: the receipts change is a refactor of an already-covered path to a stdlib-guaranteed bound, and the subscribe path has no clean unit-test seam today (rpc.NotifierFromContext is server-internal and the existing logs-subscription integration test is t.Skip()-ed).

@amir-deris amir-deris self-assigned this Jun 22, 2026
@amir-deris amir-deris changed the title Added goroutine cap for block receipts, added ctx Done check for Logs fix(evmrpc): cap eth_getBlockReceipts goroutine fan-out and honor ctx.Done() in eth_subscribe logs (PLT-701) Jun 22, 2026
@cursor

cursor Bot commented Jun 22, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Behavior-preserving RPC hardening with bounded concurrency; no auth or data-model changes.

Overview
Hardens two evmrpc paths as part of RPC endpoint protection.

GetBlockReceipts no longer launches one goroutine per transaction. Per-tx receipt fetch now uses errgroup with SetLimit(100), so large blocks cannot create unbounded goroutines from a single request. Missing-receipt skips and first real error behavior are unchanged; errors return via g.Wait() instead of a shared mutex flag.

eth_subscribe (logs) polling replaces a fixed time.Sleep between iterations with a select on rpcSub.Err() or SleepInterval, so the loop exits when the client unsubscribes instead of polling indefinitely.

Reviewed by Cursor Bugbot for commit 6c68e44. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 22, 2026, 10:30 PM

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 35.71429% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.12%. Comparing base (daba2e7) to head (6c68e44).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
evmrpc/block.go 40.00% 5 Missing and 1 partial ⚠️
evmrpc/subscribe.go 25.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3621      +/-   ##
==========================================
- Coverage   58.99%   58.12%   -0.88%     
==========================================
  Files        2224     2150      -74     
  Lines      182708   174158    -8550     
==========================================
- Hits       107782   101221    -6561     
+ Misses      65235    63946    -1289     
+ Partials     9691     8991     -700     
Flag Coverage Δ
sei-chain-pr 68.49% <35.71%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
evmrpc/subscribe.go 72.11% <25.00%> (-0.88%) ⬇️
evmrpc/block.go 83.91% <40.00%> (+0.87%) ⬆️

... and 74 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bdchatham bdchatham left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-review of the two fixes. The receipts cap looks good; one blocking item on the eth_subscribe(logs) change plus a few non-blocking notes, all inline.

Comment thread evmrpc/subscribe.go Outdated
// disconnects (rpcSub.Err()) or the request context is cancelled /
// its deadline expires (ctx.Done()).
select {
case <-ctx.Done():

@bdchatham bdchatham Jun 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one will actually break the subscription. The ctx we get here gets cancelled as soon as the eth_subscribe call returns its subscription id — the handler in the go-ethereum fork we're on (v1.15.7-sei-16) does a defer cancel() on it in handleNonBatchCall (rpc/handler.go:283) and again in startCallProc (rpc/handler.go:384). Since Logs hands back rpcSub right away, by the time we loop back to this select the ctx is already done, so the client gets the first backfill batch and then the goroutine just exits — no more logs, no error, no unsubscribe.

That's why geth's own filter subscriptions (and NewHeads right here in this file) only watch rpcSub.Err() and never the method ctx. I'd drop the ctx.Done() case and keep just rpcSub.Err() + the time.After, so it lines up with NewHeads. If we actually want a disconnect to cancel an in-flight fetch, that's a separate change — we'd need a connection-lifetime ctx, not this per-call one.

Worth noting too that the logs-subscription integration test is skipped, so CI won't catch this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for great feedback. I will remove the ctx.Done(). Regarding the tests passing, actually there seems to be a tests that is failing: subscribe test
which is a test that is added by @kollegian in this pr Link
Thanks to Yasin for improving the test coverage.

Comment thread evmrpc/block.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants