Skip to content

fix(blockstore): don't activate bloom cache if build incomplete#1183

Open
guillaumemichel wants to merge 3 commits into
mainfrom
fix/bloom-incomplete-build
Open

fix(blockstore): don't activate bloom cache if build incomplete#1183
guillaumemichel wants to merge 3 commits into
mainfrom
fix/bloom-incomplete-build

Conversation

@guillaumemichel

Copy link
Copy Markdown
Member

Problem

The blockstore's Bloom filter cache relies on one invariant: every stored CID must be present in the filter, so a filter miss can be trusted as a conclusive "not present" answer.

The initial filter build populates itself by enumerating the datastore via AllKeysChan. That enumeration had two ways to end early without anyone noticing:

  • A mid-iteration datastore error. AllKeysChan only logged errors returned by the underlying query (dsq.Result.Error) and then closed its output channel — indistinguishable from a normal, complete enumeration.
  • A cancelled context. Cancelling the build's context could still race with the channel closing "cleanly," letting the build finish and mark the filter active anyway.

In both cases the build could not tell a truncated enumeration from a complete one, so it activated a Bloom filter that only covered a subset of the actual stored CIDs. Once active, the cache started giving wrong, conclusive answers for blocks that really exist but were never indexed:

  • Hasfalse
  • Get / GetSize / View → not-found
  • DeleteBlock → silent no-op (doesn't reach the underlying store)

This is a correctness bug, not a performance one — it can make present data appear to have vanished.

Fix

Add a way for the enumeration to report, after the fact, whether it actually completed:

  • New internal capability interface allKeysChanWithErrer (allKeysChanWithErr) alongside AllKeysChan. It returns the same key channel plus a func() error that — once the channel is fully drained — reports whether enumeration was truncated (mid-iteration datastore error or context cancellation) or completed cleanly.
  • Implemented on the base blockstore, and forwarded through every layer that can sit between the base store and the Bloom cache: tqcache, idstore, ValidatingBlockstore. A allKeysChanWithErrFor helper falls back to a no-op (always-nil) error function for any blockstore that doesn't implement the capability, preserving today's best-effort behavior for those.
  • bloomcache.build() now checks that error function after the channel closes, and only flips the filter to active if it's nil. On a truncated or cancelled build, the filter stays inactive and the cache correctly degrades to pass-through (queries fall through to the real store) instead of serving false negatives.
  • This also removes the cancellation race described above: activation now depends on the enumeration's actual terminal state rather than which branch of a select happens to fire when the channel closes and the context is cancelled at the same time.

Unparseable datastore keys are still skipped (not treated as a truncating error) — they can't correspond to a retrievable block, so omitting one can't cause a false negative.

@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.10345% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.75%. Comparing base (8a6d1d2) to head (bbf07ab).

Files with missing lines Patch % Lines
blockstore/blockstore.go 93.75% 1 Missing ⚠️
blockstore/bloom_cache.go 85.71% 1 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1183      +/-   ##
==========================================
+ Coverage   63.69%   63.75%   +0.06%     
==========================================
  Files         269      269              
  Lines       27003    27027      +24     
==========================================
+ Hits        17200    17232      +32     
+ Misses       8092     8081      -11     
- Partials     1711     1714       +3     
Files with missing lines Coverage Δ
blockstore/idstore.go 68.25% <100.00%> (+1.04%) ⬆️
blockstore/twoqueue_cache.go 72.46% <100.00%> (-2.18%) ⬇️
blockstore/validating_blockstore.go 66.66% <100.00%> (+6.66%) ⬆️
blockstore/blockstore.go 56.02% <93.75%> (+9.15%) ⬆️
blockstore/bloom_cache.go 69.60% <85.71%> (+6.26%) ⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant