fix(blockstore): don't activate bloom cache if build incomplete#1183
Open
guillaumemichel wants to merge 3 commits into
Open
fix(blockstore): don't activate bloom cache if build incomplete#1183guillaumemichel wants to merge 3 commits into
guillaumemichel wants to merge 3 commits into
Conversation
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #1183 +/- ##
==========================================
+ Coverage 63.69% 63.75% +0.06%
==========================================
Files 269 269
Lines 27003 27027 +24
==========================================
+ Hits 17200 17232 +32
+ Misses 8092 8081 -11
- Partials 1711 1714 +3
... and 5 files with indirect coverage changes 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The blockstore's Bloom filter cache relies on one invariant: every stored CID must be present in the filter, so a filter miss can be trusted as a conclusive "not present" answer.
The initial filter build populates itself by enumerating the datastore via
AllKeysChan. That enumeration had two ways to end early without anyone noticing:AllKeysChanonly logged errors returned by the underlying query (dsq.Result.Error) and then closed its output channel — indistinguishable from a normal, complete enumeration.In both cases the build could not tell a truncated enumeration from a complete one, so it activated a Bloom filter that only covered a subset of the actual stored CIDs. Once active, the cache started giving wrong, conclusive answers for blocks that really exist but were never indexed:
Has→falseGet/GetSize/View→ not-foundDeleteBlock→ silent no-op (doesn't reach the underlying store)This is a correctness bug, not a performance one — it can make present data appear to have vanished.
Fix
Add a way for the enumeration to report, after the fact, whether it actually completed:
allKeysChanWithErrer(allKeysChanWithErr) alongsideAllKeysChan. It returns the same key channel plus afunc() errorthat — once the channel is fully drained — reports whether enumeration was truncated (mid-iteration datastore error or context cancellation) or completed cleanly.tqcache,idstore,ValidatingBlockstore. AallKeysChanWithErrForhelper falls back to a no-op (always-nil) error function for any blockstore that doesn't implement the capability, preserving today's best-effort behavior for those.bloomcache.build()now checks that error function after the channel closes, and only flips the filter to active if it's nil. On a truncated or cancelled build, the filter stays inactive and the cache correctly degrades to pass-through (queries fall through to the real store) instead of serving false negatives.selecthappens to fire when the channel closes and the context is cancelled at the same time.Unparseable datastore keys are still skipped (not treated as a truncating error) — they can't correspond to a retrievable block, so omitting one can't cause a false negative.