Migrate scalar indexes to public Lance segment-index APIs#28
Open
everySympathy wants to merge 1 commit into
Open
Migrate scalar indexes to public Lance segment-index APIs#28everySympathy wants to merge 1 commit into
everySympathy wants to merge 1 commit into
Conversation
639eae1 to
8f03cd7
Compare
8f03cd7 to
a685116
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR updates daft-lance to use Lance's public scalar segment-index workflow for distributed scalar index creation.
It bumps the Lance dependency to the Lance 8 beta line that exposes public scalar segment-index APIs, and configures the Lance Fury index used for those prerelease wheels.
The existing distributed scalar index path for
BTREE,INVERTED, andFTSis migrated away from the legacy metadata merge flow. Workers now build uncommitted fragment-local index segments withcreate_index_uncommitted(..., fragment_ids=...), and the coordinator commits those segments withcommit_existing_index_segments(...).For
INVERTED/FTS, worker-built segments are merged withmerge_existing_index_segments(...)before the final commit.Why
The old partitioned metadata merge path is not compatible with Lance 8 for
BTREE, and can produce indices whose deprecatedlist_indices()type appears asUnknown.Using the public segment-index workflow gives committed indices with usable
describe_indices()details and matches the newer Lance scalar indexing API.Changes
pylanceto>=8.0.0b11create_index_uncommitted(...)for scalar segment creationcommit_existing_index_segments(...)INVERTED/FTSsegments before commitBTREEandINVERTEDuse the segment-index workflow by defaultdescribe_indices()Validation
pytest -q tests/io/lancedb/test_lancedb_scalar_index.pyruff check daft_lance testsruff format --check daft_lance testsuv lock --checkpretty-format-toml