Skip to content

Migrate scalar indexes to public Lance segment-index APIs#28

Open
everySympathy wants to merge 1 commit into
daft-engine:mainfrom
everySympathy:segment-index-foundation
Open

Migrate scalar indexes to public Lance segment-index APIs#28
everySympathy wants to merge 1 commit into
daft-engine:mainfrom
everySympathy:segment-index-foundation

Conversation

@everySympathy

@everySympathy everySympathy commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR updates daft-lance to use Lance's public scalar segment-index workflow for distributed scalar index creation.

It bumps the Lance dependency to the Lance 8 beta line that exposes public scalar segment-index APIs, and configures the Lance Fury index used for those prerelease wheels.

The existing distributed scalar index path for BTREE, INVERTED, and FTS is migrated away from the legacy metadata merge flow. Workers now build uncommitted fragment-local index segments with create_index_uncommitted(..., fragment_ids=...), and the coordinator commits those segments with commit_existing_index_segments(...).

For INVERTED / FTS, worker-built segments are merged with merge_existing_index_segments(...) before the final commit.

Why

The old partitioned metadata merge path is not compatible with Lance 8 for BTREE, and can produce indices whose deprecated list_indices() type appears as Unknown.

Using the public segment-index workflow gives committed indices with usable describe_indices() details and matches the newer Lance scalar indexing API.

Changes

  • Bump pylance to >=8.0.0b11
  • Add the Lance Fury package index configuration for prerelease Lance wheels
  • Use public create_index_uncommitted(...) for scalar segment creation
  • Commit worker-built segments via commit_existing_index_segments(...)
  • Merge INVERTED / FTS segments before commit
  • Make BTREE and INVERTED use the segment-index workflow by default
  • Update scalar index tests to validate committed index details with describe_indices()

Validation

  • pytest -q tests/io/lancedb/test_lancedb_scalar_index.py
  • ruff check daft_lance tests
  • ruff format --check daft_lance tests
  • uv lock --check
  • pretty-format-toml

@everySympathy everySympathy force-pushed the segment-index-foundation branch from 639eae1 to 8f03cd7 Compare June 14, 2026 16:14
@everySympathy everySympathy force-pushed the segment-index-foundation branch from 8f03cd7 to a685116 Compare June 14, 2026 16:34
@everySympathy everySympathy changed the title refactor: prefer public segmented index API Migrate scalar indexes to public Lance segment-index APIs Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant