Skip to content

fix(python)!: derive index type from details instead of opening the index#6903

Open
wjones127 wants to merge 5 commits into
lance-format:mainfrom
wjones127:fix-unknown-index-type
Open

fix(python)!: derive index type from details instead of opening the index#6903
wjones127 wants to merge 5 commits into
lance-format:mainfrom
wjones127:fix-unknown-index-type

Conversation

@wjones127
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 commented May 22, 2026

BREAKING CHANGE: describe_indices() now reports nested and special-character field names as full field paths (e.g. meta.lang, `user-id`) instead of just the leaf name.

list_indices() called the load_indices() binding, which opened each index to derive its type and reported "Unknown" on any failure.

list_indices() is now a thin wrapper over describe_indices(), which derives the type from index details without opening the index:

  • describe_indices() no longer errors on indices without index details; it returns a best-effort degraded entry instead.
  • When index details exist but no plugin is registered for the type URL, the type is derived from the type URL rather than "Unknown".
  • field_names now uses the full field path, so nested fields are reported as dotted paths instead of just the leaf name.
  • IndexSegmentDescription gains a base_id field.
  • The unused load_indices() Python binding is removed.

The list_indices() return type hint was incorrect (List[Index] — the method has always returned dicts). It now returns a typed IndexInformation TypedDict, so callers get key and value types instead of an opaque dict.

Testing

  • Rust: cargo test -p lance --lib index::, lance-index registry tests — new tests cover the degraded entry and the type-URL fallback.
  • Python: test_scalar_index.py, test_column_names.py, test_vector_index.py, test_optimize.py — including new list_indices() characterization tests committed before the rework, plus index-without-details and legacy-vector cases.
  • Lint: cargo fmt, cargo clippy (lance, lance-index, pylance), ruff, pyright.

🤖 Generated with Claude Code

wjones127 and others added 2 commits May 21, 2026 16:26
The deprecated list_indices() returns plain dicts (not Index dataclasses),
but its dict shape was not covered by any test. Add characterization tests
locking down the dict keys/values and the nested-field path format, as a
backwards-compatibility guard before reworking the implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
list_indices() called the load_indices() binding, which opened each
index to derive its type and reported "Unknown" on any failure.

list_indices() is now a thin wrapper over describe_indices(), which
derives the type from index details without opening the index:

- describe_indices() no longer errors on indices without index details;
  it returns a best-effort degraded entry instead.
- When index details exist but no plugin is registered for the type
  URL, the type is derived from the type URL rather than "Unknown".
- field_names now uses the full field path, so nested fields are
  reported as dotted paths instead of just the leaf name.
- IndexSegmentDescription gains a base_id field.
- The unused load_indices() Python binding is removed.

The list_indices() return type hint is corrected from List[Index] to
List[Dict[str, Any]] to match what it actually returns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added bug Something isn't working python breaking-change labels May 22, 2026
Add Python tests that commit raw index metadata without index details
via CreateIndex and assert list_indices()/describe_indices() report the
degraded entry ("Unknown") and the legacy monolithic vector index
("Vector") instead of erroring.

Also corrects the field_names doc comment to say "full paths" rather
than "names".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 92.10526% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index.rs 90.81% 8 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

wjones127 and others added 2 commits May 21, 2026 19:01
test_nested_field_vector_index indexes the nested `data.embedding`
column; field_names now reports the full path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the `List[Dict[str, Any]]` return type of `list_indices()` with
a typed `IndexInformation` TypedDict so callers get key/value type
information instead of an opaque dict.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wjones127 wjones127 marked this pull request as ready for review May 22, 2026 16:00
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@wjones127 wjones127 changed the title fix!: derive index type from details instead of opening the index fix(python)!: derive index type from details instead of opening the index May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant