feat: support DROP INDEX DDL#371
Merged
hamersaw merged 6 commits intolance-format:mainfrom Apr 2, 2026
Merged
Conversation
hamersaw
reviewed
Apr 1, 2026
Collaborator
hamersaw
left a comment
There was a problem hiding this comment.
Looks great thanks! Can you just add an integration test?
Add two integration tests to TestDDLIndex: - test_drop_index: creates index, drops it, verifies output schema and that SHOW INDEXES no longer lists it - test_drop_index_then_recreate: full lifecycle create -> drop -> recreate, verifies the recreated index works with queries
LuciferYang
commented
Apr 2, 2026
Contributor
Author
LuciferYang
left a comment
There was a problem hiding this comment.
Thanks for the review! Added two integration tests in docker/tests/test_lance_spark.py under TestDDLIndex:
test_drop_index— creates a BTree index, drops it viaALTER TABLE ... DROP INDEX, verifies the output schema (index_name,status), and confirms the index no longer appears inSHOW INDEXES.test_drop_index_then_recreate— full lifecycle: create → drop → recreate, then verifies the recreated index exists inSHOW INDEXESand queries still work correctly.
SHOW INDEXES output schema uses column 'name', not 'index_name'. The DROP INDEX output uses 'index_name' — these are different schemas.
Contributor
Author
|
need to further investigate the Docker testing |
Spark's spark-catalyst JAR contains its own org.apache.spark.sql.catalyst.plans.logical.DropIndex with a 3-parameter constructor (LogicalPlan, String, boolean). At runtime Spark's class takes precedence on the classpath, causing NoSuchMethodError when the AST builder tries to call lance-spark's 2-parameter constructor. Rename to LanceDropIndex / LanceDropIndexOutputType to avoid the collision, consistent with how other lance-spark classes (AddIndex, ShowIndexes, etc.) already use names that don't conflict with Spark built-ins.
…llision Spark's spark-sql JAR also contains DropIndexExec in the same package (org.apache.spark.sql.execution.datasources.v2) with a different constructor signature. Rename to LanceDropIndexExec for the same reason as the LanceDropIndex rename.
Contributor
Author
|
all test passed |
Contributor
Author
|
thanks @hamersaw |
Contributor
Author
|
By the way, I would like to mention that I will be on a 4-day holiday. During this period, I may not respond promptly to code fix suggestions. |
Collaborator
Enjoy the break! |
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add support for dropping indexes on Lance tables through the Spark SQL interface:
This is the counterpart to the existing
CREATE INDEXandSHOW INDEXEScommands, completing the index lifecycle management in lance-spark.Design
DROP INDEXis a metadata-only operation — it removes the index entry from the dataset manifest via lance-core'sdataset.dropIndex(name)API. Physical index files are not deleted; they are cleaned up byVACUUMduring garbage collection.The implementation follows the standard lance-spark SQL extension pipeline:
ALTER TABLE ... DROP INDEX indexNamerule +DROPtokenvisitDropIndexin all version-specific builders (3.4, 3.5, 4.0, 4.1)LanceDropIndex(table, indexName)with output schema(index_name, status)LanceDropIndexExeccallsdataset.dropIndex(indexName)on the driverLanceDropIndex→LanceDropIndexExecmapping withindexName.toLowerCasefor consistency withCREATE INDEXChanges
LanceSqlExtensions.g4— AddeddropIndexgrammar rule andDROPtokenLanceSqlExtensionsAstBuilder.scala(3.4, 3.5, 4.0, 4.1) — AddedvisitDropIndexvisitorDropIndex.scala— NewLanceDropIndexlogical planDropIndexExec.scala— NewLanceDropIndexExecphysical execution nodeLanceDataSourceV2Strategy.scala— AddedLanceDropIndex→LanceDropIndexExecdispatchBaseAddIndexTest.java— AddedtestDropIndexandtestDropIndexThenRecreatetest_lance_spark.py— Addedtest_drop_indexandtest_drop_index_then_recreateintegration testsdrop-index.md— New documentation pageTest plan
Unit tests (BaseAddIndexTest.java)
testDropIndex— Creates index, drops it, verifies schema/output, confirms index is gonetestDropIndexThenRecreate— Full lifecycle: create → drop → recreate → verify query worksIntegration tests (test_lance_spark.py)
test_drop_index— Creates BTree index, drops it via SQL, verifies output schema (index_name,status), confirmsSHOW INDEXESno longer lists ittest_drop_index_then_recreate— Full lifecycle: create → drop → recreate, verifies recreated index appears inSHOW INDEXESand queries still return correct results