Search before asking
Paimon version
708ea8f
Compute Engine
spark version: 3.2
Minimal reproduce step
- create vector table
CREATE TABLE test_db.test_table(gid BIGINT, sid STRING, embs ARRAY<FLOAT>) PARTITIONED BY (date STRING COMMENT 'date') ROW FORMAT SERDE 'org.apache.paimon.hive.PaimonSerDe' WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS INPUTFORMAT 'org.apache.paimon.hive.mapred.PaimonInputFormat' OUTPUTFORMAT 'org.apache.paimon.hive.mapred.PaimonOutputFormat' TBLPROPERTIES ( 'file.format' = 'parquet', 'vector.file.format'='lance', 'vector-field'='embs', 'field.embs.vector-dim'='4', 'row-tracking.enabled'='true', 'data-evolution.enabled'='true', 'global-index.enabled' = 'true' );
- load data
insert overwrite table test_db.test_table VALUES (1, '1', array(cast(1.0 as float), cast(2.0 as float), cast(3.0 as float), cast(4.0 as float)) '20260420');
- proceess a query
select gid, embs from vector_search('test_db.test_table', 'embs', array(1.0f, 2.0f, 3.0f, 3.0f), 5)
- fail build paimon scan
What doesn't meet your expectations?
Process vector search with spark3.2
Anything else?
No response
Are you willing to submit a PR?
Search before asking
Paimon version
708ea8f
Compute Engine
spark version: 3.2
Minimal reproduce step
CREATE TABLE test_db.test_table(gid BIGINT, sid STRING, embs ARRAY<FLOAT>) PARTITIONED BY (dateSTRING COMMENT 'date') ROW FORMAT SERDE 'org.apache.paimon.hive.PaimonSerDe' WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS INPUTFORMAT 'org.apache.paimon.hive.mapred.PaimonInputFormat' OUTPUTFORMAT 'org.apache.paimon.hive.mapred.PaimonOutputFormat' TBLPROPERTIES ( 'file.format' = 'parquet', 'vector.file.format'='lance', 'vector-field'='embs', 'field.embs.vector-dim'='4', 'row-tracking.enabled'='true', 'data-evolution.enabled'='true', 'global-index.enabled' = 'true' );insert overwrite table test_db.test_table VALUES (1, '1', array(cast(1.0 as float), cast(2.0 as float), cast(3.0 as float), cast(4.0 as float)) '20260420');select gid, embs from vector_search('test_db.test_table', 'embs', array(1.0f, 2.0f, 3.0f, 3.0f), 5)What doesn't meet your expectations?
Process vector search with spark3.2
Anything else?
No response
Are you willing to submit a PR?