Port tests in csv_files.rs to sqllogictest#8251
Port tests in csv_files.rs to sqllogictest#8251Asura7969 wants to merge 24 commits intoapache:mainfrom
csv_files.rs to sqllogictest#8251Conversation
# Conflicts: # datafusion/physical-plan/src/joins/hash_join_utils.rs
# Conflicts: # datafusion/physical-plan/src/joins/hash_join_utils.rs
| .with_delimiter(cmd.delimiter as u8) | ||
| .with_quote(cmd.options.get("quote").map_or(b'"', |x| x.as_bytes()[0])) | ||
| .with_escape(cmd.options.get("escape").map_or(None, |x| Some(x.as_bytes()[0]))) | ||
| .with_file_compression_type(file_compression_type), |
There was a problem hiding this comment.
I'm not sure if this is the best way, maybe we can open another issue to improve it
There was a problem hiding this comment.
TBH I'd prefer to file a separate issue for integrating QUOTE & ESCAPE into CreateExternalTable pipeline. My concern here is that it probably will be confusing for end-users to pass read options into clause reserved for write options (and without any effect on writing CSV).
| WITH HEADER ROW | ||
| DELIMITER ',' | ||
| OPTIONS ('quote' '~') | ||
| LOCATION '../../testing/data/csv/csv_custom_quote.csv'; |
There was a problem hiding this comment.
This file is needed here, but I don't know how to create it.
The contents of the file(csv_custom_quote.csv) are as follows:
c1,c2
~id0~,~value0~
~id1~,~value1~
~id2~,~value2~
~id3~,~value3~
~id4~,~value4~
~id5~,~value5~
~id6~,~value6~
~id7~,~value7~
~id8~,~value8~
~id9~,~value9~
| WITH HEADER ROW | ||
| DELIMITER ',' | ||
| OPTIONS ('escape' '\') | ||
| LOCATION '../../testing/data/csv/csv_custom_escape.csv'; |
There was a problem hiding this comment.
This is the same
The contents of the file(csv_custom_escape.csv):
c1,c2
"id0","value\"0"
"id1","value\"1"
"id2","value\"2"
"id3","value\"3"
"id4","value\"4"
"id5","value\"5"
"id6","value\"6"
"id7","value\"7"
"id8","value\"8"
"id9","value\"9"
|
Thank you @Asura7969 . This PR appears to be failing CI, so marking it as draft until that is fixed. |
The reason for this failure is that the test file is missing. Can you help me check test? |
I believe another way is to generate custom-delimited (or escaped) .csv while configuring test context -- reference example might be csv generation for |
Thanks for your tip, I'll improve it |
# Conflicts: # datafusion/core/tests/sql/mod.rs
This PR also involves two attribute( CREATE EXTERNAL TABLE custom_escape (
c1 VARCHAR DEFAULT NULL,
c2 VARCHAR DEFAULT NULL
)
STORED AS CSV
WITH HEADER ROW
DELIMITER ','
OPTIONS ('escape' '\') -- this
LOCATION '../core/tests/data/custom_escape.csv';Using the API to create similar tables supports Thanks again for the tips. |
|
I created the test file under the path |
I suppose, it's more neat to use enable_testdir method of |
I used enable_testdir method,thanks for your advice |
|
cc @alamb |
alamb
left a comment
There was a problem hiding this comment.
Thanks @Asura7969 -- this code looks great to me. Also thank you very much for filing #8310
In order to avoid losing test coverage, I think we should fix #8310 before merging this one.
Is that something you are able to try @Asura7969 ? Conveniently, you already have written the test for the code in this PR :)
| info!("Registering metadata table tables"); | ||
| register_metadata_tables(test_ctx.session_ctx()).await; | ||
| } | ||
| "csv_files.slt" => { |
I'll try it this week |
For anyone following along this is done in #8351 |
|
Thanks for implementing #8351 @Asura7969! Since all tests for #8197 have been ported & with #8885 being merged this draft PR here can be closed? |
I agree -- thank you @simicd |
Which issue does this PR close?
Closes #8197 .
Rationale for this change
What changes are included in this PR?
2 parts:
quote&escapeforcreate external tablein sql(The api method is already supported)Are these changes tested?
Yes
Are there any user-facing changes?