MDEV-34228: Support underscores in numeric literals (SQL:2023 T662)#4624
MDEV-34228: Support underscores in numeric literals (SQL:2023 T662)#4624abhishek593 wants to merge 4 commits intoMariaDB:mainfrom
Conversation
gkodinov
left a comment
There was a problem hiding this comment.
A good start! Thank you for your contribution!
This is a preliminary review. The important part is to measure and document the performance impact.
Please also quote into the worklog not some random blog, but the actual version, clause and ideally the definition of the extension itself.
I'd be especially interested how does the standard define the conversions of such literal to strings. E.g. is cast (1_2_3 AS string) supposed to preserve these?
And vice versa, is cast("123" as INT) supposed to preserve these?
You also need to cover casts to/from strings, e.g.: CAST('1_2.3_4e1_0' AS …)
| LEX_CSTRING Lex_input_stream::get_numeric_token(uint skip, uint length) | ||
| { | ||
| const char *str= m_tok_start + skip; | ||
| for (uint i= 0; i < length; i++) |
There was a problem hiding this comment.
I'd use strchr: maybe the compiler will optimize it better.
| underscore will be removed. We can iterate over str to count | ||
| exact size of the new string, but that may be slower. | ||
| */ | ||
| if (!(to= m_thd->alloc<char>(length))) |
There was a problem hiding this comment.
I would like you to profile and record the following performance comparisons:
- processing a SQL SELECT statement with 1M integer literals (no underscores): prior to your fix compared to after. Run that 10M times to get a good sample.
- same as above, but with underscores.
| SELECT 0xABCD_EF01; | ||
| 0xABCD_EF01 | ||
| ��� |
There was a problem hiding this comment.
this is probably garbled by some editor!
The main contributions of this PR are:
New validation rules: