[CALCITE-7510] EnumerableTableModify: advanced UPDATE, DELETE and INSERT#4922
[CALCITE-7510] EnumerableTableModify: advanced UPDATE, DELETE and INSERT#4922wasabii wants to merge 1 commit into
Conversation
|
Thank you for your contribution! Please refer to point four of the PR template and make adjustments accordingly. |
0f0eb88 to
09685a2
Compare
|
|
|
||
| /** Generates code for an UPDATE statement. | ||
| * | ||
| * <p>The child produces, for each row matched by the WHERE clause, a row of |
There was a problem hiding this comment.
is "the child" childExp?
This phrase is awkward with lots of commas, you can rearrange it a bit.
| * updateColumnList.size()} fields are the new values, one per column named in | ||
| * the SET clause. Filtering by WHERE has already been applied upstream, so | ||
| * every source row corresponds to an existing row in the modifiable collection | ||
| * and can be located by full-row content equality. |
There was a problem hiding this comment.
What happens if the collection is a multiset?
Equality will match multiple rows.
I hope there is a test for this case.
There was a problem hiding this comment.
It would match multiple rows. I thought about this, but I am not sure how better to go about it currently. There isn't really any sort of 'row id' passed around between the two sides of this. I also don't really like the full equality test. But, no PK metadata either.
I figured that this was largely good enough for the ServerDdlExecutor.
There was a problem hiding this comment.
Actually, I could do something like take the input update list, and ensure that we only ever update one item in the target collection per item in the update list. It's just a lot of complicated accounting, and I don't know if it matter. It certainly doesn't for my use case for this (which is just minimal UPDATE functionality so I can use ServerDdlExecutor as the backend for some other projects test cases.
There was a problem hiding this comment.
Calcite is a library with many users, we have to build APIs that have well-defined contracts.
The contract has to be defined clearly, and to make sense. Perhaps it's fine to not support deletion from multi-sets, but this has to be documented and perhaps justified. One justification is that using the DELETE statement you can only delete all or none of the copies of a value from a multiset.
There was a problem hiding this comment.
I implemented it as I suggested. With tests. Since all the records in this hypothetical are the same, it just deletes the first record it hits for each item in the source set. So whatever copies remain remain.
|
|
||
| /** Generates code for a DELETE statement. | ||
| * | ||
| * <p>{@code Object[].equals()} is reference equality, so |
There was a problem hiding this comment.
This comment is about the implementation and not what this function does. I would make it a regular comment. The JavaDoc can use more information, though.
|
|
||
| /** Generates code for an INSERT statement. | ||
| * | ||
| * <p>When the child row type differs from the table row type (the common case |
| /** | ||
| * Removes from {@code sink} every row that appears in {@code source}. | ||
| * | ||
| * <p>For multi-column rows the backing collection stores {@code Object[]} |
There was a problem hiding this comment.
another implementation comment
| for (Object row : tempList) { | ||
| toRemove.add(Arrays.asList((Object[]) row)); | ||
| } | ||
| ((Collection<Object[]>) sink).removeIf(row -> toRemove.contains(Arrays.asList(row))); |
There was a problem hiding this comment.
This seems to rely on the fact that all equal rows will either be deleted or not.
Maybe this is fine, but then function definition should specify this.
There are legit cases where you may want to remove only some of the copies (e.g., IVM).
b4c9210 to
6af4079
Compare
mihaibudiu
left a comment
There was a problem hiding this comment.
Please do not amend commits during the review process, especially for such a large PR.
It makes it difficult for reviewers to see what has changed
mihaibudiu
left a comment
There was a problem hiding this comment.
I think this is fine, but we won't merge it before the next release, which will hopefully happen soon.
I did this because the policy said The commit text should match the PR title. I assumed you wanted then squashed. It was super annoying to do and I would have rather not but I thought it was correct. |
|
We'll squash before merging |
|
Please check the failures from the checker framework, these are not run as part of |
|
Can you please also check my last two comments and see if they can be addressed? |
|
Did so. |
|
If the CI passes you can also squash the commits into one. |
|
Did I do it? Did I get it all? |
|
You can get most of these diagnoses by running locally |



Jira Link
CALCITE-7510
Changes Proposed
Add UPDATE support to EnumerableTableModify.
Fix DELETE support: was calling removeAll, which removed all instances of the matching entity, or did not remove anything if the row type did not implement equality properly.
INSERT support works as is, but the code paths were cleaned up a bit given the restructure caused by UPDATE and DELETE.
These are each fairly fundamentally different operations. UPDATE and DELETE require finding the original rows in the collection. But, UPDATE operates on an array containing both existing columns with the new values appended, even if the table is a scalar. DELETE requires finding the existing columns, but per-row state needs to be maintained to only delete the appropriate number of items (in case of a multiset). INSERT is pretty straightforward.