MDEV-39791: Handle count aggregate optimization for replay purpose#5145
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for recording and replaying count aggregate contexts (count_agg) within the optimizer context store/replay framework. While the implementation successfully integrates the new structures and serialization routines, several critical issues must be addressed. Most notably, there is a type confusion vulnerability in opt_sum.cc where a cast to Item_field* is performed without verifying the item type, which could lead to server crashes. Additionally, there are bugs involving the serialization of a pointer instead of the actual count value, incorrect string appending of a ulong counter, potential counter desynchronization during OOM events, and swapped error messages in the JSON parsing routines.
| void Optimizer_context_recorder::record_count_agg(TABLE *tbl, longlong count) | ||
| { | ||
| table_context_for_store *table_ctx= | ||
| get_table_context(tbl->pos_in_table_list); | ||
|
|
||
| if (unlikely(!table_ctx)) | ||
| return; // OOM | ||
|
|
||
| auto *count_agg_ctx= new (mem_root) count_agg; | ||
| if (unlikely(!count_agg_ctx)) | ||
| return; // OOM | ||
|
|
||
| count_agg_ctx->call_number= ++count_agg_counter; | ||
| count_agg_ctx->count= count; | ||
| table_ctx->count_agg_list.push_back(count_agg_ctx, mem_root); | ||
| } |
There was a problem hiding this comment.
If get_table_context or the allocation of count_agg_ctx fails (e.g., due to OOM), record_count_agg returns early without incrementing count_agg_counter. However, during replay, infuse_count_agg increments count_agg_counter unconditionally. This will cause the counters to get out of sync for any subsequent count aggregates. Incrementing count_agg_counter at the very beginning of record_count_agg ensures that the counters remain synchronized even if recording fails for a specific call.
void Optimizer_context_recorder::record_count_agg(TABLE *tbl, longlong count)
{
count_agg_counter++;
table_context_for_store *table_ctx=
get_table_context(tbl->pos_in_table_list);
if (unlikely(!table_ctx))
return; // OOM
auto *count_agg_ctx= new (mem_root) count_agg;
if (unlikely(!count_agg_ctx))
return; // OOM
count_agg_ctx->call_number= count_agg_counter;
count_agg_ctx->count= count;
table_ctx->count_agg_list.push_back(count_agg_ctx, mem_root);
}5acf866 to
5f4ea85
Compare
Separately dump into the optimizer context, the count of records for every count aggregate along with the call_number that represents how many times opt_sum_query() was invoked as part of the original query execution. During replay, hook the recorded count value when the hook is invoked from opt_sum_query().
5f4ea85 to
65823a0
Compare
039fe52 to
90b9f07
Compare
Separately dump into the optimizer context, the count of records for every count aggregate along with the call_number that represents how many times opt_sum_query() was invoked as part of the original query execution.
During replay, hook the recorded count value when the hook is invoked from opt_sum_query().