[CALCITE-7422] Support large plan optimization mode for HepPlanner#4803
[CALCITE-7422] Support large plan optimization mode for HepPlanner#4803zhuwenzhuang wants to merge 2 commits intoapache:mainfrom
Conversation
d87d57a to
d4ae34e
Compare
d4ae34e to
eefc51c
Compare
6369ba1 to
6d9ab05
Compare
|
A better iterater implementation of DFS/BFS is needed. I will optimize this later (unavailable for the next two weeks). |
6d9ab05 to
a5805c8
Compare
4e77147 to
3870056
Compare
90c7cd5 to
9ed2932
Compare
Key optimizations of large plan mode: 1. Reusable graph, avoid reinit. 2. Efficient traversal, skip stable subtree. 3. Fine-grained GC. Usage: see comments of HepPlanner() Perf result of LargePlanBenchmark: Match Order Union Num Node Count Rule Transforms Time (ms) -------------------------------------------------------------------- ARBITRARY 1000 4000 6006 1043 ARBITRARY 3000 12000 18006 1306 ARBITRARY 10000 40000 60006 3655 ARBITRARY 30000 120000 180006 13040 DEPTH_FIRST 1000 4000 6006 347 DEPTH_FIRST 3000 12000 18006 1068 DEPTH_FIRST 10000 40000 60006 4165 DEPTH_FIRST 30000 120000 180006 12898 BOTTOM_UP 1000 4000 6006 1145 BOTTOM_UP 3000 12000 18006 10152 TOP_DOWN 1000 4000 6006 1193 TOP_DOWN 3000 12000 18006 8074
9ed2932 to
7aabe9d
Compare
|
|
||
| private boolean enableFiredRulesCache = false; | ||
|
|
||
| private boolean largePlanMode = false; |
There was a problem hiding this comment.
Is the optimization somehow worse for small plans?
If it's always faster, there should not be such a flag, so you should document it as "To be removed in the future".
If small plans are worse in this mode, the flag should remain, but it still should be documented somehow (e.g., how to choose whether to set it).
There was a problem hiding this comment.
The optimization is always better for plans in any scale.
I will add the comment "To be removed in the future".
| * planner.setRoot(initPlanRoot); | ||
| * planner.executeProgram(phase1Program); | ||
| * planner.dumpRuleAttemptsInfo(); // optional | ||
| * planner.clear(); // clear the rules and rule match caches, the graph is reused |
There was a problem hiding this comment.
I don't understand the comment about "the graph is reused".
Maybe the graph is deallocated?
There was a problem hiding this comment.
The graph is not deallocated.
The clear() api isn't good for the purpose that clean the rule caches and keep the graph in the HepPlanner (for other HepPrograms).
Maybe clearRules is better?
HepPlanner planner = new HepPlanner(); planner.setRoot(initPlanRoot); planner.executeProgram(phase1Program); planner.clearRules(); planner.executeProgram(phase2Program); planner.clearRules(); ... RelNode optimized = planner.buildFinalPlan();
| } | ||
|
|
||
| @Override public RelNode findBestExp() { | ||
| if (isLargePlanMode()) { |
There was a problem hiding this comment.
what is the reason for this choice?
There was a problem hiding this comment.
Originally, I through large planner mode preferred running multiple programs. So the main program wasn't necessary in this mode.
Considering that the optimization can be used for plans of any scale, there is another choice.
findBestExp uses mainProgram do optimization, and if mainProgram is empty(), it just skips the mainProgram, directly builds the final plan.
I will remove this exception.
| iter = getGraphIterator(programState, newVertex); | ||
| } else { | ||
| // Continue from newVertex and keep previous iterator status. | ||
| // It prevents revisiting the large plan's stable subgraph from root. |
There was a problem hiding this comment.
what is a stable subgraph?
There was a problem hiding this comment.
A stable subgraph is a part of the DAG to which no rules will be applied.
For a plan like this, every node replacement in subgraph2 may reset the iterator to the root, so subgraph1, although stable, will be visited repeatedly.
root
<- subgraph1_root (stable)
<- ... other nodes in subgraph1 (stable)
<- subgraph2_root
<- ... other nodes in subgraph2 (with many node replacements)
There was a problem hiding this comment.
You should add this information to a comment, it seems useful
| assert start == root; | ||
| collectGarbage(); | ||
| if (!isLargePlanMode()) { | ||
| // NOTE: Planner already runs GC for every subtree removed by transformation |
There was a problem hiding this comment.
I don't understand what this comment is saying.
There was a problem hiding this comment.
I will change it to:
We do not need to run garbage collection for the whole graph here. tryCleanVertices already cleans up potentially removed vertices.
|
|
||
| HepRelVertex newVertex = addRelToGraph(bestRel); | ||
| HepRelVertex newVertex = addRelToGraph(bestRel, null); | ||
| Set<HepRelVertex> garbageVertexSet = new LinkedHashSet<>(); |
There was a problem hiding this comment.
why does it have to be a LinkedHashSet?
There was a problem hiding this comment.
LinkedHashSet preserves insertion order during iteration. it is debugging-friendly.
There was a problem hiding this comment.
you should probably add a comment about this too
| return rel; | ||
| } | ||
|
|
||
| /** Try to remove discarded vertices recursively. */ |
There was a problem hiding this comment.
"recursively" means that vertices without in-edges are removed, and after each removal the operation is repeated on the remaining edges?
There was a problem hiding this comment.
Yes.
transformTo will remove the in-edges for the matched nodes. We just tryClean the nodes on this subtree.
It requires that users should not just change the nodes outside the matchOperands list.
| if (graph.getOutwardEdges(vertex).size() | ||
| != Sets.newHashSet(requireNonNull(vertex, "vertex").getCurrentRel().getInputs()).size()) { | ||
| throw new AssertionError("HepPlanner:outward edge num is different " | ||
| + "with input node num, " + vertex); |
There was a problem hiding this comment.
different from the input node count
| if (liveNum == validSet.size()) { | ||
| return; | ||
| } | ||
| throw new AssertionError("HepPlanner:Query graph live node num is different with root" |
|
If you make changes, please use new commits for now. |
- Add Javadoc for largePlanMode field with "To be removed in the future" note - Rename clear() to clearRules() for clarity in multiphase optimization - Update usage example to use clearRules() and fix "graph is reused" to "graph is preserved" - Remove UnsupportedOperationException in findBestExp() for large plan mode - Fix comment about garbage collection in getGraphIterator() - Add comment about caching fired rule before constructing HepRuleCall - Fix "different with" to "different from" in assertion error messages
|
mihaibudiu
left a comment
There was a problem hiding this comment.
I have approved, but please consider adding the two extra comments I have suggested.






Motivation and details: https://issues.apache.org/jira/browse/CALCITE-7422
Before:
LargePlanBenchmark:100 : 1s
LargePlanBenchmark:1000 : 9s
LargePlanBenchmark:10000 : pretty slow, cannot measure.
CPU Profiler Result(enable fired rules cache and disable large plan mode):


fireRule(RelOptRuleCall ruleCall)takes 2% cpu time.Mem Profiler Result(enable fired rules cache and disable large plan mode):
After (enable fired rules cache and large plan mode):
LargePlanBenchmark:100 : 1s
LargePlanBenchmark:1000 : 2s
LargePlanBenchmark:10000 : about 60s
Benchmark (unionNum) Mode Cnt Score Error Units LargePlanBenchmark.testLargeUnionPlan 100 avgt 256.561 ms/op LargePlanBenchmark.testLargeUnionPlan 1000 avgt 1616.421 ms/op LargePlanBenchmark.testLargeUnionPlan 10000 avgt 53393.727 ms/opCPU Profiler Result:
fireRule(RelOptRuleCall ruleCall)takes 11% cpu time. There is still lots of room for CPU optimization.Mem Profiler Result:

(avoid buildListRecurse/collectGarbage, smaller memory peak size)