perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450
Conversation
|
| Filename | Overview |
|---|---|
| Core/GameEngine/Include/GameLogic/AIPathfind.h | Adds m_tail to PathfindCellList, updates constructor/reset, and declares the new canReverseSort, reverseInsertionSort, and getPrevOpen/getTotalCostDifference methods. Changes are clean and well-structured with #pragma once and GPL header both present. |
| Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp | Implements the reverse insertion sort, tail-maintenance in forward sort and removeFromOpenList, and the canReverseSort dispatch heuristic. Core algorithm is correct; forwardInsertionSortRetailCompatible (retail-compat mode only) does not update m_tail, though it is safe in practice because the retail→fixed transition always passes through reset() before any canReverseSort call. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["putOnSortedOpenList(list)"] --> B{RETAIL_COMPATIBLE\n& !s_useFixedPathfinding?}
B -- Yes --> C["forwardInsertionSortRetailCompatible(list)\nm_tail NOT maintained"]
B -- No --> D{"canReverseSort(*this)\n|head.cost - new.cost| > |tail.cost - new.cost|?"}
D -- false\nempty list OR closer to head --> E["forwardInsertionSort(list)"]
D -- true\ncloser to tail --> F["reverseInsertionSort(list)"]
E --> E1{list.m_head == nullptr?}
E1 -- Yes --> E2["Init: m_head = m_tail = this"]
E1 -- No --> E3{new.cost < head.cost?}
E3 -- Yes --> E4["Prepend: update m_head"]
E3 -- No --> E5["Traverse forward until\nnext.cost > new.cost\nthen insert; update m_tail if appending"]
F --> F1{list.m_tail == nullptr?}
F1 -- Yes --> F2["Init: m_head = m_tail = this"]
F1 -- No --> F3{new.cost >= tail.cost?}
F3 -- Yes --> F4["Append: update m_tail"]
F3 -- No --> F5["Traverse backward until\nprev.cost <= new.cost\nthen insert; update m_head if prepending"]
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp
Line: 1707-1756
Comment:
**`forwardInsertionSortRetailCompatible` does not maintain `m_tail`**
This function is gated behind `!s_useFixedPathfinding` and, in current practice, `canReverseSort` is never reachable from this path, so there is no runtime bug today. However, every other insertion path (`forwardInsertionSort`, `reverseInsertionSort`) now keeps `m_tail` in sync, making this function the only one that can leave `m_tail` in a stale state (nullptr while the list is non-empty).
The retail → fixed transition always routes through `forceCleanCells()` → `reset()` before `canReverseSort` is ever called, so the invariant is safe under the current call graph. Nevertheless, adding the two `m_tail` assignments that the non-retail counterpart has would make the function self-consistent and eliminate any future risk if the call graph changes:
```cpp
if (list.m_head == nullptr)
{
list.m_head = this;
list.m_tail = this; // keep m_tail in sync
m_info->m_prevOpen = nullptr;
m_info->m_nextOpen = nullptr;
return;
}
```
and in the append-at-end `else` branch:
```cpp
else
{
// append after "previousCell" - we are at the end of the list
previousCell->m_info->m_nextOpen = this->m_info;
m_info->m_prevOpen = previousCell->m_info;
m_info->m_nextOpen = nullptr;
list.m_tail = this; // keep m_tail in sync
}
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "perf(pathfinder): Ex..."
12389b7 to
62457b8
Compare
62457b8 to
7f0fee2
Compare
|
@greptile early initialising the tail makes no difference to performance when testing, keeping the code to initialise the tail when the list is appended is just as performant. Often hundreds of items are added to the list so an element is often appended to the list early on. |
|
Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative. Since |
Can you update your summary please. |
|
@greptile can you update your summary to disregard your considered prepend head fix as it is unnecessary as discussed. |
|
I understand - the prepend head optimization isn't necessary. Here's the updated summary: This PR optimizes the pathfinding open list insertion by implementing bidirectional insertion sorting. The algorithm now compares a new node's cost against the midpoint between the head and tail costs to determine whether to traverse from the head or tail, reducing average insertion time from O(N) to O(N/2). The performance improvement is particularly dramatic for long paths (2-3x faster) because:
The benchmarks demonstrate significant speedup on longer paths while maintaining equivalent performance on shorter paths. |
7f0fee2 to
195696e
Compare
|
Tweaked forward and reverse insertion sort so if they are initialising the head or tail they also initialise their opposite if it is uninitialised. It doesn't hurt performance when done here as it's only checked once after the head or tail initialisation check. |
195696e to
a478eda
Compare
a478eda to
852bb07
Compare
|
updated based on feedback |
852bb07 to
61742fd
Compare
61742fd to
73a3e2d
Compare
|
That was just a quick rebase, going to tweak now |
73a3e2d to
9ce7b91
Compare
…evOpen() and extend PathfindCellList with tail tracking and PathfindCellList::canReverseSort() (#2450)
9ce7b91 to
90d6f7a
Compare
|
Should be good now, i just realised i needed to tweak it again since even the retail compatible reset needs to be able to reset the tail if the fixed pathfinding path gets used in a retail compatible build. |
| m_info->m_prevOpen = current->m_info; | ||
| } | ||
|
|
||
| // Reverse insertion sort, returns early if the list is being initialized or we are appending the list |
There was a problem hiding this comment.
Could add TheSuperHackers @performance here or elsewhere suitable and explain how this gives performance and how much.
|
Greptile is still complaining about "abs() on unsigned subtraction is incorrect". |
…ditional reverse insertion sorting (#2450)
90d6f7a to
c88b44e
Compare
yeah i promoted the unsigned shorts to int before the subtraction, the original code could produce incorrect results but didn't affect the outcome overall in the data i was comparing with at least. this should be good now |
|
@greptile when |
|
Got it, thanks for the clarification! That makes sense - |
This PR adds conditionaly selected reverse insertion sorting.
putOnSortedOpenList()now acts as a dispatching function that chooses if a new node is inserted by traversing the list from the head or tail. The code determines if the new nodes total cost is closer to the head or tail to determine which list traversal direction to use.This provides a significant performance improvement to long paths, allowing them to finish in half to a third of the time of just using forward insertion sorting alone.
The likely reason for this is that longer and more complex paths tend to append larger values to the list which requires a full list traversal when forward inserting. Or they are adding large path costs so will typically be inserting in the upper 50% of the list.
On average we would have expected at least a 50% performance improvement, due to the logical splitting in half of the list length, since we now approach any node position from either end of the list.
With the conditional reverse insertion sort we would have expected the algorithm to be O(N/2) in time, but for the above reasons, allowing quick appends dramatically improves the performance of some pathing sorts.
In other words, pathfinder go brrrrrr already at stage 1 tuning!
Some comparison values from a debug build, the pathfinding times are in second.

Shorter paths are not as significantly affected, but the followup skip list enhancement should provide a further boost for these.

Some variation is also down to CPU load as i often would see variation between runs but overall significantly lower times on the longer paths.