Skip to content

perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450

Open
Mauller wants to merge 2 commits intoTheSuperHackers:mainfrom
Mauller:Mauller/perf-reverse-insertionsort
Open

perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450
Mauller wants to merge 2 commits intoTheSuperHackers:mainfrom
Mauller:Mauller/perf-reverse-insertionsort

Conversation

@Mauller
Copy link

@Mauller Mauller commented Mar 13, 2026

This PR adds conditionaly selected reverse insertion sorting.

putOnSortedOpenList() now acts as a dispatching function that chooses if a new node is inserted by traversing the list from the head or tail. The code determines if the new nodes total cost is closer to the head or tail to determine which list traversal direction to use.

This provides a significant performance improvement to long paths, allowing them to finish in half to a third of the time of just using forward insertion sorting alone.

The likely reason for this is that longer and more complex paths tend to append larger values to the list which requires a full list traversal when forward inserting. Or they are adding large path costs so will typically be inserting in the upper 50% of the list.

On average we would have expected at least a 50% performance improvement, due to the logical splitting in half of the list length, since we now approach any node position from either end of the list.

With the conditional reverse insertion sort we would have expected the algorithm to be O(N/2) in time, but for the above reasons, allowing quick appends dramatically improves the performance of some pathing sorts.

In other words, pathfinder go brrrrrr already at stage 1 tuning!


Some comparison values from a debug build, the pathfinding times are in second.
image

image

Shorter paths are not as significantly affected, but the followup skip list enhancement should provide a further boost for these.
image

Some variation is also down to CPU load as i often would see variation between runs but overall significantly lower times on the longer paths.

@Mauller Mauller self-assigned this Mar 13, 2026
@Mauller Mauller added Performance Is a performance concern Gen Relates to Generals ZH Relates to Zero Hour labels Mar 13, 2026
@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR implements a conditional bidirectional insertion sort for the A* pathfinder open list, yielding a reported 50–66% reduction in pathfinding time for long/complex paths. PathfindCellList gains a m_tail pointer so the list is doubly-anchored; a new canReverseSort heuristic checks whether the incoming node is cost-closer to the tail than to the head and, if so, dispatches to reverseInsertionSort (backward traversal from tail) instead of the existing forwardInsertionSort. All mutation paths — init, prepend, mid-list insert, append, and removeFromOpenList — correctly maintain both m_head and m_tail.

  • Algorithm correctness: The bidirectional traversal preserves ascending sort order, handles equal-cost ties consistently with the forward path, and correctly updates the head/tail anchors in every branch. Edge cases (empty list, single element, exact head/tail cost matches) are all handled.
  • releaseOpenList / releaseClosedList: Both now call list.reset() (clearing m_tail as well as m_head) rather than assigning nullptr directly to m_head, keeping the list fully reset after a full release.
  • forwardInsertionSortRetailCompatible: The retail-compatible insertion path does not maintain m_tail. This is safe today because the retail → fixed transition always passes through forceCleanCells()reset() before canReverseSort can be called, but it leaves the function as the only insertion path that does not keep m_tail in sync (see inline comment).
  • API surface: reverseInsertionSort and canReverseSort are public, consistent with forwardInsertionSort. The intended entry point is putOnSortedOpenList, as documented by the author.

Confidence Score: 4/5

  • This PR is safe to merge with one minor housekeeping point to address at convenience.
  • The core algorithm is correct: sorted order is maintained in all branches, head/tail anchors are updated consistently across insert/remove/release, the canReverseSort heuristic degrades gracefully on empty and single-element lists, and the retail→fixed transition is guarded so that m_tail is always reset before the new code paths execute. The only gap is that forwardInsertionSortRetailCompatible does not maintain m_tail, making it the sole insertion function that diverges from the new invariant — but there is no runtime path today where this causes a problem.
  • The append-at-end branch in forwardInsertionSortRetailCompatible in AIPathfind.cpp deserves a follow-up to add m_tail maintenance for internal consistency.

Important Files Changed

Filename Overview
Core/GameEngine/Include/GameLogic/AIPathfind.h Adds m_tail to PathfindCellList, updates constructor/reset, and declares the new canReverseSort, reverseInsertionSort, and getPrevOpen/getTotalCostDifference methods. Changes are clean and well-structured with #pragma once and GPL header both present.
Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp Implements the reverse insertion sort, tail-maintenance in forward sort and removeFromOpenList, and the canReverseSort dispatch heuristic. Core algorithm is correct; forwardInsertionSortRetailCompatible (retail-compat mode only) does not update m_tail, though it is safe in practice because the retail→fixed transition always passes through reset() before any canReverseSort call.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["putOnSortedOpenList(list)"] --> B{RETAIL_COMPATIBLE\n& !s_useFixedPathfinding?}
    B -- Yes --> C["forwardInsertionSortRetailCompatible(list)\nm_tail NOT maintained"]
    B -- No --> D{"canReverseSort(*this)\n|head.cost - new.cost| > |tail.cost - new.cost|?"}

    D -- false\nempty list OR closer to head --> E["forwardInsertionSort(list)"]
    D -- true\ncloser to tail --> F["reverseInsertionSort(list)"]

    E --> E1{list.m_head == nullptr?}
    E1 -- Yes --> E2["Init: m_head = m_tail = this"]
    E1 -- No --> E3{new.cost < head.cost?}
    E3 -- Yes --> E4["Prepend: update m_head"]
    E3 -- No --> E5["Traverse forward until\nnext.cost > new.cost\nthen insert; update m_tail if appending"]

    F --> F1{list.m_tail == nullptr?}
    F1 -- Yes --> F2["Init: m_head = m_tail = this"]
    F1 -- No --> F3{new.cost >= tail.cost?}
    F3 -- Yes --> F4["Append: update m_tail"]
    F3 -- No --> F5["Traverse backward until\nprev.cost <= new.cost\nthen insert; update m_head if prepending"]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp
Line: 1707-1756

Comment:
**`forwardInsertionSortRetailCompatible` does not maintain `m_tail`**

This function is gated behind `!s_useFixedPathfinding` and, in current practice, `canReverseSort` is never reachable from this path, so there is no runtime bug today. However, every other insertion path (`forwardInsertionSort`, `reverseInsertionSort`) now keeps `m_tail` in sync, making this function the only one that can leave `m_tail` in a stale state (nullptr while the list is non-empty).

The retail → fixed transition always routes through `forceCleanCells()``reset()` before `canReverseSort` is ever called, so the invariant is safe under the current call graph. Nevertheless, adding the two `m_tail` assignments that the non-retail counterpart has would make the function self-consistent and eliminate any future risk if the call graph changes:

```cpp
	if (list.m_head == nullptr)
	{
		list.m_head = this;
		list.m_tail = this;   // keep m_tail in sync
		m_info->m_prevOpen = nullptr;
		m_info->m_nextOpen = nullptr;
		return;
	}
```

and in the append-at-end `else` branch:

```cpp
	else
	{
		// append after "previousCell" - we are at the end of the list
		previousCell->m_info->m_nextOpen = this->m_info;
		m_info->m_prevOpen = previousCell->m_info;
		m_info->m_nextOpen = nullptr;
		list.m_tail = this;   // keep m_tail in sync
	}
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "perf(pathfinder): Ex..."

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 12389b7 to 62457b8 Compare March 13, 2026 16:34
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 62457b8 to 7f0fee2 Compare March 13, 2026 16:50
@Mauller
Copy link
Author

Mauller commented Mar 13, 2026

@greptile early initialising the tail makes no difference to performance when testing, keeping the code to initialise the tail when the list is appended is just as performant. Often hundreds of items are added to the list so an element is often appended to the list early on.

@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative.

Since putOnSortedOpenList is the only intended entry point and the list quickly gets populated in typical pathfinding scenarios, the current approach works well for your use case.

@Mauller
Copy link
Author

Mauller commented Mar 13, 2026

Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative.

Since putOnSortedOpenList is the only intended entry point and the list quickly gets populated in typical pathfinding scenarios, the current approach works well for your use case.

Can you update your summary please.

@Mauller
Copy link
Author

Mauller commented Mar 14, 2026

@greptile can you update your summary to disregard your considered prepend head fix as it is unnecessary as discussed.

@greptile-apps
Copy link

greptile-apps bot commented Mar 14, 2026

I understand - the prepend head optimization isn't necessary. Here's the updated summary:

This PR optimizes the pathfinding open list insertion by implementing bidirectional insertion sorting. The algorithm now compares a new node's cost against the midpoint between the head and tail costs to determine whether to traverse from the head or tail, reducing average insertion time from O(N) to O(N/2).

The performance improvement is particularly dramatic for long paths (2-3x faster) because:

  1. Complex paths tend to generate nodes with higher costs that insert near the tail
  2. Approaching from the tail eliminates full list traversals for these common cases

The benchmarks demonstrate significant speedup on longer paths while maintaining equivalent performance on shorter paths.

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 7f0fee2 to 195696e Compare March 14, 2026 10:14
@Mauller
Copy link
Author

Mauller commented Mar 14, 2026

Tweaked forward and reverse insertion sort so if they are initialising the head or tail they also initialise their opposite if it is uninitialised.

It doesn't hurt performance when done here as it's only checked once after the head or tail initialisation check.

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 195696e to a478eda Compare March 14, 2026 21:22
@xezon xezon requested a review from Skyaero42 March 17, 2026 18:46
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from a478eda to 852bb07 Compare March 17, 2026 21:09
@Mauller
Copy link
Author

Mauller commented Mar 17, 2026

updated based on feedback

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 852bb07 to 61742fd Compare March 18, 2026 20:53
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 61742fd to 73a3e2d Compare March 19, 2026 20:21
@Mauller
Copy link
Author

Mauller commented Mar 19, 2026

That was just a quick rebase, going to tweak now

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 73a3e2d to 9ce7b91 Compare March 19, 2026 20:33
…evOpen() and extend PathfindCellList with tail tracking and PathfindCellList::canReverseSort() (#2450)
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 9ce7b91 to 90d6f7a Compare March 19, 2026 20:36
@Mauller
Copy link
Author

Mauller commented Mar 19, 2026

Should be good now, i just realised i needed to tweak it again since even the retail compatible reset needs to be able to reset the tail if the fixed pathfinding path gets used in a retail compatible build.

@xezon xezon changed the title perf(pathfinder): Implement conditional reverse insertion sorting, improving long path performance by 2-3x perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm Mar 19, 2026
Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looking good.

Title optimized.

m_info->m_prevOpen = current->m_info;
}

// Reverse insertion sort, returns early if the list is being initialized or we are appending the list
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add TheSuperHackers @performance here or elsewhere suitable and explain how this gives performance and how much.

@xezon
Copy link

xezon commented Mar 19, 2026

Greptile is still complaining about "abs() on unsigned subtraction is incorrect".

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 90d6f7a to c88b44e Compare March 20, 2026 08:16
@Mauller
Copy link
Author

Mauller commented Mar 20, 2026

Greptile is still complaining about "abs() on unsigned subtraction is incorrect".

yeah i promoted the unsigned shorts to int before the subtraction, the original code could produce incorrect results but didn't affect the outcome overall in the data i was comparing with at least.

this should be good now

@Mauller
Copy link
Author

Mauller commented Mar 20, 2026

@greptile forwardInsertionSortRetailCompatible does not assign m_tail, as m_tail is never used during retail compatible pathfinding.

when s_usefixedpathfinding is true, forwardinsertionsortand reverseinsertionsort are used and they initialise and use m_tail.

@greptile-apps
Copy link

greptile-apps bot commented Mar 20, 2026

Got it, thanks for the clarification! That makes sense - m_tail is only needed when the bidirectional insertion sort is active (fixed pathfinding mode), so there's no reason for the retail-compatible version to maintain it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Gen Relates to Generals Performance Is a performance concern ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants