We’ve been working on the Quench editor and pipeline for about a month now, and before anything else, I want to know what I can and can’t get away with doing on an Android tablet. Over this past week I’ve performed a semi-scientific study to identify how best to use the A* algorithm in Quench.

AI can be incredibly demanding on CPU resources. Having spent time studying robotics, I know the kind of computational power that often goes into academic robot designs with goals no more noble than ensuring even coverage of a surface by a Roomba. It can be surprising how much computation it can take to do something that seems trivial in the human experience.

Quench is going to require that groups of animals have herd-level group-think AI and individual-animal-level AI that result in flocking/swarming behaviour to move as a group and also avoid enemies while finding their way through a map of hazards to reach a goal location. Our plan is to implement the group-level AI as an A* algorithm that runs at intervals to identify a clear path to follow. Flocking requires some further study before we can say how exactly that will work.

With these goals in mind, I dove into the first assignment for my AI class at Humber to answer some important questions for myself. I wanted to know what factors cause the computation complexity of A* to grow most quickly, so that I could plan to mitigate or sidestep them. And so I wrote myself a simple A* testbed as a C# Console Application that utilized interfaces to specify the necessary characteristics of an A* agent so that I could compare them and make generalizations about their behaviour.

*TLDR; The improvement provided by a bit of research and a 1-to-2-hour fix was massive. Feel free to scroll down to the bottom to download my complete result set, the A* testing program binaries and/or of all of my C# source code. Feel free to use them in whatever way you like.*

## Methods

I like science a lot, but I am an engineer by trade. I do it for results, so please forgive me if this doesn’t play out like a scientific paper. People that don’t like numbers may want skip straight to the conclusions and download my A* testbed to try on your own PC, because this is going to get DRY.

I started off by implementing a naive A* algorithm. I didn’t do any research into improving my A* algorithm. I just wrote something that would work, based strongly upon the guide at http://www.policyalmanac.org/games/aStarTutorial.htm. I use this as my control below.

If you arean’t already familiar with the A* pathfinding algorithm, this article may make little sense to you. If you aren’t familiar, I strongly suggest that you take a look through that article because it’s a really good primer on the basics and it helped my get started pretty quickly.

I built a simple test harness that configures the A* navmesh by linking up a hexagonal navmesh, applies a Simplex/Perlin (coherent) 2D noise pattern to fake terrain costs, and then randomly generates a bunch of A* agents to traverse the navmesh in various ways. I set up a bunch of stopwatches in the Agent code to record as much data as I could regarding the performance characteristics of the algorithm — which parts take the most time, and under which circumstances.

Once all of this was done, I build the project out in Release mode with CPU optimization enabled, loaded up my desktop PC into Windows 7 (64-bit) in safe-mode (to reduce noise from other applications in the background) and ran several hundred Agents over several hundred random navmeshes to produce tens of thousands of results to study.

I tested 3 heuristics with 3 different noise amplitudes, since I noticed some interplay between the noise pattern and how heuristics worked. My heuristics are the venerable Manhattan heuristic, the simple Straight-Line-Distance heuristic, and a no heuristic at all — which results in the A* algorithm decomposing into Dijkstra’s Algorithm (which is a specific case of A*).

I employed noise patterns to fake terrain costs at 3 amplitude levels: 0 to 1, 0 to 2 and 0 to 5. I didn’t understand anything about how these would change how A* behaves at first, but since it appeared to matter I collected data. And man, am I ever happy that I did.

And then I broke out my goto datavis program — Excel — to figure out how to improve it.

## Naive Algorithm Results

I’ll spare you a lot of my fanangling with charts to identify the strongest correlations in the data and just show you those that turned out nicely.

The naive algorithm produced the following results for each combination of heuristic and noise level:

Naive Algorithm (CPU ticks) | |||
---|---|---|---|

Move Cost | Manhattan |
Straight Line |
None (Dijkstra’s) |

0.0 – 1.0 | 1,710 | 1,529 | 3,616,933 |

0.0 – 2.0 | 3,671 | 5,541 | 3,594,015 |

0.0 – 5.0 | 282,601 | 628,454 | 3,582,042 |

I measure this in CPU ticks mostly because that is as close as I can get to a cross-platform performance measure. Using the .NET 4.0 compiler, I expect you’ll see similar numbers on any x86/x64 machine in Windows. For those who are curious, the Dijkstra’s Algorithm runs average about 1.3 seconds to complete 3.6ish-million ticks on the test rig. Obviously not the kind of performance you’re looking for in a game, although the other heuristics still perform a great deal better.

Since I clocked a lot of the costs of internal operations within the A* algorithm, I came up with the following charts fora similar noise level as well (see the complete results for all of the charts).

The left-hand chart represents the cost of maintaining the closed set (clearing the set, adding nodes, checking if nodes belong to it) compared to the size of the closed set. The right-hand chart shows the cost of maintaining the open set (clearing the set, adding nodes, checking if nodes belong to it, finding the node with the lowest heuristic cost) compared to the size of the open set. It also describes the cost of finding the node with the lowest heuristic cost specifically in red.

### Manhattan Heuristic

### Straight-Line Heuristic

### No heuristic (Dijkstra’s Algorithm)

To those of you familiar with computational complexity, the obvious quadratic nature of the closed set curves above should smack of O(n^2) complexity. It appears, although it is less obvious through the noise, that the open set chart exhibits an O(n^2) computational complexity as well. This makes a lot of sense as many operations within A* are linear operations that need to be done for each item in a set.

The hawk-eyed amongst you will have noted the scale of the axes. Dijkstra’s Algorithm clearly ends up with much larger closed and open sets than the other two, which is probably responsible for how expensive it is to compute.

Okay. A few more graphs. This time we’re comparing different move cost settings for a single heuristic.

### 0.0 – 1.0 Terrain Costs

### 0.0 – 2.0 Terrain Costs

### 0.0 – 5.0 Terrain Costs

Probably the most significant difference between each of these charts is the time complexity. Note the y-axis of each graph. As the Terrain Costs increase in magnitude, the time complexity of the algorithm increases exponentially. At first I had no idea why this could be, but after reading this article on Amit’s A* Blog, it seems that the higher noise magnitude results in the heuristics underestimating by more and more, and so the behaviour of the A* algorithm behaves more and more like Dijkstra’s Algorithm (very bad news!).

## Optimization

From staring at these results for a bit, the areas that can exhibit the greatest improvement were:

- The closed set operations — particularly checking if a node belongs to the set
- The open set operations — particularly checking if a node belongs to the set and finding the minimum cost node within the set

James suggested an optimization that involves including a pair of boolean flags into each node to indicate whether it belongs to the closed set or the open set. While containers still need to exist for both of these things (to aid in cleaning up the boolean flags when the algorithm ends), checking whether a nod belongs to a collection is a linear operation that takes time. Reducing this to a constant-time operation for checking if a flag is set makes short work of this process. The cost of adding nodes to containers and clearing them are inexpensive operations compared to searching through the entire sets, so the rest of the set operations are peanuts. All told, this optimization alleviates the greatest cost of working with these collections

The other major optimization was to replace the List<T> type used for the Open Set with a binary heap. While this type isn’t included natively in the .NET framework, I was able to find a nice C# implementation on the Game Programmers Wiki. A binary heap provides constant time access to the minimum-valued item in the set, but requires log(n) time to add items to the set. Ultimately this trades a linear operation that has to be performed often for a log(n) operation that only have to be done occasionally. A very good trade in my opinion!

Between both of these changes, which only took an hour or two to implement, we saw incredible improvements.

## Optimized Algorithm Results

To refresh your memory, the naive algorithm’s results looked like this:

Naive Algorithm (CPU ticks) | |||
---|---|---|---|

Move Cost | Manhattan |
Straight Line |
None (Dijkstra’s) |

0.0 – 1.0 | 1,710 | 1,529 | 3,616,933 |

0.0 – 2.0 | 3,671 | 5,541 | 3,594,015 |

0.0 – 5.0 | 282,601 | 628,454 | 3,582,042 |

The optimized algorithm produced the following results for each combination of heuristic and noise level. Included next to each result is the speedup factor in multiples of the naive algorithm speed for the same noise level and heuristic.

Optimized Algorithm (CPU ticks) | |||
---|---|---|---|

Move Cost | Manhattan |
Straight Line |
None (Dijkstra’s) |

0.0 – 1.0 | 359 (5x) | 349 (4x) | No data |

0.0 – 2.0 | 566 (6x) | 675 (8x) | No data |

0.0 – 5.0 | 4,055 (70x) | 6,066 (104x) | 14,760 (243x) |

Not too bad, eh? Have some charts!

### Manhattan Heuristic

The new optimized results:

compared to naive algorithm:

### Straight Line Heuristic

The new optimized results:

compared to naive algorithm:

### No Heuristic (Dijkstra’s Algorithm)

The new optimized results:

compared to naive algorithm:

Now that I’ve drawn your attention to the y-axis of these charts, you might have noticed that the time complexity of the new results are DRAMATICALLY better. Also, the shape of the plotted data indicates that the closed set and open set operations have been reduced to linear time complexity from O(n^2).

Okay… Just a few more. I promise!

### 0.0 – 1.0 Terrain Costs

The new optimized results:

compared to naive algorithm:

### 0.0 – 2.0 Terrain Costs

The new optimized results:

compared to naive algorithm:

### 0.0 – 5.0 Terrain Costs

The new optimized results:

compared to naive algorithm:

Note that the maximum cost of the optimized algorithm grows much more slowly, and in fact, linearly. This is a really interesting result because it means that the worst case of the optimized A* algorithm — even if the heuristic is a massive underestimate — even if the path to search is quite long — is much better than before and scales much less quickly.

## Conclusions

I’m happy to say that the week or so that I put into building this A* test harness really translated into massive performance gains for Quench. Beyond that, I’ve picked up an in-depth understanding of the algorithm and its strengths and weaknesses. Moving on from now, should I ever have need of any specialized A* behaviour in the future and I plan to write in C# again, then I can easily compare that behaviour to these test cases using my test harness.

You can find all of the files I promised below. Feel free to use them for any purpose you like. I hope it makes someone else as giddy as I am about huge volumes of data!

## Recent Comments