return to table of content

Sort, sweep, and prune: Collision detection algorithms (2023)

JKCalhoun
12 replies
1d5h

This naive algorithm runs in O(n2) time in Big O terms.

Is this true? The naive algorithm's outer loop (i) counts n - 1, while the inner loop (j) begins at i + 1 (so counts progressively less than n - 1).

Not a CS major, is this roughly equivalent (for large values of n) to O(n2) or is it, as it appears, something less?

friendzis
6 replies
1d5h

Big O is just complexity classes, describing how number of abstract computations scale (that's the key word) with input size (input list length). Generally speaking, if you could analytically express number of computations as a function of input size, Big O takes the largest term and drops all factors.

It does not necessarily describe performance of the algorithm. `20n2^+5n` and `2n^2 + 9001n` are both O(n^2)

chipdart
5 replies
1d2h

It does not necessarily describe performance of the algorithm.

Not necessarily true. It does indeed describe performance of the algorithm. It just compares scenarios with coarser granularity. You can tell from the very start that a O(1) algorithm is expected to outperform a O(N²) alternative.

ohwellhere
3 replies
1d1h

My algorithms class taught to think of it not as "describing performance" in an absolute sense, but as "describing how performance changes as the size of the input data increases".

It is not necessarily true that an O(1) algorithm will outperform an O(n^2) alternative on a particular set of data. But it is true that an O(1) algorithm will outperform an O(n^2) alternative as the size of the input data increases.

chipdart
1 replies
1d1h

(...) but as "describing how performance changes as the size of the input data increases".

Yes, that's the definition of asymptotic computational complexity. That's the whole point of these comparisons. It's pointless to compare algorithms when input size is in the single digit scale.

Kubuxu
0 replies
23h39m

You could have an O(N^2) algorithm outperform an O(N) on the scale of 10,000 (or whatever scale you want to imagine). The big-O notation compares only asymptotic behaviour, and sometimes the lower power factors are overwhelming.

tialaramex
0 replies
21h31m

This sometimes doesn't work out in practice because the scaling involved runs into a limitation your big-O model didn't account for. Typical examples are: The size of the machine registers, physical RAM, addressable storage, or transmission speeds.

If your O(1) algorithm takes an hour for any input, and the competition is O(n) it may seem like there must be cases where you're better, and then you realise n is the size in bytes of some data in RAM, and your competitors can do 4GB per second. You won't be competitive until we're talking about 15TB of data and then you remember you only have 64GB of RAM.

Big-O complexities are not useless, but they're a poor summary alone, about as useful as knowing the mean price of items at supermarkets. I guess this supermarket is cheaper? Or it offers more small items? Or it has no high-end items? Or something?

rcxdude
0 replies
14h19m

Counter-examples: https://en.m.wikipedia.org/wiki/Galactic_algorithm

All that different complexity classes show is that there is some n for which one outperforms the other. In practice this n can be extremely large.

yeevs
0 replies
1d5h

it's the summation of numbers from 1 to n which is n(n+1)/2. This reduces to quadratic complexity because big O notation ignores all coefficients and terms that scale slower

veryrealsid
0 replies
1d2h

这今天

spacecadet_
0 replies
1d5h

You're right that it's not exactly n^2. For the i-th element we perform (n - i - 1) comparisons (indexing from 0). This adds up to a total of (n - 1) * n / 2 comparisons. (see https://en.wikipedia.org/wiki/Triangular_number)

In the end it doesn't make a difference for big-O analysis because it's used to describe the behavior when n approaches infinity, where the quadratic factor takes over.

qwery
0 replies
1d5h

The "optimisation" of starting the inner loop at `j = i + 1` is done to avoid testing every pair of objects twice.

[Ed: I should add that it also prevents testing an object against itself.]

The algorithm is O(n^2) because every pair will be checked once.

cinntaile
0 replies
1d5h

I don't know if this will make more sense to you but Big O is like limit calculations from calculus.

jonnycat
8 replies
1d4h

One interesting note on this approach is that the author suggests using a "fast" sorting algorithm like mergesort/quicksort as the sorting algorithm for best performance. But in practice, you may get better performance from a "worse" sorting algorithm: insertion sort.

The reason is that objects in collision detection systems tend to move relatively small steps between frames, meaning you can keep lists from the previous frame that are already mostly sorted. For sorting these mostly sorted lists, insertion sort tends towards O(n) while Quicksort tends towards O(n^2)!

deathanatos
5 replies
1d4h

The author covers this pretty much exactly as you describe it. From Part 2 of TFA,

Let’s look at the sort step, which is the bottleneck of the algorithm according to the analysis.

You can see that most of the time, the sort does nothing at all! The list is almost always already sorted from the previous frame.

Even when it becomes unsorted, it usually just takes a couple of swaps to be sorted again.

Here’s insertion sort in action:
tialaramex
4 replies
1d3h

If you use a modern sort, it'll just go "Oh that's sorted, we're done". Pattern Defeating Quicksort will do that and so would Glidesort, Driftsort, IPNsort and so on. Most of these algorithms will also correctly shortcut trivial "almost sorted" cases like 123457689 (only longer, if you really only have nine elements you want a custom sort for that, there will be a correct algorithm and it's probably written down somewhere already)

Rust's old sorts, both of them, get this right. There's a fair chance the current version of your C++ stdlib even gets this right in its unstable sort. In 1965 it's understandable that you reach for an insertion sort for this case, in 2024 it's embarrassing if your language doesn't provide a built-in sort where this just works.

maccard
0 replies
6h15m

You don’t just want to sort the data, you actually want to collect a list of overlapping pairs, (or more like you want to collect the list of pairs that have changed since last time).

Using a built in sort will give you the same complexity, but iterating over 10000 elements again to gather this list is an overhead that you can avoid with a DIY approach

eru
0 replies
11h4m

For what it's worth, Haskell's built-in sort does this, and I think Python, too. (But I'm not as confident about Python. My memory is a bit hazy.)

eru
0 replies
11h5m

For sorting these mostly sorted lists, insertion sort tends towards O(n) while Quicksort tends towards O(n^2)!

Only if you pick your pivot really, really badly. You can randomise your pivot to get O(n log n), or in the case of already almost sorted lists, you can pick a pivot by just picking the element in the middle of the list.

But you are right, that even with optimal pivots, QuickSort is still O(n log n) even in the best case. There are simple variants of merge sort that give you O(n log k) behaviour where k is the number of runs (both ascending and descending) in the data. The 'sort' you get by default in eg Haskell's standard library uses such an algorithm. I think Python does, too.

GistNoesis
0 replies
1d1h

There is an alternative to sorting every step : Build your indexing structure a little looser, so that you catch the candidates collisions when object have moved less than epsilon.

For example that can be done by increasing the radius of the spheres by epsilon. As long as the spheres have not moved by epsilon, you don't need to recompute the index.

To avoid latency peaks when you do need to recompute the index, you can start building a lagging index by sorting 10% each frame (aka amortizing the computation cost). After 10 frames you have an index that is valid as long as the position is within epsilon of the position 10 frames ago.

wood_spirit
5 replies
1d5h

A tangent, but HNers interested in an article on collision detection might know: are there any similar articles that show how to compute the intersection of two capsules (as in the space that that a sphere moving in a straight line in a time step occupies) in 3D? My own hobby 3D game got stuck on that hurdle and I couldn’t find any examples anywhere :(

qwery
1 replies
1d4h

Capsule-capsule overlap can be detected by treating the capsules as line segments with radius/thickness. But I think you need something more complicated than capsule-capsule intersection to truly solve the problem (continuous collision detection between dynamic objects).

The Rapier project and its documentation[0] might be of interest to you. Rapier has a more sophisticated CCD implementation than most (popular in gamedev) physics engines.

Rapier implements nonlinear CCD, meaning that it takes into account both the angular and translational motion of the rigid-body.

[0] https://rapier.rs/docs/user_guides/rust/rigid_bodies#continu...

Animats
0 replies
12h22m

I haven't worked on collision detection since the 1990s. But convex hull collisions are very fast with GJK. And the space swept by translating a convex hull is itself a convex hull. So continuous collision detection (no fly-through) is possible that way. But that's usually overkill. Objects that move more than their own dimension in one frame time are usually bullets, and those should get special handling.

xeonmc
0 replies
1d4h

Isn’t it just a matter of checking if two lines’ perigee fall within radius sum, and if so check if the respective segments’ closest points to the perigee are within radius sum?

pornel
0 replies
1d4h

You make one of them have thickness of both, so that the collision detection becomes line vs capsule collision.

Another trick is to rotate both so that one of the move directions is axis-aligned, and then the problem looks more like AABB.

GistNoesis
0 replies
1d2h

For fine collision between capsules, you consider the capsules as 3d segments with thickness. And two of those collide if the distance between segments < sum of half-thicknesses.

You can check this site with nice animations which explain how to compute the distance between segments : https://zalo.github.io/blog/closest-point-between-segments/

One more generic way to compute the distance between shapes is to view it as a minimization problem. You take a point A inside object 1 and a point B inside object 2, and you minimize the squared distance between the points : ||A-B||^2 while constraining point A to object 1 and point B to object 2.

In the case of capsules, this is a "convex" optimization problem : So one way of solving for the points, is to take a minimization (newton) step and then project back each point to its respective convex object (projection is well defined and unique because of the convex property) (It usually need less than 10 iterations).

In geometry, we often decompose object into convex unions. One example is sphere-trees, where you cover your object with spheres.

This allow you to use fast coarse collision algorithm for spheres to find collision candidates of your capsules : You cover your capsules with spheres (a little bigger than the thickness) so that the capsule is inside the union of the spheres.

sixthDot
2 replies
1d5h

The title was curious to me because I expected more a post about the `intersects` function, e.g the pip algorithm... turns out it's more about the complexity involved by the number of objects to test, which in the end is also interesting.

veryrealsid
1 replies
1d2h

我Ronald Hansen

sixthDot
0 replies
22h3m

E Don Lancaster

nox101
2 replies
18h48m

A long time ago I did something similar but rather than sorting I just kept index lists for each direction and the objects sorted themselves. Meaning like there are 4 lists `objectIndicesSortedByLeftEdge/RightEdge/TopEdge/BottomEdge` If an object moves horizontally then it updates its own indices in the leftEdge and rightEdge arrays. This is because as it moves it likely only has to swap 1 or 2 indices at most to move itself.

novaleaf
1 replies
18h31m

I think your method seems useful for scenes that are mostly static. the "recreate the graph" approach seems better the more dynamic things are.

nox101
0 replies
17h48m

objects are unlikely to jump from one side the screen to the other so comparing them to stuff on the other side of the screen seems like a waste (which is what a generic sort will do).

MrLeap
2 replies
18h34m

I'm a big fan of spatial grid hashes to simplify situations like this. Nice to see other approaches. Iterative Insertion sorting would be easier to port to a compute shader than a spatial grid hash, so maybe this method is better if we get into the millions of objects range?

Jarmsy
1 replies
9h0m

Isn't spatial hashing most suited to colliding objects of similar sizes, and with similar dimensions in each axis? (So you can match the grid size accordingly) If you've got a mix of very different sizes and shapes, I think sweep and prune can be a better choice

deliveryboyman
0 replies
1h17m

Yes, in TFA the author uses objects that are the same size. I imagine that is why OP brought up spatial hashing since they specifically mention this.

rendaw
1 replies
1d4h

I hadn't seen this before, but isn't it similar to using something like a quad-tree to reduce the number of potential colliders?

deliveryboyman
0 replies
1d4h

Yes, although you're more likely to see something like a k-d tree in offline rendering than in real-time rendering.

bee_rider
1 replies
1d4h

Since the balls probably don’t move much per frame, should the list be considered “nearly sorted?”

Sharlin
0 replies
1d1h

Yes, the author gets to that in the second part.

RaftPeople
1 replies
23h48m

I won’t cover other approaches, such as space partitioning or spatial tree subdivision

I'm curious about this comment, anyone know if the algorithm in the article is generally faster than space partitioning/spatial tree subdivision?

A long time ago I used a spatial tree type of approach which seemed naively to be a pretty good approach, but I never investigated or compared algorithms other people were using (this was 80's, pre-internet).

kevingadd
0 replies
23h23m

The complexity involved in maintaining space partitioning, tree subdivision etc can end up being a big hassle, especially if you have huge numbers of moving objects.

It's much easier to write, debug and optimize something that manages a single list of entities, or a grid of i.e. 256x256 'cells' that each contain a list of entities, than it is to set up a complex partitioning scheme that maintains all your tree invariants every time an object moves.

In the days of i.e. DOOM or Quake the performance of these underlying systems was much more important than it is now, so it (IMO) made more sense for the authors of those engines to cook up really complex partitioning systems. But these days CPUs are really good at blasting through sorted arrays of items, and less good at chasing linked lists/trees (comparatively) than they used to be, due to pipelining. Your CPU time isn't going to be spent on managing those entity lists but will instead be spent on things like AI, rendering, etc.

zengid
0 replies
1h0m

This is awesome, very beautiful and well written!

Are there other handy 2-d algorithms that are this well explained? I know redblobgames has a treasure trove too, but curious about more.

yazzku
0 replies
16h44m

Not mentioned is spatial hashing, which works well without complex data structures.

Also sad that you need all this complexity to test 25 balls in Javascript. 25^2 is a small number in C, especially when your balls fit in cache.

Still a good introduction to various algorithms, though.

syntaxing
0 replies
1d3h

Super interesting, my first thought before I read the article was why not a bloom filter but didn’t expect it to be “physical” collision

marginalia_nu
0 replies
1d5h

I enjoyed the use of illustration. Seems like an appropriate usage.

Sometimes I feel like these articles with interactive illustrations are more like an excuse to put together a bunch of cool demos, like there's a lot of fluff with not much substance (a bit like a TED talk), but this one didn't let the illustrations take over.

layer8
0 replies
1d2h

The animations don’t show on iOS Safari.

denvaar
0 replies
1d3h

Got distracted (in a good way) by this website. It's fun and inspiring.

bob1029
0 replies
20h51m

I've always enjoyed this document regarding continuous collision detection:

https://github.com/bepu/bepuphysics2/blob/master/Documentati...

The library itself is amazing in terms of performance. It is a bit challenging to integrate with though due to the amount of optimization involved.

bambax
0 replies
1d3h

Excellent article. Sorting the list is a really simple and neat idea, without the need for clustering or a special data structure.

FrustratedMonky
0 replies
1d2h

Very Nice animated examples

AndrewKemendo
0 replies
15h18m

This was really well put together.

What’s funny is that I’ve been doing some form of gamedev since late 90s and most of this is abstracted by engines at this point, but this is essential to understanding how complex system simulations work.

Thanks to the author for making a very accessible article