return to table of content

The GJK Algorithm: A weird and beautiful way to do a simple thing

arithmoquine
16 replies
1d

I couldn't find any written explanation of the GJK collision detection algorithm which gave good intuition for it, so I spent an afternoon writing one up. Let me know of any ways I could make it more clear & efficient.

And, of course: Take this with the appropriate amount of salt for a high school sophomore's explanation of anything math related.

JadeNB
5 replies
19h22m

I am a mathematician, and the worst that I can say is that, if I were writing for a mathematical audience, I might have phrased some things slightly differently. (Oh, and the title should be "as simply as possible", not "as simply as possible.") I didn't know the GJK algorithm before, but, if I were teaching Calculus III now, I'd probably see if I could find a way to work this in. That's how good your explanation is.

bschwindHN
2 replies
18h30m

Oh, and the title should be "as simply as possible", not "as simply as possible."

I often read HN in the morning after waking up, and these kinds of mistakes in the comments always make me feel like I'm having a stroke.

vincnetas
0 replies
13h6m

I try to find a difference for too long. Event did Cmd+F to make sure im not crazy :)

JadeNB
0 replies
18h17m

Ha, right, thanks! The original title was "as simply as possibly."

burnished
1 replies
18h38m

'Praising with faint critique'?

JadeNB
0 replies
18h16m

'Praising with faint critique'?

I had hoped so, but I can see how it wouldn't come across that way.

interroboink
4 replies
17h33m

Is this algorithm guaranteed to terminate?

For the "smooth rounded rectangle" example at the end of the article, what's to stop it from getting closer-and-closer to an answer, but never actually getting there?

I'm talking about idealized mathematical objects, of course. I realize that in real-world computing there's no point continuing beyond some practical precision threshold.

Animats
3 replies
15h24m

Termination conditions for GJK are hard, and caused a lot of trouble. Search for "GJK algorithm" and "termination". There are several academic papers, and discussions in the game dev community.

GJK works on convex polyhedra. The "convex" requirement is strict. For example, a cube with tesselated faces is not convex. The two triangles that make up the face of a cube are co-planar, and the edge between them has a zero break angle. This messes up GJK. It's necessary to be able to handle flat polygon faces, not just triangles. That creates a new problem. Polygons are not perfectly flat due to round-off error in floating point. What works is computing a convex hull with a minimum break angle of about 1 degree. That creates geometry GJK can process.

GJK is an example of an algorithm for which there is considerable distance between the naive algorithm that mostly works and the production-quality version that works reliably. Some other algorithms like that include:

- JPEG 2000 decoding. The open source reference implementation is slow and somewhat buggy. There is proprietary code that is faster and works reliably, but it is rather expensive. This is partly why JPEG 2000 never really caught on, outside of medical imagery.

- Mesh simplification. There are two common open source approaches - local triangle merging and quadric mesh simplification. Both can create a big mess if pushed too hard. The metric you really want is that the reduced mesh should look roughly the same as the original when seen from a distance at any angle. Neither of those approaches achieve that. The open source code in Blender works, but the results are disappointing for many cases. Simplygon and Unreal Engine have better approaches, but are proprietary.

- Convex hull decomposition. Given a non-convex mesh, produce a set of disjoint convex meshes which add up to the original. This is a preprocessing step before you can use GJK. Most algorithms for this produce terrible-looking decompositions, with long, skinny triangles. There's something called approximate convex hull decomposition, which relaxes the "disjoint" requirement, allowing some overlap. This results in much simpler sets of convex meshes. There's open source code for this, but it crashes if the input geometry is not absolutely correct, and sometimes when it is.

- Constructive solid geometry. Create a cylinder. Create a thread form. Extrude the thread form along a spiral. Subtract that from the cylinder to get a screw. Chamfer the end. Autodesk Inventor has been able to do that since about 2014, which is what you get for the big bucks. There's some open source code, but historically it hasn't handled the hard cases well. This matters when the next step is cutting the part out of metal.

- Layered clothing for game characters. Create a game character. Set them into an A-pose (arms out, legs apart). Put a rigged mesh shirt on them as a rigged mesh. Resize so that the shirt fits and the body doesn't poke through. Add a coat created by someone else than the one who created the shirt. Resize so the coat fits and the shirt doesn't poke through. Now observe that if the character moves out of A-pose, you probably get some peek-through at shoulders and elbows. Automatically adjust the mesh rigging to fix that. Clo and Marvelous Designer have non-real-time solutions to this, and Roblox has a real-time solution. There's no known open source solution.

Each of these problems is good for years of thesis-level hard work.

namibj
0 replies
10h50m

For mesh simplification, what about https://github.com/wjakob/instant-meshes ? I know it does the thing you required, but it doesn't really do variable density (you could tell it what areas need what density and it would do that; the paper has a demonstration of this).

mdnahas
0 replies
1h4m

Computational Geometry has a technique for avoiding all of the complex cases where point are colinear, coplanar, cocircular, etc.. I’ve forgotten the techniques’ name, but coordinates are integers plus an epsilon term. As the epsilon goes to zero, you get the original statement of the problem.

The epsilon terms are used for breaking ties. If you are deciding if a point is to the right or left of a line, you can use the factors on epsilon terms to decide. If the limit of the point for epsilon close to but greater than 0 is right or left, you use that answer.

You can use multiple epsilon terms, one for each dimension and claim that the x-dimension approaches zero faster than the y or z-dimension, etc. It makes it all work as integer linear algebra with simple greater than or less than on scalar result.

bombela
0 replies
5h51m

- Constructive solid geometry. Create a cylinder. Create a thread form. Extrude the thread form along a spiral. Subtract that from the cylinder to get a screw. Chamfer the end. Autodesk Inventor has been able to do that since about 2014, which is what you get for the big bucks. There's some open source code, but historically it hasn't handled the hard cases well. This matters when the next step is cutting the part out of metal.

In freecad I didn't encounter any issue doing just that. Of course we are 10y later, so maybe things have improved.

I made quite elaborate custom threads for 3D printing this way.

adolph
1 replies
15h56m

I’m confused by the three sets in the second figure: A, B and A-B. At first I took it to mean that there was some transformation of sets A and B that resulted in shape A-B. After rereading a couple times I think A-B is supposed to represent an intersection of some other sets A and B, not the two present to the left, and the significance of this intersection is that it overlaps the origin or 0,0. Is that right?

Hope this feedback is helpful.

bonzini
0 replies
11h54m

It's literally a difference of coordinates, not an intersection. Take all pairs of points in the (filled) polygons A and B, subtract their coordinates, and include the difference in A-B.

If there is an intersection, there is at least one pair that consists of the same point twice; the difference of coordinates will be (0,0) and therefore, if there is an intersection then A-B contains the origin.

zamadatix
0 replies
18h29m

I'd never heard of this algorithms before (or never looked into it if I had) and this explanation was great. The 3 visuals with A, B, and A-B gave a good "yeah, I could see how that works" and then the further explanation sold why it really does always work. Nice work and cool info, thanks for sharing!

rayiner
0 replies
21h31m

This is great!

jll29
0 replies
13h20m

Very clear write-up.

You have a talent, please keep doing this and who knows maybe one day you will write a fantastic text book.

I thought "as nice as this is already, what's missing from perfect" and here's what I can say:

- add a few comments on worst-case runtime complexity

- separate section about termination

- I always enjoy pseudo-code being part of an explanation. Your explanation comes from a mathematical point of view, which works very well, and which should be kept, but perhaps after each step in the maths explanation you could add a few lines of pseudocode that "capture" how far you got with the algorithm, e.g. define auxiliary functions like S(•).

But again, these are nitpicky extension proposals to make something already great perfect because I love it, not to criticize it.

PS: I also like your write-up about hidden Open AI models (I always check out what else people did whose output impressed me, and that additional time investment is nearly always worth it).

unholiness
9 replies
21h21m

There's probably a few things incorrect with what I said. So, take it with the appropriate amount of salt for a high school sophomore's explanation of anything math related.

A high school sophomore! I'm thoroughly impressed, not just by the clarity of this explanation, but by the tone, showing appreciation for where and how the elegance pops up, and empathy for a reader who's never heard of these concepts. Throw in the minimal web design I was sure I was reading some graybeard.

Excellent writing, you should be proud of this.

OJFord
8 replies
20h56m

high school sophomore

For the rest of us (and because I've previously heard that in films etc. about university students and never really known exactly what it meant except that it's one of the years) that's the second year of high school, which in a middle-school-having system (as is the norm in the US) is year 10, which in the US is two years (not three as it would be in the UK for example) of high school away from university.

Just a bit of a tangent to understand. Impressive indeed.

63
7 replies
20h52m

For reference, the expected age would be ~15 years old

IncreasePosts
6 replies
20h49m

An easy rule of thumb for US grade-level ages is 5 + grade level = age of students in that grade

jameshart
5 replies
20h1m

   Type mismatch error: operation ‘+’ can not be applied to values ‘5’ and ‘“sophomore”’

trealira
3 replies
19h33m

Another idiosyncrasy. "Elementary school" goes kindergarten, first grade, second grade, ..., fifth grade; then "middle school" is sixth, seventh, and eighth grades; but high school is freshman, sophomore, junior, and senior, and it's very common to just say ninth grade, tenth grade, eleventh grade, and twelfth grade interchangeably with the former terms. Finally, you can say "grade one" and "first grade," "grade two" and "second grade," etc., interchangeably as well.

gleenn
1 replies
19h18m

An extra twist in the US is elementary school tends to end at 6th grade and middle school is 7th and 8th or sometimes elementary ends a year early at 5th and middle school absorbs 6th grade to be 3 years long.

derefr
0 replies
16h59m

It's not so much that there's a formal standard for what years "should" be middle school in the US.

School districts in the US almost always start without established middle schools. Instead, there's so much elementary school capacity and so much secondary school capacity in a district. And then, whenever the population of students in the district changes, the choices of the district school board to respond to that are either to:

1. build one additional elementary school for every N additional grade 1-7 students, and/or one additional secondary school for every M additional grade 8-12 students (and continue to pay these CapEx costs even if student population goes down; and agree to deal with the fact that growth in elementary students will inevitably propagate to growth in secondary students 7 years later);

2. add temporary capacity to existing elementary + secondary schools with portable classrooms (which can easily be reduced back down if populations decrease — but which don't come with commensurate increases in central infrastructure, eventually straining the capacity of school facilities);

3. or, at one point, you look at the numbers of students in each year in your district, treat the years as vnodes, and then rebalance a contiguous set of those vnodes to a new "shard" called "middle schools". I.e. you pick a set of school-years, where extracting students from that set of years from one or two elementary and secondary schools in the district into a newly-built middle school, will result in a "good" amount of capacity being freed from those schools, and the resulting middle school instantly having a "good" utilization.

Once you've chosen option 3 one time, it sets a precedent in that district for what years middle school should cover in that district. Most districts only do the study for what years would be best to extract out the first time they build a middle school. After that, they just go with what they did before, even if it's no longer optimal (because skipping the study saves time and money.)

In theory, school districts could actually build successive middle schools that "pull" different years of utilization out of the elementary+secondary schools around them than previously-built middle schools have. The only issue would be in administrative assignment of students to schools — this would require you to place middle schools with different year coverage so they never have overlapping catchment area.

lelandfe
0 replies
19h9m

Remarkable writing considering OP is just NaN

arithmoquine
4 replies
21h12m

Now that this is unexpectedly getting more attention, I should probably mention that my personal website is basically an elaborate set of in-jokes. Let me know via a reply if you're interested in contacting me or anything!

xelxebar
0 replies
15h51m

You're my kind of people, and your website brings me joy. Keep on keeping on! Throw me your deets and we can chat.

epiccoleman
0 replies
17h36m

I love your site and think you're cool. Keep doing cool stuff!

bsubs
0 replies
19h7m

If you’d be interested in being mentored in a research project send me an email; bersub@cmu.edu

JackYoustra
0 replies
20h8m

I'd be interested!

Animats
3 replies
21h31m

I spent about a year struggling with GJK back in the 1990s.

It's useful for collision detection in 3D, and it can also be used as a closest-points algorithm. As a closest-points algorithm, it's easy to understand the basics. You have two convex solids. Start with a random point on each, and get the distance between those points. On each solid, try to improve the distance by advancing along each edge from the current point. Pick the new closest point. Repeat.

This stops working when the closest point is no longer a vertex. That's when the "simplex" concept is needed. The closest points can be vertex-vertex, vertex-edge, vertex-face, edge-edge, edge-face (no unique solution) and face-face (no unique solution). The simplex thing is really doing a case analysis of those cases.

Lots of things go wrong in practice. If you are doing a physics engine, objects will often settle into face-face contact. A single-point collision model will then result in vibration or incorrect movement. Also, as positions settle into face-face contact, the GJK algorithm runs into a small difference of large values problem that can result in a complete loss of floating point significance. The termination condition can also cause infinite looping.

In theory, this is an elegant solution, but it's a difficult numerical analysis problem. Despite that, it's probably the fastest approach to the problem. It's O(log N) in normal cases, and O(1) for situations where the positions are closest to the previous positions and you use the last solution as the starting point.

The late Prof. Steven Cameron at Oxford did a lot of work on getting GJK right. I used GJK in "Falling Bodies", the first commercial 3D ragdoll system, in the late 1990s.

verticalscaler
1 replies
18h54m

Awesome as always.

https://www.youtube.com/watch?v=5lHqEwk7YHs

Did the patent expire yet? Would you consider releasing the code? It sounds of historical significance and an interesting doom-like read.

Animats
0 replies
15h58m

Here's Prof. Stephen Cameron's gjk, which I used.[1]

Falling Bodies itself was a plug-in for Softimage, a defunct 3D animation program. It's not very useful without Softimage.

[1] https://www.cs.ox.ac.uk/stephen.cameron/distances/

maccard
0 replies
20h23m

Couldn't agree more. One thing to note is that once you have that contact, you almost certainly want to do something with it. In most cases doing something useful involves knowing the actual overlap information. Working that out is actually _worse_ numerically - you start with the simplex that GJK generates and expand out the way, and do triangle subdivision along the way. It's a total nightmare to do performantly.

saturnV
0 replies
6h17m

I have made an implementation based on this article before.

https://dyn4j.org/2010/04/gjk-gilbert-johnson-keerthi/

It was years ago but I remember it was the only good ressource I could find that allowed me to understand how it works.

bruce343434
2 replies
10h36m

How do you get the center of mass of the intersection volume? And in general how do you get the volume as a single number? You will need these to resolve the collisions, but this part is always always always glossed over!

maccard
1 replies
9h38m

I assume by center of mass of the intersection volume you actually mean you want the contact point(s).

And in general how do you get the volume as a single number?

It's not a single number. Imagine a triangle overlapping a square, it's actually a plane, and a depth. To figure that out you use EPA - here's [0] a video explaining it. You project the simplex outwards until it hits the minkowski difference. It's remarkably similar to GJK in implementation. IT's also a pain in the butt to do in a numerically stable way.

[0] https://handmade.network/forums/t/1521-gjk_+_expanding_polyt...

bruce343434
0 replies
8h40m

I was thinking of resolving collisions the way you would simulate boyancy: by applying a force and torque from a certain point (the center of boyancy, the center of the subvolume that is submerged), calculated by pressures (i.e., which areas of the object "dip into" the other object). This should also fix the bouncy jitter that happens when objects lie flat on their face, without needing a "ground" state.

Imagine a bug coming from the ground state design: an object is grounded onto another object, but that other object slowly tilts. The grounded object never slides off, but remains suspended in air.

ubj
1 replies
19h0m

Excellent article! Very clear and engaging.

If checking for an intersection between two convex sets, another method is to solve a convex optimization problem minimizing the norm of the difference between two points, one constrained to be in the first (convex) set and the second constrained to be in the second (convex) set. If the optimal value is zero, then the sets have an intersection.

I'd be curious to see a comparison between the GJK algorithm and using convex optimization. Not sure if either has an advantage over the other.

blt
0 replies
11h37m

Interesting question. Seems like interior-point methods might terminate quickly when the overlap is significant. You could probably add some clever early stopping conditions too.

tehsauce
1 replies
19h41m

Awesome article! Something slightly misleading though - the first image shows the intersection of a non-convex shape, but it isn't revealed until much later that the algorithm only works for convex shapes, not the type shown in the first image.

JadeNB
0 replies
18h44m

It is discussed that the algorithm handles non-convex shapes by breaking them into convex shapes.

zeroonetwothree
0 replies
3h17m

Does it work if they intersect in a single (non vertex?) point? It seems like this would require exact equality testing of floats, which is impossible. So I assume in practice you allow for some margin.

xpe
0 replies
3h49m

Suggestion for the author of computerwebsite.net : include dates of posting and edits. (This suggestion also applies to anyone who writes on the web.)

ryankrage77
0 replies
20h37m

I've been using the Minkowski function in openSCAD for a while now, it's nice to know what it actually is.

rdevsrex
0 replies
21h15m

This was a great writeup. I've never heard of the GJK algorithm before.

pjs_
0 replies
20h38m

Bad boy algorithm. Amazing algorithm

jprete
0 replies
3h12m

I'd never heard of GJK before, but as I remember it, there's a lot of mileage to be gained in computational geometry from applying these kinds of "shape algebra" options in different combinations. And there are plenty of problems that can be adapted to computational geometry formulations by working in the state space.

I think deep learning draws on the same pool of concepts as well.