I couldn't find any written explanation of the GJK collision detection algorithm which gave good intuition for it, so I spent an afternoon writing one up. Let me know of any ways I could make it more clear & efficient.
And, of course: Take this with the appropriate amount of salt for a high school sophomore's explanation of anything math related.
I am a mathematician, and the worst that I can say is that, if I were writing for a mathematical audience, I might have phrased some things slightly differently. (Oh, and the title should be "as simply as possible", not "as simply as possible.") I didn't know the GJK algorithm before, but, if I were teaching Calculus III now, I'd probably see if I could find a way to work this in. That's how good your explanation is.
I often read HN in the morning after waking up, and these kinds of mistakes in the comments always make me feel like I'm having a stroke.
I try to find a difference for too long. Event did Cmd+F to make sure im not crazy :)
Ha, right, thanks! The original title was "as simply as possibly."
'Praising with faint critique'?
I had hoped so, but I can see how it wouldn't come across that way.
Is this algorithm guaranteed to terminate?
For the "smooth rounded rectangle" example at the end of the article, what's to stop it from getting closer-and-closer to an answer, but never actually getting there?
I'm talking about idealized mathematical objects, of course. I realize that in real-world computing there's no point continuing beyond some practical precision threshold.
Termination conditions for GJK are hard, and caused a lot of trouble. Search for "GJK algorithm" and "termination". There are several academic papers, and discussions in the game dev community.
GJK works on convex polyhedra. The "convex" requirement is strict. For example, a cube with tesselated faces is not convex. The two triangles that make up the face of a cube are co-planar, and the edge between them has a zero break angle. This messes up GJK. It's necessary to be able to handle flat polygon faces, not just triangles. That creates a new problem. Polygons are not perfectly flat due to round-off error in floating point. What works is computing a convex hull with a minimum break angle of about 1 degree. That creates geometry GJK can process.
GJK is an example of an algorithm for which there is considerable distance between the naive algorithm that mostly works and the production-quality version that works reliably. Some other algorithms like that include:
- JPEG 2000 decoding. The open source reference implementation is slow and somewhat buggy. There is proprietary code that is faster and works reliably, but it is rather expensive. This is partly why JPEG 2000 never really caught on, outside of medical imagery.
- Mesh simplification. There are two common open source approaches - local triangle merging and quadric mesh simplification. Both can create a big mess if pushed too hard. The metric you really want is that the reduced mesh should look roughly the same as the original when seen from a distance at any angle. Neither of those approaches achieve that. The open source code in Blender works, but the results are disappointing for many cases. Simplygon and Unreal Engine have better approaches, but are proprietary.
- Convex hull decomposition. Given a non-convex mesh, produce a set of disjoint convex meshes which add up to the original. This is a preprocessing step before you can use GJK. Most algorithms for this produce terrible-looking decompositions, with long, skinny triangles. There's something called approximate convex hull decomposition, which relaxes the "disjoint" requirement, allowing some overlap. This results in much simpler sets of convex meshes. There's open source code for this, but it crashes if the input geometry is not absolutely correct, and sometimes when it is.
- Constructive solid geometry. Create a cylinder. Create a thread form. Extrude the thread form along a spiral. Subtract that from the cylinder to get a screw. Chamfer the end. Autodesk Inventor has been able to do that since about 2014, which is what you get for the big bucks. There's some open source code, but historically it hasn't handled the hard cases well. This matters when the next step is cutting the part out of metal.
- Layered clothing for game characters. Create a game character. Set them into an A-pose (arms out, legs apart). Put a rigged mesh shirt on them as a rigged mesh. Resize so that the shirt fits and the body doesn't poke through. Add a coat created by someone else than the one who created the shirt. Resize so the coat fits and the shirt doesn't poke through. Now observe that if the character moves out of A-pose, you probably get some peek-through at shoulders and elbows. Automatically adjust the mesh rigging to fix that. Clo and Marvelous Designer have non-real-time solutions to this, and Roblox has a real-time solution. There's no known open source solution.
Each of these problems is good for years of thesis-level hard work.
For mesh simplification, what about https://github.com/wjakob/instant-meshes ? I know it does the thing you required, but it doesn't really do variable density (you could tell it what areas need what density and it would do that; the paper has a demonstration of this).
Computational Geometry has a technique for avoiding all of the complex cases where point are colinear, coplanar, cocircular, etc.. I’ve forgotten the techniques’ name, but coordinates are integers plus an epsilon term. As the epsilon goes to zero, you get the original statement of the problem.
The epsilon terms are used for breaking ties. If you are deciding if a point is to the right or left of a line, you can use the factors on epsilon terms to decide. If the limit of the point for epsilon close to but greater than 0 is right or left, you use that answer.
You can use multiple epsilon terms, one for each dimension and claim that the x-dimension approaches zero faster than the y or z-dimension, etc. It makes it all work as integer linear algebra with simple greater than or less than on scalar result.
In freecad I didn't encounter any issue doing just that. Of course we are 10y later, so maybe things have improved.
I made quite elaborate custom threads for 3D printing this way.
I’m confused by the three sets in the second figure: A, B and A-B. At first I took it to mean that there was some transformation of sets A and B that resulted in shape A-B. After rereading a couple times I think A-B is supposed to represent an intersection of some other sets A and B, not the two present to the left, and the significance of this intersection is that it overlaps the origin or 0,0. Is that right?
Hope this feedback is helpful.
It's literally a difference of coordinates, not an intersection. Take all pairs of points in the (filled) polygons A and B, subtract their coordinates, and include the difference in A-B.
If there is an intersection, there is at least one pair that consists of the same point twice; the difference of coordinates will be (0,0) and therefore, if there is an intersection then A-B contains the origin.
I'd never heard of this algorithms before (or never looked into it if I had) and this explanation was great. The 3 visuals with A, B, and A-B gave a good "yeah, I could see how that works" and then the further explanation sold why it really does always work. Nice work and cool info, thanks for sharing!
This is great!
Very clear write-up.
You have a talent, please keep doing this and who knows maybe one day you will write a fantastic text book.
I thought "as nice as this is already, what's missing from perfect" and here's what I can say:
- add a few comments on worst-case runtime complexity
- separate section about termination
- I always enjoy pseudo-code being part of an explanation. Your explanation comes from a mathematical point of view, which works very well, and which should be kept, but perhaps after each step in the maths explanation you could add a few lines of pseudocode that "capture" how far you got with the algorithm, e.g. define auxiliary functions like S(•).
But again, these are nitpicky extension proposals to make something already great perfect because I love it, not to criticize it.
PS: I also like your write-up about hidden Open AI models (I always check out what else people did whose output impressed me, and that additional time investment is nearly always worth it).