The code looks heavily obfuscated. It's more like "source available" than open source. E.g.
g(_M,W-=1<<f;xx=M[f];M[f]=x)Ui(M_,Ux=M[i];x?(W+=1<<i,M[i]=xx,x):30>i?_M(i,M_(i+1))+(2*n0<<i):OO())f(_r,pt?x:65535&rx?(--rx,x):e(if(!Tx)n(_r(xU))_M(mx,x-n0);x))f(r_,pt?x:65535&++rx?x:OO())
Edit: Looking at it a bit more, I can't tell if the code is obfuscated or if the author really wrote it like this...
You may not believe it, but that's how K/Q/J people write C code.
Bonus: Go visit and do "View Source" on that website. Even HTML has fragrance of K.
I don't understand how. How do you debug something this? How do you go about fixing a bug? There are no docs nor tests. It seems like one would spend hours just trying to understand what's going on. Is there a good source on their methodology? Because my mind is blown.
I recommend checking out more of the APL family language family and the history of the notation, highly interesting. Almost like a parallel universe of computing, when you look past the syntax.
Yes, but C isn't APL. I don't buy it that this how it was written from day 1. Occam's razor and all, this is obfuscated C, not code written by an alien superintelligence.
haven't you ever written code with single letter variable names and it makes sense to you? and then been forced to read somebody else's code with single character variable names and found it completely inscrutable? this is just that on (a lot of) steroids
No I don't write entire C programs with single letter variables, because there is no way where "c" is more readable than "cnt" or "count". With the usual exception of "for(int i", x, and y variable inside small scopes.
If I was paid by the hour to write C, then I'd use single letter variables too, but I'm too lazy to do twice the work, when I can make my life simpler.
Simplicity is a virtue, there is nothing interesting about complexity for complexity's sake.
In the words of Terry Davis: https://youtu.be/k0qmkQGqpM8?si=larQzV0Ngdba6vQI
The language structure of APLs is far simpler than C. Basically, you learn a few dozen primitives and you're off to the races. That's part of what makes APLs appealing.
But then the scope grows as the code evolves and suddenly you've got 200 lines with a bunch of single variable names. If I'm not a sadist and I rename the code before submitting the PR, but there's definitely a flow state where I've been living the code for too long and what makes total sense to me looks like line noise to others, or even future me. (I was real good at perl, back in the day.)
Point being, Arthur Whitney writes like this, even if you and I can't comprehend it, and yes it's obtuse. I wouldn't even want to work with myself if I wrote code like this, but I'm not as smart as Arthur Whitney.
As you say though, simplicity is a virtue. This is simpler for Arthur Whitney, even if it's more complicated for the rest of us.
I've written code with single letter variable names lots if times! But later on it most certainly does not make sense to me.
Is single letter programming like a religion and I've offended the zealots with my comment?
Whitney is famous for writing code like this, it's been his coding style for decades.
For example, he wrote an early J interpreter this way in 1989. There's also a buddy allocator he wrote at Morgan Stanley that's only about 10 lines of C code.
https://code.jsoftware.com/wiki/Essays/Incunabulum
https://github.com/tavmem/buddy/blob/master/a/b.c
When writing code in this manner, lines of code is kind of a meaningless metric. You could put the entire Linux kernel in one very long line of C.
It was meant just as an indicator of the code density. The actual line lengths are ~80 chars.
You do actually spend hours to understand what's going on. Basically, the idea is that your implementation will be so short and condensed that you'll just rewrite from scratch to fix bugs.
Using a debugger is basically out of the question. The theory is that the conciseness and lack of fluff makes it easier to reason about once you’ve mustered the requisite focus.
Try J or APL, K, BQN, or April, and be prepared to rethink how you implement solutions to problems you've tackled in other PLs. I am an array language user and fan. I have been playing with April and I use J regularly at home and sometimes for work when I can.
From the April github site: "April compiles a subset of the APL programming language into Common Lisp. Leveraging Lisp's powerful macros and numeric processing faculties, it brings APL's expressive potential to bear for Lisp developers. Replace hundreds of lines of number-crunching code with a single line of APL."
The environment is very interactive. It also helps when you can fit do much more on screen, quite possibly the entire program.
Does writing in this way has any advantage in practical terms (not aesthetically reasons)?
The only possible advantage I can think of is that you can fit more code on one screen so I guess in theory you can see your context more easily. But that seems pretty minor compared to... well, look at it!
I read a couple of other threads and some people try to claim less code = fewer bugs, but that's pretty clearly nonsense otherwise minifiers would magically fix bugs.
As for why people actually use this (it seems like they really do), my guess would be that it's used for write-only code, similar to regexes.
Like, using a regex you can't deny that you get a lot of power from not many keystrokes, and that's really awesome for stuff you're never going to read again, like searching code in your editor, one throwaway shell commands.
But they also tend to be completely unreadable and error prone and are best avoided in production code.
Regular expressions can be constructed using various syntaxes. Some are optimized for writing them out quickly, and some are not. Choose the latter when going to production, and you'll be fine.
As for K/J/APL - it's similar. You can write incredibly terse code which works wonderfully in a REPL (console). Working in an interactive manner is the default mode of operation for those languages. If you ever worked with Numpy/Pandas or similar, in a REPL, you likely concocted similarly terse monstrosities. Check your %hist the next time you have a chance. Going to production, you naturally rewrite the code instead of just committing the first one-liner that produces the desired results.
And Pandas was inspired by J per a quote in an article I once read about Wes McKinney, but I cant seem to find it online any longer.
Yeah exactly my point. But it seems like some people are using dense K/J/APL code in production which is just mad.
I suppose in our new world of LLMs, using as few tokens as possible means you can cram more in a small context window, which could be helpful in various ways.
Maybe someone could produce an LLM which takes a line of vector language code and expands it into the dozen(s) of lines of equivalent pseudo-algol?
(I mean, you could skip the whole hallucination thing and write an exact converter, but that'd be a lot of effort for code that'd probably get used about as much as M-expression to S-expression converters do in the lisp world?)
There are a few articles of using J or APL for ANNs and CNNs out there. Here's one:
It's not that any method of reducing code length fixes bugs, it's just happens that optimizing code to be read and worked on by domain experts leads one toward patterns that secondarily end up manifesting as terse expressions. The terseness is certainly shocking if you're not accustomed to it, but it's really not a terminal goal, just a necessary outcome.
The disbelief on first encounter is totally reasonable, but from personal experience, once you've gotten past that and invested the time to really grok the array language paradigms, code like this Whitney style actually ends up feeling more readable than whatever features our current SE culture deems Good and Proper.
There are a whole lot of moving parts to the why and how of these ergonomics, so I don't expect to be able to convince anyone in a simple, short comment, but if you're at all interested and able to suspend disbelief a little bit, it's worth watching some of Aaron Hsu's talks to get more of a taste.
From personal experience with regex, yes the first encounter was shocking, and sure I got past it and now can easily write regexes. But they still aren't more readable than the alternatives. Nor would I encourage their use in production.
I've heard it said that it becomes easy to spot common patterns and structures in this style. Without knowing the idioms it's difficult to read but apparently once you know the idioms, you can scan it and understand it well.
It is for APL/k/j; you see patterns popping up in very dense code which does a lot, which does make it readable really once used to it. But most people never get to that point (or even ever look at APL of course).
imaginary numbers in programming languages is the pattern once common we need to see pop up, J and APL got them for sure
That's the "Whitney style". See: https://code.jsoftware.com/wiki/Essays/Incunabulum
It's writing C in array-language style rather than intentional obfuscation.
Thanks. I'm now reading this where people are trying to explain what happened in the ref/ directory.
https://github.com/kparc/ksimple/blob/main/a.c
Thanks for that link. The comments there help a lot. If I understand them, this is a minimal implementation of K with a lot of limitations, such as:
"the only supported atom/vector type is 8bit integer, so beware of overflows"
Still, it's fascinating how an interpreter can be written with such a small amount of code.
An interpreter for BLC, including tokenizing, parsing, and evaluation, can be written in as few as 29 bytes of BLC (and 650 bytes of C).
John, do tell more about BLC please!
There's more at http://tromp.github.io/cl/cl.html including the LispNYC talk I gave last year.
Thank you!
Personally I believe in a sort of "evolution" of software that operates independently of intentions of the programmers.
I can totally believe that he didn't intentionally obfuscate it, but its incomprehensibility made it harder for other people to make a knockoff and thats why it survived and became successful.
That's just how Arthur Whitney writes code.