HN comments for: Arthur Whitney releases an open-source subset of K with MIT license

jiripospisil

39 replies

4d7h

2024-06-01 10:33:03 UTC

The code looks heavily obfuscated. It's more like "source available" than open source. E.g.

  g(_M,W-=1<<f;xx=M[f];M[f]=x)Ui(M_,Ux=M[i];x?(W+=1<<i,M[i]=xx,x):30>i?_M(i,M_(i+1))+(2*n0<<i):OO())f(_r,pt?x:65535&rx?(--rx,x):e(if(!Tx)n(_r(xU))_M(mx,x-n0);x))f(r_,pt?x:65535&++rx?x:OO())

Edit: Looking at it a bit more, I can't tell if the code is obfuscated or if the author really wrote it like this...

cloudbonsai

16 replies

4d7h

2024-06-01 10:41:04 UTC

You may not believe it, but that's how K/Q/J people write C code.

Bonus: Go visit and do "View Source" on that website. Even HTML has fragrance of K.

jiripospisil

15 replies

4d7h

2024-06-01 11:17:56 UTC

I don't understand how. How do you debug something this? How do you go about fixing a bug? There are no docs nor tests. It seems like one would spend hours just trying to understand what's going on. Is there a good source on their methodology? Because my mind is blown.

cenamus

10 replies

4d7h

2024-06-01 11:20:46 UTC

I recommend checking out more of the APL family language family and the history of the notation, highly interesting. Almost like a parallel universe of computing, when you look past the syntax.

sph

9 replies

4d7h

2024-06-01 11:27:29 UTC

Yes, but C isn't APL. I don't buy it that this how it was written from day 1. Occam's razor and all, this is obfuscated C, not code written by an alien superintelligence.

fragmede

5 replies

4d6h

2024-06-01 11:46:18 UTC

haven't you ever written code with single letter variable names and it makes sense to you? and then been forced to read somebody else's code with single character variable names and found it completely inscrutable? this is just that on (a lot of) steroids

sph

2 replies

4d6h

2024-06-01 12:09:40 UTC

No I don't write entire C programs with single letter variables, because there is no way where "c" is more readable than "cnt" or "count". With the usual exception of "for(int i", x, and y variable inside small scopes.

If I was paid by the hour to write C, then I'd use single letter variables too, but I'm too lazy to do twice the work, when I can make my life simpler.

Simplicity is a virtue, there is nothing interesting about complexity for complexity's sake.

In the words of Terry Davis: https://youtu.be/k0qmkQGqpM8?si=larQzV0Ngdba6vQI

rscho

0 replies

4d5h

2024-06-01 13:26:43 UTC

The language structure of APLs is far simpler than C. Basically, you learn a few dozen primitives and you're off to the races. That's part of what makes APLs appealing.

fragmede

0 replies

3d22h

2024-06-01 19:36:19 UTC

With the usual exception of "for(int i", x, and y variable inside small scopes.

But then the scope grows as the code evolves and suddenly you've got 200 lines with a bunch of single variable names. If I'm not a sadist and I rename the code before submitting the PR, but there's definitely a flow state where I've been living the code for too long and what makes total sense to me looks like line noise to others, or even future me. (I was real good at perl, back in the day.)

Point being, Arthur Whitney writes like this, even if you and I can't comprehend it, and yes it's obtuse. I wouldn't even want to work with myself if I wrote code like this, but I'm not as smart as Arthur Whitney.

As you say though, simplicity is a virtue. This is simpler for Arthur Whitney, even if it's more complicated for the rest of us.

davidmurdoch

1 replies

4d5h

2024-06-01 12:52:24 UTC

I've written code with single letter variable names lots if times! But later on it most certainly does not make sense to me.

davidmurdoch

0 replies

2024-06-02 17:45:06 UTC

Is single letter programming like a religion and I've offended the zealots with my comment?

papercrane

2 replies

4d6h

2024-06-01 12:00:54 UTC

Whitney is famous for writing code like this, it's been his coding style for decades.

For example, he wrote an early J interpreter this way in 1989. There's also a buddy allocator he wrote at Morgan Stanley that's only about 10 lines of C code.

https://code.jsoftware.com/wiki/Essays/Incunabulum

https://github.com/tavmem/buddy/blob/master/a/b.c

BeefWellington

1 replies

4d5h

2024-06-01 12:57:23 UTC

When writing code in this manner, lines of code is kind of a meaningless metric. You could put the entire Linux kernel in one very long line of C.

papercrane

0 replies

4d5h

2024-06-01 13:20:24 UTC

It was meant just as an indicator of the code density. The actual line lengths are ~80 chars.

rscho

0 replies

4d5h

2024-06-01 13:23:44 UTC

You do actually spend hours to understand what's going on. Basically, the idea is that your implementation will be so short and condensed that you'll just rewrite from scratch to fix bugs.

gitonthescene

0 replies

3d20h

2024-06-01 21:54:45 UTC

Using a debugger is basically out of the question. The theory is that the conciseness and lack of fluff makes it easier to reason about once you’ve mustered the requisite focus.

eggy

0 replies

4d6h

2024-06-01 12:07:47 UTC

Try J or APL, K, BQN, or April, and be prepared to rethink how you implement solutions to problems you've tackled in other PLs. I am an array language user and fan. I have been playing with April and I use J regularly at home and sometimes for work when I can.

From the April github site: "April compiles a subset of the APL programming language into Common Lisp. Leveraging Lisp's powerful macros and numeric processing faculties, it brings APL's expressive potential to bear for Lisp developers. Replace hundreds of lines of number-crunching code with a single line of APL."

  https://github.com/phantomics/april

andylynch

0 replies

4d6h

2024-06-01 12:15:12 UTC

The environment is very interactive. It also helps when you can fit do much more on screen, quite possibly the entire program.

padthai

12 replies

4d7h

2024-06-01 10:39:30 UTC

Does writing in this way has any advantage in practical terms (not aesthetically reasons)?

IshKebab

8 replies

4d7h

2024-06-01 10:54:01 UTC

The only possible advantage I can think of is that you can fit more code on one screen so I guess in theory you can see your context more easily. But that seems pretty minor compared to... well, look at it!

I read a couple of other threads and some people try to claim less code = fewer bugs, but that's pretty clearly nonsense otherwise minifiers would magically fix bugs.

As for why people actually use this (it seems like they really do), my guess would be that it's used for write-only code, similar to regexes.

Like, using a regex you can't deny that you get a lot of power from not many keystrokes, and that's really awesome for stuff you're never going to read again, like searching code in your editor, one throwaway shell commands.

But they also tend to be completely unreadable and error prone and are best avoided in production code.

klibertp

2 replies

4d7h

2024-06-01 11:09:09 UTC

But they [regexes] also tend to be completely unreadable

Regular expressions can be constructed using various syntaxes. Some are optimized for writing them out quickly, and some are not. Choose the latter when going to production, and you'll be fine.

As for K/J/APL - it's similar. You can write incredibly terse code which works wonderfully in a REPL (console). Working in an interactive manner is the default mode of operation for those languages. If you ever worked with Numpy/Pandas or similar, in a REPL, you likely concocted similarly terse monstrosities. Check your %hist the next time you have a chance. Going to production, you naturally rewrite the code instead of just committing the first one-liner that produces the desired results.

eggy

0 replies

4d6h

2024-06-01 12:11:47 UTC

And Pandas was inspired by J per a quote in an article I once read about Wes McKinney, but I cant seem to find it online any longer.

IshKebab

0 replies

4d6h

2024-06-01 12:04:47 UTC

Yeah exactly my point. But it seems like some people are using dense K/J/APL code in production which is just mad.

eigenvalue

2 replies

4d7h

2024-06-01 11:04:09 UTC

I suppose in our new world of LLMs, using as few tokens as possible means you can cram more in a small context window, which could be helpful in various ways.

082349872349872

1 replies

4d7h

2024-06-01 11:19:55 UTC

Maybe someone could produce an LLM which takes a line of vector language code and expands it into the dozen(s) of lines of equivalent pseudo-algol?

(I mean, you could skip the whole hallucination thing and write an exact converter, but that'd be a lot of effort for code that'd probably get used about as much as M-expression to S-expression converters do in the lisp world?)

eggy

0 replies

4d6h

2024-06-01 12:15:25 UTC

There are a few articles of using J or APL for ANNs and CNNs out there. Here's one:

  https://arxiv.org/pdf/1912.05234

xelxebar

1 replies

4d6h

2024-06-01 11:34:43 UTC

It's not that any method of reducing code length fixes bugs, it's just happens that optimizing code to be read and worked on by domain experts leads one toward patterns that secondarily end up manifesting as terse expressions. The terseness is certainly shocking if you're not accustomed to it, but it's really not a terminal goal, just a necessary outcome.

The disbelief on first encounter is totally reasonable, but from personal experience, once you've gotten past that and invested the time to really grok the array language paradigms, code like this Whitney style actually ends up feeling more readable than whatever features our current SE culture deems Good and Proper.

There are a whole lot of moving parts to the why and how of these ergonomics, so I don't expect to be able to convince anyone in a simple, short comment, but if you're at all interested and able to suspend disbelief a little bit, it's worth watching some of Aaron Hsu's talks to get more of a taste.

IshKebab

0 replies

4d6h

2024-06-01 12:13:54 UTC

From personal experience with regex, yes the first encounter was shocking, and sure I got past it and now can easily write regexes. But they still aren't more readable than the alternatives. Nor would I encourage their use in production.

kybernetikos

2 replies

4d7h

2024-06-01 11:19:29 UTC

I've heard it said that it becomes easy to spot common patterns and structures in this style. Without knowing the idioms it's difficult to read but apparently once you know the idioms, you can scan it and understand it well.

tluyben2

1 replies

4d6h

2024-06-01 11:44:56 UTC

It is for APL/k/j; you see patterns popping up in very dense code which does a lot, which does make it readable really once used to it. But most people never get to that point (or even ever look at APL of course).

koolala

0 replies

4d2h

2024-06-01 15:56:50 UTC

imaginary numbers in programming languages is the pattern once common we need to see pop up, J and APL got them for sure

t-3

7 replies

4d7h

2024-06-01 10:43:57 UTC

That's the "Whitney style". See: https://code.jsoftware.com/wiki/Essays/Incunabulum

It's writing C in array-language style rather than intentional obfuscation.

jiripospisil

5 replies

4d7h

2024-06-01 10:59:19 UTC

Thanks. I'm now reading this where people are trying to explain what happened in the ref/ directory.

https://github.com/kparc/ksimple/blob/main/a.c

dcuthbertson

4 replies

4d6h

2024-06-01 11:39:03 UTC

Thanks for that link. The comments there help a lot. If I understand them, this is a minimal implementation of K with a lot of limitations, such as:

"the only supported atom/vector type is 8bit integer, so beware of overflows"

Still, it's fascinating how an interpreter can be written with such a small amount of code.

tromp

3 replies

4d6h

2024-06-01 12:17:16 UTC

An interpreter for BLC, including tokenizing, parsing, and evaluation, can be written in as few as 29 bytes of BLC (and 650 bytes of C).

tasuki

2 replies

4d4h

2024-06-01 14:11:14 UTC

John, do tell more about BLC please!

tromp

1 replies

4d4h

2024-06-01 14:21:05 UTC

There's more at http://tromp.github.io/cl/cl.html including the LispNYC talk I gave last year.

tasuki

0 replies

3d3h

2024-06-02 14:34:09 UTC

Thank you!

im3w1l

0 replies

4d4h

2024-06-01 14:06:29 UTC

Personally I believe in a sort of "evolution" of software that operates independently of intentions of the programmers.

I can totally believe that he didn't intentionally obfuscate it, but its incomprehensibility made it harder for other people to make a knockoff and thats why it survived and became successful.

orthoxerox

0 replies

4d7h

2024-06-01 10:40:12 UTC

That's just how Arthur Whitney writes code.

stefano

33 replies

4d6h

2024-06-01 11:59:31 UTC

Those are some very big claims with respect to performance. Has anyone outside of the author been able to reproduce the claims, considering you need to pay 100k/month just to do it?

I also wonder if the commercial version has anti-benchmark clauses like some database vendors. I've always seen claims that K is much faster than anything else out there, but I've never seen an actual independent benchmark with numbers.

Edit: according to https://mlochbaum.github.io/BQN/implementation/kclaims.html, commercial licenses do indeed come with anti-benchmark clauses, which makes it very hard to take the one in this post at face value.

Chirono

14 replies

4d5h

2024-06-01 12:49:21 UTC

I used to use K professionally inside a hedge fund a few years back. Aside from the terrible user experience (if your code isn’t correct you will often just get ‘error’ or ‘not implemented’ with no further detail), if the performance really was as stellar as claimed, then there wouldn’t need to be a no benchmark clause in the license. It can be fast, if your data is in the right formats, but not crazy fast. And easy to beat if you can run your code on the GPU.

tiffanyh

9 replies

4d5h

2024-06-01 12:59:03 UTC

Would you recommend K?

Is something else better (if so what)?

chongli

4 replies

4d3h

2024-06-01 15:15:19 UTC

I think the main reason to use any of these array languages (for work) is job security. Since it's so hard to find expert programmers if you can get your employer to sign off on an array language for all the mission-critical stuff then you can lock in your job for life! How can they possibly replace you if they can't find anyone else who understands the code?

Otherwise, I don't see anything you can do in an array language that you couldn't do in any other language, albeit less verbosely. But I believe in this case a certain amount of verbosity is a feature if you want people to be able to read and understand the code. Array languages and their symbol salad programs are like the modern day equivalent of medieval alchemists writing all their lab notes in a bespoke substitution cipher. Not unbreakable (like modern cryptography) but a significant enough barrier to dissuade all but the most determined investigators.

As an aside, I think the main reason these languages took off among quants is that investing as an industry tends toward the exultation of extremely talented geniuses. Perhaps unintelligible "secret sauce" code has an added benefit of making industrial espionage more challenging (and of course if a rival firm steals all your code they can arbitrage all of your trades into oblivion).

rscho

3 replies

4d1h

2024-06-01 17:10:16 UTC

I'm sorry but you really sound like you judge APLs from an outsider pov. For sure, it's not job security that's keeping APLs afloat, because APLs are very easy to learn. A pro programmer would never use K or any APL. But pro mathematicians or scientists needing some array programming for their job will.

nathan_compton

2 replies

2024-06-01 18:14:26 UTC

I have been a programmer, scientist, etc, at various times in my life and I have programmed in J. I don't think there is any compelling reason to use J over Matlab, R, or Python and very many reasons not to. Vector languages are mind expanding, for sure, but they have done a very poor job keeping up with the user experience and network effects of newer languages.

A few years ago I wrote a pipeline in J and then re-implemented it R. The J code was exactingly crafted and the R code was naive, but the R code was still faster, easier to read and maintain and, frankly, easier to write. J gives a certain perverse frisson, but beyond that I don't really see the use case.

rscho

1 replies

3d23h

2024-06-01 18:43:44 UTC

I disagree. For the stuff I'm doing, I've been rewriting the same 1-screen program in different ways many times. J made that easy, R would make that much more tedious since I'd have >250 lines instead of 30. Of course if I'm doing something for which R has libs, I'll do it in R. Additionally, I'll very probably end up rewriting my final program in another language for speed and libs, because by then I'll know precisely what I want to write. IMO, the strength of array languages lies in easy experimentation.

nathan_compton

0 replies

2d8h

2024-06-03 09:45:19 UTC

I think there is an element of truth to this, but the context where this ends up being an advantage is personal and idiosyncratic.

rscho

2 replies

4d3h

2024-06-01 15:05:10 UTC

Are you a quant, or doing very specialized numerical things that aren't in libs on 2D datasets? Then 10X yes. If not, no. Everything else will be better

nathan_compton

1 replies

2024-06-01 18:16:21 UTC

I'm not a quant but having used "data science" languages, ocaml, J, R, etc, I strongly doubt that an array language offers any substantial advantages at this date. I could be wrong, of course, but it seems unlikely.

rscho

0 replies

3d22h

2024-06-01 20:03:28 UTC

J only is a data science language in the sense that its core is suited to dataset manipulation, but it's severely lacking in modelling librairies. If you're doing exotic stuff and you know exactly what you're doing, I can see the rationale but otherwise it's R/Python any day. It makes plenty of sense for custom mathematical models, such as simulations though.

koolala

0 replies

4d2h

2024-06-01 15:52:27 UTC

J and OPFS

anonu

1 replies

4d3h

2024-06-01 14:51:39 UTC

I've been using kdb/q since 2010. Started at a big bank and have used it ever since.

Kdb/q is like minimalist footwear. But you can run longer and faster with it on. There's a tipping point where you just "get it". It's a fantastic language and platform.

The problem is very few people will pay 100k/month for shakti. I'm not saying people won't pay and it won't be a good business. But if you want widespread adoption you need to create and an ecosystem. Open sourcing it is a start. Creating libraries and packages comes after. The mongodb model is the right approach IMO

ccorcos

0 replies

3d23h

2024-06-01 19:01:44 UTC

Can you elaborate on what mongo did right? My understanding is that AWS is stealing their business by creating a compatible api

lmeyerov

0 replies

4d5h

2024-06-01 13:26:12 UTC

The last point is spot on... Pandas on GPUs (cudf) gets you both the perf + usability, without having to deal with issues common to stack/array languages (k) and lazy languages (dask, polars). My flow is pandas -> cudf -> dask cudf , spark, etc.

More recently, we have been working on GFQL with users at places like banks (graph dataframe query language), where we translate down to tools like pandas & cudf. A big "aha" is that columnar operations are great -- not far from what array languages focus on -- and having a static/dynamic query planner so optimizations around that helps once you hit memory limits. Eg, dask has dynamic DFS reuse of partitions as part of its work stealing. More SQL-y tools like Spark may make plans like that ahead of time. In contrast, that lands more on the user if they stick with pandas or k, eg, manual tiling.

koolala

0 replies

4d2h

2024-06-01 15:52:08 UTC

no benchmark clause sounds like webgpu

wood_spirit

13 replies

4d5h

2024-06-01 12:31:01 UTC

Quick for building a website? Probably not.

Quick for evaluating some idea you just had if you are a quant? Yes absolutely!

So imagine you have a massive dataset, and an idea.

For testing out your idea you want that data to be in an “online analytical processing” (OLAP) kind of database. These typically store the data by column not row and other tricks to speed up crunching through reads, trading off write performance etc.

There are several big tech choices you could make. Mainstream king is SQL.

Something that was trendy a few years ago in the nosql revolution was to write some scala at the repl.

It is these that K is competing with, and being faster than.

IshKebab

12 replies

4d5h

2024-06-01 12:48:18 UTC

I would probably use Matlab for that sort of stuff tbh. Is K faster than Matlab?

rscho

10 replies

4d5h

2024-06-01 13:17:45 UTC

For using a heap of libs to compute something that has been done a thousand times? No. For writing a completely new data processing pipeline from scratch? Much, much faster! Array langs have decent performance, but that's not why they are used. They are used because you develop faster for the kind of problem they're good at. People always conflate those two aspects. I use J for scientific research.

oefrha

7 replies

4d4h

2024-06-01 13:31:01 UTC

Seems weird to switch to develop faster and complain about people conflating the two aspects when this thread is clearly talking about runtime performance, triggered by the benchmark claims:

real-sql(k) is consistently 100 times faster (or more) than redshift, bigquery, snowflake, spark, mongodb, postgres, ..

same data. same queries. same hardware. anyone can run the scripts.

wood_spirit

2 replies

3d11h

2024-06-02 07:04:00 UTC

GP here :)

When talking about speed I was rolling the time taken to write the query and get the query to run with the run time. The total speed depends upon both.

In another thread someone pointed out that being a good quant is also about having the best ideas in the first place.

This is just a general comment on the whole comments section in general that I leave here:

We have this weird situation where the general programming population looks at a small group of the most profitable programmers who are thinking about their domain problems in languages that are a mapping or mathematical symbols to a qwerty keyboard (in the old days there were APL keyboards). And the mainstream programmers say that is so weird that it must be wrong and must be a lie and so on.

Occam’s razor says that those profitable programmers wouldn’t be buying K if the same results at lower TCO or better results at a higher price?

In broader data engineering there has been tech like “I use scala!” that are used to gatekeep and recognise the in crowd. But that is in the faceless corporate end of enterprise data engineering where people are not measured in bottom lines.

Sorry for venting :)

spopejoy

1 replies

2d22h

2024-06-02 19:52:58 UTC

most profitable programmers

This more than anything demonstrates the hothouse-flower mentality of K stans. Quants have long since stopped being the best-paid or most value-generating engineers, and since K has zero application outside of quant, it's no longer even a particularly lucrative skill to acquire.

It's interesting though that the opacity of the "I make more money than you" argument fits so snugly with other unverifiable and outdated claims of K supremacy, like performance, job security, or expressiveness.

sudosysgen

0 replies

2d19h

2024-06-02 22:52:19 UTC

Besides, it has been my experience that the more the programming part of a given quant's job contributes to their profitability, the less they enjoy using K. K is a neat language for research, but I don't know of many who still like it as a language to write code you intend to reuse or maintain.

That said, I would personally rather do research in python, especially now that the performance situation is reversed.

rscho

1 replies

4d4h

2024-06-01 14:15:14 UTC

I'm not switching anything. Just trying to add to the conversation. BTW, for simpler queries I have no doubt the benchmarks are correct. I anticipate it would not hold for more beefy queries.

freedomben

0 replies

4d3h

2024-06-01 14:30:09 UTC

You came by it honestly! The initial conflation (and therefore context switch) happened further up-thread.

freedomben

1 replies

4d4h

2024-06-01 14:20:14 UTC

Seems weird to switch to develop faster and complain about people conflating the two aspects when this thread is clearly talking about runtime performance, triggered by the benchmark claims

It doesn't look to me like GP switched to develop and complained of conflation. The switch happened higher up the thread by wood_spirit, and GP just continued the conversation (and called out the tendency to conflate, without calling out a specific person).

On a meta note, I wish this trend of saying "it seems weird" and then calling out some fallacy or error would die. Fallacies are extremely common and not "weird", and it comes off as extremely condescending.

It happens quite frequently on HN (and surely other places, though I don't regularly patronize those). So to be clear, this isn't critcism levelled at you exclusively. (I even include myself as target for this criticism, as I've used the expression previously on HN as well, before I thought more about it).

Firstly, in this case and in most cases where that expression is used, it's actually weird to call it weird[1]. Fallacies, logic errors, and other mistakes are extremely natural to humans. Even with significant training and effort, we still make those mistakes routinely.

Secondly, it seems like it's often used as a veiled ad hominem or insult. It's entirely superfluous to add. In this case you could have just said "you complained about people conflating the two aspects and then conflated them yourself." (It still wouldn't have been correct as GP didn't conflate them, but it would have been more direct and clear).

Thirdly, it comes off as condescending[2]. It's sort of like saying, "whoa dude, we're all normal and don't make mistakes, but you're weird and do make mistakes." In reality, we all do it so it's not weird at all.

[1]: https://www.merriam-webster.com/dictionary/weird

  1: of strange or extraordinary character : ODD, FANTASTIC
  2: of, relating to, or caused by witchcraft or the supernatural : MAGICAL

[2]: The irony of this is not lost on me. I can definitely see how this comment might also come off as condescending. I don't intend it to be, but it is a ridiculously long comment for such a simple point. It also included a dictionary check which is also frequently a charactersitc of condescending comments. I don't intend it to be condescending, merely reflective of self-analytical, but as is my theme here, we all make mistakes :-)

chuckadams

0 replies

3d23h

2024-06-01 19:09:31 UTC

You can understand something fully and still call it weird. I’ve used perl for decades, some of the things it does are still just weird. As for fallacies, one of the first things I was taught in logic class was that using an argument’s formal structure (or lack thereof) to determine truth is itself a fallacy. Unsound != Untrue, and throwing around Latin names of fallacies doesn’t actually support an argument.

IshKebab

1 replies

3d21h

2024-06-01 20:39:59 UTC

Yeah to be clear I meant performance-wise. In terms of development speed it looks like it just goes to insane extreme on the "fast at the beginning / fast in the middle" trade-off. You know how dynamic typing can be faster to develop when you're writing a one-man 100 line script, but once you get beyond that the extra effort to add static types means you overtake dynamic typing.

rscho

0 replies

3d21h

2024-06-01 20:58:05 UTC

Performance-wise I'll admit I don't really know. But I do know that performance is at least decent, in that it lets you experiment easily with pretty much any array problem. Your dynamic/static typing analogy is pretty much on point.

dan-robertson

0 replies

4d4h

2024-06-01 13:29:19 UTC

I think some things will be faster and the real difference is more in things that are easy to express in the language. Eg solving differential equations may be easier in matlab and as-of joins in k.

JonChesterfield

1 replies

4d6h

2024-06-01 12:17:25 UTC

Array languages are very fast for code that fits sensibly into arrays and databases spend a lot of compute time getting correctness right. 100x faster than postgres on arbitrary data sounds unlikely-to-impossible but on specific problems might be doable.

mlochbaum

0 replies

4d5h

2024-06-01 12:29:33 UTC

Yes, when I took a look at shakti's database benchmarks before, they seemed entirely reasonable with typical array language implementation methods. I even compared shakti's sum-by benchmarks to BQN group followed by sum-each, which I'd expect to be much slower than a dedicated implementation, and it was around the same order of magnitude (like 3x slower or something) when accounting for multiple cores. I was surprised that something like Polars would do it so slowly, but that's what the h2o benchmark said... I guess they just don't have a dedicated sum-by implementation for each type. I think shakti may have less of an advantage with more complicated queries, and doing the easy stuff blazing fast is nice but they're probably only a small fraction of the typical database workload anyway.

viraptor

0 replies

3d19h

2024-06-01 22:33:31 UTC

The comparison posted may be true, but on some level it doesn't make sense. It's like this old awk-vs-hadoop post https://adamdrake.com/command-line-tools-can-be-235x-faster-...

Yes, a very specific implementation will be faster than a generic system which includes network delays and ensures you can handle things larger than your memory in a multi-client system. But the result is meaningless - perl or awk will also be faster here.

If you need a database system, you're not going to replace it with K, if you need super fast in-memory calculations, you're not going to use a database system. Apples and oranges.

Dudhbbh3343

0 replies

4d5h

2024-06-01 12:31:54 UTC

They at least used to have free evaluation licenses that were good for a month. Our license was even unlocked for unlimited cores.

I doubt they'd give them out to a random individual or small startup, but maybe still possible for a serious potential customer.

rwillink

22 replies

4d7h

2024-06-01 10:53:48 UTC

He showed K @ Royal Society & bunch of apl & aplus guys were there. Someone asked , where are the comments? AW said comments get out of date with all the changes, if you can’t read the code you Shldnt be working on it. We then all looked at each other..

magicalhippo

15 replies

4d6h

2024-06-01 11:55:47 UTC

if you can’t read the code you Shldnt be working on it

I don't know this AW guy, but to me that's a huge red flag and a sign that a programmer hasn't worked on anything substantial. Ie non-trivial stuff that's maintained by a team over time.

Being able to read the code is irrelevant, as the comments should tell you why the code is doing what it's doing.

For example, yeah I trivially can see the code is doing a retry loop trying to create a file with the same name.

That looks like a bug, if you can't create the file with that name, you change the name in the retry loop.

But the comment will tell me this is due to certain virus scanners doing dumb stuff, so we might have to try the same name a few times.

Sure, good code will have few comments as most of it should be self-documenting through good structure and names of classes and variables. But in non-trivial code there will always be places where it is not obvious why the code does what it does.

anonzzzies

10 replies

4d6h

2024-06-01 12:14:49 UTC

a sign that a programmer hasn't worked on anything substantial

Maybe you want to check who he is?

magicalhippo

7 replies

4d5h

2024-06-01 12:39:52 UTC

I meant I don't know as in I haven't actually worked with him.

Based on what I've seen before as well as the supplied code, I would be very skeptical to have him on my team.

wood_spirit

4 replies

4d4h

2024-06-01 14:24:00 UTC

This is a bit of a tangent but it’s a thought experiment that I recently heard:

Data pipeline A is written and maintained by a team in a type safe language with extensive unit tests.

Data pipeline B be was written long ago by a scientist who has since left, in sql in a day.

Both compute the same dataset, but B gets the answer correct.

Which is the better pipeline, and why?

tresclow

2 replies

4d3h

2024-06-01 15:12:56 UTC

And how do you know B's answer is correct?

nlitened

0 replies

3d23h

2024-06-01 19:14:08 UTC

It makes all the money

JonChesterfield

0 replies

3d19h

2024-06-01 23:04:45 UTC

In the world of science you get to compare software's predictions against reality. It's a weird concept but it grows on you.

I've seen systems with this structure. Part of the fun is B's code is likely to have a lot of errors in it that cancel each other out in exciting ways when running on the domain of interest, which makes using it to work out why A's code is failing to correspond to reality much harder than it could be.

magicalhippo

0 replies

4d3h

2024-06-01 15:26:09 UTC

Without more context or additional assumptions there's no good answer to that.

If you just care about today, then clearly B is better because it provides the correct result today.

Also, just because A is written in a type-safe language and has extensive unit tests doesn't in itself mean it's any less complex and undecipherable than B.

I can think of several takes, with different assumptions, leading to very different perspectives.

One take could be this:

Lets assume A has been written by a competent team, using good practices. Lets also assume the problem of incorrect answers in A has been known for some time and has been investigated a fair bit. That is, it's not just a trivial bug that's not been caught yet.

Since A doesn't work one could reasonably assume B is complex and difficult to understand, otherwise A's team should be able to find their error based on studying the SQL in B. Otherwise it indicates A's team is not competent, which goes against our previous assumption.

Given that, one could reasonably assume changing B will be very difficult.

Thus if one cares about maintaining and evolving the pipeline due to changing demands over many years, then it's likely A is better, as the bug in A producing the wrong answer should be fixable by a competent team.

Again, just one take of many possible...

An alternate, more trivial take could be that team A were given an incorrect specification. So while they implemented the specification correctly, B actually implements something slightly differently.

We see this one with customers all the time. Where they think the old system does X but it does in fact do something slightly different, so when we implement the new system as requested, the customer files a bug report because it doesn't do what the old system actually did.

Pet_Ant

1 replies

4d5h

2024-06-01 13:09:48 UTC

He would be your entire team

magicalhippo

0 replies

4d5h

2024-06-01 13:13:27 UTC

Even worse, as then the bus factor is as bad as it can get.

IshKebab

1 replies

4d2h

2024-06-01 16:28:41 UTC

I think what he meant is he hasn't had to work with other people's code. Other people have to work with his code.

Qem

0 replies

3d22h

2024-06-01 19:32:51 UTC

This reminds me of Rorschach prison remarks.

JonChesterfield

1 replies

4d6h

2024-06-01 12:21:57 UTC

Defining substantial as written by a team over a long period of time seems to be putting the value in the wrong place. Is the important thing what the software does, or that it took a lot of people a long time to write it?

magicalhippo

0 replies

4d5h

2024-06-01 12:49:32 UTC

I didn't say long time to write it. I said maintained by a team over time.

Sure there might be some substantial software that was written as a one-off (ie not modified after release), but that's the minority by far.

hi-v-rocknroll

0 replies

3d20h

2024-06-01 22:18:22 UTC

hasn't worked on anything substantial

This isn't the case. It's more likely a lack of business socialization combined with individual hyper-achievement. Reminds me of Ian Pratt in some ways.

An annoyance in tech, both startups and corporate, is technically-capable people but with outsized egos.

defrost

0 replies

4d5h

2024-06-01 12:33:34 UTC

for reference: https://en.wikipedia.org/wiki/Arthur_Whitney_(computer_scien...

hi-v-rocknroll

1 replies

3d20h

2024-06-01 22:14:23 UTC

if you can’t read the code you Shldnt be working on it

An ultimate expression of insider elitism and knowledge hoarding, which can be a self-interested asset to profitable, closed-source, specialized software.

For everyone else, code should present no surprises whenever possible with semantic expressiveness and reserve comments for explanation of surprises, design choices, and protocols.

xelxebar

0 replies

3d17h

2024-06-02 01:00:00 UTC

I don't know about you, but if you don't know the basics about woodworking and have no interest in learning, I certainly don't want you building my furniture. That, um, has nothing to do with elitism.

There is a real tradeoff between making code friendly to the uninitiated and making code ergonomic for the expert. It's completely natural that non-initiates feel unwelcome when the code is written for domain experts. In your company's codebase, is it really best to optimize for onboarding newcomers over optimizing for the productivity of your engineers? Where along that spectrum maximizes your goals?

Are we building chairs or are we building chair-builders?

082349872349872

1 replies

4d7h

2024-06-01 11:06:07 UTC

So far the highest ratio of comment to code I've seen is Hsu's thesis (~200 pages of commentary[0] to ~20 lines of code[1]) at 10 pages/line.

[0] https://scholarworks.iu.edu/dspace/bitstreams/dcbd5240-8454-...

[1] https://www.bonfire.com/co-dfns-thesis-edition/

(a) come to think of it, theses are one and done

(b) thanks to Kragen for pointing out this 02019 work!

082349872349872

0 replies

3d20h

2024-06-01 22:09:24 UTC

Heh, I just noticed Hsu's "above average" shirts: ⊢>+⌿÷≢

One of these days I need to (a) learn more about the Berber culture, and then (b) write an array language which exploits the ⵜⵉⴼⵉⵏⴰⵖ symbology.

https://en.wikipedia.org/wiki/Tifinagh#/media/File:Tifinagh_...

https://www.win.tue.nl/~aeb/natlang/berber/tifinagh/tifinagh...

https://www.edition-originale.com/media/h-3000-saint-exupery...

msla

0 replies

3d22h

2024-06-01 19:51:28 UTC

This immaturity is one reason I don't like array languages: They ape mathematical conventions but mathematicians are interested in communicating, if only with other mathematicians, and so write clearly and understandably. That source code (if you can call it that, as I'm not entirely convinced it's the form Whitney himself programs in) is the opposite of clear to the point of being comically dense, in every sense of the term.

Concision is the handmaiden of clarity.

koolala

0 replies

4d2h

2024-06-01 15:57:22 UTC

Where are the imaginary numbers?

eigenvalue

18 replies

4d7h

2024-06-01 11:12:57 UTC

This language is very popular among quant finance people associated with Morgan Stanley. I don’t see the appeal myself. Maybe it helps prevent people stealing the code since it’s so awful looking to work with! At one point I had to learn it and I think I’ve totally forgotten it now— it’s like my brain repressed it. Not my cup of tea, that’s for sure.

wood_spirit

8 replies

4d6h

2024-06-01 12:13:36 UTC

Thinking about problems and data manipulation in the way array languages enable is hard but, once you have pushed through the what feels like a barrier of mainstream programming language thinking that is stopping you from “grokking” it, it is a sudden moment of clarity and then you “get it”.

Perhaps a half way house is sql. The difference between ORM-style CRUD and a power user using window functions to make the data dance shows there is still art to be had in programming :)

breck

6 replies

4d5h

2024-06-01 13:19:18 UTC

Agreed. Pushing through until you can think in array languages is well worth it! In my experience one of the top 30 highest ROI mental circuits you can develop.

That being said, I'm not convinced that the extremely minimal syntax is essential. I think it can be done another way ;)

hoosieree

3 replies

3d23h

2024-06-01 18:44:34 UTC

I started with J but now prefer K and I'm writing an interpreter for a K-like language in my spare time for fun.

I'd say the minimal syntax isn't just a gimmick, because it really does help with mentally chunking phrases/idioms to a degree that's not possible when the same phrases are multiple lines long. Terseness also makes it physically faster to write the same thing, which encourages interactive experimentation much more than other languages.

These are small things, but taken together you get an experience that's more than the sum of its parts.

A lot of folks seem to tolerate K syntax because K jobs pay well. (Supposedly. I've never seen a super-high paying K job in real life.) But I actually like the K syntax because it helps organize my problem solving, and it gets out of the way during experimentation time. To me it's like NumPy/Pandas but better designed and without all the ceremonial boilerplate.

jimberlage

2 replies

3d5h

2024-06-02 13:28:22 UTC

Would an array language without the terseness, but with hygienic macros, fit the same niche?

rak1507

1 replies

3d3h

2024-06-02 15:24:44 UTC

I don't see how it could, the terseness is the point. I'm not sure how a macro system could really help. Did you have a specific idea in mind?

jimberlage

0 replies

2d22h

2024-06-02 19:32:19 UTC

The thought was - if I have a long piece of code repeated, and I want it to be shorter, I can

- Use a language that minimizes the code to write

- Use a helper function, maybe at some runtime cost

- Use a macro, turning a short piece of code into a longer one, at a compile-time cost

Having a DTI (debt-to-income ratio) macro and a very short definition of DTI that looks similar everywhere in code sort of do the same thing.

bravura

1 replies

4d4h

2024-06-01 13:53:48 UTC

Can you please share a few of the other 30 highest ROI mental circuits to develop?

breck

0 replies

4d4h

2024-06-01 14:20:59 UTC

Good question.

I should make a ranked list.

In regards to programming, my top 30 would include:

ScrollSets: https://breckyunits.com/scrollsets.html

RegEx.

The dataflow paradigm as popularized by dplyr would be on there.

HIT ranking: https://breckyunits.com/hits.html

raverbashing

0 replies

4d5h

2024-06-01 12:35:14 UTC

But tbh it looks worse than matlab

eggy

8 replies

4d6h

2024-06-01 12:00:48 UTC

Most people I know who actually learn an array language like k or j usually grow to appreciate the expressiveness and cleverness of these languages. Typically, people have your reaction who have only looked at it and tried it very briefly. I'm surprised. Why did you have to learn it? Where?

eigenvalue

7 replies

4d6h

2024-06-01 12:26:53 UTC

Was working at a quant pod at Millennium for a bit where they used it. I was ultimately able to use it but everything took me 20x longer than using Numpy/Pandas. The irony was that the Python code was shorter because there were so many more library functions and better abstractions and syntax. So it was slow and unintuitive for zero benefit whatsoever.

wood_spirit

4 replies

4d5h

2024-06-01 12:35:50 UTC

But how did your perf compare to the best of the K kicking quants around you? Were they too being less productive than they would have been in python?

I’m not saying they were right or better. Horses of courses. Array languages do my head in and my choice is sql.

eigenvalue

3 replies

4d5h

2024-06-01 12:54:05 UTC

I was able to explore new ideas much much faster using Python than the experienced k people could. But creativity is more important anyway. Ultimately, having good ideas/data/signals trumps fancy or fast data wrangling. Glad I’m doing other things now in any case.

gitonthescene

2 replies

3d15h

2024-06-02 02:42:32 UTC

Any detail whatsoever would make this a more credible claim. I haven’t met many people, including those skeptical of the performance claims, who have called K _slow_. Maybe for particular domains but I’d doubt that includes the kind of quant work that gets done at Millennium.

spopejoy

1 replies

2d22h

2024-06-02 20:21:04 UTC

I've heard plenty of complaints over the years, and only within quant, unsurprisingly since that is the only field you'll get paid to use K.

A brilliant programmer I met who came from DE Shaw said he reimplemented a K-based portfolio optimization pipeline because the performance hit a wall once the dataset got large enough. He was able to beat K with Java of all things.

Columnar and timeseries dbs have continued to evolve, K is the same tech it was in the 2000s. The only reason it gets used at a Millennium is that whatever trade is still printing money, not any tech advantage.

gitonthescene

0 replies

2d14h

2024-06-03 03:57:56 UTC

I appreciate it's probably not possible to share too many details but I wouldn't be surprised if the choice of Java wasn't simply preference. It may have been a problem with the pipeline rather than with K. I.e. a fix might have been available using K but it can be easier (and harder) to just use some out-of-the box solution. I agree that columnar and time series dbs may have caught up with K over the years, but most of the complaints I've heard about K aren't technical.

eggy

0 replies

1d6h

2024-06-04 11:57:43 UTC

Geometric mean in Numpy vs. J:

(Copied from some forum, since I don't use Python much)

  import numpy as np

  def geo_mean_overflow(iterable):
      return

  np.exp(np.log(iterable).mean())

Or,

  from statistics import.
  geometric_mean

  geometric_mean([1.0, 0.00001, 10000000000.]) # 46.415888336127786

In J, since I don't know K:

  gm=:#%:*/

Even shorter than Python whether it's a canned lib routine or created from composing simple functions.

And I don't need to format code on HN in J because it's so short anyway, besides I don't know how!

Qem

0 replies

4d5h

2024-06-01 13:21:34 UTC

was ultimately able to use it but everything took me 20x longer than using Numpy/Pandas.

You can try klongpy, a K-like array language implementation that runs atop numpy: https://pypi.org/project/klongpy/

PufPufPuf

7 replies

4d7h

2024-06-01 11:25:47 UTC

The website reads like an edgy script-kiddy blog. Is K actually a useful project, or is it just a passion project of someone who happens to be sort of famous?

sph

2 replies

4d6h

2024-06-01 11:35:11 UTC

The website reads like an edgy script-kiddy blog.

The code does, as well. Either Mr. Whitney's brain is not wired like a regular homo sapiens sapiens, or the entire thing smells of "I am smarted than you and I don't need to lower myself to your level."

I do not buy for a single second that for Mr. Whitney debugging IOCCC-level obfuscated code is easier than plain C code. One writes "normal code" because one will have to read it later, and they don't want to spend ages doing so, unless they have to keep an air of superiority about their abilities to their peers.

I get that APL is obtuse and dense. But writing obtuse and dense C doesn't turn it into APL.

MrBuddyCasino

0 replies

4d5h

2024-06-01 13:04:01 UTC

What if he is smarter than you and doesn’t need to lower himself to your level.

JonChesterfield

0 replies

4d5h

2024-06-01 12:32:00 UTC

I worked with a guy who could handle loops nested half a dozen deep with data dependencies woven through the structure with exactly the same apparent cognitive overhead as for i = 0, N. The sort of structure you get when you arrange a difficult calculation to match the cache hierarchy of the target machine. Didn't do comments or variable names with much enthusiasm.

He was superb at finding errors at code review. As in looking through code someone else had written and pulling out the mistakes. Presumably everything looked completely trivial to him, regardless of how tangled the control flow had got.

Whitney may be similar.

rscho

1 replies

4d5h

2024-06-01 13:10:36 UTC

Oh, it's just racking millions of dollars from big bank users every year. Nothing a script-kiddy couldn't achieve...

deusum

0 replies

4d3h

2024-06-01 14:50:57 UTC

Script kiddies bust their butts phishing and installing black market ransomware. This Whitney fellow is probably sitting in his office somewhere expecting people to just throw $100k (per month!) at him. ;-)

andylynch

0 replies

4d6h

2024-06-01 12:26:47 UTC

It’s niche, but a large part of the financial industry relies on it, for the heavy lifting of pricing and modelling, often with higher level APIs in Python or Java. Incidentally, if you run into a group full of Northern Irish developers at any big bank, you have probably found the K folks.

082349872349872

0 replies

4d6h

2024-06-01 11:29:33 UTC

Consider, mutatis mutandis, https://www.smbc-comics.com/comic/why-i-couldn39t-be-a-math-...

Lagniappe: click on the circular red button underneath the comic, to the right of the orange "RANDOM" :-)

victorbjorklund

6 replies

4d6h

2024-06-01 11:31:02 UTC

Can someone explain how it can be "faster" than anything else?

andrepd

4 replies

4d6h

2024-06-01 11:33:09 UTC

My money is on "it's not, and the benchmarks are cherry picked"

I mean faster at filtering data than a python script? Sure. Faster than a database or hand-rolled C code? Only if your benchmarks are misleading.

tluyben2

3 replies

4d6h

2024-06-01 11:53:02 UTC

Not for all cases, but he (and his team) take the time to squeeze performance out of things where others just say 'it's fast enough'. There was a monh+ long conversation why all most used json parsers are so terribly slow etc. Not many people take the time to try to optimise the last drop of blood out of everything, especially if you have shareholders or deadlines; you settle for 'good enough'.

magicalhippo

2 replies

4d5h

2024-06-01 12:55:11 UTC

And to be clear, in many, if not most cases, settling for "good enough" is the right call, and spending hours upon hours chasing performance is the wrong call.

tluyben2

1 replies

4d5h

2024-06-01 13:03:36 UTC

Sure, but not in the case of maximising performance if that is your goal. Which is what we are talking about here… wringing every ounce of overhead from it is basically the business case; at least one of them.

magicalhippo

0 replies

4d4h

2024-06-01 14:09:19 UTC

Oh sure. Just wanted to stress that that's seldom the case. I see so many cases of wasted optimizations because someone thought performance mattered, leading to unnecessarily complex code or other maintenance issues.

In other cases, especially in my sector, it's simply that the customer values lower cost over absolute performance.

Of course, not saying you shouldn't be mindful of introducing an O(n^2) algorithm when an O(n log n) can trivially be used, or similar.

082349872349872

0 replies

4d6h

2024-06-01 11:33:47 UTC

I think he's referring to the trillion-row regime: https://news.ycombinator.com/item?id=40522433

nivertech

6 replies

4d7h

2024-06-01 11:15:00 UTC

Is this line in a.c enough for MIT license?

  //k(c)2024 arthur whitney(l)MIT

defrost

5 replies

4d7h

2024-06-01 11:20:06 UTC

nice to see an exuberantly verbose arthur

082349872349872

3 replies

4d6h

2024-06-01 11:37:28 UTC

The loquaciousness brings two Calvin Coolidge anecdotes to mind, the punchlines to which are: "you lose" and "with the same hen?"

Y_Y

1 replies

4d5h

2024-06-01 12:39:46 UTC

Once, at a dinner party, the woman sitting next to him said she bet she could get more than two words out of him. Coolidge then famously responded: "You lose.".

https://newenglandhistoricalsociety.com/the-mordant-humor-of...

Remind me of:

Brevity is... wit

(Mr. Lisa Goes to Washington)

vinnyvichy

0 replies

3d15h

2024-06-02 03:26:24 UTC

https://en.m.wiktionary.org/wiki/in_der_K%C3%BCrze_liegt_die...

Y_Y

0 replies

4d5h

2024-06-01 12:36:59 UTC

The President and Mrs. Coolidge were being shown around an experimental government farm. When she came to the chicken yard she noticed that a rooster was mating very frequently. She asked the attendant how often that happened and was told, “Dozens of times each day.” Mrs. Coolidge said, “Tell that to the President when he comes by.” Upon being told, Coolidge asked, “Same hen every time?” The reply was, “Oh no, Mr. President, a different hen every time.” Coolidge: “Tell that to Mrs. Coolidge!”

https://quoteinvestigator.com/2018/03/30/coolidge

yencabulator

0 replies

4d1h

2024-06-01 17:03:18 UTC

The weirdest part about that is that copyrights do not need to mention years, so it's verbose in unnecessary ways.

dchest

6 replies

4d6h

2024-06-01 11:30:54 UTC

Everyone here charges too little for software.

  the free(MIT license) version is shakti.com/k/k.zip

  the supported version(supported and 10 to 100 times faster)
  is $100K per month(minimum)

https://groups.google.com/g/shaktidb/c/5SPufca3mo4

jeremiahbuckley

3 replies

4d5h

2024-06-01 13:05:02 UTC

They’ve been charging a that amount forever, it’s a crazy ask. But you’ll be happy to hear that the quoted price is about 80% off of the price in 2000, so take advantage of the discount. In 2000 it was $100K/month.

losvedir

2 replies

3d22h

2024-06-01 20:21:28 UTC

Inflation has been 80% since then, but that doesn't mean it's 80% off. $1000 in 2000 is $1800 today, so a discount of 44%. 80% off would imply it's $5000 today, but prices didn't 5x fortunately.

Good joke, though.

jeremiahbuckley

0 replies

2d21h

2024-06-02 21:10:03 UTC

I absolutely got it wrong, thanks for the correction.

coxley

0 replies

3d6h

2024-06-02 12:05:12 UTC

Good joke and good follow-through, haha.

pixelmonkey

0 replies

4d3h

2024-06-01 14:29:26 UTC

k is widely used by a handful of big investment banks and big hedge funds for quant/finance stuff. There are only a handful of such companies in the world but they are extremely price insensitive, especially with regard to technology that lets them get a market edge. I suspect this is the dynamic that kx, the company, tapped into over the years. I also suspect this open source release is mainly because investment banks have come around on their desire for open source (rather than proprietary) software over the years, at least on some teams. You can see the open source release doc explicitly positions k vs Python, pandas, and polars.

For example, I have an old friend from a major investment bank who used to work on an internal (proprietary) pub/sub system but who, these days, works on integrations between that system and Apache Kafka.

nordsieck

0 replies

4d5h

2024-06-01 12:53:42 UTC

I mean, it really depends on who your customers are. You can charge a lot if you can make the typical financial analyst 20% more productive.

Wouldn't work the same way if your core customer base is elementary school teachers.

doug-moen

5 replies

4d4h

2024-06-01 13:44:16 UTC

FYI: This only implements a subset of K (I'd estimate 1/3).

Calling it a 'release' is an overstatement. The docs state that it is a work in progress. It's also quite buggy (it's easy to get a segmentation fault). The version I saw in January was about 1/3 the size of this version, and also buggy. I hope that the final version of this code is less buggy and more usable.

If you want to learn the K language, don't use this version. Any of the other open source K projects are better than this (more complete, less buggy, better documented). This project is good if you want to learn more about the Arthur Whitney C coding style, because it is so small. Other projects written in this style (some open source K implementations, the J language) are huge by comparison.

Qem

3 replies

4d4h

2024-06-01 14:01:20 UTC

Any of the other open source K projects are better than this (more complete, less buggy, better documented).

One thing that puzzles me, about array languages, is that despite several open source implementations already existing, like J, its surprisingly difficult to find them packaged in Linux repositories. For example, you can't just "apt install J", or "apt install gnu-apl" on Ubuntu. In J case, it seems the default is just compiling it from source. Is there something tricky about packaging them?

The closest to a repository-friendly array language I could find was the klongpy implementation of klong[0], that is pip installable.

[0]. https://t3x.org/klong/

gmfawcett

1 replies

2024-06-01 17:52:19 UTC

I could swear that I used to install j902 from apt on Ubuntu. Am I misremembering this?

zimpenfish

0 replies

3d22h

2024-06-01 20:05:16 UTC

Arch has j901 in AUR which is directly installable if you have a helper (like `rua`) (or you can download the `tar.gz` and run `makepkg`.)

dzaima

0 replies

4d4h

2024-06-01 14:15:01 UTC

You can 'apt install apl' for GNU APL. Most open-source array languages though either have very few users, and/or are moving quite fast and thus an apt-packaged version would likely be rather out-of-date quite basically always. Though, for example, nix has J, BQN, uiua, GNU APL, and Dyalog APL (based on quick searches), so the barrier to entry to apt also is presumably rather high.

lelf

0 replies

2d3h

2024-06-03 15:25:22 UTC

it's work-in-progress for fun/educational purposes (i.e. read and learn something)

NKosmatos

5 replies

4d6h

2024-06-01 12:28:34 UTC

He might be a smart person, with a very high IQ and on a different level than the rest of us, but by writing with this style, with no comments, with no proper capitalization/style and with this attitude, he’s putting me off (IMHO).

Oftentimes, the way something is presented and how the language is used, might be as important as the thing itself ;-)

taf2

2 replies

4d5h

2024-06-01 13:02:09 UTC

I’ve always told myself- no matter how smart my idea is , if no one else understands it it could be as if it didn’t exist.

t-3

0 replies

3d18h

2024-06-01 23:53:09 UTC

Nobody needs to understand the code except for the people writing and maintaining it. The users just need it to run.

breck

0 replies

4d5h

2024-06-01 13:17:07 UTC

"But in science the credit goes to the man who convinces the world, not to the man to whom the idea first occurs" - Francis Darwin

sharmajai

0 replies

4d5h

2024-06-01 12:58:24 UTC

But but look at all the Turing Award and Putnam Prize winners he was worked with.

cess11

0 replies

3d21h

2024-06-01 20:51:18 UTC

Might want to read Notation as a Tool of Thought: https://www.eecg.utoronto.ca/~jzhu/csc326/readings/iverson.p...

If you want you can switch out his terse names in the .h and .c and see if that helps. I'm not so sure it does, but experience with array languages and a couple of decades with rather advanced C will. As in, experience is what matters rather than "IQ".

klibertp

4 replies

4d7h

2024-06-01 10:51:12 UTC

OMG, just yesterday I wrote a comment saying that I regret not learning K (I instead chose J) due to being too hung up on the notion of free software at the time... What a coincidence! Now I have no excuses anymore, time to learn K!

forgotpwd16

2 replies

4d7h

2024-06-01 11:16:02 UTC

How those two compare?

rscho

0 replies

4d5h

2024-06-01 13:04:36 UTC

K specializes in financial data, i.e. lists of 1d arrays. Other APLs, and J, are more high-dimensional math oriented and specialize in true multidimensional arrays.

082349872349872

0 replies

4d7h

2024-06-01 11:25:09 UTC

K is pragmatically business-oriented, J is what you get after you've been thinking about computing for half a century?

koolala

0 replies

4d2h

2024-06-01 15:58:13 UTC

Regret why? It's nice K is finally a viable language for communication and learning.

omnicognate

3 replies

4d4h

2024-06-01 13:55:18 UTC

I'm going to have to take another run at learning this corner of computing soon, but it's a prospect I'm not relishing. Everything about it rubs me up the wrong way.

If you'd like an antidote, have a read of Gerald Jay Sussman's books, where you'll see profound concepts from maths and physics captured in succinct and expressive (as opposed to merely terse) code, accompanied by eloquent explanations devoid of boasts or name dropping and provided free of charge online. That will change the way you think about computing too, but it will be a more pleasant experience.

philipov

2 replies

4d4h

2024-06-01 14:21:40 UTC

Which one do you recommend?

The first one I found was "Structure and Interpretation of Computer Programs"

https://web.mit.edu/6.001/6.037/sicp.pdf

omnicognate

1 replies

4d3h

2024-06-01 14:47:56 UTC

That's a good place to start but it's primarily about programming itself. In "Structure and Interpretation of Classical Mechanics" and "Functional Differential Geometry" he applies his approach of using computer programs as a way of communicating concepts to humans to some fascinating maths and physics topics.

chuckadams

0 replies

3d22h

2024-06-01 19:35:46 UTC

The SICP lecture series goes places that the book doesn’t: one of them involved encoding derivatives in a generic way that high-school math never covered. It was mind-blowingly elegant, which is of course why I completely forgot it.

mrob

3 replies

4d6h

2024-06-01 11:56:56 UTC

Is there some application that demonstrates the utility of this language?

E.g. it's tempting to dismiss Haskell as something invented by mathematicians more concerned with the elegance of their abstractions than actually getting things done, but Pandoc is so undeniably good and useful that you're forced to admit Haskell can be a good choice. What's the Pandoc of K?

rscho

0 replies

4d5h

2024-06-01 13:00:39 UTC

That's not the idea. In a sense, the Pandoc of K is K itself. I mean its designed for interactive, fast and terse scripting on financial data for quants. And it's incredibly good for that. So almost all substantial K is proprietary.

jjtheblunt

0 replies

3d21h

2024-06-01 20:43:54 UTC

i know Formula 1 has enormous, voluminous, live data feeds streamed from the cars, and god knows where else; kdb/q supposedly enable such.

grumpyprole

0 replies

4d2h

2024-06-01 15:43:39 UTC

The "killer app" of K is KDB.

JonChesterfield

3 replies

4d6h

2024-06-01 11:38:38 UTC

Interesting things in here.

  #define _(e) ({e;})
  //!< isolate expression e in its own lexical scope and clamp it with ;
  //!< note the outer parens, which is a very powerful c trick: they turn _(e) into a so called
  //!< r-value, which basically means we can do x=_(e) for as long as e evaluates to or returns
  //!< at least anything at all, i.e. not void. this macro is fundamental to k/simple implementation.

I didn't know that corner of C. Removing the () from the macro does change what you can pass as e, and assigning the result of a block does work as one would expect.

edit:

-Wpedantic on gcc will tell me ISO C doesn't like the construct but it still compiles it happily.

Clang offers -Wgnu-statement-expression-from-macro-expansion

So it looks likely that this is the GNU statement expression extension after all and not a part of C. Shame.

shawn_w

0 replies

4d2h

2024-06-01 15:43:22 UTC

Yeah, it's a gcc extension. Details at https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Statement-Expr...

jstimpfle

0 replies

4d6h

2024-06-01 12:24:37 UTC

Yes, this is called statement-expression (instead of expression-statement which is the normal "doer" statement that contains just an expression followed by a semicolon).

The Linux kernel makes quite a bit of use of them as far as I'm aware.

IshKebab

0 replies

4d5h

2024-06-01 12:49:21 UTC

You can use these to implement Rust-style `auto x = TRY(...);` in C++ which is pretty nice. Unfortunately MSVC doesn't support this extension.

tosh

2 replies

4d7h

2024-06-01 10:42:37 UTC

How does it compare to the k that you can license from shakti or to ngn/k?

r8533292

0 replies

4d6h

2024-06-01 11:33:41 UTC

as of right now it's a desktop calculator, functions, conditionals and loops are missing, you can see its scope here https://shakti.com/k/k.d

gitonthescene

0 replies

3d20h

2024-06-01 22:08:46 UTC

Ngn/k is GPL and thus more restrictive. https://codeberg.org/ngn/k/src/branch/master/LICENSE

tetris11

2 replies

4d7h

2024-06-01 10:31:47 UTC

Clicking through the page, I'm still not quite sure what I'm seeing.

rwmj

0 replies

4d7h

2024-06-01 10:33:13 UTC

It's rather obscure to say the least, but it's apparently an open source release of https://en.wikipedia.org/wiki/K_(programming_language)

chrispsn

0 replies

4d7h

2024-06-01 10:33:24 UTC

Click on k, then k.zip.

secondary_op

2 replies

4d5h

2024-06-01 12:30:52 UTC

And I though that Skala, FP (monad in X is just a monoid in the category of endofunctors of X) people are pretentious sect, this is so much worse, given their customer base, I immediately presume that this is nothing more than approach to deliberately make things unintelligible for as many people as possible, so that your white collar hedge fund guy would have heart stroke just by glancing at this source code, not even trying to read or understand it, that is one way to treat people and make business, it is despicable. Industry needs to formalise this into well known phenomena much like Security through obscurity [1] so that kind hearted pragmatic people avoid this like a plague.

[1] https://en.wikipedia.org/wiki/Security_through_obscurity

rscho

0 replies

4d5h

2024-06-01 13:08:20 UTC

Try it or any other APL for 3 months. You will change your mind.

gitonthescene

0 replies

3d20h

2024-06-01 22:12:17 UTC

It’s not at all clear where you read pretentiousness. Is it the mere fact of its existence?

freedomben

2 replies

4d4h

2024-06-01 13:58:39 UTC

For those who haven't heard of or aren't familiar with K, the Wikipedia page[1] has a remarkably helpful brief overview:

K is a proprietary array processing programming language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb+, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax.

There was also a great thread on HN about it as well[2].

[1] https://en.wikipedia.org/wiki/K_(programming_language)

[2] https://news.ycombinator.com/item?id=28493283

koolala

1 replies

4d2h

2024-06-01 15:54:11 UTC

Does it support imaginary numbers or did they take them out like everyone else...

cess11

0 replies

3d21h

2024-06-01 20:46:50 UTC

J supports them well, so I'd guess K does too.

And many programming languages do, for example C# and Racket. I have a feeling it isn't very hard to implement, since that's what you typically do if you need complex numbers in Java.

defrost

2 replies

4d7h

2024-06-01 11:10:49 UTC

K by Arthur Whitney: http://archive.vector.org.uk/art10010830

vector the journal of the British APL Association

koolala

1 replies

4d2h

2024-06-01 15:54:51 UTC

let them eat APL

082349872349872

0 replies

3d20h

2024-06-01 21:49:50 UTC

When your eng dept comes to you and says "Boss, we want arrays!", then let them eat APL.

cies

2 replies

4d7h

2024-06-01 10:32:21 UTC

Not very informative homepage... A lang+db? Fast on one machine? Or does it distribute?

padthai

1 replies

4d7h

2024-06-01 10:40:31 UTC

I would be surprised if the database is here. Does not it rely on Q?

tluyben2

0 replies

4d6h

2024-06-01 12:04:56 UTC

This is a new implementation not related to Q (same main author of course though).

KRAKRISMOTT

2 replies

4d7h

2024-06-01 11:24:27 UTC

This is probably his last attempt at leaving a legacy given his age. I wonder if Wolfram would do something similar.

tluyben2

1 replies

4d6h

2024-06-01 11:32:27 UTC

There is a lot to learn from him; tiny binaries, super fast performance; programming style you like or don't, that's fine. To have a 200kb binary that's a programming language + database is very nice. It's great we can study a part of it and probably more in the future. We went overboard with bloating and complexity; it's good to be shown you can write current enterprise/commercial products that fits in the memory of an 80s homecomputer without changing your style of programming or tools you use for it. IMHO anyway.

RetroTechie

0 replies

4d6h

2024-06-01 12:07:49 UTC

Sadly this isn't the norm.

Imho software size should reflect complexity of the problem domain. Not arbitrary metrics like say, the capabilities of a system executing it.

So "Hello World!" should weigh in at mere bytes. Not KBs or even MBs.

vmsp

1 replies

4d4h

2024-06-01 13:49:49 UTC

I think the link for k.zip was just removed.

EDIT: shakti.com/k/k.zip is now returning 404.

freedomben

0 replies

4d4h

2024-06-01 13:55:35 UTC

Thanks, I thought I was going insane a little bit there :-)

nickpeterson

1 replies

4d7h

2024-06-01 10:58:52 UTC

Well I know what the next episode of the arraycast is going to talk about now.

pjmlp

0 replies

4d6h

2024-06-01 11:31:09 UTC

Already looking forward to when it might be addressed. :)

lenkite

1 replies

4d4h

2024-06-01 14:16:26 UTC

https://shakti.com/k/k.zip gives 404 not found

cristaloleg

0 replies

3d12h

2024-06-02 05:51:05 UTC

try https://shakti.com/k.zip

i_don_t_know

1 replies

3d21h

2024-06-01 20:52:56 UTC

The link to the archive ‘k.zip’ has moved to https://shakti.com/ with terse documentation in the ‘education’ section.

hi-v-rocknroll

0 replies

3d18h

2024-06-01 23:29:19 UTC

The source is an IOCCC candidate and has zero tinkering value.

hardlianotion

1 replies

4d7h

2024-06-01 10:32:53 UTC

That's ... er ... cryptic.

082349872349872

0 replies

4d7h

2024-06-01 10:57:10 UTC

Just as Inform 7 works because people who write adventure games are the least likely to mind having to play "guess the verb", K source works because people who write vector languages are the least likely to mind expressing algorithms with, not display: block, but display: inline.

delta_p_delta_x

1 replies

4d3h

2024-06-01 14:56:59 UTC

The website, the comments in this thread, and the various C snippets shared here all make me feel very stupid.

gitonthescene

0 replies

3d20h

2024-06-01 22:06:11 UTC

Don’t. Feel inspired. Nothing worth learning comes easy.

andrepd

1 replies

4d6h

2024-06-01 11:31:18 UTC

This reads like the incomprehensible ramblings of a mentally ill patient scribbling in the walls. There is zero context, zero explanation. Very vague or incomplete statements scattered all over the place. I don't understand what someone not already familiar with the project is supposed to take away from this.

Is the terseness of the site mean to reproduce the terseness of the language? Is that the gimmick?

Tomte

0 replies

4d6h

2024-06-01 12:03:31 UTC

I don't understand what someone not already familiar with the project is supposed to take away from this.

This isn‘t an advocacy piece directed at the general public. You‘re not his audience.

Fortunately, there is secondary commentary, like this thread, so we can get an idea what this is about.

PeterZaitsev

1 replies

4d2h

2024-06-01 16:12:06 UTC

Some big claims, but I wonder if there are some published repeatable benchmarks

Also when someone claims 1000x better Performance I want to know why. For example MySQL or PostgreSQL -> Clickhouse I can clearly attribute to column store, compression, vectorization, parallel execution on multiple CPU cores and machines...

leontrolski

0 replies

4d2h

2024-06-01 16:22:15 UTC

For what it's worth, there are some benchmarks of kdb+ (the database built on k) here - https://tech.marksblogg.com/billion-nyc-taxi-kdb.html (it's overall a fantastic series of blog posts).

xelxebar

0 replies

4d5h

2024-06-01 12:58:47 UTC

Just packaged it up on my private Guix channel. Here's the package def (along with the one for ngn/k as well) if anyone's interested: https://gist.github.com/xelxebar/c37ab9285b297fed3e9e0f9ce78...

vadiml

0 replies

4d7h

2024-06-01 11:17:22 UTC

Apl always was kind of write only language. But now you probly can paste all code in chat gpt and it will explain it

secwang

0 replies

4d5h

2024-06-01 13:01:40 UTC

How to compile these codes under clang15?

mlochbaum

0 replies

4d4h

2024-06-01 14:04:57 UTC

Functionality is, at this time, extremely limited (and the first kfun was out in January so I don't think there's really any intention to get this to usability on a short timeframe). No support for paired syntax like parentheses, functions in braces, and square brackets for indexing and function calls. No stranding or tacit functions. I doubt it's Turing complete. Many primitives are unimplemented and others are flaky: for instance calling count (#) on an atom can give an arbitrary number, print garbage, segfault, or allocate memory until it's killed. But it's got vector instruction support.

If you're looking for a practical k implementation, I recommend ngn/k, and several other implementations are listed at https://k.miraheze.org/wiki/Running_K .

koolala

0 replies

4d4h

2024-06-01 14:17:21 UTC

Coukd this run in Wasm on OPFS - like how SQlite got official support for OPFS?

hiAndrewQuinn

0 replies

4d5h

2024-06-01 12:44:38 UTC

... Ah, so this is regexes but for math. Got it.

hakanderyal

0 replies

4d7h

2024-06-01 10:38:11 UTC

For people not accustomed to the style of Whitney, you can read various HN threads from the past to learn more about why he writes programs the way he does.

It’s deliberate and powerful.

Here is a recent one: https://news.ycombinator.com/item?id=39026551

There was an epic post some years ago but couldn’t find it now from my phone.

dpflan

0 replies

4d7h

2024-06-01 11:09:29 UTC

“for: hedgefunds banks manufacturers formula1 ..”

doug-moen

0 replies

3d4h

2024-06-02 13:56:48 UTC

Since this was posted, the source code was changed, and a makefile was added.

The new version requires ARM 64 or Intel 64 with AVX2. It requires clang-13 (clang-14 and later won't work). Gcc doesn't work.

With clang-14, I got build errors. First error: ./a.h:38:30: error: use of unknown builtin ‘__builtin_ia32_pminub256’ [-Wimplicit-function-declaration]

Seems to be related to this LLVM change which removed the above builtin: https://reviews.llvm.org/D117798

When I replaced __builtin_ia32_pminub256 with __builtin_elementwise_min and ditto for max, then it compiles and apparently works.

dlkf

0 replies

3d11h

2024-06-02 07:07:12 UTC

Does the page 404 for anyone else? Is this a Europe thing?

djoldman

0 replies

4d4h

2024-06-01 14:05:50 UTC

Eat your heart out:

https://codeberg.org/ngn/k/src/branch/master/0.c

cristaloleg

0 replies

3d12h

2024-06-02 05:51:21 UTC

New link https://shakti.com/k.zip

carlsborg

0 replies

4d5h

2024-06-01 13:03:46 UTC

If you ask ChatGPT to reply in the style of Arthur Whitney, you get amazingly concise summaries. Like a language verion of this code. I use that prompt often.

1vuio0pswjnm7

0 replies

3d17h

2024-06-02 01:16:29 UTC

Other than ngn/k ...

https://ktye.github.io/kdoc.htm

https://github.com/ktye/i/releases/download/latest/k.c

IIRC, some old UNIX versions had an APL interpeter in the userland. For me, a k interpreter could be the ultimate UNIX utility. But interoperability with pipes and other UNIX utilities is awkward to say the least, as is having to use other programming languages as duct tape.