A blast from the past, I used to work in particle physics and used ROOT a lot. I had a love/hate relationship with it. On the one hand, it had a lot of technical debt and idiosyncrasies. But on the other hand, there are a bunch of things that are easier in ROOT than in more "modern" options like matplotlib. For example, anything that has to do with histograms. Or highly structured data (where your 'columns' contain objects with fields). Or just plotting functions (without having to allocate arrays for the x and y values). I also like the very straightforward object-oriented API. It feels like old-school C++ or Java, as opposed to pandas/matplotlib which has a lot of method chaining, abuse of [] syntax and other magic. It is not elegant, and quite verbose, but that is probably a good thing when doing a scientific analysis.
I left about 5 years ago, and ROOT was in a process of change. They already ripped out the old CINT interpreter and moved to a clang-based codebase, and now you can run your analyses in Jupyter as far as I know (in C++ or Python). I heard the code quality has improved a lot, too.
I wonder if Haskell would also be a good fit for writing something like this.
No.
This is a technical community. You really have to do better than a one word dismissal without any reasoning.
In other words, why do you think it’s not a good fit?
I think the response gets right to the point!
Using something like Haskell for ROOT is ridiculous for a lot of obvious reasons. A simple and dismissive "no" invites the cautious reader to discover them on their own rather than waste engaging in a protracted debate. Maybe it's better to reject the idea out of hand and spend our time elsewhere.
That’s just not how technical discussions work. Not everyone knows what you know and the point of this community is to share knowledge not gatekeep it behind some “discovering it yourself” bullshit. The fastest thing to do is not dismissing it with no explanation but rather explaining for all the readers why that is the case. Because if one person doesn’t know I can guarantee that there’s plenty out there who are just as interested to know. And it’s a waste of everyone’s time to have each person independently come to the same conclusion when it’s apparently easily explainable.
You’re free to not do any of that, of course, but be prepared to defend the fact that you’d prefer not engaging in discussion and instead just shallowly dismiss something.
There's a number of reasons for this. The first is that the quant physics community has never really adopted functional programming. It's not particularly obvious to scientists, who typically want to express their computation the way they want to- something that C, C++, and Fortran are all long-established at doing. The second is that much of physics depends on old libraries written over the last 30-40 years, and it's easiest to use them from a language that the library is written in, or one that has a highly similar interface (for example, Python is similar enough to C++ that many foreign function interfaces are literally just direct wrappers). The third is that types (other than simple scalars, arrays, and trees/graphs) have never been a high priority in quant physics. The fourth is that undergrad education outside CS rarely teaches students Haskell, while most undergrads in a quant field graduate knowing some amount of Python.
It's much more likely the physics community would adopt Julia, or maybe Rust, and even that has been pretty slow.
(nothing I said above should be construed as taking a position about the suitability of any specific language or lack thereof for doing scientific computing. I have opinions, but I am attempting to explain the reason factually with a minimum of bias)
Could it though?
Haskell would be great for designing the interface of a library like this, but not for implementing it. It would definitely not look like "old-school C++ or Java" but, well, that's the whole point :P
I haven't used ROOT so I don't know how well it would work to write bindings for it in Haskell; it can be hard to provide a good interface to an implementation that was designed for a totally different style of use. Possible, just difficult.
I think having Haskell bindings to it will be quite valuable .For implementation of core structures, though, it's better to stick to C++ to max out on performance and have a finer control on resource usage. Haskell isn't particularly good at that.
EDIT: there's one at https://hackage.haskell.org/package/HROOT
The best thing about root was how it handled data loading. TTree's, with their column based slicing on disk, are such a good idea. Ever since I graduated and moved into industry, I've been looking for something that works the same way.
Apache arrow and parquet all work this way. Even HDF5 in column mode isn't completely bad.
TTree is succeeded by RNTuple, which is basically CERN's take on Apache Arrow, they're incredibly similar
Is this a kind of lazy loading?
I was hosting one of the leads of ROOT at Google and we got to talking about ROOT. I mentioned sstables and columnio and he said "oh, yeah, we've been doing that for years".
Honestly now with chatgpt, matplotlib terrible API is less of a problem.
This is a great example of why the age of truly terrible software is going to be ushered in as LLMS get better.
When the cost of complexity of interacting with an API is paid by the LLM, optimizing this particular part of software design (also one of the hardest to get right) will be less fashionable.
That's true, but still, there are things you just can't do in matplotlib that you can do better in other GPT-aware packages like plotly.
We all have a love/hate relationship with it. It’s a bit like Stockholm syndrome.
Because matplotlib is not so histogram focused (I guess because the kids these days have plenty of r RAM), people always show these abominable scatter plots that have so many points on top of each other that they're useless. Yuck.