return to table of content

SciPy builds for Python 3.12 on Windows are a minor miracle

wenc
68 replies
23h31m

What an amazing read. Now I know why my pip installs are failing in 3.12 but we now have a brighter future ahead.

Also while I love Python it’s helpful to understand why Python packaging is a (manageable) mess. It’s because of non standardization of build tools for C/C++/fortran and the immensity of the ecosystem, nothing to do with Python itself. It’s part irreducible complexity.

It’s a miracle it works at all.

cdavid
25 replies
22h57m

Yes, that's a fundamental reason python packaging is a mess. Python success is largely due to the availability of key mixed language packages. No other mainstream language package manager has to deal with this.

For example, cargo for rust, which is great, can assume to package mostly rust-only code. And while it is compiled, the language "owns" the compiler, which means building from sources as distribution strategy works. I don't know how/if cargo can deal with e.g. fortran out of the box, but I doubt cargo on windows would work well if top cargo packages required fortran code.

The single biggest improvement for python ecosystem was the standardisation of a binary package format, wheel. It is only then that the whole scientific python ecosystem started to thrive on windows. But binary compatibility is a huge PITA, especially across languages and CPUs.

winstonewert
12 replies
21h25m

Many rust crates actually do package code written in other languages. There are plenty of useful C/C++/Fortran libraries that nobody has rewritten into Rust but for which wrappers have been created that call into C. It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

Various factors still make the Rust and Python story different (Python uses more mixed-language packages, the Rust demographic is more technically advanced, etc). But a big one is that in Rust, the FFI is defined in Rust. Recompiling just the Rust code gives you an updated FFI compatible with your version of Rust. In Python, the FFI is typically defined in C, so recompiling the python won't get you a compatible FFI. If Python did all FFI through something like ctypes, it would be much more smooth.

glandium
6 replies
19h22m

It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

It's still its own mess. You'll find plenty of people having problems with openssl, for instance.

oefrha
4 replies
11h39m

I have the pleasure of maintaining a reasonably popular -sys crate, and getting it working on Windows is an absolute nightmare, especially considering the competing toolchains & ABIs (msvc and gnu) and millions of ways the C dependency can be installed (it’s a big one, and users may choose a specific version, so we don’t bundle it), with no pkg-config telling you where to look.

No idea how build.rs being able to run any Rust code is an advantage, it’s not like setup.py can’t run any Python code and shell out. In fact, bespoke code capable of spitting out inscrutable errors during `setup.py install` is the old way that newer tooling tries to avoid. Rust evangelism is puzzling.

oblio
2 replies
9h18m

Wait a second, I need to understand this better.

If you cargo build, can that run a dependencies' build including trying to compile C and stuff?

oefrha
0 replies
8h24m
bogeholm
0 replies
3h0m

I believe build.rs can do pretty much anything: https://doc.rust-lang.org/cargo/reference/build-scripts.html

winstonewert
0 replies
5h39m

When I referred to build.rs - I merely meant that the build script made it possible to build code written in other languages - not that it solved all the problems. It very much doesn't solve all the problems involved.

wongarsu
0 replies
5h26m

Though you get partially saved by the "FFI is defined in Rust" part, with many of the hard to compile crates offering optional prebuilt binaries for the part that's not rust.

cesarb
3 replies
17h52m

It works in Rust because the build.rs lets libraries do whatever they want during the build process, including invoking compilers for other languages.

And also there's a helper library (the "gcc" crate) which does all the work of figuring out how to call the C or C++ compiler for the target platform, so that build.rs can be very small for simple cases. You don't have to do all the work yourself.

LoganDark
2 replies
10h40m

Don't you mean the `cc` crate? `gcc` is its terribly-outdated predecessor.

cesarb
1 replies
8h19m

It's the same crate, it just changed its name at some point in its history. I still refer to it through its former name by force of habit.

LoganDark
0 replies
2h51m

I know it's the same crate, but they changed its name. `gcc` is the predecessor of `cc` in terms of name. If you depend on `gcc` instead of `cc`, you won't have any of the new improvements.

mattip
0 replies
15h18m
sertbdfgbnfgsd
3 replies
11h12m

Yes, that's a fundamental reason python packaging is a mess. Python success is largely due to the availability of key mixed language packages. No other mainstream language package manager has to deal with this.

Admittedly I'm not a python expert, but julia handles this just fine? It doesn't seem like it's a difficulty inherent to "mixed language packages". Somehow it appears to me that python's approach is just bad somehow.

rightbyte
2 replies
10h54m

Then again jl seems to take 10s to start doing something non-trivial each time while modules are being compiled.

sertbdfgbnfgsd
1 replies
10h8m

THEN AGAIN, If you want to restart repeatedly, for whatever reason (??), maybe you should just compile the modules once...? See e.g. [1]

TLDR: There's a package called PrecompileTools.jl [2] which allows you to list: I want to compile and cache native code for the following methods on the following types. This isn't complicated stuff.

[1] https://julialang.org/blog/2023/04/julia-1.9-highlights/#cac...

[2] https://julialang.github.io/PrecompileTools.jl/stable/

markkitti
0 replies
7h25m

Julia 1.9's native precompilation is definitely helpful in that regard but loading those native shared libraries (.so files on Linux) into Julia does take some to verify.

If the main objective is to reduce time to load and time to first task then PackageCompiler.jl [3] is still the ultimate way to do so.

Because Julia is a dynamic language, there are some complicated compilation issues such as invalidation and recompilation that arise. Adding new methods or packages may result in already compiled code no longer statically dispatching correctly requiring invalidation and recompilation of that code.

It slightly more complicated than what you stated. It's "I want to compile and cache native code for the following methods on the following types in this particular environment". PackageCompiler.jl can wrap the entire environment into a single native library, the system image.

[3] https://github.com/JuliaLang/PackageCompiler.jl

the__alchemist
2 replies
19h15m

As you might expect, compiling rust crates that use C libraries can lead to the inscrutable-block-of-text linker errors we know and love. I have been having a rough time with CUDA, CMSIS-DSP, Tensorflow, and OpenCV during the past few weeks. One of them requires LLVM=v15 to be installed; another requires v16+. A diff one requires an old version of rustc to be installed. On a diff one, when I posted on Github, the maintainers assured me the crate is, in fact, fine, and my system is misconfigured; phew!

oefrha
1 replies
10h55m

Lol, as a maintainer of a reasonably popular and complex -sys crate, our crate is “fine” on Windows, at least in theory, and I’ve heard of successes using it. However, I can’t even port my own app depending on said -sys crate to Windows; there’s always a wall of linker errors. If you report a Windows problem to me, I won’t tell you it’s fine, I just throw my hands in the air.

cshokie
0 replies
6h4m

Out of curiosity is it a “Windows is objectively difficult” problem, or a “Windows is not Linux and I know Linux best” problem?

I’ve only begun using it so my expertise is limited, but I think vcpkg aims to help with some of these difficulties by shipping code as source and then running make on dependencies so they are guaranteed ABI compatible because the same compiler builds everything.

NewJazz
2 replies
21h34m

Doesn't R handle fortran and C++ packages?

azalemeth
1 replies
21h28m

It does -- and to a much lesser extent shims exist for other languages as well. R also has imperfect packaging but I think it's handled well albeit with a level of complexity that also goes up a lot unexpectedly at times. For a truly great experience, call python packages within R in a custom conda environment in order to get data out of pandas in a particularly unholy way...

disgruntledphd2
0 replies
8h16m

R has always shipped binary packages on Windows and Mac to avoid lots of the pain we see in Python.

Also, all packages must build with the latest version of R, or they are removed from CRAN. This makes the dep problems a lot less severe than we see with Python.

shoo
0 replies
13h16m

The single biggest improvement for python ecosystem was the standardisation of a binary package format, wheel.

I agree. Some people love to complain about python packaging. But from one perspective, it's arguably been a solved problem since wheels were introduced 10 years ago. The introduction of wheels was a massive step forward. Only depend on wheel archives, don't depend on packages that need to be built from source, and drag in all manner of exciting compile-time dependencies.

If there's a package you want to depend on for your target platform, and the maintainers don't produce a prebuilt wheel archive for your platform -- well, set up a build server and build some wheels yourself, and host them somewhere, or pick a different platform.

jchw
0 replies
18h12m

I don't think this is really true. NodeJS has quite a similar problem.

IshKebab
21 replies
22h57m

It's also to do with Python itself.

wenc
20 replies
22h55m

How?

Kwpolska
11 replies
22h35m

The Python packaging world is full of barely compatible tools and no clear vision. Even if you're consuming packages, or packaging pure Python code, it's often an incomprehensible mess.

delfinom
5 replies
21h23m

Well, part of it is really Python's age and legacy. We are talking about going back to 1996. So much of python's development was ducktape through history in response to the changing world and whims of the contributors.

I'm not saying it's an excuse but it's just how it got to where it was. Newer languages have alot of lessons learnt to build upon to be decent from day 0.

ecshafer
3 replies
20h52m

Java and ruby are similar ages as python and dependencies are much better stories there.

fbdab103
2 replies
18h30m

Ruby never had nearly the FFI/other language problem as Python so could almost entirely focus on Ruby-code delivery.

cesarb
1 replies
17h45m

The same for Java, since its ecosystem has an allergy to calling native (non-JVM) code, to the point of rewriting perfectly good libraries in Java. When they do call native code, it's often in horrible ways (like copying a native library from within the JAR file to a temporary directory, and loading it from there, the JAR file coming with pre-compiled native libraries for all possible operating system and architecture combinations). So the Java package managers mostly focus on Java (and other JVM languages) code building and delivery.

pjmlp
0 replies
14h17m

Maven and Gradle support building C and C++ libraries just fine.

IshKebab
0 replies
47m

While this may be true, they've also had literal decades to improve the situation and have barely got anywhere. In some ways they've gone backwards!

xapata
4 replies
21h49m

I haven't found it so. I've stuck with pip and adopted venv when it showed up, and haven't needed anything else. I use Docker for "pinned" builds.

eviks
3 replies
14h9m

venv and docker are exactly the indicators of how Python is bad

globular-toast
2 replies
13h30m

Are screws bad because you need a screwdriver?

eviks
0 replies
12h22m

You're mistaking hammer for a screwdriver

IshKebab
0 replies
49m

That's a completely nonsensical analogy. Maybe you missed his point, but well designed programming language infrastructure does not need Docker or venv to work. The fact that you have to resort to the massive hack of Docker shows how bad the situation is.

I do not have to use Docker or a venv for my Rust, Go or Deno builds.

IshKebab
6 replies
22h25m

I'm assuming you're not a Python user otherwise you'd already know the many answers!

This link might give you a taste:

https://packaging.python.org/en/latest/key_projects/

selcuka
5 replies
22h15m

I am a Python user, but never heard of most of the tools in that list. This is probably because everyone and their cousin attempts to write yet another package manager for Python.

The built-in tools venv, pip, (together with requirements.txt and constraints.txt) meet 99% of real life dependency management needs.

starlevel003
3 replies
21h50m

The proliferation of requirements.txt is one of the key reasons why Python packaging sucks so much.

kelipso
2 replies
21h40m

Right what we need is a requirements.yaml (better yet, create an entirely new markup language for this particular project) and another new package manager for it. One day (one day!) I will start a project without python. One can hope.

starlevel003
1 replies
21h28m

Right what we need is a requirements.yaml

It already exists, it's called pyproject.toml. It already existed for years in the form of setup.py. Requirements.txt means that projects can't be automatically installed which contributes massively to the difficulty of getting packages to work.

mkesper
0 replies
5h24m

Pyproject.toml is a right step afaict but man is it complicated: "Please note that some of these configurations are deprecated, obsolete or at least discouraged, but they are made available to ensure portability." Core vs setuptools-specific etc. See https://setuptools.pypa.io/en/latest/userguide/pyproject_con... and https://packaging.python.org/en/latest/specifications/declar...

gv83
0 replies
8h50m

Also web frameworks! Many web frameworks.

oconnor663
0 replies
20h58m

This pair of articles is excellent, both for good practical advice about installing Python packages, and for its general attitude about how to teach difficult things to large groups of people:

https://www.bitecode.dev/p/relieving-your-python-packaging-p...

https://www.bitecode.dev/p/why-not-tell-people-to-simply-use

Every single decision point or edge case represents permanent failure for hudreds of people and intense frustration for thousands. Of course, none of this is really to do with Python the language. It's more about the wide userbase, large set of packages and use cases, and overlapping generations of legacy tools. But most of it isn't C/C++/Fortran's fault either.

globalnode
7 replies
21h23m

yes it was very eye opening. i often see people comparing their favourite package manager to pythons and coming to the conclusion that python is terrible, but its not! one thing i dont quite understand, is why dont the python people just use a c/c++ math library instead of fortran?

patrick451
4 replies
21h9m

There often simply doesn't exist an equivalent library written in c/c++ (or any other language for that matter). The example I'm familiar with is SLICOT (Subroutine Library in Systems and Control Theory) [1], exposed in python through Slycot [2]. It as routines for pole placement, riccati solvers, various factorizations, and MIMO zero computations and a ton of other stuff. As far as I have been able to find, no c/c++/other-language comes close to supplying either the breadth or depth of this library. Further, many of the SLICOT subroutines were written by the original inventors of their respective algorithms, which I view as a big bonus.

[1] http://slicot.org/ [2] https://github.com/python-control/Slycot

selimnairb
2 replies
20h54m

Right, lots of legacy code, plus lack of pointer aliasing in Fortran opens up more opportunities for optimization (or so I have read; this might have changed).

rightbyte
1 replies
10h50m

'restrict' has been an common extension in C since long ago and is now a proper keyword.

I guess it is a matter of taste.

baq
0 replies
7h1m

also a matter of 'I'm not going to rewrite lapack in C because a platform which was never the intended target doesn't have a free compiler'.

radarsat1
0 replies
6h11m

A lot of these kinds of routines could be translated to other languages but aren't because they are complex and often unmaintained and no one is really around that understands them well enough to port them to C or Rust or whatever.

There is also the issue that often they were published before adding a LICENSE file was a thing. I've found myself in the position before of having to email professors in some random university to ask them if I can get permission to redistribute such a routine while packaging a library that depended on it. In one case I asked them if it would be possible to update their code with a license (which was just a zip file on netlib) and the answer was, "no, but you have my email". So I found myself having to write something like "distributed with permission from the author, private correspondence, pinky swear" in my copyright file. some of this code is so old the authors aren't around and it would get "lost" in terms of being able to get permissions to use it, I mean it's a potential crisis to be honest, if people really cared to check. (Until the copyrights expire I guess, which is what, 70 years after the author's death or some such?)

Anyway, I wonder if a potential solution is to autotranslate some of these libraries using LLMs? Maybe AI will save us in the end. Of course you can't trust LLMs so you still need to understand the code well enough to verify any such translation.

ecshafer
0 replies
20h53m

Imo i would much rather write a math library in fortran than c or cpp. Fortran is quite a joy for doing numerical work. It sucks for most other things though. Really the only thing nowadays is that youll probably be using gpus so that makes cpp better for cuda integration.

dzdt
0 replies
19h14m

Because some of the best math libraries are written in Fortran. Seriously lots of heavy duty scientific code was written in Fortran in the 1970s and is still underpinning applications today. In many cases there is no equivalent alternative.

smabie
5 replies
21h26m

Nothing to do with Python? These FFI bindings exist because Python is slow as dirt.

zo1
2 replies
13h27m

And the bindings to python exist because those languages are a pain in the ass to work with.

I don't want to work with Fortran, C++, Cobol, etc. And I sure as hell don't want to figure out how to integrate such wildly different languages into my existing and modern ecosystem.

oblio
1 replies
9h22m

You're probably replying to the original comment from the wrong angle.

Ecosystems like Java, .NET, Golang, Rust, etc do away with this entire problem by virtue of... not calling into C 99.99% of the time, because they're <<fast enough>>.

baq
0 replies
7h3m

There's no right or wrong angle here. There's the useful or not useful angle.

Python was designed to call into C. It was always the solution to make Python fast: write the really slow parts in C and it might just turn out that will make the whole thing fast enough. Again: this is by design.

The languages and VMs you list were designed to be fast enough without calling into C. If you need that, great, use them.

People saying 'Python is slow' miss the point. It was never meant to be fast, it was always meant to be fast enough without qualifiers like 'no C'. If it isn't fast enough or otherwise not useful, don't use it, you've got plenty of alternatives.

zzbn00
0 replies
11h21m

This is one of the reasons why Python got so popular.

It is too slow to reimplement big pieces of software in it, so people just used bindings to existing code. And productivity rocketed!

baq
0 replies
13h27m

These FFI bindings exist because Python was designed as a glue language for FFI bindings.

keithalewis
4 replies
20h32m

Indeed. The real problem is python seems to attract people with no training in software development. It is a mess on top of a mess.

dylan604
2 replies
19h20m

I thought that was what people said about php

oblio
0 replies
9h19m

And Basic, Visual Basic, HTML, Javascript, the list is probably endless :-)

josefx
0 replies
12h6m

A lot of the mockery PHP got was not because it attracted amateur developers, it was because the language itself was amateurishly implemented and because of the resulting mess when that leaked into how it behaved. Things like function names in the standard library optimized for a strlen based hash, a hand rolled parser that made it impossible to even guess in which contexts what features would work, proactive conversion of strings into numbers "0hello" == "0world", ... . There where entire communities dedicated not to mock the people working with PHP but the language itself.

tomlockwood
0 replies
13h1m

That's also a feature - by design it has to be friendly to new users, and not an arcane art only accessible to the Chosen One, as much as those Chosen Ones would like to be the only programmers.

placebo
0 replies
15h16m

It’s a miracle it works at all.

I agree. In fact with what seems to be an exponential growth in complexity of software ecosystems, what's keeping it all from eventually getting to a "tower of Babel" catastrophe? Of course, this does not only apply to software, but it is a good example.

LispSporks22
41 replies
22h18m

I have thought that everyone was using WSL2 for this kind of thing and calling it a day. What are the reasons for even trying to build a native Windows version?

hurryer
32 replies
21h58m

That's like asking way don't mac developers work in a linux vm instead of wanting a native build.

Because working in a VM is inconvenient and has poor integration.

ActorNightly
26 replies
16h44m

WSL2 is both convenient and is extremely well integrated. Microsoft did a really good job with it and VSCode.

Also, in most cases for scientific computing, Mac users are using stuff like Remote SSH in VSCode to work directly on hardware that is running the code, which is pretty much always Linux.

Generally, its kinda said that manpower is being wasted on getting things running in Windows or Mac by the software developers. It should be the other way around, have Microsoft or Apple dedicate manpower to port things over and make both systems conform to the Linux standard.

hurryer
19 replies
15h55m

Let's try the same argument, the other way around: it's sad that Linux developers waste time developing the Linux desktop given that almost all usage of Linux is on the server. Linux developers should dedicate manpower to porting Linux desktop software to Windows/Mac which the vast majority of desktop users use. Now you see the problem with your argument?

ActorNightly
17 replies
14h26m

Windows/Mac which the vast majority of desktop users use.

Use for...what? Windows is primarily used for people who play games. Mac is used by people who want tech jewelry. Neither of which is related to development.

And it would make more sense to develop on a platform that runs the same kernel as the servers. This is the reason why the whole WSL2 exists with VSCode integration. Microsoft quickly realized that if they want to compete in the cloud with Azure, they have to be Linux first.

pjmlp
15 replies
14h12m

And that is why so many developers buy Apple devices and code on macOS, and walk around with them on FOSDEM, because of the kernel.

ActorNightly
14 replies
13h22m

... no, because its tech jewelry lol. That is literally the history of Apple.

Or are we still going to pretend that a computer that you don't own, because Apple tells you what software you can and can't install on it, is somehow better for development?

pjmlp
6 replies
13h15m

It is a UNIX desktop that actually works, and doesn't need endless hours researching for hardware compatibility or custom kernel parameters.

Android/Linux, ChromeOS/Linux, WebOS/Linux also work great for the same reasons.

Other than that, better leave it on servers and embedded devices being a UNIX headless clone, with cloud and hardware vendors taking the trouble to keep it running.

ActorNightly
5 replies
13h2m

doesn't need endless hours researching for hardware compatibility or custom kernel parameters.

Yeah so this is indicative of the fact that you don't really have ANY experience with modern Linux. You can take a well supported laptop and install Linux Mint on it and everything will just work, no tweaking required. Try it sometime before making 10 year old arguments that Mac users were making back in the day and apparently still do now.

Furthermore, Id go even further and argue that as a developer, learning how to configure basic documented things should be something that you know and is fairly straightforward for you to to do, just like installing tooling you need for your development.

VaxWithSex
2 replies
11h45m

Would like to know which laptop that might be...

ActorNightly
1 replies
3h23m

Pretty much most and Thinkpad or Dell works out of the box. You may have to change a setting here or there for some advanced things like adaptive charging, but most of the time the core OS works very well.

There is also Framework laptops, and Librem laptops that are linux first.

My current DD is an Ideapad with Manjaro, which even being somewhat more bleeding edge than Debian based ones, has not only been flawless, but things like Nvidia Optimus work straight out of the box, with external displays.

Jenda_
0 replies
1h53m

Pretty much most and Thinkpad or Dell works out of the box.

…except for AMD ones with AMD GPU :( https://www.wezm.net/v2/posts/2020/linux-amdgpu-pixel-format...

or Intel ones with a particular wifi card https://bugzilla.kernel.org/show_bug.cgi?id=203709

I however agree with you that at least for me, the Linux desktop on almost all laptops and desktops "just works". Especially when comparing with Macbooks - you have way less hardware choice there.

pjmlp
1 replies
10h45m

My dear, I do have plenty experience with "modern" Linux, plenty of it,

https://www.idgshop.de/linuxwelt

As usual we get the answer,

"Have you ever tried distribution XYZ?"

The magical one that will sort out all problems, and then doesn't.

ActorNightly
0 replies
3h32m

Im sorry but no. I have set up probably over 100 linux laptops at this point. It used to require more tweaks back in the early 2010s. Now you can get any Dell or Thinkpad, install Mint without issues.

You must be doing something wrong if you are having to tweak kernel parameters.

fragmede
6 replies
13h10m

Apple's locking down of macOS has been greatly exaggerated. You can install anything you want on a Mac, Apple doesn't stop you. The same can't be said for iOS or iPadOS, however.

ActorNightly
5 replies
13h1m

You can install anything you want on a Mac

Can I install Ubuntu?

fragmede
3 replies
12h46m

if you're on an Intel Mac, yes. if you're on apple Silicon, yes, but it's not user friendly (but hey, that's Linux for you)

ActorNightly
2 replies
12h19m

Intel Macs aren't relevant anymore. As for apple silicon, running a ubuntu VM doesn't count. You also need to use Parallels, which is not free. The Asahi Linux method is still very buggy because its essentially a reverse engineer of Apple. And its not guaranteed to ever not be buggy, because Apple. And Apple will 100% kill it if it ever gets to popular because it runs well, because they will lose revenue streams they get with MacOS.

So the definitive answer is no, Macs are still pretty much locked down.

I get that people like the battery life and the hardware of Macs, which is fine for personal use, but objectively for a laptop that is going to be used for development, you get much more utility out of buying a "non mac" laptop of your choice in the form factor, and installing Linux on it.

notpushkin
1 replies
2h34m

The Asahi Linux method is still very buggy

It's been getting way better lately. I don't daily drive it mainly because I've yet to move my stuff from the macOS partition.

ActorNightly
0 replies
28m

If it ever get to the state that modern linux is, then I will change my mind. Its very much like the linux of 2010s, where you had to configure a whole bunch of things to get linux to work despite people swearing that it works.

pjmlp
0 replies
10h48m

If you want to install Ubuntu, buy from an OEM that is supported by Ubuntu.

https://ubuntu.com/certified

oblio
0 replies
9h5m

Enterprise software development says hello to your world.

Too
0 replies
15h15m

For one, Linux is OSS, so contributions made there are available for anyone, forever. I don’t think voluntary developers are very keen on dedicating contributions to a business entity who by the next major release would have stolen your code and hidden it behind ads or some not so smart assistant that requires online connectivity all the time.

Linux is also the common ground. If you have to choose ONE system, you would choose Linux. Otherwise you are going to need BOTH Mac and Windows, possibly even more. Just getting the hardware to test that would be a major setback for a small oss contributor.

I don’t think you can consider scipy desktop software anyway. The IDE can by all means be, but just let it communicate to some deamon running in WSL, docker or remotely over a standard interface.

eviks
4 replies
14h18m

What's so well integrated with atrocious IO speeds between boundaries?

mardifoufs
1 replies
10h29m

Wasn't that mostly on wsl1?

eviks
0 replies
8h23m

vice versa, wsl2 is the true VM that truly suffers

https://vxlabs.com/2019/12/06/wsl2-io-measurements/

baq
0 replies
6h46m

not a problem if you don't cross them - my wsl2 disk image took more space than the windows install. I basically used windows as a web/email/chat client and a terminal to the real system, which was wsl2.

ActorNightly
0 replies
14h8m

I think you mistook this for reddit.

walleeee
0 replies
6h48m

Also, in most cases for scientific computing, Mac users are using stuff like Remote SSH in VSCode to work directly on hardware that is running the code, which is pretty much always Linux.

This might be true for extremely demanding tasks that need to run on a cluster or cloud but a surprisingly large amount of scientific computing is perfectly manageable (indeed much easier to manage) on a laptop, if there are compatible toolchains for the OS.

That said, I would not be dissatisfied with a world in which Linux was the OS of choice for nearly everyone. Fully agree it would be great if research software developers could focus on the domain, distributions take a wildly disproportionate amount of effort

satvikpendem
4 replies
20h19m

WSL2 is not really a VM though, in the traditional sense, as it has full hardware access, including the GPU. Technically, when you enable WSL2, the Windows build itself is running through the Hyper-V hypervisor, as is the Linux distribution. In fact, a few years ago, I used Proxmox to basically do the same thing, in order to test cross-desktop apps I was developing for Windows, Linux, and macOS (as a Hackintosh).

fbdab103
1 replies
18h26m

As someone stuck on a Windows corporate laptop, I can say that WSL2 is definitely not on the approved software list. Unless it installed by default and comes with a big solid green check mark for compliance/virus scanner/whatever security doo-dad of the day, it is a non-starter for many of us.

Sure, with enough begging and pleading, anything is possible, but that usually requires Conversations.

satvikpendem
0 replies
18h16m

Sure, corporate laptops are always a different story. Ideally it's nice to have cross compilation but that might not always be possible. I use a lot of Rust and I really like their OS compatibility out of the box.

xnyan
0 replies
17h15m

WSL2 is not really a VM though, in the traditional sense, as it has full hardware access

It is exactly a VM, and the WSL2 guest does not have full hardware access. The hypervisor can paravirtualize compatible GPUs, but for other hardware (such as USB) this is not possible. Hardware passthrough is also not possible in WSL2.

thrdbndndn
0 replies
16h55m

Technically, when you enable WSL2, the Windows build itself is running through the Hyper-V hypervisor

My (very limited) understanding of hypervisor is it by definition is used to run VMs.

So I don't get why that makes WSL2 not be counted as a VM, even "technically".

I assume you mean once you enabled hypervisor, technically both Windows itself and WSL2 are VMs in parallel?

PLenz
5 replies
22h17m

Way more researchers use windows then you might think. Plus students.

spookie
4 replies
21h24m

Maybe I'm biased, being in EU and all. But I've seen more Linux than Windows in those environments.

izacus
1 replies
12h49m

What you've "seen" has nothing to do with what's actually popular. Why does the concept of observer bias have to be explained on this site?

spookie
0 replies
7h40m

My response was based on my own experience. Hence the remark.

What I've observed was also dictated by my own choices.

pjmlp
0 replies
14h9m

I can assert that a large majority of people at CERN leave GNU/Linux for servers and use Windows/macOS on their desktops.

It was like that 20 years ago, and it looks quite the same when I visit it for Alumni events.

aulin
0 replies
15h38m

Same, it's either linux or macos in compute intensive research for what I've seen in Europe. I've seen groups working with Windows based data analysis tools but those groups and the ones that need python rarely intersect.

WSL2 is a godsend for people forced on Windows by blind IT policies, but fortunately IT doesn't have that kind of control in academia.

nijave
0 replies
9h28m

I'm guessing big corporations where it's near impossible to get a non-Windows machine are a big part of the audience.

RockRobotRock
0 replies
22h13m

msft and nvidia tackling CUDA drivers on WSL2 is a godsend. Especially when you can use Docker Desktop on it.

Definitely making the best of a less than ideal situation.

tedunangst
29 replies
23h38m

In particular, Meson was going to refuse to accept the MSVC+gfortran combination that was in use in conda-forge.

This sounds like a bug? The point of a build tool is to run the commands you tell it, not tell you, sorry dave.

rbancroft
13 replies
23h26m

The article is well-written and detailed, however I was taken aback by the claim that meson is 'widely used for C & C++ projects'. I've come across bazel more often than meson. I guess because meson is written in python it seemed like a good choice for SciPy, and it worked out in the end so congratulations.

But yeah, I think CMake is still the gold standard despite all its quirks, complexities and problems.

wenc
6 replies
22h56m

Projects that use Meson

https://mesonbuild.com/Users.html

Includes Gimp, Gtk+, nautilus, Postgres, qemu, Wayland…

galangalalgol
4 replies
21h54m

I thought conan was the standard? Or what about hunter? Vcpkg? And now with modules getting added to the language and being supported ro varying degrees by all these options, I don't understand how the c++ ecosystem won't fracture into a pile of unusable garbage.

spookie
2 replies
21h27m

Vcpkg is a Microsoft thing. No project I know, besides those targeting Windows only even consider it. But, I'm probably biased by my software choices.

galangalalgol
0 replies
21h2m

I know orojects that have chosen it for linux only work, but I got the feeling they regretted it. Conan is what I was using before I switched to cargo, and it was fine as long as everything in your tree was conan. Dependencies you could wrap, but dependents were a headache.

davisp
0 replies
20h19m

If you’ve ever wished that CMake’s ExternalProject pattern could be lit on fire, launched into the sea, fished out of the sea and then incinerated via tactical nuclear strike before bundling up the ashes that are then launched into the sun, vcpkg is definitely something you should look into.

I’m no Microsoft fanboy. I’ve been in software dev for roughly 20 years at this point so generally view anything from Microsoft with genuine suspicion, so I get the hesitation to take it seriously. But it works across the big three (Windows, Linux, macOS) and is MIT licensed so I’d definitely recommend giving it a whirl.

The only serious knock against it is that they went the OG Homebrew route with a single Git repo containing all of their ports (equivalent to Homebrew Formulae). And then whoever designed the Git repo approach also knew slightly too much about Git internals and leveraged tree-ish refs as part of the versioning design which is just weird and confuses anyone that’s not spent time tearing into Git’s object model.

So basically, vcpkg is honestly a good tool that does what it does fairly well. It may not do everything you need, but if it can it’s amazing.

Also, the buried lede here is how vcpkg handles binary caching. Think of it like sccache but at the dependency level. I’ve seen it drop CI runs from over an hour to 10m purely because it helps skip building dependencies without resorting to bespoke caching strategies.

pjmlp
0 replies
14h13m

conan and vcpkg are pretty much head to head, depending on which circles one moves on.

Everything else is mostly statistical noise.

People on C and C++ ecosystems like choice, so there will never be one single solution.

okanat
0 replies
20h48m

One thing that's noticeable is most of those projects are Linux-only and many of them are around GNOME / GTK ecosystem. Most of those projects have no Windows versions. The ones that have them have really bad compatibility with it (e.g. GIMP, GTK3). Meson has a preference of simpler code bases leaning more on the C-heavy side. That's usually not the case with big, old and commercial C++ code bases.

The programs that need complex build systems that require things like compiling a code generator first and then compiling the rest of the project with the generator etc. are quite common in C++ world to tame the language's shortcomings. Libraries like Qt, Protobuf and GRPC often introduce a crazy amount of build complexity.

CMake's complexity is directly result of that and it is currently the only build generator that can cope with that using basically every compiler in existence (including proprietary ones like ARMCC, ICC, MSVC and very limited ones like SDCC). Even Bazel cannot handle the same number of compilers and feature sets. That's the thing that makes CMake gold standard not its string-driven scripting language.

CMake shares quite a bit history with C++ and you hear sentences like "it's the only thing that works for this level of complexity" for both.

bsder
5 replies
23h15m

Meson is a front end.

Edit: Caution: This statement is incorrect--It can generate CMakefiles as a backend. It can also generate build files for MSVC. It can also generate standard Makefiles.

Edit: Corrected downstream. CMake can make Ninja files which Meson also makes. I got this backwards but kept the edit so people won't get confused.

CMake is NOT a gold standard. CMake is an agglomerative disaster.

For example: try getting CMake to accept zig as your compiler. "Oh, your compiler command has a space in it? So sorry. I'm going to put everything after the space in all manner of weird places. Some correct--some broken--some totally random." If you're lucky CMake crashes with an inscrutable message. If you're unlucky, you wind up with a compiler command that fails in bizarre ways and no way to figure out why CMake is doing what it did.

This is my experience with CMake every damn time--some absolutely inscrutable bug pops up until I figure out how to route around it. If I'm really unlucky, I have to file a bug report with CMake as I can't route around it.

Sure, if some unfortunate soul has beaten CMake into submission and produced a functioning CMakefile, CMake works. If YOU are the poor slob having to create that CMakefile, you are in for worlds and worlds and worlds of pain.

bonzini
2 replies
23h13m

Note that Meson does not generate CMake files. It generates MSVC or ninja files.

bsder
1 replies
23h7m

Sorry. You are correct. I forgot that it's CMake that can now generate Ninja files.

maccard
0 replies
10h26m

I forgot that it's CMake that can now generate Ninja files.

Cmake has had ninja support for a long time. I've used it for at least 7 years at this point.

spookie
1 replies
21h30m

I agree, I stay away from CMake whenever possible. I've tried so much to understand it, and yet, it kicks me around like a pinata

akoboldfrying
0 replies
10h50m

Same. By now I almost have an allergic reaction to thinking about using it.

I know the basic thing I'm trying to do is not going to work, and that I'm going to wind up opening 20 browser tabs that alternate between (1) trying to understand from first principles how to do it properly (and getting frustrated going around in circles through their labyrinthine yet thoroughly incomplete docs), and (2) just desperately searching the rest of the web for the right incantation to whisper (and getting frustrated by all the blog posts and forum answers that describe how to do the thing before they went and changed how everything works).

Feeling the rage and despair build as hours roll by, and you're still staring at a screen full of the most cretinous syntax ever excreted into the world.

gavinhoward
6 replies
23h25m

I agree.

Disclaimer: author of a soon-to-be released Meson competitor.

But one obscure, little gem of a rant that taught me what you just said is [1].

tl;dr: Build systems should run the commands you tell them to run, period. Because sometimes, the programmer actually does know what he's doing.

I am ashamed to admit that before I read that comment, I was thinking about making my build system magical. But after reading that comment, I realized that "magic" is why people hate build systems.

[1]: https://ofekshilon.com/2016/08/30/cmake-rants/#comment-29273

chubot
3 replies
15h45m

I read this 2021 comment/rant and didn't find it illuminating or insightful

I just want a tool that doesn’t make ANY assumptions about compiler flags or compilers or how my project directory structure is laid out. I can do the leg work of inputting all the exact parameters and build configurations into the tool. I just need the tool to incrementally compile my code in a parallel fashion without HACKS

That's exactly what Ninja is, and it's existed since 2012.

CMake is a flawed generator for Ninja, but you can write your own. I did that for my project, as explained here - https://lobste.rs/s/qnb7xt/ninja_is_enough_build_system#c_tu...

gavinhoward
2 replies
7h28m

Understandable that you wouldn't find it insightful.

I found it insightful because I didn't know. Also, how angry the commenter was; I'd been struggling to figure out why people hate build systems, and the hate pouring out from that comment was palpable enough to point me in the right direction.

Anyway, I've already read your comments on lobste.rs, and I'm glad it's worked for you.

Ninja is great (my build system will be able to read Ninja files), but there are two major problems.

First, you still need a generator. Most people don't think about reinventing the wheel like you do, so CMake still comes up. And a generator can still have the magic that people hate.

Second, it's still limited in what you can do. Sometimes, you need something more complex to make your build happen.

Ninja can definitely take care of the 95% case, and I actually will encourage people to use it before they use mine.

I'm only going to be targeting the people with the 5% case, or that hate all of the widely-used generators. I fall into the latter category myself.

chubot
1 replies
3h8m

OK but I don't understand why you would read Ninja files rather than generate them.

I think Ninja almost always needs a generator -- it's too low level to write by hand, especially for C/C++ projects.

Meson apparently also generates Ninja, but I haven't used it.

---

I think the main difficulty with builds is solving the "Windows problem", as I wrote in that thread.

Good example of all that from yesterday:

https://lobste.rs/s/hh7wuy/why_scipy_builds_for_python_3_12_...

https://news.ycombinator.com/item?id=38196412

Anyone who can solve that problem will be a hero, but I also think no one person can solve that problem.

That is, cross-language / polyglot builds and working with the existing tools from each ecosystem is THE problem. Unfortunately most people seem to think that the language they use is the only one worth solving for.

Our Ninja build handles C++ great, but we also have many shell scripts for one-off Python things, and R things, and JavaScript things, ...

gavinhoward
0 replies
1h47m

OK but I don't understand why you would read Ninja files rather than generate them.

My build system is self-contained. It doesn't just do the configure; it can also do the build.

Also, a feature is being able to treat external builds as part of the same build.

For example, if one of your dependencies is a CMake project, it will be able to run the CMake configure as a target, import the targets from the Ninja file, and run those targets as part of the build for your project.

It should be able to do the same for Cargo files, Zig build files, etc.

That would solve the polyglot issue, minus details, which I'm working hard on.

https://news.ycombinator.com/item?id=38196412

I'm not sure why you linked the post we're commenting on...

Edit: I'm also solving the Windows problem; my build system is Windows-native, and I have a design for something to build up command-lines based on the compiler in use, including MSVC.

Edit 2: I also understand why you question whether anyone can solve the polyglot problem. You are right to question whether I can. If your response is "show me the code," that is rational, and my response would be, "I'm almost there; give me three months." :)

gpderetta
1 replies
8h6m

Magic is great when you write the magic to suit your problem.

When you are subject to other people magic, it usually ends in tears.

gavinhoward
0 replies
7h26m

Yes, very much yes.

Your comment is such a concise description of the problem.

bonzini
3 replies
23h14m

It's the MSVC linker that would complain. The problem is that the C runtimes used by MSVC and gfortran (whose own runtime library is written in C) are not ABI-compatible. The hack that numpy used was to link the Fortran objects into a DLL to add a level of indirection (the import library) that would pacify MSVC.

So there was some extra work needed to create these DLLs. Either in the build description files, or in Meson. The SciPy people didn't want to implement this indirection in either place, and the Meson developers were not eager to help them either (they did help in general, for example with Fortran and Cython support; but they don't want to provide footguns) because it was indeed a hack. It only worked because the Fortran side didn't use files that were opened on the Python/C side, for example.

https://web.archive.org/web/20180711144501/https://pav.iki.f...

bsder
2 replies
23h10m

Thank you for the clarification. Meson refusing to do something and the developers not deeming that a bug seemed really unusual.

bonzini
1 replies
22h48m

Oh, Meson is definitely a "sorry Dave" kind of build system. In many cases it's extremely opinionated, though I have only found a couple cases that get into "infuriating" territory.

It does compensate by generally preserving a lot more sanity than its competitors, and having a readable and maintainable description of the build system.

spookie
0 replies
21h33m

I appreciate the sanity it brings. More software should be like that, as you implied. Or, so I read as such.

aidenn0
3 replies
23h25m

Meson does more than run the command you tell it, it can also synthesize those commands (so you can support MSVC/gcc/clang without writing build rules). If you ask it to synthesize a command for a combination it doesn't know about, of course it's going to tell you "sorry dave"

tedunangst
2 replies
23h13m

Yeah, sure, 40 years ago make had implicit rules for building .o files from .c files so you didn't have to write them, but if you told it your compiler was going to be bananac, it wouldn't refuse to run.

dekhn
1 replies
22h36m

The author of make personally thanked the author of bazel for making a replacement. He actually greatly regretted making Make.

(I love Make).

mzs
0 replies
16h5m

Stuart Feldman? Got a source/name?

kragen
28 replies
15h30m

maybe the free software community should stop subsidizing microsoft's operating systems and let them port things like scipy to it themselves

after all, if you want linux, you know where to find it; it's right there in wsl2

also microsoft could start shipping a fucking compiler with their sorry malware-ridden excuse for an operating system, like every single other operating system vendor has for sixty years

xcv123
27 replies
15h9m

microsoft could start shipping a fucking compiler

Visual Studio is a free download. Most users are not developers. Waste of space to include it by default.

https://visualstudio.microsoft.com/vs/community/

kragen
22 replies
14h41m

that's a ridiculous excuse

unix v7 included a compiler and was three megabytes

https://www.tuhs.org/Archive/Distributions/Research/Keith_Bo...

https://opensimh.org/research-unix-7-pdp11-45-v2.0.pdf

the compiler was a tiny fraction of that

gcc 9 is about 50 megabytes

windows 11 is over 8 gigabytes; you need a 16 gig usb drive to install it

there are probably individual audio files included in windows that are bigger than gcc

also, tho, this is a lot like not including life jackets on a ship because most passengers don't get shipwrecked

samus
7 replies
14h33m

Most users will never need a compiler. If they need it, it's a download away.

Life jackets are hopefully never needed, but when they are needed, the crew can't go to the warehouse and get them. Big difference.

kragen
6 replies
14h30m

it might be a download away, or it might be permanently unavailable

pjmlp
3 replies
14h19m

Which was the usual way things were on UNIX, thanks Sun, before GNU/Linux became relevant.

kragen
2 replies
14h16m

no, there was a brief period of time where sun decided to try to imitate microsoft in this stupidity, but fortunately none of the other unix vendors followed suit

pjmlp
1 replies
13h28m

First of all, everyone else was doing the same outside UNIX in the 1980's.

Secondly, Solaris, Aix, HP-UX, DG/UX were the same in what concerns having to buy a UNIX developers license for the compilers.

So other UNIX vendors did follow suit, and I can't be bothered to dive into BYTE and DDJ ads from 1980 - 1990's to add others to the list.

kragen
0 replies
3h19m

though i never bought one myself, i never saw an aix or irix box without compilers installed, and don't have any personal experience with hp-ux and dg/ux, but it was only for solaris that the fsf decided they had to put up precompiled gcc binaries on their ftp site because the vendor wasn't shipping one

samus
1 replies
14h11m

Why would it be permanently unavailable?

kragen
0 replies
13h57m

that's what always happens to downloads

like 90% of my links from 8 years ago are 404 now

xcv123
6 replies
14h34m

It's not an excuse. Visual Studio is multiple gigabytes. Yes they should make it smaller but unless that happens it would be stupid to include it by default. Waste of space.

kragen
3 replies
14h31m

they can ship a compiler that isn't a giant pile of shit then

there are free ones

xcv123
1 replies
14h28m

You should get a job at Microsoft and fix it.

kragen
0 replies
14h25m

i'd sooner breakfast on goat vomit

pjmlp
0 replies
14h21m

A compiler without system libraries is useless.

All of those little apt get install ....-dev to spend an afternoon on.

Followed by installing Clion, QtCreator or KDE + KDevelop.

figomore
1 replies
2h19m

It's possible install only the components you want from Visual Studio. I chocolatey to install only the compiler and a component needed to compile Cython codes on Windows:

``` choco install -y visualstudio2019buildtools choco install -y visualstudio2019-workload-vctools ```

I this case it's installing Visual Studio 2019.

xcv123
0 replies
21m

If we're talking about base OS install then its a different story. Which options do they choose as default for all users? Are you suggesting that Microsoft create an OOTB setup perfectly custom tailored to this one tiny niche requirement to compile Python whatever?

Even on Linux the OOTB setup is useless for my development. I always need to `apt install` all the compilers, frameworks, libraries, SDK's, utilities, etc, before its usable.

pjmlp
6 replies
14h22m

You are only looking at the compiler without standard library, and all the nice tools modern C and C++ developers have grown to enjoy since 1979.

You should be comparing to a C compiler for MS-DOS.

If you want to do a proper comparisation you should include GNU/Linux libraries for all major architectures already compiled, GUI frameworks, IDE, .NET, Python, node, Java SDK, Azure integration SDKs, device drivers,...

kragen
5 replies
14h19m

because of dynamic linking, the standard library is already included in microsoft windows, and all that other crap isn't needed to get blas and lapack to build

pjmlp
4 replies
13h26m

What standard library would that be?

kragen
2 replies
13h0m

beats me, i haven't had a microsoft windows box since 02000

pjmlp
1 replies
10h39m

So you just like to rant, I see.

kragen
0 replies
3h21m

yet somehow i was still correct: https://news.ycombinator.com/item?id=38203731

jasomill
0 replies
10h4m

Microsoft's Universal CRT, present by default in Windows 10, and installable on Windows 7 SP1 and later.

Linking to UCRT using an entirely FOSS toolchain is, alas, nontrivial, but supported by mingw compilers (gcc and clang; no idea about the various FOSS Fortran compilers):

https://mingwpy.github.io/ucrt.html

gary_0
3 replies
12h33m

All you really need is cl.exe and friends, and the C/C++ headers, in some fixed directory path. (And, apparently, maybe a Fortran compiler?) That shouldn't take up that much space.

And the last time I was forced to install Windows 10, it spent a few hundred megabytes of bandwidth and disk space on Candy Crush Soda Saga, not to mention a bunch of other junk I never asked for, so disk space is not that precious to Microsoft.

xcv123
2 replies
1h38m

That simply won't work for most Windows development. Need all the frameworks, including .NET. At least tens of gigabytes. Full Visual Studio 2022 installation is 210 GB.

gary_0
1 replies
1h21m

Nobody's asking for .NET or an IDE. Just enough to build things like the aforementioned Python packages. Or am I missing your sarcasm?

xcv123
0 replies
1h12m

This is not sarcasm.

You are asking for basic development capability in the base OS installation.

Vast majority of Windows development uses .NET, or uses frameworks etc. The barebones C++ compiler and standard library simply won't work for most development on Windows, so what's the point? You are expecting base functionality to cater to your very niche specific needs which is practically useless for the vast majority of Windows development in general. There's no business case for it. Won't happen.

Even on Linux I need to install a lot of headers and libraries and compilers and SDKs before it can be used for development. Ubuntu base install is practically useless for dev without `apt install <all the things>`.

mattbillenstein
18 replies
22h48m

Very naive question, but are the semantics of Fortran so different that it can't be translated to C first and then compiled using a C compiler? Perhaps maintained in C going forward?

I can't imagine there are a lot of Fortran folks around maintaining these old libraries - they must need maintenance no?

not2b
13 replies
22h37m

If you want the code to run slower, yes, you could do that. Because there are no pointers in Fortran, only arrays, and because arguments to functions aren't allowed to alias (let's ignore the horror of COMMON blocks for now), aggressive optimization and vectorization is easier.

The standard Fortran math libraries just work, and they are fast.

I should clarify that you can write C/C++ code that would have equivalent speed, especially with the C restrict keyword, but putting in an f2c step to translate the existing code will making things significantly worse in many cases.

Sohcahtoa82
8 replies
22h22m

Because there are no pointers in Fortran, only arrays, and because arguments to functions aren't allowed to alias [...], aggressive optimization and vectorization is easier.

Can you explain this like I'm 12?

What's an example of an aggressive optimization you can make based on arguments not being aliased and there being no pointers?

Falell
3 replies
22h10m

Not a direct compiler optimization, but consider memcpy() vs memmove() as an example. If you know two regions of memory do not overlap you can call memcpy() for a direct optimized copy, but if they overlap you must call memmove() and introduce an intermediate copy.

Sohcahtoa82
1 replies
21h18m

It makes sense, but when would you ever memcpy with overlap? I would think any situation that lets that happen is from a bug, like you have an incorrect buffer length or an incorrect destination address.

murderfs
0 replies
20h51m

Inserting an element in an array is something along the lines of memmove(arr + idx + 1, arr + idx, (length - idx) * sizeof(*arr)); arr[idx] = foo;

johncolanduoni
0 replies
9h26m

memmove does not (in any implementation I’ve ever heard of) introduce an intermediate copy, it just performs the copy loop in the reverse direction to handle the overlap (and can’t always vectorize in the same way memcpy can).

hurryer
2 replies
21h49m

  int *x;
  int *y;
  // ...
  *x = 5;
Because of aliasing, x and y might point to the same memory location, so the compiler must assume that when you modified x to 5, y also potentially got modifed, so at next access it will read y again from memory and discard any cached value in a register it might have.

The compiler will try proving that no such aliasing is present, by tracking your pointer usage, but it's not always possible, in those cases it will assume the worse and re-read values from memory.

wycy
0 replies
19h1m

I finally understand—-thanks for this example.

Sohcahtoa82
0 replies
21h21m

Ah that makes sense. Thank you!

superlopuh
0 replies
21h58m
pklausler
2 replies
21h38m

Fortran has had pointers in the standard language since F'90 and as a ubiquitous vendor extension since ca. 15 years earlier.

Affric
1 replies
11h37m

Are the often used in the big linear algebra libraries?

not2b
0 replies
2h17m

No. The math libraries are written in Fortran 77 for the most part, which did not have pointers. I should have been more precise about that.

dwheeler
0 replies
21h50m

Also, in Fortran arrays are normally stored in column-major order (sometimes called “Fortran order”). C uses row order. A simplistic translation would have terrible performance.

lldb
1 replies
22h43m

We have this already - “f2c” has been around for decades.

ogrisel
0 replies
4h30m

It's currently used to provide a hackish WASM compatible port of scipy for Pyodide and the Pyodide maintainers are eager to be able to drop the f2c hacks in favor of lfortran or any WASM-capable fotran compiler because f2c can cause very hard to debug low level crashes when running scipy or other downstream libraries tests (e.g. scikit-learn).

mkoubaa
0 replies
22h43m

Fortran is a higher level language than C. There are plenty of fortran developers.

It's awful for application development, but that isn't it's niche

googl-free
0 replies
22h47m

yeah, fortran has native arrays

cozzyd
9 replies
23h43m

the build system churn in python is really hard to keep up with.

I'm also curious about performance numbers on Windows (though, to first order, it doesn't matter... anything serious is probably running on a Linux machine).

nerdponx
4 replies
23h30m

Fortunately it should be slowing down now. The big transition was getting everyone on board with PEP 517, especially converting legacy Setuptools projects.

lmm
3 replies
22h43m

I've been hearing something like this about Python builds every two or three years for decades. How long until the next "big transition" is "converting legacy PEP 517 projects"?

nerdponx
2 replies
21h19m

Hopefully never.

lmm
1 replies
19h17m

Hopefully, sure. But why should we believe that this time is different?

oblvious-earth
0 replies
19h4m

The main difference now is there is an actual standard rather than a de facto tool that becomes the standard.

The idea is that different tools may rise and fall in popularity but they should all be following the standard so the compatability breaks should be minimal.

Will it work? No idea, but it's the best attempt to make things work well for everyone yet.

Python 3.12 might be the biggest churn moment, but there are probably a few more down the road, such as dropping legacy version specifiers.

hurryer
3 replies
21h42m

For pure CPU computation Windows is just as fast as Linux since 99.9% of the time it's your code running and not the OS.

sdfhioandion
0 replies
20h25m

Not true. Changes to the OS, particularly to the scheduler, can affect CPU-bound work a great deal. It's "your code", but the OS decides when and where it runs. For example, the changes between Linux 6.5 and Linux 6.6 led to a >20% uplift in TensorFlow and some smaller uplifts to Blender. This is the same software on the same hardware.

https://www.phoronix.com/review/linux66-epyc-xeon/3

I couldn't track down a more detailed scheduler-specific benchmark. I remember reading one on Phoronix a few months back...

murderfs
0 replies
20h48m

Depending on the code you're running, calling convention can matter a lot. The SysV ABI will use xmm/ymm/zmm automatically, whereas on Windows you have to opt into it with __vectorcall.

cozzyd
0 replies
18h27m

Not many compute clusters run windows

bjackman
6 replies
12h8m

Back when Linux was a janky corner case run by loosely-coordinated hackers with sometimes-impractical ideological constraints, when brilliant people were expending heaps of effort to enable support for it, that was fantastic.

Now that the janky corner case is a proprietary system run by cyber-landlords whose constraints are just... hostility, I feel less positive about the work being done to support it.

On the one hand it's really great that these people care so deeply about making these tools usable for everyone. And I applaud them for doing it, absolutely not suggesting they should change course, just pondering.

But... now instead of thinking "wow I'm so glad this work is happening" I do rather think "wow imagine what those wonderful people could achieve if they didn't have to work on this".

crabbone
3 replies
6h27m

It's not even that. SciPy is paying the price of the idiotic (and highly biased) decisions of Python core-dev members who chose MSVC over MinGW for Python on Windows. (And their motivation is derived from Microsoft sponsorship. A bunch of people on core-dev list are straight-up Microsoft employees, who are paid by Microsoft to be on that list, and, on top of that, Microsoft pays for CI servers for CPython project).

This whole problem could've been avoided if Python just didn't use proprietary tools in its toolchain.

IcyWindows
2 replies
2h19m

Are you saying using llvm would have also made things better, or are you saying that it would be better if everyone just used gcc?

crabbone
0 replies
1h29m

I'm not knowledgeable enough about portability of LLVM, but if it's binary compatible with GCC-compiled stuff (or can be made such) on Windows, then, sure.

See, the problem here is that if you want interop, with, eg. Ruby, Erlang, R, Perl, Go and probably a bunch of others (the only other exception I know of is PHP (PECL) that uses MS toolchain), then you have to produce compatible binaries.

Ideally, it shouldn't be about the flavor of compiler, but be some kind of official documented format... that many compilers can easily implement. But, since de facto this format is "just use GCC", then, in practical terms, either use GCC, or pretend you use GCC.

crabbone
0 replies
1h17m

BTW, in the quest to find out how other popular languages deal with this problem, I discovered that things can actually be even worse than Python. Enter JavaScript (Node.js)!

In JavaScript, you cannot distribute pre-built native modules, only source code. And if it compiles, then it compiles, and if it doesn't -- it's the user's fault.

SiempreViernes
1 replies
11h0m

Well, "they don't "have" to work on it, as was pointed out several times the SciPy devs are volunteers.

Indeed most of the story is explaining why SciPy just had to hope someone else would make a open source Fortran compiler for windows, and it looks like it was mainly NVidia devs that provided salvation

henrydark
0 replies
8h12m

I must go on compiling / You can't break that which isn't yours / I must go on packaging / I'm not my own, it's not my choice

FrustratedMonky
6 replies
22h54m

Maybe move away from Python altogether?

mkoubaa
3 replies
22h42m

Great idea! Why don't you go suggest to the developers of sciPY??

FrustratedMonky
2 replies
22h9m

The developers of SciPY are really good, maybe their brains could be focused on developing tools for a different language that doesn't cause articles to be written about how crazy it is to integrate with it.

At some point, move on, Python started as a toy, and grew into a behemoth.

cycomanic
0 replies
21h40m

Maybe instead the solution should be moving away from Windows. Almost all the issues described in the article (and I would argue with most language packaging systems) is due Windows being such a pain to develop on. I'd argue it's going to be easier to convert all scientific users to Linux than rewrite all scientific code in a different language. (no I'm not seriously suggesting all Windows users abandon it, I'm just replying to someone who suggests to ditch the language which has the largest scientific codebase by likely several orders of magnitude)

bluenose69
0 replies
21h12m

At some point, move on

For some in the scientific community, Julia is seen as a a nice alternative to python. It has some great libraries, and installing them is easy, perhaps because the system was well-designed from the start, and recently enough to build on experience with other systems.

Julia is really handy for reproducible work, because you can specify which versions of which libraries you're using, and share that information with others easily so they can retrace your steps. That's a big factor in scientific work.

The other thing (returning to the main topic) is that Julia does not have to rely on fortran or C/C++ for low-level work. Julia is very fast all by itself. Indeed, most of the Julia libraries are written in Julia itself, providing users with a tutorial for intensive work.

In addition to these things, I have to admit that I love how Julia lets me program in unicode symbols, so my equations in code can look like my equations on paper.

There are some downsides to Julia that I ought to point out.

1. The error messages can be very cryptic. 2. The documentation is poor (no worse than Python, I think, but a lot worse than R). 3. The development community is a bit ivory-tower-ish. The usual answer to "could you improve the documentation" is "please submit a PR". 4. There is a learning curve to get used to Julia. It's easier if you are already familiar with Fortran and Matlab. Then again, most scientists I know are familiar with both those languages (and Python, and R, and ...) so learning a new language is not a challenge. Scientists like to learn stuff -- that's the whole point of being in science, for most of us.

nologic01
0 replies
22h7m

Python is now bigger than what even its most keen friends ever imagined might happen. This is not automatically solving the issues, nor make it more likeable to those who dislike it, but it does incentivise serious plumbing work.

dilawar
0 replies
20h1m

SOS Justine's Cosmopolitan.

gamache
5 replies
23h51m

One nit to pick: as far as I am aware, "aarch64" and "arm64" are the same thing. Am I off?

stefan_
1 replies
23h28m
pitaj
0 replies
21h49m

I love how he ranted so hard - without even mentioning the embedded or realtime variants.

mschuster91
1 replies
23h47m

They are the same thing but there used to be two competing LLVM implementations on the backend side.

[1] https://www.phoronix.com/news/MTY5ODk

gamache
0 replies
23h39m

Thanks, that is very helpful!

nerdponx
0 replies
23h48m

What I see in Python is that "aarch64" usually refers to Linux and "arm64" usually refers to MacOS ARM. I don't know enough about these things to understand why they have different names.

aj7
5 replies
22h8m

Does anyone else find the handling of arrays by Python so horrific that they can’t bring themselves to use it?

__MatrixMan__
2 replies
21h52m

Do you mean lists, or is there some array type that I'm not thinking of?

hurryer
1 replies
21h36m

There is the array module, but it's cumbersome to use.

https://docs.python.org/3/library/array.html

But unlikely this is what parent meant.

kevin_thibedeau
0 replies
19h33m

Probably meant ndarray.

kstrauser
0 replies
22h7m

In what way?

ketozhang
0 replies
18h11m

It's not popular because you're mostly hearing from the science community who want more features in their array (vector/matrix/tensors).

Why would you want to use C-like arrays in Python anyways?

bee_rider
3 replies
23h18m

I’m under the impression that the best BLASes are mostly C (MKL, Blis, and OpenBlas). I wonder how far they could get with just C and Python.

I wonder if they’d just go with libflame instead, if they started today.

Of course there’s lots of other functionality in scipy; iterative stuff, sparse stuff, etc etc, so maybe Fortran is unavoidable (although, Fortran is a great language, I’m glad the tooling situation is at least starting to improve on Windows).

pharrington
1 replies
18h50m

not gonna lie, seeing "Fortran is a great language, I’m glad the tooling situation is at least starting to improve on Windows" was not on my 2023 bingo card

bee_rider
0 replies
6h41m

I assume that is because Fortran is such a great language that you just didn’t think it was possible that the tooling situation on Windows could be anything but perfect. :)

cdavid
0 replies
23h4m

The question of removing fortran from scipy has happened a few times, and never got anywhere for the reasons you gave. A lot of scipy itself contains fortran code that would take man years to rewrite.

Several key parts using fortran have been removed, once it became possible. For example fft stuff.

vintagedave
1 replies
13h58m

An amazing read. Most of the article notes that there was no Fortran compiler yet (though amazing how well the one they landed on worked -- a miracle alright!) but I was particularly struck by the ABI issues. The story of a lone developer hacking a MinGW build that used the right ABI struck home.

I can provide some background on the ABI issues in Windows.

The following is personal opinion not company opinion, and I am a product manager not an engineer so may have some technical details wrong. But:

At Embarcadero we're moving our C++Builder toolchain forward to a new Clang[1], and using a new C and C++ RTL layer [also see 1]. Like SciPy, we're now using UCRT, and the key bits that cause difficulty are the C++ RTL and platform ABI. Boy howdy do we have some stories.

Issues:

* There is no standard Windows ABI beyond what msvc produces. This results in WinAPI (C-level, in other words!) APIs that could not in the past be reliably used from anything other than msvc because they could throw exceptions, rather than returning error codes, and Win64 SEH is not fully documented (clang-cl, which is open source, disappears into the closed source msvcrt to handle this.)

* Or another issue: cdecl isn't standardised. This may surprise those of you who think -- as I used to think! -- that cdecl is simple and a known calling convention for any platform. We have issues where sret for return values[2] can be different between our toolchain and something built with MSVC. Since we need to change multiple languages, and one of our languages (Delphi) handles some returns (managed types) differently, changing this is more complex than it looks. We already did a lot of ABI compatibility work several years ago between multiple languages[3].

Back to the ABI: our new Clang is aiming at being fairly compatible with mingw-llvm on Windows. (And MSVC too, but we're starting with mingw-llvm as the basis.) That does not include a C++ ABI which the C++ committee has been resistant to, but it's a known good, working C++ toolchain, open source.

If Windows ever did have a platform ABI, it would likely be based on MSVC, but I would suggest that we -- developers -- should resist that until or unless the entire toolchain including runtime internals that affect the ABI is open sourced, or at least documented so that other toolchains can match it.

[1] https://blogs.embarcadero.com/win64-clang-toolchains-in-rad-...

[2] sret is a hidden (?) or special return value used for structs, basically when a large (> register size) memory is required. I think. This is where I hope I don't embarass myself too much in the explanation. See eg https://stackoverflow.com/questions/66894013/when-calling-ff... In other words, returning values is more complex than it seems, even for a plain simple C method returning a struct which should be _incredibly basic_ and becomes a complex ABI issue.

[3] https://blogs.embarcadero.com/abi-changes-in-rad-studio-10-3...

dist-epoch
0 replies
9h47m

closed source msvcrt

Isn't msvcrt source code included with Visual Studio? I distinctly remember looking through it many years ago.

And I see it now too on my install: C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\crt\src

But maybe it's not 100% of it and some key parts and missing, and obviously there might be legal issues reading this source code.

tsegratis
1 replies
22h39m

The κατα in καταστροφή means 'down to' or 'according to' rather than sudden, and implies strongly a bad turn

The inverse of κατα is often ανα though αναστροφή means literally "up turn" or invert

So maybe ευστροφη a "good turn" (eustrophe) would be better coinage

But arguing with JRR Tolkien on language coinage and being right would be a ... ευκαταστροφη i.e. good luck!

But in general I love their noticing of the overflowing grace -- that's something that gives joy and happiness in the world

coldtea
0 replies
21h16m

"Kata" in katastrofi actually means "against", so katastrofi is things "turning against".

pietroppeter
1 replies
19h56m

> While Fortran has long been the butt of the joke in IT departments the world over, in a curious twist of fate, it has seen a dramatic resurgence over the last few years. While the reasons for this are not exactly obvious ...

It is obvious to me. :)

https://x.com/OndrejCertik/status/1722364038212899274?s=20

resurrecting fortran: https://ondrejcertik.com/blog/2021/03/resurrecting-fortran/

bjornasm
0 replies
11h18m

This was a great read, thx :) Of all the other languages than Python, I am the most drawn to Fortran. I used it for some university project and it was quite nice. Hope I'll find a good excuse to pick it up for some project.

panzi
1 replies
20h54m

About the compiler/architecture table: Whats the difference between arm64 and aarch64? The Wikipedia article to ARM64 redirects to AArch64.

stingraycharles
0 replies
20h40m

I was thinking the same thing. Maybe they meant the armhfp architecture?

drtournier
1 replies
23h41m
josh-sematic
0 replies
22h48m

Also https://xkcd.com/1987/

Edit: not to imply that the work of the maintainers hasn’t been INCREDIBLE (it has). I just thought this XKCD was a funny take on how complex the python packaging ecosystem is.

d3m0t3p
1 replies
23h16m

In their table about compilers, isn't AArch64 and ARM64 the same ?

tempay
0 replies
23h11m

Yes, though different operating systems use different naming. On Linux it's `aarch64` and on macOS+Windows it's `arm64`.

Grimburger
1 replies
23h10m

GCC..there are ways to make it work on Windows (through cygwin or MinGW)

I assume it would be working on WSL also? Which seems like a simple solution to the mess that is developing on Windows, especially in Python.

mkoubaa
0 replies
22h41m

No, Python on windows needs to work natively without wsl.

warangal
0 replies
14h51m

Indeed it was a great read about what goes behind the scenes for maintaining such packages and a problem generally every developer thinks about when such an extension has to be shipped along .

I still wonder if targeting Python's LIMITED C API wouldn't help in this case. I use a tool (for Nim) which seems to target that Limited API and solves the problem of specific python version mismatching at-least for my specific code. I never had to upgrade a pure C/Nim extension due to python version !

Using zig-cc along with a fixed GLIBC (2.27) and generic CPU flag also made it possible to target linux ecosystem in case user is not a developer and just use the compiled extension shipped with the package.

selimnairb
0 replies
20h57m

Great read. After spending a lot of time this year modernizing a CMake C++ project with Python bindings, which I successfully added to conda-forge as a new feedstock, I can say with great confidence that the first IT-related change I would make as God Emperor would be to extirpate Windows from all Universes for Eternity.

richwater
0 replies
22h27m

Yet another reason the Python ecosystem and installation is unfixable.

pklausler
0 replies
1d

I'm very glad that f18 worked (so far as we know). I haven't tried building SciPy for any platform yet myself, much less Windows.

ngcc_hk
0 replies
8h29m

Just go to read: https://www.bitecode.dev/p/relieving-your-python-packaging-p...

TL; DR

The reason why we need virtual this and that in python is that some work in one and some work in the others. This is particularly problematic in macOS where sometimes the system fall back to the default (somewhat default) python compiler. I even have script to check version and lots of environment for running different program.

It is a mess.

After this magic, is there a new packaging environment, procedure, ... for the rest of us?

hazrmard
0 replies
5h0m

https://archive.is/SyeRt

Archive link. Website doesn't load for me.

eviks
0 replies
14h0m

Heroes and miracles are indeed required in the swamps of bad designs

crabbone
0 replies
6h31m

I just want to comment on this:

Meson was going to refuse to accept the MSVC+gfortran combination

Back in the days, I went to Python's core-dev list and asked why. Why would any sane person ever use MSVC for a cross-platform language runtime. And guess what the answer was? Well... The answer was "Microsoft pays us, gives us servers to run CI on, and that's why we will use Microsoft's tools, goodby!"

For reference, Ruby uses GCC for the same purpose as do plenty of other similar languages for this exact reason.

To give you some context, I ran into this problem when writing bindings to kubectl. For those of you that don't know, in order to interface with Python from Go, one needs CGO, and on MS Windows it means MinGW. You could, in principle, build Python itself with GCC (a.k.a. MinGW) (and that's what MSYS2 a.k.a Cygwin a.k.a. Gitbash does), but this means no ABI compatibility with the garbage distributed from python.org.

So, after I had a proof of concept bindings to kubectl working on Linux, I learned that there will be no way (well, no reasonably simple way) to get that working on Windows. So, the project died. (Btw, there still isn't a good Kubernetes client in Python).

---

On the subject of packaging. I've decided to write my own Wheel packager. Just as a way to learn Ada. This made me read through the "spec" of this format while paying a lot more attention that I ever needed before. And what a dumpster fire this format is... It's insane that this atrocity is used by millions, and so much of critical infrastructure relies on this insanity to function.

It's very sad that these things are only ever discussed by a very small, very biased, and not very smart group of people. But then their decisions affect so many w/o even the baseline knowledge of the decisions made by those few. I feel like Python users should be picking up pitchforks and torches and marching on PyPA (home-)offices and demand change. Alas, most those adversely affected by their work have no idea PyPA exists, forget the details of their work.

carapace
0 replies
4h55m

Having finally gotten over the whole 2->3 fiasco I'm still sitting out new Python development until the PyPA sorts out the new packaging & distribution story. They're working on it and making progress, but it's still pretty gnarly.

Anyway, the complexity is too damn high!

Decabytes
0 replies
9h57m

Sometimes I wonder if Rust’s biggest draw is not that is is safe, but that is removed a lot of the BS hoops you have to jump through to get a working program on your computer.

I wonder if the Python alternatives will also get a similar boost for that reason. If maintaining the language and the ecosystem is that much of a bear, it would save a lot of human hours to do that. Even if we started back from scratch for a bit