I've tried again and again to like Nix, but at this point I have to throw in the towel.
I have 2 systems running Nix, and I'm afraid to touch them. I've already broken both of them enough that I had to reinstall from scratch in the past (yes yes - it's supposed to be impossible I know), and now I've forgotten most of it. In theory, Nix is idempotent and deterministic, but the problem is "deterministic in what way?" Unless you intimately understand what every dependent part is doing, you're going to get strange results and absolutely bizarre and unhelpful errors (or far more likely: nothing at all, with no feedback). Nix feels more like alchemy than science. Like trying to get random Lisp packages to play nice together.
Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse), and tutorials only get you part of the way. The moment you step off the 80% path, you're in for a world of hurt, because the underlying components are just not built to support anything else. Sure, you can always "build your own", but this requires years of experiential knowledge and layers upon layers of frustration that I just don't want to deal with anymore (which is also why I left Gentoo all those years ago). And woe unto you if you want to use a more modern version than the distribution supports!
The strength of Docker is the chaos itself. You can easily build pretty much anything, without needing much more than a cursory understanding of the shell and your distro's package manager. Or you can mix and match whatever the hell you want! When things break, it's MUCH easier to diagnose and fix the problems because all of the tooling has been around for decades, which makes it mature enough to handle edge cases (and breakage is almost ALWAYS about edge cases).
Nix is more like Emacs: It can do absolutely anything if you have the patience for it and the deep, arcane knowledge to keep it from exploding in a brilliant flash of octarine. You either go full-in and drink the kool aid, or you keep it at arm's length - smiling and nodding as you back slowly towards the door whenever an enthusiast speaks.
I've gone down the same path. I love deterministic builds, and I think Docker's biggest fault is that to the average developer a Dockerfile _looks_ deterministic - and it even is for a while (build a container twice in a row on the same machine => same output), but then packages get updated in the package manager, base images get updated w/ the same tag, and when you rebuild a month later you get something completely different. Do that times 40 (the number of containers my team manages) and now fixing containers is a significant part of your job.
So in theory Nix would be perfect. But it's not, because it's so different. Get a tool from a vendor => won't work on Nix. Get an error => impossible to quickly find a solution on the web.
Anyway, out of that frustration I've funded https://www.stablebuild.com. Deterministic builds w/ Docker, but with containers built on Ubuntu, Debian or Alpine. Currently consists of an immutable Docker Hub pull-through cache, full daily copies of the Ubuntu/Debian/Alpine package registries, full daily copies of most popular PPAs, daily copies of the PyPI index (we do a lot of ML), and arbitrary immutable file/URL cache.
So far it's been the best of both worlds in my day job: easy to write, easy to debug, wide software compatibility, and we have seen 0 issues due to non-determinism in containers that we moved over to StableBuild in my day job.
I don't have any experience with Nix but regarding stable builds of Docker: we provide Java application, have all dependencies as fixed versions so when doing a release, if someone is not doing anything fishy (re-releasing particular version, which is bad-bad-bad) you will get exactly same binaries on top of the same image (again, considering you are not using `:latest` or somesuch)...
Until someone overwrites or deletes the Docker base image (regularly happens), or when you depend on some packages installed through apt - as you'll get the latest version (impossible to pin those).
Any source of that claim?
Well... please re-read my previous comment - we do Java thing so we use any JDK base image and then we slap our distribution on top of it (which are mostly fixed-version jars).
Of course if you are after perfection and require additional packages then you can install it via dpgk or somesuch but... do you really need that? What about security implications?
Any tag like ubuntu:20.04 -> this tag gets overwritten every time there's a new release (which is very often)
https://hub.docker.com/r/nvidia/cuda -> these get removed (see e.g. https://stackoverflow.com/questions/73513439/on-what-conditi...)
Do you have an example that isn't Nvidia? They're infamous for terrible Linux support, so an egregious disregard for tag etiquette is entirely unsurprising.
You gave example of nvidia and not ubuntu itself. What's more, you are referring to devel(opment) version, i.e. "1.0-devel-ubuntu20.04" which seems like a nightly so it's expected to be overriden (akin to "-SNAPSHOT" for java/maven)?
Besides, if you really need utmost stability you can use image digest instead of tag and you will always get exactly the same image...
I am convinced that any sort of free public service is fundamentally incomapatible with long term reproducible builds. It is simply unfair to expect free service to maintain archives forever and never clean them up, rename itself, or go out of business.
If you want reproducibility, the first step is to copy everything to a storage you control. Luckily, this is pretty cheap nowdays
Another option for reproducible container images is https://github.com/reproducible-containers although you may need to cache package downloads yourself, depending on the distro you choose.
Yeah, very similar approach. We did this before, see e.g. https://www.stablebuild.com/blog/create-a-historic-ubuntu-pa... - but then figured everyone needs exactly the same packages cached, so why not set up a generic service for that.
For Debian, Ubuntu, and Arch Linux there are official snapshots available so you don't need to cache package downloads yourself. For example, https://snapshot.debian.org/.
Yes, fantastic work. Downside is that snapshot.debian.org is extremely slow, times out / errors out regularly - very annoying. See also e.g. https://github.com/spesmilo/electrum/issues/8496 for complaints (but it's pretty apparent once you integrate this in your builds).
Ubuntu now has snapshot.ubuntu.com, see https://ubuntu.com/blog/ubuntu-snapshots-on-azure-ensuring-p...
Here's a related discussion about reproducible builds by the Docker people, where they provide some more details: https://github.com/docker-library/official-images/issues/160...
The pricing page for StableBuild says
Free …
Number of Users 1
Number of Users 15GB
Is that a mistake or if not can you explain please?
https://www.stablebuild.com/pricing
Ah, yes, on mobile it shows the wrong pricing table... Copying here while I get it fixed:
Free => Access to all functionality, 1 user, 15GB traffic/month, 1GB of storage for files/URLs. $0
Pro => Unlimited users, 500GB traffic included (overage fees apply), 1TB of storage included. $199/mo
Enterprise => Unlimited users, 2,000GB traffic included (overage fees apply), 3TB of storage included, SAML/SSO. $499/mo
Are you associated with the project?
I’m an investor in StableBuild.
Just pin the dependencies and your mostly fine right?
Yeah, but it's impossible to properly pin w/o running your own mirrors. Anything you install via apt is unpinnable, as old versions get removed when a new version is released; pinning multi-arch Docker base images is impossible because you can only pin on a tag which is not immutable (pinning on hashes is architecture dependent); Docker base images might get deleted (e.g. nvidia-cuda base images); pinning Python dependencies, even with a tool like Poetry is impossible, because people delete packages / versions from PyPI (e.g. jaxlib 0.4.1 this week); GitHub repos get deleted; the list goes on. So you need to mirror every dependency.
Huh, I have never had this issue with apt (Debian/Ubuntu) but frequently with apk/Alpine: The package's latest version this week gets deleted next week.
What is an efficient process to avoid using versions with known vulnerabilities for long times when using a tool like stablebuild?
Very nice project!
But also Nix solves more problems than Docker. For example if you need to use different versions of software for different projects. Nix lets you pick and choose the software that is visible in your current environment without having to build a new Docker image for every combination, which leads to a combinatorial explosion of images and is not practical.
But I also agree with all the flaws of Nix people are pointing out here.
>Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse)
Documentation is often just plain erroneous, especially for the new CLI and flakes, not even edge cases. I remember spending some time trying to understand why nix develop doesn't work like described and how to make it work like it should. I feel like nobody ever actually used it for its intended purpose. Turns out that by default it doesn't just drop you into the build-time environment like the docs claim (hermetically sealed with stdenv scripts available), it's not sealed by default and the commandline options have confusing naming, you need to fish out the knowledge from the sources to make it work. Plenty of little things like this.
>In theory, Nix is idempotent and deterministic
I surely wish they talked more about edge cases that break reproducibility. Things like floating point code being sensitive to the order of operations with state potentially leaking from OS preemption, and all that. Which might be obvious, but not saying obvious things explicitly is how you get people shoot themselves in the foot.
That’s profoundly cursed and also something that doesn’t happen, to my knowledge. Unless the kernel programmer screwed up, an x86-64 FPU is perfectly virtualizable (and I expect an AArch64 FPU too, I just haven’t tried). So it doesn’t matter where preemtion happens.
(What did happen with x87 is that it likes to compute things in more precision than you requested, depending on how it’s configured—normally determined by the OS ABI. Yet variable spills usually happened in the declared precision, so you got different results depending on the particulars of the compiler’s register allocator. But that’s still a far cry from depending on preemption of all things, and anyway don’t use x87.
Floating-point computation does depend on associativity, in that nearestfp(nearestfp(a+b)+c) is not the same as nearestfp(a+nearestfp(b+c)), but the sane default state is that the compiler will reproduce the source code as written, without reassociating things behind your back.)
That's doesn't happen in a single thread, but e.g. asynchronous multithreaded code can spit values in arbitrary order, and depending on what you do you can end up with a different result (floating point is just an example). Generally, you can't guarantee 100% reproducibility for uncooperative code because there's too much hardware state that can't be isolated even in a VM. Sure, 99% software doesn't depend on it or do cursed stuff like microarchitecture probing during building, and you won't care until you try to package some automated tests for a game physics engine or something like that. What can happen, inevitably happens.
We don't need to be looking for such contrived examples actually, nixpkgs track the packages that fail to reproduce for much more trivial reasons. There aren't many of them, but they exist:
https://github.com/NixOS/nixpkgs/issues?q=is%3Aopen+is%3Aiss...
Less than a couple of thousand packages are reproduced. Nobody has even attempted to rebuild the entirety of the nixpkgs repository and I'd make a decent wager on it being close to impossible.
It’s really not that bad. However, with a standard NixOS setup, you still have a tremendous amount of non-reproducible state, both inside user accounts and in the system. I’m running a “Erase your darlings” setup, it mostly gets rid of non-reproducible state outside my user account. It’s a bit of a pain, but then what isn’t on NixOS.
https://grahamc.com/blog/erase-your-darlings/
Inside my user account, I don’t bother. I don’t like Home Manager.
A nice upgrade to that is to put root in a tmpfs RAM filesystem instead of ZFS:
https://elis.nu/blog/2020/05/nixos-tmpfs-as-root/
That way it doesn't even need to bother with resetting to ZFS snapshots, instead it just wipes root on shutdown and reconstructs it in RAM on reboot.
Then, optionally, with some extra work you can put /home in tmpfs too:
https://elis.nu/blog/2020/06/nixos-tmpfs-as-home/
That setup uses Home Manager, so maybe it's not for you, but worth mentioning if we're talking about making all state declarative and reproducible. You have to use the Impermanence module and set up some soft links to permanent home folders on different drive or partition. But for making all state on the system reproducible and declarative, this is the best way afaik.
Thanks, that's interesting. It allows one to stick to "regular Linux filesystems", which is probably a good thing.
True, I think it's more a more elegant setup than the ZFS version. Why actively rollback to a snapshot when ephemeral memory will do that automatically on reboot.
That said I'll just mention that ZFS support on NixOS is like nothing else I've seen in Linux. ZFS is like a first-class citizen on NixOS, painless to configure and usually just works like any other filesystem.
https://old.reddit.com/r/NixOS/comments/ops0n0/big_shoutout_...
That depends whether you are okay with chaos.
It appears that you are, so it is suitable tool for you. Choose the right tool for the right job.
---
Docker is a poor choice for people who are interested in deterministic/reproducible builds.
I’m not sure exactly why this is being downvoted. It seems pretty fair to want your container builds to not fail because of the “chaos” with docker images and how they change quite a lot. This isn’t about the freedom to build how you want, it’s about securing your build pipelines so that they don’t break at 4am because docker only builds 99% of the time.
I’ll use docker, I like docker, but I can see the point of how it’s not necessarily advantageous if stability is your main goal.
I recently faced a similar hurdle with Nix, particularly when trying to run a .NET 8 AOT application. What initially seemed like it would be a simple setup spiraled into a plethora of issues, ultimately forcing me to back down. I found myself having to abandon the AOT method in favor of a more straightforward solution. To give credit where it's due, .NET AOT is relatively new and, as far as I know, still has several kinks that need ironing out. Nonetheless, I agree that, at least based on my experience, you need a solid understanding of the ins and outs before you can be reasonably productive using Nix.
.NET AOT really is not designed for deployment, in my experience - for example, the compilation is very hard to do in Nix-land, because a critical part of the compilation is to download a compiler from NuGet at build-time. It's archetypical of the thousand ways that .NET drives me nuts in general.
Give a try to Fedora Atomic (immutable). At this point I have pretty much played around and used every distro package maneger there is and I have broken all of them in one way or another even without doing something exotic (pacman I am looking at you). My Fedora Kinoite is still going strong even with adding/removing different layers, daily updates, and a rebase from Silverblue. Imho rpm-ostree will obsolete Nix.
How do you alter layering without a restart? Just have an immutable base and do other rpm-ostrees in containers? Is that what flatpak is up to?
I use both Docker and NixOs at work. I've never had any of the problems you seemed to have above. Docker is fine, performance wise it's not great on Macs. I love nix because it's trivial to get something to install and behave the same across different machines.
Nix Doc are horrible but I've found that ChatGPT4 is awesome at troubleshooting Nix issues.
I feel like 90% of the time I run into Nix issues, it's because I decided to do something "Not the Nix way."
I'm just here to give you points for the Discworld reference.
It has a bit of a learning curve that is worth it - it's an incredible tool.
That has been the case for as long as I can remember. I gave up on Nix about 5 years ago because of it, and apparently not much has changed on that front since then..
Maybe it won't be your cup of tea given your reference to Emacs, but there's guix if you want to try a saner alternative to nix.
Yes at this point I hope someone builds a friendlier version on top of Nix, so we can cleanly migrate completely away from it.
Just out of curiosity. What were you trying to do that didn't work?
You complain about the documentation, and the first thing I wonder is if you’ve tried using one of the prominent chatbots like chatgpt or claude to help fill in the gaps of said documentation? Maybe an obvious thing to do around here, but I’ve found they help fill in documentation gaps really well. At the same time Nix is so niche there might not have been enough information out there to feed into even chatgpt’s model…