As a permanent "out of style" curmudgeon in the last ~15 years, I like that people are discovering that maybe VMs are in fact the best approach for a lot of workloads and the LXC cottage industry and Docker industrial complex that developed around solving problems created by themselves or solved decades ago might need to take a hike.
Modern "containers" were invented to make things more reproducible ( check ) and simplify dev and deployments ( NOT check ).
Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow, I didn't dig too deep into this is practice, maybe I'm afraid to learn the contrary, but I hope not.
Either way Docker is "fine" but WAY overused and overrated IMO.
As the person who created docker (well, before docker - see https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and compare to docker), I argued that it wasn't just good for containers, but could be used to improve VM management as well (i.e. a single VM per running image - seehttps://www.usenix.org/legacy/events/lisa11/tech/full_papers...)
I then went onto built a system with kubernetes that enabled one to run "kubernetes pods" in independent VMs - https://github.com/apporbit/infranetes (as well as create hybrid "legacy" VM / "modern" container deployments all managed via kubernetes.)
- as a total aside (while I toot my own hort on the topic of papers I wrote or contributed to), note the reviewer of this paper that originally used the term Pod for a running container - https://www.usenix.org/legacy/events/osdi02/tech/full_papers... - explains where Kubernetes got the term from.
I'd argue that FreeBSD Jails / Solaris Zones (Solaris Zone/ZFS inspired my original work) really aren't any more secure than containers on linux, as they all suffer from the same fundamental problem of the entire kernel being part of one's "tcb", so any security advantage they have is simply due lack of bugs, not simply a better design.
Would you say approaches like gvisor or nabla containers provide more/enough evolution on the security front? Or is there something new on the horizon that excites you more as a prospect?
GVisor basically works by intercepting all Linux syscalls, and emulating a good chunk of the Linux kernel in userspace code. In theory this allows lowering the overhead per VM, and more fine-grained introspection and rate limiting / balancing across VMs, because not every VM needs to run it's own kernel that only interacts with the environment through hardware interfaces. Interaction happens through the Linux syscall ABI instead.
From an isolation perspective it's not more secure than a VM, but less, because GVisor needs to implement it's own security sandbox to isolate memory, networking, syscalls, etc, and still has to rely on the kernel for various things.
It's probably more secure than containers though, because the kernel abstraction layer is separate from the actual host kernel and runs in userspace - if you trust the implementation... using a memory-safe language helps there. (Go)
The increased introspectioncapabiltiy would make it easier to detect abuse and to limit available resources on a more fine-grained level though.
Note also that GVisor has quite a lot of overhead for syscalls, because they need to be piped through various abstraction layers.
I actually wonder how much "overhead" a VM actually has. i.e. a linux kernel that doesn't do anything (say perhaps just boots to an init that mounts proc and every n seconds read in/prints out /proc/meminfo) how much memory would the kernel actually be using?
So if processes in gvisor map to processes on the underlying kernel, I'd agree it gives one a better ability to introspect (at least in an easy manner).
It gives me an idea that I'd think would be interesting (I think this has been done, but it escapes me where), to have a tool that is external to the VM (runs on the hypervisor host) that essentially has "read only" access to the kernel running in the VM to provide visibility into what's running on the machine without an agent running within the VM itself. i.e. something that knows where the processes list is, and can walk it to enumerate what's running on the system.
I can imagine the difficulties in implementing such a thing (especially on a multi cpu VM), where even if you could snapshot the kernel memory state efficiently, it be difficult to do it in a manner that provided a "safe/consistent" view. It might be interesting if the kernel itself could make a hypercall into the hypervisor at points of consistency (say when finished making an update and about to unlock the resource) to tell the tool when the data can be collected.
What I really want is a "magic" shell on a VM - i.e. the ability using introspection calls to launch a process on the VM which gives me stdin/stdout, and is running bash or something - but is just magically there via an out-of-band mechanism.
Not really "out of band", but many VMs allow you to setup a serial console, which is sort of that, albeit with a login, but in reality, could create one without one, still have to go through hypervisor auth to access it in all cases, so perhaps good enough for your case?
Indeed, easy enough to get a serial device on Xen.
Another possibility could be to implement a simple protocol which uses the xenstore key/value interface to pass messages between host and guest?
You can launch KVM/qemu with screen + text console, and just log in there. You can also configure KVM to have a VNC session on launch, and that ... while graphical, is another eye into the console + login.
(Just mentioning two ways without serial console to handle this, although serial console would be fine.)
https://github.com/Wenzel/pyvmidbg
more: https://github.com/topics/virtual-machine-introspectionthanks, the kvm-vmi is basically an expansive version of what I was imagining (maybe read about it before, as noted, I thought it existed).
Would you recommend ZFS as a building block for modern "VLFS"?
https://blog.chlc.cc/p/docker-and-zfs-a-tough-pair/
Not quite what you are after, but comes close ... you could run gdb on the kernel in this fashion and inspect, pause, step through kernel code: https://stackoverflow.com/questions/11408041/how-to-debug-th....
You don't necessarily need to run a full operating system in your VM. See eg https://mirage.io/
There's already some memory sharing available using DAX in Kata Containers at least: https://github.com/kata-containers/kata-containers/blob/main...
Go is not memory safe even when the code has none of unsafe blocks, although through a typical usage and sufficient testing memory safety bugs are avoided. If one needs truly memory-safe language, then use Rust, Java, C# etc.
This sounds vaguely like the forgotten Linux a386 (not i386!).
been out of the space for a bit (though interviewing again, so might get back into it), gvisor at least as the "userspace" hypervisor, seemed to provide minimal value vs modern hypervisor systems with low overhead / quick boot VMs (ala firecracker). With that said, I only looked at it years ago, so I could very well be out of date on it.
Wasn't aware of Nabla, but they seem to be going with the unikernel approach (based on a cursory look at them). Unikernels have been "popular" (i.e. multiple attempts) in the space (mostly to basically run a single process app without any context switches), but it creates a process that is fundamentally different than what you develop and is therefore harder to debug.
while the unikernels might be useful in the high frequency trading space (where any time savings are highly valued), I'm personally more skeptical of them in regular world usage (and to an extent, I think history has born this out, as it doesn't feel like any of the attempts at it, has gotten real traction)
Modern gVisor uses KVM, not ptrace, for this reason.
so I did a check, it would seem that gvisor with kvm, mostly works for bare metal, not on existing VMs (nested virtualization).
https://gvisor.dev/docs/architecture_guide/platforms/
"Note that while running within a nested VM is feasible with the KVM platform, the systrap platform will often provide better performance in such a setup, due to the overhead of nested virtualization."
I'd argue then for most people (unless have your own baremetal hyperscaler farm), one would end up using gvisor without kvm, but speaking from a place of ignorance here, so feel free to correct me.
I think many people (including unikernel proponents themselves) vastly underestimate the amount of work that goes into writing an operating system that can run lots of existing prod workloads.
There is a reason why Linux is over 30 years old and basically owns the server market.
As you note, since it's not really a large existing market you basically have to bootstrap it which makes it that much harder.
We (nanovms.com) are lucky enough to have enough customers that have helped push things forward.
For the record I don't know of any of our customers or users that are using them for HFT purposes - something like 99% of our crowd is on public cloud with plain old webapp servers.
I picked the name and wrote the first prototype (python2) of Docker in 2012. I had not read your document (dated 2010). I didn't really read English that well at the time, I probably wouldn't have been able to understand it anyways.
https://en.wikipedia.org/wiki/Multiple_discovery
More details for the curious: I wrote the design doc and implemented the prototype. But not in a vacuum. It was a lot work with Andrea, Jérôme and Gabriel. Ultimately, we all liked the name Docker. The prototype already had the notion of layers, lifetime management of containers and other fundamentals. It exposed an API (over TCP with zerorpc). We were working on container orchestration, and we needed a daemon to manage the life cycle of containers on every machine.
I'd note I didn't say you copied it, just that I created it first (i.e. "compare paper to docker". also, as you note, its possible someone else did it too, but at least my conception got through academic peer-review / patent office, yeah, there's a patent, never been attempted to be enforced though to my knowledge).
when I describe my work (I actually should have used quotes here), I generally give air quotes when saying it, or say "proto docker", as it provides context for what I did (there's also a lot of people who view docker as synonymous with containerization as a whole, and I say that containers existed way before me). I generally try to approach it humbly, but I am proud that I predicted and built what the industry seemingly needed (or at least is heavily using).
people have asked me why I didn't pursue it as a company, and my answer is a) I'm not much of an entrepreneur (main answer), and b) I felt it was a feature, not a "product", and would therefore only really profitable for those that had a product that could use it as a feature (which one could argue that product turned out to be clouds, i.e. they are the ones really making money off this feature). or as someone once said a feature isn't necessarily a product and a product isn't necessarily a company.
I understood your point. I wanted to clarify, and in some ways connect with you.
At the time, I didn't know what I was doing. Maybe my colleagues did some more, but I doubt that. I just wanted to stop waking up at night because our crappy container management code was broken again. The most brittle part was the lifecycle of containers (and their filesystem). I recall being very adamant about the layered filesystem, because it allowed to share storage and RAM across running (containerized) processes. This saves in pure storage and RAM usage, but also in CPU time, because the same code (like the libc for example) is cached across all processes. Of course this only works if you have a lot of common layers. But I remember at the time, it made for very noticeable savings. Anyways, fun tidbits.
I wonder how much faster/better it would have been if inspired by your academic research. Or maybe not knowing anything made it so we solved the problems at hand in order. I don't know. I left the company shortly after. They renamed to Docker, and made it what it is today.
they did it "simpler", i.e. academic work has to be "perfect" in a way a product does not. so (from my perspective), they punted the entire concept of making what I would refer to as a "layer aware linux distribution" and just created layers "on demand" (via RUN syntax of dockerfiles).
From an academic perspective, its "terrible", so much duplicate layers out in the world, from a practical perspective of delivering a product, it makes a lot of sense.
It's also simpler from the fact that I was trying to make it work for both what I call "persistent" containers (ala pets in the terminology) that could be upgraded in place and "ephemeral" containers (ala cattle) when in practice the work to enable upgrading in place (replacing layers on demand) to upgrade "persistent" containers I'm not sure is that useful (its technologically interesting, but that's different than useful).
My argument for this was that this actually improves runtime upgrading of systems. With dpkg/rpm, if you upgrade libc, your systems is actually temporarily in a state where it can't run any applications (in the delta of time when the old libc .so is deleted and the new one is created in its place, or completely overwrites it), any program that attempts to run in that (very) short period time, will fail (due to libc not really existing). By having a mechanism where layers could be swapped in essentially an atomic manner, no delete / overwrite of files occurs and therefore there is zero time when programs won't run.
In practice, the fact that a real world product came out with a very similar design/implementation makes me feel validated (i.e. a lot of phd work is one offs, never to see the light of day after the papers for it are published).
Would you consider there to be any 'layer-aware Linux distributions' today, e.g., NixOS, GuixSD, rpm-ostree-based distros like Fedora CoreOS, or distri?
Have you seen this, which lets existing container systems understand a Linux package manager's packages as individual layers?
https://github.com/pdtpartners/nix-snapshotter
(Not GP.)
NixOS can share its Nix store with child (systemd-nspawn) containers. That is, if you go all in, package everything using Nix, and then carefully ensure you don’t have differing (transitive build- or run-time) dependency versions anywhere, those dependencies will be shared to the maximum extent possible. The amount of sharing you actually get matches the effort you put into making your containers use the same dependency versions. No “layers”, but still close what you’re getting at, I think.
On the other hand, Nixpkgs (which NixOS is built on top of) doesn’t really follow a discipline of minimizing package sizes to the extent that, say, Alpine does. You fairly often find documentation and development components living together with the runtime ones, especially for less popular software. (The watchword here is “closure size”, as in the size of a package and all of its transitive runtime dependencies.)
Yep. I remember before Nix even had multi-output derivations! I once broke some packages trying to reduce closure sizes when that feature got added, too. :(
Besides continuing to split off more dev and doc outputs, it'd be cool if somehow Nixpkgs had a `pkgsForAnts` just like it has a `pkgsStatic`, where packages just disable more features and integrations. On the other hand, by the time you're really optimizing your Nix container builds it's probably well worth it to use overrides and build from source anyway, binary cache be damned.
I'll try to get back to this to give a proper response, but can't promise.
I like to say that Docker wouldn’t exist if the Python packaging and dependency management system weren’t complete garbage. You can draw a straight line from “run Python” to dotCloud to Docker.
Does that jive with your experience/memory at all? How much of your motivation for writing Docker could have been avoided if there were a sane way to compile a Python application into a single binary?
It’s funny, this era of dotCloud type IaaS providers kind of disappeared for a while, only to be semi-revived by the likes of Vercel (who, incidentally, moved away from a generic platform for running containers, in favor of focusing on one specific language runtime). But its legacy is containerization. And it’s kind of hard to imagine the world without containers now (for better or worse).
I do not think the mess of dependency management in Python got us to Docker/containers. Rather Docker/containers standardized deploying applications to production. Which brings reproducibility without having to solve dependency management.
Long answer with context follows.
I was focused on container allocation and lifecycle. So my experience, recollection, and understanding of what we were doing is biased with this in mind.
dotCloud was offering a cheaper alternative to virtual machines. We started with pretty much full Linux distributions in the containers. I think some still had a /boot with the unused Linux kernel in there.
I came to the job with some experience testing deploying Linux at scale quickly by preparing images with chroot before making a tarball to then distribute over the network (via multicast from a seed machine) with a quick grub update. This was for quickly installing Counter-Strike servers for tournament in Europe. In those days it was one machine per game server. I was also used to run those tarball as virtual machines for throughout testing. To save storage space on my laptop at the time, I would hard-link together all the common files across my various chroot directories. I would only tarball to ship it out.
It turned out my counter-strike tarballs from 2008 would run fine as containers in 2011.
The main competition was Heroku. They did not use containers at the beginning. And they focused on running one language stack very well. It was Ruby and a database I forget.
At dotCloud we could run anything. And we wanted to be known serving everything. All your languages, not just one. So early on we started offering base images ready made for specific languages and database. It was so much work to support. We had a few base images per team member to maintain, while still trying to develop the platform.
The layered filesystem was to pack ressources more efficiently on our servers. We definitely liked that it saved build time on our laptop when testing (we still had small and slow spinning disks in 2011).
So I wouldn't say that Docker wouldn't exist without the mess of dependency management in software. It just happened to offer a standardized interface between application developers, and the person running it in production (devops/sre).
The fact you could run the container on your local (Linux) machine was great for testing. Then people realized they could work around dependency hell and non reproducible development environment by using containers.
I’m really confused. Solomon Hykes is typically credited as the creator of Docker. Who are you? Why is he credited if someone else created it?
This is the internet and just about everyone could be diagnosed with Not Invented Here syndrome. First one to get recognition for creating something that's already been created is just a popular meme.
He means he came up with the same concept, not literally created Docker.
Solomon was the CEO of dotCloud. Sébastien the CTO. I (François-Xavier "bombela") was one of the first software engineer employee. Along with Andrea, Jérôme and Louis with Sam as our manager.
When it became clear that we had reached the limits of our existing buggy code, I pushed hard to get to work on the replacement. After what seemed an eternity pitching Solomon, I was finally allowed to work on it.
I wrote the design doc of Docker with the help of Andrea, Jérôme, Louis and Gabriel. I implemented the prototype in Python. To this day, three of us will still argue who really choose the name Docker. We are very good friends.
Not long after, I left the company. Because I was under paid, I could barely make ends meet at the time. I had to borrow money to see a doctor. I did not mind, it's the start-up life, am I right? I worked 80h/week happy. But then I realized not everybody was under paid. And other companies would pay me more for less work. When I asked, Solomon refused to pay me more, and after being denied three times, I quit. I never got any shares. I couldn't afford to buy the options anyways, and they had delayed the paperwork multiple times, such that I technically quit before the vesting started. I went to Google, were they showered me with cash in comparison. The next morning after my departure from dotCloud, Solomon raised everybody's salary. My friends took me to dinner to celebrate my sacrifice.
I am not privy to all the details after I left. But here is what I know. Andrea rewrote Docker in Go. It was made open source. Solomon even asked me to participate as an external contributor. For free of course. As a gesture to let me add my name to the commit history. Probably the biggest insult I ever received in life.
dotCloud was renamed Docker. The original dotCloud business was sold to a German company for the customers.
I believe Solomon saw the potential of Docker for all, and not merely an internal detail within a distributed system designed to orchestrate containers. My vision was extremely limited, and focused on reducing the suffering of my oncall duties.
A side story: the German company transfered to me the trademark of zerorpc, the open source network library powering dotCloud. I had done a lot of work on it. Solomon refused to hand me off the empty github/zerorpc group he was squatting. He offered to grant me access but retain control. I went for github/0rpc instead. I did not have time nor money to spend on a lawyer.
By this point you might think I have a vendetta against a specific individual. I can assure you that I tried hard to paint things fairly with a flattering light.
feels like maybe there is some corelation
Believe Google embarked on this path with Crostini for ChromiumOS [0], but now it seems like they're going to scale down their ambitions in favour of Android [1]. Crostini may not but looks like the underlying VMM (crosvm) might live on [2].
Jails (or an equivalent concept/implementation) come in handy where the Kernel/OS may want to sandbox higher privilege services (like with minijail in ChromiumOS [3]).
[0] https://www.youtube.com/watch?v=WwrXqDERFm8&t=300 / summary: https://g.co/gemini/share/41a794b8e6ae (mirror: https://archive.is/5njY1)
[1] https://news.ycombinator.com/item?id=40661703
[2] https://source.android.com/docs/core/virtualization/virtuali...
[3] https://www.chromium.org/chromium-os/developer-library/guide...
And also CPU branch prediction state, RAM chips, etc. The side-channels are legion.
My infra is this exactly. K8s managed containers that manage qemu VMs. Every VM has its own management environment, they don’t ever see each other, and they work just the same as using virtual-manager, but I get infinite flexibility in my env provisioning before I start a VM that gets placed in its tenant network that is isolated.
For me it's about the ROAC property (Runs On Any Computer). I prefer working with stuff that I can run. Running software is live software, working software, loved software. Software that only works in weird places is bad, at least for me. Docker is pretty crappy in most respects, but it has the ROAC going for it.
I would love to have a "docker-like thing" (with ROAC) that used VMs not containers (or some other isolation tech that works). But afaik that thing does not yet exist. Yes there are several "container-tool, but we made it use VMs" (firecracker and downline), but they all need weirdo special setup, won't run on my laptop, or a generic Digitalocean VM.
Vagrant / Packer?
With all the mind share that terraform gets you would thing vagrant would at least be known but alas
Somebody educate me about the problem Packer would solve for you in 2024?
Making machine images. AWS calls them AMIs. Whatever your platform, that's what it's there for. It's often combined with Ansible, and basically runs like this:
1. Start a base image of Debian / Ubuntu / whatever – this is often done with Terraform.
2. Packer types a boot command after power-on to configure whatever you'd like
3. Packer manages the installation; with Debian and its derivatives, this is done mostly through the arcane language of preseed [0]
4. As a last step, a pre-configured SSH password is set, then the new base VM reboots
5. Ansible detects SSH becoming available, and takes over to do whatever you'd like.
6. Shut down the VM, and create clones as desired. Manage ongoing config in a variety of ways – rolling out a new VM for any change, continuing with Ansible, shifting to Puppet, etc.
[0]: https://wiki.debian.org/DebianInstaller/Preseed
This is nice in its uniformity (same tool works for any distro that has an existing AMI to work with), but it's insanely slow compared to just putting a rootfs together and uploading it as an image.
I think I'd usually rather just use whatever distro-specific tools for putting together a li'l chroot (e.g., debootstrap, pacstrap, whatever) and building a suitable rootfs in there, then finish it up with amazon-ec2-ami-tools or euca2ools or whatever and upload directly. The pace of iteration with Packer is just really painful for me.
I haven’t played with chroot since Gentoo (which for me, was quite a while ago), so I may be incorrect, but isn’t that approach more limited in its customization? As in, you can install some packages, but if you wanted to add other repos, configure 3rd party software, etc. you’re out of luck.
Nah you can add other repos in a chroot! The only thing you can't really do afaik is test running a different kernel; for that you've got to actually boot into the system.
If you dual-boot multiple Linux systems you can still administer any of the ones you're not currently running via chroot at any time, and that works fine whether you've got third-party repositories or not. A chroot is also what you'd use to reinstall the bootloader on a system where Windows has nuked the MBR or the EFI vars or whatever.
There might be some edge cases like software that requires a physical hardware token to be installed for licensing purposes is very aggressive, so it might also try to check if it's running in a chroot, container, or VM and refuse to play nice or something like that. But generally you can do basically anything in a chroot that you might do in a local container, and 99% of what you might do in a local VM.
I miss saltstack. I did that whole litany of steps with one tool plus preseed.
What's a better way to make VM images?
There's lots of tools in this space. I work on https://github.com/systemd/mkosi for example.
I think the thread is more about how docker was a reaction to the vagrant/packer ecosystem that was deemed overweight but was in many ways was a “docker like thing” but VMs.
Oh, yeah, I'm not trying to prosecute, I've just always been Packer-curious.
Wouldn't work here, they have software on each VM that cannot be reimaged. To use Packer properly, you should treat like you do stateless pod, just start a new one and take down the old one.
Sure then throw Ansible over the top for configuration/change management. Packer gives you a solid base for repeatable deployments. Their model was to ensure that data stays within the VM which a deployed AMI made from Packer would suit the bill quite nicely. If they need to do per client configuration then ansible or even AWS SSM could fit the bill there once EC2 instance is deployed.
For data sustainment if they need to upgrade / replace VMs, have a secondary EBS (volume) mounted which solely stores persistent data for the account.
Yeah that's kind of a crummy tradeoff.
Docker is "Runs on any Linux, mostly, if you have a new enough kernel" meaning it packages a big VM anyway for Windows and macOS
VMs are "Runs on anything! ... Sorta, mostly, if you have VM acceleration" meaning you have to pick a VM software and hope the VM doesn't crash for no reason. (I have real bad luck with UTM and VirtualBox on my Macbook host for some reason.)
All I want is everything - An APE-like program that runs on any OS, maybe has shims for slightly-old kernels, doesn't need a big installation step, and runs any useful guest OS. (i.e. Linux)
The modern developer yearns for Java
I had to use eclipse the other day. How the hell is it just as slow and clunky as I remember from 20 years ago? Does it exist in a pocket dimension where Moore's Law doesn't apply?
I think it's pretty remarkable to see any application in continuous use for so long, especially with so few changes[0] -- Eclipse must be doing something right!
Maintaining (if not actively improving/developing) a piece of useful software without performance degradation -- that's a win.
Keeping that up for decades? That's exceptional.
[0] "so few changes": I'm not commenting on the amount of work done on the project or claiming that there is no useful/visible features added or upgrades, but referring to Eclipse of today feeling like the same application as it always did, and that Eclipse hasn't had multiple alarmingly frequent "reboots", "overhauls", etc.
[?] keeping performance constant over the last decade or two is a win, relatively speaking, anyway
Not accounting for Moore's Law, yikes. Need a comparison adjusted for "today's dollars".
I agree, that you've pointed it out to me makes it obvious that this is not the norm, and we should celebrate this.
I'm reminded of Casey Muratori's rant on Visual Studio; a program that largely feels like it hasn't changed much but clearly has regressed in performance massively; https://www.youtube.com/watch?v=GC-0tCy4P1U
It's not as slow as it was 20 years ago.
It's only as slow as you remember because the actual performance was so bad that you can't physiologically remember it.
That's not Java's fault though. IntelliJ IDEA is also built on Java and runs just fine.
Java's ecosystem is just as bad. Gradle is insanely flexible but people create abominations out of it, Maven is extremely rigid so people resort to even worse abominations to get basic shit done.
Maybe just the JVM.
New enough kernel means CentOS 5 is right out. But it's been a decade, it'll run on anything vaguely sensible to be running today
docker is your userspace program carries all its user space dependencies with it and doesn't depend on the userspace configuration of the underlying system.
What I argued in my paper is that systems like docker (i.e. what I created before it), improve over VMs and (even Zones/ZFS) in their ability to really run ephemeral computation. i.e. if it takes microseconds to setup the container file system, you can run a boatload of heterogeneous containers even if they only needed to run for very shot periods of time). Solaris Zones/ZFS didn't lend itself to heterogeneous environments, but simply cloning as single homogeneous environment, while VMs suffered from that problem, they also (at least at the time, much improved as of late) required a reasonably long bootup time.
Docker’s the best cross-distro rolling-release package manager and init system for services—staying strictly out of managing the base system, which is great—that I know of. I don’t know of anything that’s even close, really.
All the other stuff about it is way less important to me than that part.
This is wrong in pretty much every way I can imagine.
Docker's not a package manager. It doesn't know what packages are, which is part of why the chunks that make up Docker containers (image layers) are so coarse. This is also part of why many Docker images are so huge: you don't know exactly the packages you need, strictly speaking, so you start from a whole OS. This is also why your Dockerfiles all invoke real package managers— Docker can't know how to install packages if it doesn't know what they are!
It's also not cross-platform, or at least 99.999% of images you might care about aren't— they're Linux-only.
It's also not a service manager, unless you mean docker-compose (which is not as good as systemd or any number of other process supervisors) or Docker Swarm (which has lost out to Kubernetes). (I'm not sure what you even mean by 'init system for containers' since most containers don't include an init system.)
There actually are cross-platform package managers out there, too. Nix, Pkgsrc, Homebrew, etc. All of those I mentioned and more have rolling release repositories as well. ('Rolling release' is not a feature of package managers; there is no such thing as a 'rolling release package manager'.)
Nope! It’s not wrong in any way at all!
You’re thinking of how it’s built. I’m thinking of what it does (for me).
I tell it a package (image) to fetch, optionally at a version. It has a very large set of well maintained up-to-date packages (images). It’s built-in, I don’t even have to configure that part, though I can have it use other sources for packages if I want to. It fetches the package. If I want it to update it, I can have it do that too. Or uninstall it. Or roll back the version. I am 100% for-sure using it as a package manager, and it does that job well.
Then I run a service with a simple shell script (actually, I combine the fetching and running, but I’m highlighting the two separate roles it performs for me). It takes care of managing the process (image, which is really just a very-fat process for these purposes). It restarts it if it crashes, if I like. It auto-starts it when the machine reboots—all my services come back up on boot, and I’ve never touched systemd (which my Debian uses), Docker is my interface to that and I didn’t even have to configure it to do that part. I’m sure it’s doing systemd stuff under the hood, at least to bring the docker daemon up, but I’ve never touched that and it’s not my interface to managing my services. The docker command is. Do I see what’s running with systemd or ps? No, with docker. Start, restart, or stop a service? Docker.
I’ve been running hobbyist servers at home (and setting up and administrating “real” ones for work) since 2000 or so and this is the smoothest way to do it that I’ve seen, at least for the hobbyist side. Very nearly the only roles I’m using Docker to fill, in this scenario, are package manager and service manager.
I don’t care how it works—I know how, but the details don’t matter for my use case, just the outcomes. The outcome is that I have excellent, updated, official packages for way more services than are in the Debian repos, that leave my base system entirely alone and don’t meaningfully interact with it, with config that’s highly portable to any other distro, all managed with a common interface that would also be the same on any other distro. I don’t have to give any shits about my distro, no “oh if I want to run this I have to update the whole damn distro to a new major version or else manually install some newer libraries and hope that doesn’t break anything”, I just run packages (images) from Docker, update them with Docker, and run them with Docker. Docker is my UI for everything that matters except ZFS pool management.
I specifically wrote cross-distro for this reason.
Docker “packages” have a broader selection and better support than any of those, as far as services/daemons go; it’s guaranteed to keep everything away from the base system and tidy for better stability; and it provides a common interface for configuring where to put files & config for easier and more-confident backup.
I definitely use it mainly as a package manager and service manager, and find it better than any alternative for that role.
I've read your reply and I hear you (now). But as far as I'm concerned package management is a little more than that. Not everything that installs or uninstalls software is a package manager-- for instance I would say that winget and Chocolatey are hardly package managers, despite their pretensions (scoop is closer). I think of package management, as an approach to managing software and as a technology, as generally characterized by things like and including: dependency tracking, completeness (packages' dependencies are themselves all packaged, recursively, all the way down), totality (installing software by any means other than the package manager is not required to have a practically useful system), minimal redundancy of dependencies common to multiple packages, collective aggregation and curation of packages, transparency (the unit the software management tool operates on, the package, tracks the versions of the software contained in it and the versions of the software contained in its dependencies), exclusivity (packaged software does not self-update; updates all come through the package manager), etc. Many of these things come in degrees, and many package managers do not have all of them to the highest degree possible. But the way Docker gets software running on your system just isn't meaningfully aligned with that paradigm, and this also impacts the way Docker can be used. I won't enumerate Docker's deviations from this archetype because it sounds like you already have plenty of relevant experience and knowledge.
When there's a vuln in your libc or some similar common dependency, Docker can't tell you about which of your images contains it because it has no idea what glibc or liblzma are. The whole practice of generating SBOMs is about trying to recover or regenerate data that is already easily accessible in any competent package manager (and indeed, the tools that generate SBOMs for container images depend on actual package managers to get that data, which is why their support comes distro-by-distro).
Managing Docker containers is also complicated in some ways that managing conventional packages (even in other containerized formats like Flatpak, Snap, and AppImage) isn't, in that you have to worry about bind mounts and port forwarding. How the software works leads to a radically different sort of practice. (Admittedly maybe that's still a bit distant from very broad outcomes like 'I have postgres running'.)
This is indeed a great outcome. But when you achieve it with Docker, the practice by means of which you've achieved it is not really a package management discipline but something else. And that is (sadly, to me) part of the appeal, right? Package management can be a really miserable paradigm when your packages all live in a single, shared global namespace (the paths on your filesystem, starting with /). Docker broke with that paradigm specifically to address that pain.
But that's not the end of the story! That same excellent outcome is also achievable by better package managers than ol' yum/dnf and apt! And when you go that route, you also get back the benefits of the old regime like the ability to tell what's on your system and easily patch small pieces of it once-and-for-all. Nix and Guix are great for this and work in all the same scenarios, and can also readily generate containers from arbitrary packages for those times you need the resource management aspect of containers.
For me, this is not a benefit. I think the curation, integration, vetting, and patching that coordinated software distributions do is extremely valuable, and I expect the average software developer to be much worse at packaging and systems administration tasks than the average contributor to a Linux distro is. To me, this feels like a step backwards into chaos, like apps self-updating or something like that. It makes me think of all the absolutely insane things I've seen Java developers do with Maven and Gradle, or entire communities of hobbyists who depend on software whose build process is so brittle and undocumented that seemingly no one knows how to build it and Docker has become the sole supported distribution mechanism.
My bad! Although that actually widens the field of contenders to include Guix, which is excellent, and arguably also Flatpak, which still aligns fairly well with package management as an approach despite being container-based.
I suppose this is an advantage of a decentralized authority-to-publish, like we also see in the AUR or many language-specific package repositories, and also of freedom from the burden of integration, since all docker image authors have to do is put together any runtime at all that runs. :-\
Ok. So you're just having dockerd autostart your containers, then, no docker-compose or Docker Swarm or some other layer on top? Does that even have a notion of dependencies between services? That feels like table stakes for me for 'good service manager'.
PS: thanks for giving a charitable and productive reply to a comment where I was way gratuitously combative about a pet peeve/hobby horse of mine for no good reason
Oh no, you’re fine, thanks for responding in kind, I get where you’re coming from now too. Maybe it’s clearer to label my use of it as a software manager or something like that. It does end up being my main interface for nearly everything that matters on my home server.
Like, I’m damn near running Docker/Linux, in the pattern of gnu/Linux or (as started as a bit of a joke, but is less so with each year) systemd/Linux, as far as the key parts that I interact with and care about and that complete the OS for me.
As a result, some docker alternatives aren’t alternatives for me—I want the consistent, fairly complete UI for the things I use it for, and the huge library of images, largely official. I can’t just use raw lxc or light VMs instead, as that gets me almost nothing of what I’m currently benefiting from.
I haven’t run into a need to have dependent services (I take the SQLite option for anything that has it—makes backups trivial) but probably would whip up a little docker-compose for if I ever need that. In work contexts I usually just go straight for docker-compose, but with seven or eight independent services on my home server I’ve found I prefer tiny shell scripts for each one.
[edit] oh, and I get what you mean about it not really doing things like solving software dependencies—it’s clearly not suitable as, like, a system package manager, but fills the role well enough for me when it comes to the high-level “packages” I’m intending to use directly.
This kind of uniformity of interface and vastness are among the things that made me fall in love with Linux package management as a kid, too. I can see how there's a vital similarity there that could inspire the language you used in your first comment.
Right, those will give you the isolation, but primarily what you want is the low commitment and uniform interface to just try something (and if turns out to be good enough, why not leave it running like that for a few years).
I sometimes kind of enjoy packaging work, and even at work I sometimes prefer to build a container from scratch myself when we're doing container deployments, rather than using a vendor one. In fact we've got one ECS deployment where we're using a vendor container where the version of the software that I've got running locally through a package I wrote works fine, but the vendor container at that same version mysteriously chokes on ECS, so we've reverted to the LTS release of the vendor one. Building my own Nix package for that software involved some struggle: some reading of its build tooling and startup code, some attempts to build from source manually, some experiments with wrapping and modifying prebuilt artifacts, some experiments with different runtimes, etc. But it also taught me enough about the application that I am now prepared to debug or replace that mysteriously-busted vendor container before that deployment moves to production.
At the same time, I don't think that pain would be worth it for me for a homelab deployment of this particular software. At work, it's pretty critical stuff so having some mastery of the software feels a bit more relevant. But if I were just hoping to casually try it at home, I'd have likely moved on or just resorted to the LTS image and diminished my opinion of upstream a little without digging deeper.
The (conceptually) lightweight service management point is well-taken. In systemd, which I generally like working with, you always have to define at least one dependency relation (e.g., between your new service and one of the built-in targets) just to get `systemctl enable` to work for it! On one level I like the explicitness of that, but on another I can understand seeing it as actually unnecessary overhead for some use cases.
When I read the parent comment, I was picturing package manager and init system in quotes. Docker is the "package manager" for people who don't want to package their apps with a real package manager. It's a "service manager" and "init system" that can restart your services (containers) on boot or when they fail.
Right, it gives me the key functionality of those systems in a way that’s decoupled from the base OS, so I can just run some version of Debian for five years or however long it’s got security updates, and not have to worry about things like services I want to run needing newer versions of lots of things than some old Debian has. Major version updates of my distro, or trying to get back ported newer packages in, have historically been the biggest single source of “fuck, now nothing works and there goes my weekend” running home servers, to the point of making ROI on them not so great and making me dread trying to add or update services I was running. This approach means I can install and run just about any service at any recent-ish version without touching the state of the OS itself. The underlying tech of docker isn’t directly important to me, nor any of the other features it provides, in this context, just that it’s separate from the base OS and gives me a way to “manage packages” and run services in that decoupled manner.
I agree with all this and use it similarly. I hate when updating the OS breaks my own apps, or does something annoying like updating a dependency (like Postgres)... Docker is perfect for this.
You're right. I read the commenter I was replying to very badly. In my later discussion with them we covered a bit better how Docker can cover some of the same uses as package managers as well as the continued vitality of package management in the era of containers. It was a much better conversation by the end than it was at the beginning, thanks to their patience and good faith engagement.
Clear Containers/Kata Containers/firecracker VMs showed that there isn't really a dichotomy here. Why we aren't all using HW assisted containers is a mystery.
It's not at all mysterious: to run hardware-virtualized containers, you need your compute hosted on a platform that will allow KVM. That's a small, expensive, tenuously available subset of AWS, which is by far the dominant compute platform.
So… Lambda, Fargate, and EC2. The only thing you can't really do this with is EKS.
Like Firecracker was made by AWS to run containers on their global scale KVM, EC2.
Lambda and Fargate are implementations of the idea, not a way for you yourself to do any kind of KVM container provisioning. You can't generally do this on EC2; you need special instances for it.
For a variety of reasons, I'm pretty familiar with Firecracker.
What I'm I missing? AWS offers (virtual) hardware backed containers as a service, I would go so far as to say that a significant number of people are running vm backed containers.
And I've been at a few shops where EC2 is used as the poor-man's-firecracker by building containers and then running 1(ish) per VM. AWS's architecture actively encourages this because that's by far the easiest security boundary to manipulate. The moment you start thinking about two privilege levels in the same VM you're mostly on your own.
The number of people running production workloads who, knowingly or not, believe that the security boundary is not between containers but between the vms enclosing those containers is probably almost everyone.
The parent isn't talking about e.g. EC2 as a virtualized platform, they're talking about EC2 not being a virtualization platform. With few machine-type exceptions, EC2 doesn't support nested virtualization -- you can't run e.g. KVM on EC2.
I think the argument is you need to be running nitro (I think, it’s been awhile?) instances to take advantage of kvm isolation
Engineers are lazy, especially Ops. Until it's easier to get up and running and there are tangible benefits, people won't care.
I've always hated the docker model of the image namespace. It's like those cloud-based routers you can buy.
Docker actively prevents you from having a private repo. They don't want you to point away from their cloud.
Redhat understood this and podman allows you to have a private docker infrastructure, disconnected from docker hub.
For my personal stuff, I would like to use "FROM scratch" and build my personal containers in my own ecosystem.
In what ways? I use private repos daily with no issues.
If you reference a container without a domain, you pull from docker.io
With podman, you can control this with
or with docker, not possible (though you can hack mirrors)https://stackoverflow.com/questions/33054369/how-to-change-t...
So.. just use a domain. This seems like a nothing burger.
Not all dockerfiles (especially multi-stage builds) are easily sanitized for this.
think FROM python:latest or FROM ubuntu:20.04 AS build
They've put deliberate barriers in the way of using docker commands without accessing their cloud.
Or even easier: just fully qualify all images. With Podman:
nginx => docker.io/library/nginx
linuxserver/plex => docker.io/linuxserver/plex
Huh? In what way does docker prevent you from having a private repo? Its a couple clicks to get on any cloud
One of the first things I did was set up a my own container registry with Docker. It's not terribly difficult.
Docker is great, way overused 100%. I believe a lot of it started as "cost savings" on resource usage. Then it became the trendy thing for "scalability".
When home enthusiasts build multi container stacks for their project website, it gets a bit much.
Solves dependency version hell also
Solves it in the same sense that it's a giant lockfile. It doesn't solve the other half where updates can horribly break your system and you run into transitive version clashes.
Having been running a VPS with manually maintained services moving to docker saved me a lot of headache... things definitely breaks less often (almost never) and if they do it's quite easy to revert back to previous version...
But at least you can revert back to the original configuration (as you can with VM, too).
It solves it in the sense that it empowers the devs to update their dependencies on their own time and ops can update the underlying infrastructure fearlessly. It turned a coordination problem into a non-problem.
It doesn't solve it, it makes it tractable so you can use the scientific method to fix problems as opposed to voodoo.
I don't know - docker has been a godsend for running my own stuff. I can get a docker-compose file working on my laptop, then run it on my VPS with a pretty high certainty that it will work. Updating has also (to date) been incredibly smooth.
Having run both at scale, I can confirm and assure you they are not as secure as VMs and did not produce sane devops workflows. Not that Docker is much better, but it is better from the devops workflow perspective, and IMHO that's why Docker "won" and took over the industry.
A sane DevOps workflow is with declarative systems like NixOS or Guix System, definitively not on a VM infra in practice regularly not up to date, full of useless deps, on a host definitively not up to date, with the entire infra typically not much managed nor manageable and with an immense attack surface...
VMs are useful for those who live on the shoulder of someone else (i.e. *aaS) witch is ALL but insecure.
I'm not sure what you're referring to here?
Our cloud machines are largely VMs. Deployments mean building a new image and telling GCP to deploy that as machines come and go due to scaling. The software is up to date, dependencies are managed via ansible.
Maybe you think VMs means monoliths? That doesn't have to be the case.
That's precisely the case: instead of owning hw, witch per-machine it's a kind-of monolith (even counting blades and other modular solution), you deploy a full OS or half-full to run just a single service, on top of another "OS". Of course yes, this is the cloud model, and is also the ancient and deprecated mainframe model, with much more added complexity and no unique ownership with an enormously big attack surface.
Various return of experience prove that cloud model is not cheap nor reliable than owning iron, it's just fast since you live on the shoulders of someone else. A speed you will pay at an unknown point in time when something happen and you have zero control other that.
DevOps meaning the Devs taking over the Ops without having the needed competences, it's a modern recipe to a failing digital ecosystems and we witnessed that more and more with various "biblical outages" from "Roomba devices briked due to an AWS mishap, cars of a certain vendor with a slice or RCEs, payment systems outages, ... a resilient infra it's not a centrally managed decentralized infra, it's a vast and diverse ecosystem interoperating with open and standard tools and protocols. Classic mail or Usenet infra are resilient, GMail backed by Alphabet infra is not.
What if Azure tomorrow collapse? What's the impact? What's the attack surface of living on the shoulder of someone else, typically much bigger than you and often in other countries where getting even legal protections is costly and complex?
Declarative systems on iron means you can replicate your infra ALONE on the iron, VMs meaning you need much more resources and you do not even know the entire stack of your infra, you can't essentially replicate nothing. VMs/images are still made the classical '80s style semi-manual way with some automation written by a dev knowing just how to manage his/her own desktop a bit and others will use it careless "it's easy to destroy and re-start", as a result we have seen in production images with someone unknown SSH authorized keys because to be quick someone pick the first ready made image from Google Search and add just few things, we are near the level of crap of the dot-com bubble, with MUCH more complexity and weight.
(note .. use 'which' not 'witch', quite different words)
Not sure if you mentioned it, but cost and scaling is an absurd trick of AWS and others. AWS is literally 1000s, and in some usage cases even millions of times more expensive than your own hardware. Some believe that employee cost savings help here, but that's not even remotely close.
Scaling is absurd. You can buy one server worth $10k, that can handle the equivalent of thousands upon thousands of AWS instances' workload. You can buy far cheaper servers ($2k each), colo them yourself, have failover capability, and even have multi-datacentre redundancy, immensely cheaper than AWS. 1000 of times cheaper. All with more power than you'd ever, ever, ever scale at AWS.
All that engineering to scale, all that effort to containerize, all that reliance upon AWS and their support system.. unneeded. You can still run docker locally, or VMs, or just pound it out to raw hardware.
So on top of your "run it on bare metal" concept, there's the whole "why are you wasting time and spending money" for AWS, argument. It's so insanely expensive. I cannot repeat enough how insanely expensive AWS is. I cannot repeat enough how AWS scaling is a lie, when you don't NEED to scale using local hardware. You just have so much more power.
Now.. there is one caveat, and you touch on this. Skill. Expertise. As in, you have to actually not do Really Dumb Things, like write code that uses 1000s of times CPU to do the same task, or write DB queries or schema that eat up endless resources. But of course, if you do those things on your own hardware, in DEV, you can see them and fix.
If you do those in AWS, people just shrug, and pay immense sums of money and never figure it out.
I wonder, how many startups have failed due to AWS costs?
Thanks and sorry for my English, even if I use it for work I do not normally use it conversationally and as a result it's still very poor for me...
Well I do not specifically talk about AWS, but in general living on someone else is much more expensive in OPEX than what it can be spared in CAPEX, and it's a deeply critical liability, specially when we start to develop on someone else API instead of just deploy something "standard" we can always move unchanged.
Yes, technical debt is a big issue but is a relative issue because if you can't maintain your own infra you can't be safe anyway, the "initial easiness" means a big disaster sooner or later, and the more later it is the more expensive it will be. Of course an unipersonal startup can't have on it's own iron offsite backups, geo replication and so on, but dose the MINIMUM usage of third party services trying to be as standard and vendor independent as possible until you earn enough to own it's definitively possible at any scale.
Unfortunately it's a thing we almost lost since now Operation essentially does not exists anymore except for few giants, Devs have no substantial skill since they came from "quick" full immersion bootcamps where they learned just to do repetitive things with specific tools like modern Ford model workers able only to turn a wrench and still most of the management fails to understand IT for what it is, not "computers" like astronomers telescopes, but information, like stars for astronomers. This toxic mix have allowed very few to earn hyper big positions, but they start to collapse because their commercial model is technically untenable and we start all paying the high price.
VMs are useful when you don't own or rent dedicated hardware. Which is a lot of cases, especially when your load varies seriously over the day or week.
And even if you do manage dedicated servers, it's often wise to use VMs on them to better isolate parts of the system, aka limit the blast radius.
Witch is a good recipe to pay much more thinking to be smart and pay less, being tied to some third parties decisions for anything running, having a giant attack surface and so on...
There are countless lessons about how owning hw is cheaper than not, there are countless examples of "cloud nightmares", countless examples of why a system need to be simple and securely design from start not "isolated", but people refuse to learn, specially since they are just employees for living on the shoulder of someone else means less work to do and managers typically do not know even the basic of IT to understand.
Honestly, it really doesn't matter whether it's VMs or Docker. The docker/container DX is so much better than VMWare/QEMU/etc. Make it easy to run workloads in VMs/Firecracker/etc and you'll see people migrate.
I mean, Vagrant was basically docker before docker. People used it. But it turns out the overhead over booting a full VM + kernel adds latency which is undesirable for development workloads. The techniques used by firecracker could be used, but I suspect the overhead of allocating a namespace and loading a process will always be less than even restoring from a frozen VM, so I wouldn't hold my breath on it swinging back in VM's direction for developer workloads ever.
It would be interesting to see a microvm (kata/firecracker/etc.) version of vagrant. And open source, of course. I can't see any technical reason why it would be particularly difficult.
I don't think they're that valuable tbh. Outside of cases where you're running completely untrusted code or emulating a different architecture, there's no strong reason to pick any VM over one of the several container paradigms.
One more usecase - which I admit is niche - is that I want a way to run multiple OSs as transparently as possible. A microvm that boots freebsd in less than a second and that acts almost like a native container would be excellent for certain development work.
Edit: Actually it's not just cross-OS work in the freebsd/linux sense; it would also be nice for doing kernel dev work. Edit my linux kernel module, compile, and then spawn a VM that boots and starts running tests in seconds.
Yeah, there are definitely still cases, but they're when you specifically want/need a different kernel, which as you said are rather niche.
Oh they exist! Several of them in fact, they have never picked up a ton of steam though
Isn't this discussion based on a false dichotomy? I, too, use VMs to isolate customers, and I use containers within those VMs, either with or without k8s. These tools solve different problems. Containers solve software management, whereas VMs provide a high degree of isolation.
Container orchestration is where I see the great mistake in all of this. I consider everything running in a k8s cluster to be one "blast domain." Containers can be escaped. Faulty containers impact everyone relying on a cluster. Container orchestration is the thing I believe is "overused." It was designed to solve "hyper" scale problems, and it's being misused in far more modest use cases where VMs should prevail. I believe the existence of container orchestration and its misapplication has retarded the development of good VM tools: I dream of tools that create, deploy and manage entire VMs with the same ease as Docker, and that these tools have not matured and gained popularity because container orchestration is so easily misapplied.
Strongly disagree about containers and dev/deployment ("NOT check"). I can no longer imagine development without containers: it would be intolerable. Container repos are a godsend for deployment.
As a relatively early corporate adopter of k8s, this is absolutely correct. There are problems where k8s is actually easier than building the equivalent capability elsewhere, but a lot of uses it's put to seem to be driven more by a desire to have kubernetes on one's resume.
It’s no wonder people feel compelled to do this, given how many employers expect experience with k8s from applicants.
Kubernetes is a computer worm that spreads via resumes.
For what k8s was designed to do -- herding vast quantities of ephemeral compute resources across a global network -- it's great. That's not my problem with it. My problem is that by being widely misapplied it has stunted the development of good solutions to everything else. K8s users spend their efforts trying to coax k8s to do things it was never intended to do, and so the k8s "ecosystem" has spiraled into this duplicative, esoteric, fragile, and costly bazar of complexity and overengineering.
Which language do you develop in?
In no particular order; Python, Go, Perl, C, Java, Ruby, TCL, PHP, some proprietary stuff, all recently (last 1-2 years) and in different versions: Java: 8, 11 and 17, for example. Deployed to multiple environments at multiple sites, except the C, which is embedded MCU work.
Even then, IMO, it makes too little sense. It would be a bit useful if non-used containers wasted a lot of resources, or if you could get an unlimited amount of them from somewhere.
But no, just creating all the containers you can and leaving them there wastes almost nothing, they are limited by the hardware you have or rent, and the thing clouds can rent are either full VMs or specialized single-application sandboxes.
AFAIK, containers solve the "how do I run both this PHP7 and this PHP8 applications on my web server?" problem, and not much more.
What do you think of Nix/NixOS?
But that comes _after_ you have chosen VMs over Containers yes?
If you are using VMs, I think NixOs/Guix is a good choice. Reproducible builds, Immutable OS, Immutable binaries and Dead easy rollback.
It still looks somewhat futuristic. Hopefully gets traction.
if you're using nixos, just to do provisioning, I would argue OStree is a better fit.
Nix is actually a really nice tool for building docker images: https://xeiaso.net/talks/2024/nix-docker-build/
Nix is trying to be like macOS's DMG but its image file is bit more parse-able.
Docker's good at packaging, and Kubernetes is good at providing a single API to do all the infra stuff like scheduling, storage, and networking. I think that if someone sat down and tried to create a idealized VM management solution that covered everything between "dev pushes changes" to "user requests website" then it'd probably have a single image for each VM to run (like Docker has a single image for each container to run) then management of VM hosts, storage, networking, and scheduling VMs to run on which host would wind up looking a lot like k8s. You could certainly do that with VMs but for various path dependency reasons people do that with containers instead and nobody's got a well adopted system for doing the same with VMs
I'm sorry, but:
* Docker isn't good at packaging. When people talk about packaging, they usually understand it to include dependency management. For Docker to be good at packaging it should be able to create dependency graphs and allow users to re-create those graphs on their systems. Docker has no way of doing anything close to that. Aside from that, Docker suffers from the lack of reproducible builds, lack of upgrade protocols... It's not good at packaging... maybe it's better than something else, but there's a lot of room for improvement.
* Kubernetes doesn't provide a single API to do all the infra stuff. In fact, it provides so little, it's a mystery why anyone would think that. All those stuff like "storage", "scheduling", "networking" that you mentioned comes as add-ons (eg. CSI, CNI) which aren't developed by Kubernetes, aren't following any particular rules, have their own interfaces... Not only that, Kubernetes' integration with CSI / CNI is very lacking. For example, there's no protocol for upgrading these add-ons when upgrading Kubernetes. There's no generic interface that these add-ons have to expose to the user in order to implement common things. It's really anarchy what's going on there...
There are lots of existing VM management solutions, eg. OpenStack, VSphere -- you don't need to imagine them, they exist. They differ from Kubernetes in many ways. Very superficially, yet importantly, they don't have an easy way to automate them. For very simple tasks Kubernetes offers a very simple solution for automation. I.e. write some short YAML file. Automating eg. ESX comes down to using a library like govmomi (or something that wraps it, like Terraform). But, in the mentioned case, Terraform only managed deployment, and doesn't take care of the post-deployment maintenance... and so on.
However, the more you deal with the infra, the more you realize that the initial effort is an insignificant fraction of the overall complexity of the task you need to deal with. And that's where the management advantages of Kubernetes start to seem less appealing. I.e. you realize that you will have to write code to manage your solution, and there will be a lot of it... and a bunch of YAML files won't cut it.
Docker's dependency management solution is "include everything you need and specify a standard interface for the things you can't include like networking." There's no concern about "does the server I'm deploying to have the right version of libssl" because you just include the version you need. At most, you have to have "does the server I'm deploying to have the right version of Docker/other container runtime for the features my container uses" which are a much smaller subset of changes. Reproducible builds, yeah, but that's traditionally more a factor of making sure your build scripts are reproducible than the package management itself. Or to put it another way, dockerfiles are just as reproducible as .debs or .rpms. Upgrading is replacing the container with a new one
Kubernetes is an abstraction layer that (mostly) hides the complexity of storage networking etc. Yeah the CNIs and CSIs are complex, but for the appdev it's reduced to "write a manifest for a PV and a PVC" or "write a manifest for a service and/or ingress". In my company ops has standardized that so you add a key to your values.yaml and it'll template the rest for you. Ops has to deal with setting up that stuff in the first place, which you have to do regardless, but it's better than every appdev setting up their own way to do things
My company's a conglomerate of several acquisitions. I'm from a company that was heavy into k8s, and now I'm working on getting everyone else's apps that are currently just deployed to a VM into a container and onto a k8s cluster instead. I might shouldn't've said k8s was an API per se, but it is a standardized interface that covers >90% of what people want to do. It's much easier to debug everything when it's all running on top of k8s using the same k8s concepts than it is debugging why each idiosyncratic VM isn't working. Could you force every app to use the same set of features on VMs? Want a load balancer, just add a value to your config and the deployment process will add your VM to the F5? Yeah, it's possible, but we'd have to build it, or find a solution offered by a particular vendor. k8s already has that written and everyone uses it
This is super, super, super naive. You, essentially, just solved for the case of one. But now you need to solve for N.
Do you seriously believe you will never be in a situation where you have to run... two containers?.. With two different images? If my experience is anything to go by, even postcard Web sites often use 3-5 containers. I just finished deploying a test of our managed Kubernetes (technically, it uses containerd, but it could be using Docker). And it has ~60 containers. And this is just the management part. I.e. no user programs are running there. It's a bunch of "operators", CNIs, CSIs etc.
In other words: if your deployment was so easy that it could all fit into a single container -- you didn't have a dependency problem in the first place. But once you get realistic size deployment, you now have all the same problems. If libssl doesn't implement the same version of TLS protocol in two containers -- you are going to have a bad time. But now you also amplified this problem because you need certificates in all containers! Oh and what a fun it is to manage certificates in containers!
Now, be honest. You didn't really use it, did you? The complexity in eg. storage may manifest in many different ways. None of them have anything to do with Kubernetes. Here are some examples: how can multiple users access the same files concurrently? How can the same files be stored (replicated) in multiple places concurrently? What about first and second together? Should replication happen at the level of block device or filesystem? Should snapshots be incremental or full? Should user ownership be encoded into storage, or should there be an extra translation layer? Should storage allow discards when dealing with encryption? And many, many more.
Kubernetes doesn't help you with these problems. It cannot. It's not designed to. You have all the difficult storage problems whether you have Kubernetes or not. What Kubernetes offers is a possibility for the storage vendors to expose their storage product through it. Which is nothing new. All those storage products can be exposed through some other means as well.
In practice, some storage vendors who choose to expose their products through Kubernetes usually end up with a limited subset of the storage functionality exposed in such a way. So, not only storage through Kubernetes doesn't solve your problems: it adds more of them. Now you may have to work around the restrictions of Kubernetes if you want to use some unavailable features (think, for example all the Ceph CLI that you are missing when using Ceph volumes in Kubernetes: it's hundreds of commands that are suddenly unavailable to you).
----
You seem like an enthusiastic person. And you probably truly believe what you write about this stuff. But you went way above your head. You aren't really an infra developer. You kind of don't even really recognize the general patterns and problems of this field. And that's OK. You don't have to be / do that. You just happened to be a new car owner who learned how to change oil on your own, and you are trying to preach to a seasoned mechanic about the benefits and downsides of different engine designs :) Don't take it to heart. It's one of those moments where maybe years later you'll suddenly recall this conversation and feel a spike of embarrassment. Everyone has that.
Why?
I have my RPi4 and absolutely love docker(-compose) - deploying stuff/services on in it just a breeze compared to previous clusterf*k of relying on system repository for apps (or if something doesnt work)... with docker compose I have nicely separated services with dedicated databases in required version (yes, I ran into an issue that one service required newer and another older version of the database, meh)
As for development - I do development natively but again - docker makes it easier to test various scenarios...
I’ve been using LXC/Incus as lightweight VMs (my home server is an old mac mini) and I think too many software is over reliant on Docker. It’s very much “ship your computer” with a bunch of odd scripts in addition to the Dockerfiles.
Could you elaborate the "ship your computer"? Majority of the images are base OS (which is as lean as possible) and then just the app...
To that end, full blown VM seems even more "ship your computer" thing?
Btw. isn't LXC base for the Docker as well? It looks somewhat similar to docker and podman?
I've been meaning to do a bhyve deep dive for years, my gut feelings being much the same as yours. Would appreciate any recommended reading.
Read the fine manual and handbook.
From my perspective, it's the complete opposite: Docker is a workaround for problems created decades ago (e.g. dynamic linking), that could have been solved in a better manner, but were not.
there are flatpacks/appimage/whatever but they are linux-only (mostly) and still lack something akin of docker-compose...
The article doesn't read to me to be an argument about whether sharing a kernel is better or worse (multiple virtual machines each with their own kernel versus multiple containers isolated by a single kernel).
The article instead reads to me as an argument for isolating customers to their own customer-specific systems so there is no web server daemon, database server, file system path or other shared system used by multiple customers.
As an aside to the article, two virtual machines each with their own kernel are generally forced to communicate with each in more complex ways through network protocols which add more complexity and increase risk of implementation flaws and vulnerabilities existing. Two processes in different cgroups with a common kernel have other simpler communication options available such as being able to read the same file directly, UNIX domain sockets, named pipes, etc.
Yep, the article just seems to be talking about single tenancy vs multi tenancy. The VMs vs containers thing seems mostly orthogonal
Namespaces and cgroups and LXC and the whole alphabet soup, the “Docker Industrial Complex” to borrow your inspired term, this stuff can make sense if you rack your own gear: you want one level of indirection.
As I’ve said many times, putting a container on a serverless on a Xen hypervisor so you can virtualize while you virtualize? I get why The Cloud wants this, but I haven’t the foggiest idea why people sit still for it.
As a public service announcement? If you’re paying three levels of markup to have three levels of virtual machine?
You’ve been had.
You're only virtualizing once. Serverless/FaaS is just a way to run a container, and a container is just a Linux process with some knobs to let different software coexist more easily. You're still just running VMs, same as you always were, but just have a new way of putting the software you want to run on them.
Jails/Zones are not pretty much as secure as a VM. They're materially less secure: they leave cotenant workloads sharing a single kernel (not just the tiny slice of the kernel KVM manages). Most kernel LPEs are probably "Jail" escapes, and it's not feasible to filter them out with system call sandboxing, because LPEs occur in innocuous system calls, too.
If anything Docker is underused. You should have a very good reason to make a deploy that is not Docker, or (if you really need the extra security) a VM that runs one thing only (and so is essentially a more resource requiring Docker).
If you don’t, then it becomes much harder to answer the question of what exactly is deployed on a given server and what it takes to bring it up again if it goes down hard. If you but everything in Docker files, then the answer is whatever is set in the latest docker-compose file.
Are Jails/Zones/Docker even security solutions?
I always used them as process isolation & dependency bundling.
Docker is fantastic and VMs are fantastic.
I honestly can’t imagine running all the services we have without containers. It would be wildly less efficient and harder to develop on.
VMs are wonderful when you need the security
I wish people would stop going on about BSD jails as if they are the same. I would recommend at least using jails first. Most people using container technologies are well versed in BSD jails, as well as other technologies such as LXD, CRI-O, Micro VM's, and traditional virtualization technologies (KVM).
You will encounter rough edges with any technology if you use it long enough. Container technologies require learning new skills, and this is where I personally see people often get frustrated. There is also the lean left mentality of container environments, where you are expected to be responsible for your environment, which is difficult for some. I.E. users become responsible for more then in a traditional virtualizated environment. People didn't stop using VM's, they just started using containers as well. What you should use is dependent on the workload. When you have to manage more then a single VM, and work on a larger team, the value of containers becomes more apparent. Not to mention the need to rapidly patch and update in today's environment. Often VM's don't get patched because applications aren't architected in a way to allow for updates without downtime, although it is possible. There is a mentality of 'if it's not broke, don't fix it'. There is some truth that virtualized hardware can provide bounds of seperation as well, but other things like selinux also enforce these boundaries. Not to mention containers are often running inside VM's as well.
Using ephemeral VM's is not a new concept. The idea of 'cattle vs pets', and cloud, was built on KVM (OpenStack/AWS).
I agree. VMs rely on old technologies, and are reliable in that way. By contrast, the move to Docker then necessitated additional technologies, such as Kubernetes, and Kubernetes brought an avalanche of new technologies to help manage Docker/Kubernetes. I am wary of any technology that in theory should make things simpler but in fact draws you down a path that requires you to learn a dozen new technologies. The Docker/Kubernetes path also drove up costs, especially the cost associated with the time needed to set up the devops correctly. Anything that takes time costs money. When I was at Averon the CEO insisted on absolutely perfect reliability and therefore flawless devops, so we hired a great devops guy to help us get setup, but he needed several weeks to set everything up, and his hourly rate was expensive. We could have just "push some code to a server" and we would have saved $40,000. When I consult with early stage startups, and they worry about the cost of devops, I point out that we can start simply, by pushing some code to a server, as if this was still 2001, and we can proceed slowly and incrementally from there. While Docker/Kubernetes offers infinite scalability, I warn entrepreneurs that their first concern should be keeping things simple and therefore low cost. And then the next step is to introduce VMs, and then use something like Packer to enable the VMs to be uses as AMIs and so allow the devops to develop to the point of using Terraform -- but all of that can wait till the product actually gains some traction.
Same. We're still managing ESXi here at my company. Docker/K8s/etc are nowhere close to prod and probably never will be. Been very pleased with that decision.
I will say that Docker images get one HUGE use case at our company - CUDA images with consistent environments. CUDA/pytorch/tensorflow hell is something I couldn't imagine dealing with when I was in college studying CS a few decades ago.
I do strongly believe deployments of containers are easier. If you want something that parallels a raw VM, you can "docker run" the image. Things like k8s can definitely be complicated, but the parallel there is more like running a whole ESXi cluster. Having done both, there's really only a marginal difference in complexity between k8s and an ESXi cluster supporting a similar feature set.
The dev simplification is supposed to be "stop dealing with tickets from people with weird environments", though it admittedly often doesn't apply to internal application where devs have some control over the environment.
I would be interested to hear how you use them. From my perspective, raw jails/zones are missing features and implementing those features on top of them ends up basically back at Docker (probably minus the virtual networking). E.g. jails need some way to get new copies of the code that runs in them, so you can either use Docker or write some custom Ansible/Chef/etc that does basically the same thing.
Maybe I'm wrong, and there is some zen to be found in raw-er tools.
Jails/Zones are just heavy-duty containers. They're still not VMs. Not that VMs are enough either, given all the side-channels that abound.
I mean, yeah, but things like rowhammer and Spectre/Meltdown, and many other side-channels are a big deal. VMs are not really enough to prevent abuse of the full panoply of side-channels known and unknown.
I feel the exact same way.
There are so many use cases that get shoved into the latest, shiniest box just because it’s new and shiny.
A colleague of mine once suggested running a CMS we manage for customers on a serverless stack because “it would be so cheap”. When you face unexpected traffic bursts or a DDoS, it becomes very expensive, very fast. Customers don’t really want to be billed per execution during a traffic storm.
It would also have been far outside the normal environment that CMS expects, and wouldn’t have been supported by any of our commercial, vendored dependencies.
Our stack is so much less complicated without running everything in Docker, and perhaps ironically, about half of our stack runs in Kubernetes. The other half is “just software on VMs” we manage through typical tools like SSH and Ansible.