HN comments for: Unfashionably secure: why we use isolated VMs

PedroBatista

152 replies

2d1h

2024-07-25 17:25:44 UTC

As a permanent "out of style" curmudgeon in the last ~15 years, I like that people are discovering that maybe VMs are in fact the best approach for a lot of workloads and the LXC cottage industry and Docker industrial complex that developed around solving problems created by themselves or solved decades ago might need to take a hike.

Modern "containers" were invented to make things more reproducible ( check ) and simplify dev and deployments ( NOT check ).

Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow, I didn't dig too deep into this is practice, maybe I'm afraid to learn the contrary, but I hope not.

Either way Docker is "fine" but WAY overused and overrated IMO.

compsciphd

37 replies

2024-07-25 17:52:48 UTC

As the person who created docker (well, before docker - see https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and compare to docker), I argued that it wasn't just good for containers, but could be used to improve VM management as well (i.e. a single VM per running image - seehttps://www.usenix.org/legacy/events/lisa11/tech/full_papers...)

I then went onto built a system with kubernetes that enabled one to run "kubernetes pods" in independent VMs - https://github.com/apporbit/infranetes (as well as create hybrid "legacy" VM / "modern" container deployments all managed via kubernetes.)

- as a total aside (while I toot my own hort on the topic of papers I wrote or contributed to), note the reviewer of this paper that originally used the term Pod for a running container - https://www.usenix.org/legacy/events/osdi02/tech/full_papers... - explains where Kubernetes got the term from.

I'd argue that FreeBSD Jails / Solaris Zones (Solaris Zone/ZFS inspired my original work) really aren't any more secure than containers on linux, as they all suffer from the same fundamental problem of the entire kernel being part of one's "tcb", so any security advantage they have is simply due lack of bugs, not simply a better design.

ysnp

18 replies

1d23h

2024-07-25 18:30:16 UTC

Would you say approaches like gvisor or nabla containers provide more/enough evolution on the security front? Or is there something new on the horizon that excites you more as a prospect?

the_duke

13 replies

1d22h

2024-07-25 20:13:55 UTC

GVisor basically works by intercepting all Linux syscalls, and emulating a good chunk of the Linux kernel in userspace code. In theory this allows lowering the overhead per VM, and more fine-grained introspection and rate limiting / balancing across VMs, because not every VM needs to run it's own kernel that only interacts with the environment through hardware interfaces. Interaction happens through the Linux syscall ABI instead.

From an isolation perspective it's not more secure than a VM, but less, because GVisor needs to implement it's own security sandbox to isolate memory, networking, syscalls, etc, and still has to rely on the kernel for various things.

It's probably more secure than containers though, because the kernel abstraction layer is separate from the actual host kernel and runs in userspace - if you trust the implementation... using a memory-safe language helps there. (Go)

The increased introspectioncapabiltiy would make it easier to detect abuse and to limit available resources on a more fine-grained level though.

Note also that GVisor has quite a lot of overhead for syscalls, because they need to be piped through various abstraction layers.

compsciphd

10 replies

1d21h

2024-07-25 20:48:56 UTC

I actually wonder how much "overhead" a VM actually has. i.e. a linux kernel that doesn't do anything (say perhaps just boots to an init that mounts proc and every n seconds read in/prints out /proc/meminfo) how much memory would the kernel actually be using?

So if processes in gvisor map to processes on the underlying kernel, I'd agree it gives one a better ability to introspect (at least in an easy manner).

It gives me an idea that I'd think would be interesting (I think this has been done, but it escapes me where), to have a tool that is external to the VM (runs on the hypervisor host) that essentially has "read only" access to the kernel running in the VM to provide visibility into what's running on the machine without an agent running within the VM itself. i.e. something that knows where the processes list is, and can walk it to enumerate what's running on the system.

I can imagine the difficulties in implementing such a thing (especially on a multi cpu VM), where even if you could snapshot the kernel memory state efficiently, it be difficult to do it in a manner that provided a "safe/consistent" view. It might be interesting if the kernel itself could make a hypercall into the hypervisor at points of consistency (say when finished making an update and about to unlock the resource) to tell the tool when the data can be collected.

XorNot

3 replies

1d18h

2024-07-25 23:52:32 UTC

What I really want is a "magic" shell on a VM - i.e. the ability using introspection calls to launch a process on the VM which gives me stdin/stdout, and is running bash or something - but is just magically there via an out-of-band mechanism.

compsciphd

2 replies

1d15h

2024-07-26 02:30:30 UTC

Not really "out of band", but many VMs allow you to setup a serial console, which is sort of that, albeit with a login, but in reality, could create one without one, still have to go through hypervisor auth to access it in all cases, so perhaps good enough for your case?

blipvert

0 replies

1d11h

2024-07-26 07:03:37 UTC

Indeed, easy enough to get a serial device on Xen.

Another possibility could be to implement a simple protocol which uses the xenstore key/value interface to pass messages between host and guest?

bbarnett

0 replies

1d12h

2024-07-26 06:27:35 UTC

You can launch KVM/qemu with screen + text console, and just log in there. You can also configure KVM to have a VNC session on launch, and that ... while graphical, is another eye into the console + login.

(Just mentioning two ways without serial console to handle this, although serial console would be fine.)

stacktrust

2 replies

1d20h

2024-07-25 21:30:48 UTC

https://github.com/Wenzel/pyvmidbg

  LibVMI-based debug server, implemented in Python. Building a guest aware, stealth and agentless full-system debugger.. GDB stub allows you to debug a remote process running in a VM with your favorite GDB frontend. By leveraging virtual machine introspection, the stub remains stealth and requires no modification of the guest.

more: https://github.com/topics/virtual-machine-introspection

compsciphd

1 replies

1d9h

2024-07-26 08:46:29 UTC

thanks, the kvm-vmi is basically an expansive version of what I was imagining (maybe read about it before, as noted, I thought it existed).

stacktrust

0 replies

1d6h

2024-07-26 11:58:12 UTC

Would you recommend ZFS as a building block for modern "VLFS"?

https://blog.chlc.cc/p/docker-and-zfs-a-tough-pair/

xtacy

0 replies

1d21h

2024-07-25 21:15:46 UTC

to have a tool that is external to the VM (runs on the hypervisor host) that essentially has "read only" access to the kernel running on the VM to provide visibility into what's running on the machine without an agent running within the VM itself

Not quite what you are after, but comes close ... you could run gdb on the kernel in this fashion and inspect, pause, step through kernel code: https://stackoverflow.com/questions/11408041/how-to-debug-th....

eru

0 replies

1d10h

2024-07-26 07:50:51 UTC

I actually wonder how much "overhead" a VM actually has. i.e. a linux kernel that doesn't do anything (say perhaps just boots to an init that mounts proc and every n seconds read in/prints out /proc/meminfo) how much memory would the kernel actually be using?

You don't necessarily need to run a full operating system in your VM. See eg https://mirage.io/

ecnahc515

0 replies

1d20h

2024-07-25 22:27:13 UTC

I actually wonder how much "overhead" a VM actually has. i.e. a linux kernel that doesn't do anything (say perhaps just boots to an init that mounts proc and every n seconds read in/prints out /proc/meminfo) how much memory would the kernel actually be using?

There's already some memory sharing available using DAX in Kata Containers at least: https://github.com/kata-containers/kata-containers/blob/main...

fpoling

0 replies

1d1h

2024-07-26 16:32:30 UTC

Go is not memory safe even when the code has none of unsafe blocks, although through a typical usage and sufficient testing memory safety bugs are avoided. If one needs truly memory-safe language, then use Rust, Java, C# etc.

actionfromafar

0 replies

1d7h

2024-07-26 10:56:39 UTC

This sounds vaguely like the forgotten Linux a386 (not i386!).

compsciphd

3 replies

1d23h

2024-07-25 19:04:54 UTC

been out of the space for a bit (though interviewing again, so might get back into it), gvisor at least as the "userspace" hypervisor, seemed to provide minimal value vs modern hypervisor systems with low overhead / quick boot VMs (ala firecracker). With that said, I only looked at it years ago, so I could very well be out of date on it.

Wasn't aware of Nabla, but they seem to be going with the unikernel approach (based on a cursory look at them). Unikernels have been "popular" (i.e. multiple attempts) in the space (mostly to basically run a single process app without any context switches), but it creates a process that is fundamentally different than what you develop and is therefore harder to debug.

while the unikernels might be useful in the high frequency trading space (where any time savings are highly valued), I'm personally more skeptical of them in regular world usage (and to an extent, I think history has born this out, as it doesn't feel like any of the attempts at it, has gotten real traction)

tptacek

1 replies

1d21h

2024-07-25 21:08:36 UTC

Modern gVisor uses KVM, not ptrace, for this reason.

compsciphd

0 replies

1d21h

2024-07-25 21:16:45 UTC

so I did a check, it would seem that gvisor with kvm, mostly works for bare metal, not on existing VMs (nested virtualization).

https://gvisor.dev/docs/architecture_guide/platforms/

"Note that while running within a nested VM is feasible with the KVM platform, the systrap platform will often provide better performance in such a setup, due to the overhead of nested virtualization."

I'd argue then for most people (unless have your own baremetal hyperscaler farm), one would end up using gvisor without kvm, but speaking from a place of ignorance here, so feel free to correct me.

eyberg

0 replies

1d19h

2024-07-25 23:13:17 UTC

I think many people (including unikernel proponents themselves) vastly underestimate the amount of work that goes into writing an operating system that can run lots of existing prod workloads.

There is a reason why Linux is over 30 years old and basically owns the server market.

As you note, since it's not really a large existing market you basically have to bootstrap it which makes it that much harder.

We (nanovms.com) are lucky enough to have enough customers that have helped push things forward.

For the record I don't know of any of our customers or users that are using them for HFT purposes - something like 99% of our crowd is on public cloud with plain old webapp servers.

bombela

13 replies

1d22h

2024-07-25 20:15:03 UTC

As the person who created docker (well, before docker - see https://www.usenix.org/legacy/events/atc10/tech/full_papers/... and compare to docker)

I picked the name and wrote the first prototype (python2) of Docker in 2012. I had not read your document (dated 2010). I didn't really read English that well at the time, I probably wouldn't have been able to understand it anyways.

https://en.wikipedia.org/wiki/Multiple_discovery

More details for the curious: I wrote the design doc and implemented the prototype. But not in a vacuum. It was a lot work with Andrea, Jérôme and Gabriel. Ultimately, we all liked the name Docker. The prototype already had the notion of layers, lifetime management of containers and other fundamentals. It exposed an API (over TCP with zerorpc). We were working on container orchestration, and we needed a daemon to manage the life cycle of containers on every machine.

compsciphd

8 replies

1d21h

2024-07-25 21:13:32 UTC

I'd note I didn't say you copied it, just that I created it first (i.e. "compare paper to docker". also, as you note, its possible someone else did it too, but at least my conception got through academic peer-review / patent office, yeah, there's a patent, never been attempted to be enforced though to my knowledge).

when I describe my work (I actually should have used quotes here), I generally give air quotes when saying it, or say "proto docker", as it provides context for what I did (there's also a lot of people who view docker as synonymous with containerization as a whole, and I say that containers existed way before me). I generally try to approach it humbly, but I am proud that I predicted and built what the industry seemingly needed (or at least is heavily using).

people have asked me why I didn't pursue it as a company, and my answer is a) I'm not much of an entrepreneur (main answer), and b) I felt it was a feature, not a "product", and would therefore only really profitable for those that had a product that could use it as a feature (which one could argue that product turned out to be clouds, i.e. they are the ones really making money off this feature). or as someone once said a feature isn't necessarily a product and a product isn't necessarily a company.

bombela

7 replies

1d19h

2024-07-25 23:05:27 UTC

I understood your point. I wanted to clarify, and in some ways connect with you.

At the time, I didn't know what I was doing. Maybe my colleagues did some more, but I doubt that. I just wanted to stop waking up at night because our crappy container management code was broken again. The most brittle part was the lifecycle of containers (and their filesystem). I recall being very adamant about the layered filesystem, because it allowed to share storage and RAM across running (containerized) processes. This saves in pure storage and RAM usage, but also in CPU time, because the same code (like the libc for example) is cached across all processes. Of course this only works if you have a lot of common layers. But I remember at the time, it made for very noticeable savings. Anyways, fun tidbits.

I wonder how much faster/better it would have been if inspired by your academic research. Or maybe not knowing anything made it so we solved the problems at hand in order. I don't know. I left the company shortly after. They renamed to Docker, and made it what it is today.

compsciphd

4 replies

1d18h

2024-07-25 23:33:51 UTC

they did it "simpler", i.e. academic work has to be "perfect" in a way a product does not. so (from my perspective), they punted the entire concept of making what I would refer to as a "layer aware linux distribution" and just created layers "on demand" (via RUN syntax of dockerfiles).

From an academic perspective, its "terrible", so much duplicate layers out in the world, from a practical perspective of delivering a product, it makes a lot of sense.

It's also simpler from the fact that I was trying to make it work for both what I call "persistent" containers (ala pets in the terminology) that could be upgraded in place and "ephemeral" containers (ala cattle) when in practice the work to enable upgrading in place (replacing layers on demand) to upgrade "persistent" containers I'm not sure is that useful (its technologically interesting, but that's different than useful).

My argument for this was that this actually improves runtime upgrading of systems. With dpkg/rpm, if you upgrade libc, your systems is actually temporarily in a state where it can't run any applications (in the delta of time when the old libc .so is deleted and the new one is created in its place, or completely overwrites it), any program that attempts to run in that (very) short period time, will fail (due to libc not really existing). By having a mechanism where layers could be swapped in essentially an atomic manner, no delete / overwrite of files occurs and therefore there is zero time when programs won't run.

In practice, the fact that a real world product came out with a very similar design/implementation makes me feel validated (i.e. a lot of phd work is one offs, never to see the light of day after the papers for it are published).

pxc

3 replies

1d15h

2024-07-26 03:19:39 UTC

so (from my perspective), they punted the entire concept of making what I would refer to as a "layer aware linux distribution"

Would you consider there to be any 'layer-aware Linux distributions' today, e.g., NixOS, GuixSD, rpm-ostree-based distros like Fedora CoreOS, or distri?

so much duplicate layers out in the world

Have you seen this, which lets existing container systems understand a Linux package manager's packages as individual layers?

https://github.com/pdtpartners/nix-snapshotter

mananaysiempre

1 replies

1d4h

2024-07-26 14:11:02 UTC

(Not GP.)

NixOS can share its Nix store with child (systemd-nspawn) containers. That is, if you go all in, package everything using Nix, and then carefully ensure you don’t have differing (transitive build- or run-time) dependency versions anywhere, those dependencies will be shared to the maximum extent possible. The amount of sharing you actually get matches the effort you put into making your containers use the same dependency versions. No “layers”, but still close what you’re getting at, I think.

On the other hand, Nixpkgs (which NixOS is built on top of) doesn’t really follow a discipline of minimizing package sizes to the extent that, say, Alpine does. You fairly often find documentation and development components living together with the runtime ones, especially for less popular software. (The watchword here is “closure size”, as in the size of a package and all of its transitive runtime dependencies.)

pxc

0 replies

1d3h

2024-07-26 14:37:10 UTC

On the other hand, Nixpkgs (which NixOS is built on top of) doesn’t really follow a discipline of minimizing package sizes to the extent that, say, Alpine does. You fairly often find documentation and development components living together with the runtime ones, especially for less popular software. (The watchword here is “closure size”, as in the size of a package and all of its transitive runtime dependencies.)

Yep. I remember before Nix even had multi-output derivations! I once broke some packages trying to reduce closure sizes when that feature got added, too. :(

Besides continuing to split off more dev and doc outputs, it'd be cool if somehow Nixpkgs had a `pkgsForAnts` just like it has a `pkgsStatic`, where packages just disable more features and integrations. On the other hand, by the time you're really optimizing your Nix container builds it's probably well worth it to use overrides and build from source anyway, binary cache be damned.

compsciphd

0 replies

1d4h

2024-07-26 13:31:50 UTC

I'll try to get back to this to give a proper response, but can't promise.

chatmasta

1 replies

1d14h

2024-07-26 04:03:36 UTC

I like to say that Docker wouldn’t exist if the Python packaging and dependency management system weren’t complete garbage. You can draw a straight line from “run Python” to dotCloud to Docker.

Does that jive with your experience/memory at all? How much of your motivation for writing Docker could have been avoided if there were a sane way to compile a Python application into a single binary?

It’s funny, this era of dotCloud type IaaS providers kind of disappeared for a while, only to be semi-revived by the likes of Vercel (who, incidentally, moved away from a generic platform for running containers, in favor of focusing on one specific language runtime). But its legacy is containerization. And it’s kind of hard to imagine the world without containers now (for better or worse).

bombela

0 replies

2h2m

2024-07-27 16:27:02 UTC

I do not think the mess of dependency management in Python got us to Docker/containers. Rather Docker/containers standardized deploying applications to production. Which brings reproducibility without having to solve dependency management.

Long answer with context follows.

I was focused on container allocation and lifecycle. So my experience, recollection, and understanding of what we were doing is biased with this in mind.

dotCloud was offering a cheaper alternative to virtual machines. We started with pretty much full Linux distributions in the containers. I think some still had a /boot with the unused Linux kernel in there.

I came to the job with some experience testing deploying Linux at scale quickly by preparing images with chroot before making a tarball to then distribute over the network (via multicast from a seed machine) with a quick grub update. This was for quickly installing Counter-Strike servers for tournament in Europe. In those days it was one machine per game server. I was also used to run those tarball as virtual machines for throughout testing. To save storage space on my laptop at the time, I would hard-link together all the common files across my various chroot directories. I would only tarball to ship it out.

It turned out my counter-strike tarballs from 2008 would run fine as containers in 2011.

The main competition was Heroku. They did not use containers at the beginning. And they focused on running one language stack very well. It was Ruby and a database I forget.

At dotCloud we could run anything. And we wanted to be known serving everything. All your languages, not just one. So early on we started offering base images ready made for specific languages and database. It was so much work to support. We had a few base images per team member to maintain, while still trying to develop the platform.

The layered filesystem was to pack ressources more efficiently on our servers. We definitely liked that it saved build time on our laptop when testing (we still had small and slow spinning disks in 2011).

So I wouldn't say that Docker wouldn't exist without the mess of dependency management in software. It just happened to offer a standardized interface between application developers, and the person running it in production (devops/sre).

The fact you could run the container on your local (Linux) machine was great for testing. Then people realized they could work around dependency hell and non reproducible development environment by using containers.

shizcakes

3 replies

1d5h

2024-07-26 12:50:18 UTC

I’m really confused. Solomon Hykes is typically credited as the creator of Docker. Who are you? Why is he credited if someone else created it?

inetknght

0 replies

1d5h

2024-07-26 13:19:00 UTC

Why is he credited if someone else created it?

This is the internet and just about everyone could be diagnosed with Not Invented Here syndrome. First one to get recognition for creating something that's already been created is just a popular meme.

icedchai

0 replies

1d4h

2024-07-26 14:18:26 UTC

He means he came up with the same concept, not literally created Docker.

bombela

0 replies

55m

2024-07-27 17:34:39 UTC

Solomon was the CEO of dotCloud. Sébastien the CTO. I (François-Xavier "bombela") was one of the first software engineer employee. Along with Andrea, Jérôme and Louis with Sam as our manager.

When it became clear that we had reached the limits of our existing buggy code, I pushed hard to get to work on the replacement. After what seemed an eternity pitching Solomon, I was finally allowed to work on it.

I wrote the design doc of Docker with the help of Andrea, Jérôme, Louis and Gabriel. I implemented the prototype in Python. To this day, three of us will still argue who really choose the name Docker. We are very good friends.

Not long after, I left the company. Because I was under paid, I could barely make ends meet at the time. I had to borrow money to see a doctor. I did not mind, it's the start-up life, am I right? I worked 80h/week happy. But then I realized not everybody was under paid. And other companies would pay me more for less work. When I asked, Solomon refused to pay me more, and after being denied three times, I quit. I never got any shares. I couldn't afford to buy the options anyways, and they had delayed the paperwork multiple times, such that I technically quit before the vesting started. I went to Google, were they showered me with cash in comparison. The next morning after my departure from dotCloud, Solomon raised everybody's salary. My friends took me to dinner to celebrate my sacrifice.

I am not privy to all the details after I left. But here is what I know. Andrea rewrote Docker in Go. It was made open source. Solomon even asked me to participate as an external contributor. For free of course. As a gesture to let me add my name to the commit history. Probably the biggest insult I ever received in life.

dotCloud was renamed Docker. The original dotCloud business was sold to a German company for the customers.

I believe Solomon saw the potential of Docker for all, and not merely an internal detail within a distributed system designed to orchestrate containers. My vision was extremely limited, and focused on reducing the suffering of my oncall duties.

A side story: the German company transfered to me the trademark of zerorpc, the open source network library powering dotCloud. I had done a lot of work on it. Solomon refused to hand me off the empty github/zerorpc group he was squatting. He offered to grant me access but retain control. I went for github/0rpc instead. I did not have time nor money to spend on a lawyer.

By this point you might think I have a vendetta against a specific individual. I can assure you that I tried hard to paint things fairly with a flattering light.

sulandor

0 replies

1d12h

2024-07-26 06:12:05 UTC

any security advantage they have is simply due lack of bugs, not simply a better design.

feels like maybe there is some corelation

ignoramous

0 replies

1d2h

2024-07-26 16:06:22 UTC

I argued that it wasn't just good for containers, but could be used to improve VM management as well (i.e. a single VM per running image

Believe Google embarked on this path with Crostini for ChromiumOS [0], but now it seems like they're going to scale down their ambitions in favour of Android [1]. Crostini may not but looks like the underlying VMM (crosvm) might live on [2].

I'd argue that FreeBSD Jails / Solaris Zones (Solaris Zone/ZFS inspired my original work) really aren't any more secure than containers on linux, as they all suffer from the same fundamental problem of the entire kernel being part of one's "tcb", so any security advantage they have is simply due lack of bugs, not simply a better design.

Jails (or an equivalent concept/implementation) come in handy where the Kernel/OS may want to sandbox higher privilege services (like with minijail in ChromiumOS [3]).

[0] https://www.youtube.com/watch?v=WwrXqDERFm8&t=300 / summary: https://g.co/gemini/share/41a794b8e6ae (mirror: https://archive.is/5njY1)

[1] https://news.ycombinator.com/item?id=40661703

[2] https://source.android.com/docs/core/virtualization/virtuali...

[3] https://www.chromium.org/chromium-os/developer-library/guide...

cryptonector

0 replies

22h16m

2024-07-26 20:13:37 UTC

I'd argue that FreeBSD Jails / Solaris Zones [...] really aren't any more secure than containers on linux, as they all suffer from the same fundamental problem of the entire kernel being part of one's "tcb", [...]

And also CPU branch prediction state, RAM chips, etc. The side-channels are legion.

birdiesanders

0 replies

1d1h

2024-07-26 16:50:46 UTC

My infra is this exactly. K8s managed containers that manage qemu VMs. Every VM has its own management environment, they don’t ever see each other, and they work just the same as using virtual-manager, but I get infinite flexibility in my env provisioning before I start a VM that gets placed in its tenant network that is isolated.

dboreham

26 replies

2024-07-25 17:37:46 UTC

For me it's about the ROAC property (Runs On Any Computer). I prefer working with stuff that I can run. Running software is live software, working software, loved software. Software that only works in weird places is bad, at least for me. Docker is pretty crappy in most respects, but it has the ROAC going for it.

I would love to have a "docker-like thing" (with ROAC) that used VMs not containers (or some other isolation tech that works). But afaik that thing does not yet exist. Yes there are several "container-tool, but we made it use VMs" (firecracker and downline), but they all need weirdo special setup, won't run on my laptop, or a generic Digitalocean VM.

ThreatSystems

13 replies

1d23h

2024-07-25 19:13:16 UTC

Vagrant / Packer?

gavindean90

10 replies

1d21h

2024-07-25 20:38:07 UTC

With all the mind share that terraform gets you would thing vagrant would at least be known but alas

tptacek

9 replies

1d21h

2024-07-25 21:09:36 UTC

Somebody educate me about the problem Packer would solve for you in 2024?

sgarland

4 replies

1d17h

2024-07-26 00:31:23 UTC

Making machine images. AWS calls them AMIs. Whatever your platform, that's what it's there for. It's often combined with Ansible, and basically runs like this:

1. Start a base image of Debian / Ubuntu / whatever – this is often done with Terraform.

2. Packer types a boot command after power-on to configure whatever you'd like

3. Packer manages the installation; with Debian and its derivatives, this is done mostly through the arcane language of preseed [0]

4. As a last step, a pre-configured SSH password is set, then the new base VM reboots

5. Ansible detects SSH becoming available, and takes over to do whatever you'd like.

6. Shut down the VM, and create clones as desired. Manage ongoing config in a variety of ways – rolling out a new VM for any change, continuing with Ansible, shifting to Puppet, etc.

[0]: https://wiki.debian.org/DebianInstaller/Preseed

pxc

2 replies

1d15h

2024-07-26 03:29:32 UTC

This is nice in its uniformity (same tool works for any distro that has an existing AMI to work with), but it's insanely slow compared to just putting a rootfs together and uploading it as an image.

I think I'd usually rather just use whatever distro-specific tools for putting together a li'l chroot (e.g., debootstrap, pacstrap, whatever) and building a suitable rootfs in there, then finish it up with amazon-ec2-ami-tools or euca2ools or whatever and upload directly. The pace of iteration with Packer is just really painful for me.

sgarland

1 replies

1d6h

2024-07-26 11:32:50 UTC

I haven’t played with chroot since Gentoo (which for me, was quite a while ago), so I may be incorrect, but isn’t that approach more limited in its customization? As in, you can install some packages, but if you wanted to add other repos, configure 3rd party software, etc. you’re out of luck.

pxc

0 replies

1d4h

2024-07-26 14:29:03 UTC

Nah you can add other repos in a chroot! The only thing you can't really do afaik is test running a different kernel; for that you've got to actually boot into the system.

If you dual-boot multiple Linux systems you can still administer any of the ones you're not currently running via chroot at any time, and that works fine whether you've got third-party repositories or not. A chroot is also what you'd use to reinstall the bootloader on a system where Windows has nuked the MBR or the EFI vars or whatever.

There might be some edge cases like software that requires a physical hardware token to be installed for licensing purposes is very aggressive, so it might also try to check if it's running in a chroot, container, or VM and refuse to play nice or something like that. But generally you can do basically anything in a chroot that you might do in a local container, and 99% of what you might do in a local VM.

stargrazer

0 replies

1d13h

2024-07-26 04:57:38 UTC

I miss saltstack. I did that whole litany of steps with one tool plus preseed.

yjftsjthsd-h

1 replies

1d20h

2024-07-25 22:27:15 UTC

What's a better way to make VM images?

DaanDeMeyer

0 replies

1d12h

2024-07-26 05:51:39 UTC

There's lots of tools in this space. I work on https://github.com/systemd/mkosi for example.

kasey_junk

1 replies

1d20h

2024-07-25 22:18:29 UTC

I think the thread is more about how docker was a reaction to the vagrant/packer ecosystem that was deemed overweight but was in many ways was a “docker like thing” but VMs.

tptacek

0 replies

1d19h

2024-07-25 22:33:19 UTC

Oh, yeah, I'm not trying to prosecute, I've just always been Packer-curious.

stackskipton

1 replies

1d19h

2024-07-25 23:14:49 UTC

Wouldn't work here, they have software on each VM that cannot be reimaged. To use Packer properly, you should treat like you do stateless pod, just start a new one and take down the old one.

ThreatSystems

0 replies

23h40m

2024-07-26 18:49:19 UTC

Sure then throw Ansible over the top for configuration/change management. Packer gives you a solid base for repeatable deployments. Their model was to ensure that data stays within the VM which a deployed AMI made from Packer would suit the bill quite nicely. If they need to do per client configuration then ansible or even AWS SSM could fit the bill there once EC2 instance is deployed.

For data sustainment if they need to upgrade / replace VMs, have a secondary EBS (volume) mounted which solely stores persistent data for the account.

01HNNWZ0MV43FF

11 replies

2024-07-25 17:43:38 UTC

Yeah that's kind of a crummy tradeoff.

Docker is "Runs on any Linux, mostly, if you have a new enough kernel" meaning it packages a big VM anyway for Windows and macOS

VMs are "Runs on anything! ... Sorta, mostly, if you have VM acceleration" meaning you have to pick a VM software and hope the VM doesn't crash for no reason. (I have real bad luck with UTM and VirtualBox on my Macbook host for some reason.)

All I want is everything - An APE-like program that runs on any OS, maybe has shims for slightly-old kernels, doesn't need a big installation step, and runs any useful guest OS. (i.e. Linux)

neaanopri

8 replies

2024-07-25 18:04:16 UTC

The modern developer yearns for Java

smallmancontrov

5 replies

1d23h

2024-07-25 18:37:27 UTC

I had to use eclipse the other day. How the hell is it just as slow and clunky as I remember from 20 years ago? Does it exist in a pocket dimension where Moore's Law doesn't apply?

qwery

2 replies

1d22h

2024-07-25 19:36:03 UTC

I think it's pretty remarkable to see any application in continuous use for so long, especially with so few changes[0] -- Eclipse must be doing something right!

Maintaining (if not actively improving/developing) a piece of useful software without performance degradation -- that's a win.

Keeping that up for decades? That's exceptional.

[0] "so few changes": I'm not commenting on the amount of work done on the project or claiming that there is no useful/visible features added or upgrades, but referring to Eclipse of today feeling like the same application as it always did, and that Eclipse hasn't had multiple alarmingly frequent "reboots", "overhauls", etc.

[?] keeping performance constant over the last decade or two is a win, relatively speaking, anyway

password4321

0 replies

1d20h

2024-07-25 21:48:57 UTC

without performance degradation

Not accounting for Moore's Law, yikes. Need a comparison adjusted for "today's dollars".

dijit

0 replies

1d22h

2024-07-25 19:51:24 UTC

I agree, that you've pointed it out to me makes it obvious that this is not the norm, and we should celebrate this.

I'm reminded of Casey Muratori's rant on Visual Studio; a program that largely feels like it hasn't changed much but clearly has regressed in performance massively; https://www.youtube.com/watch?v=GC-0tCy4P1U

marcosdumay

0 replies

1d3h

2024-07-26 14:49:04 UTC

It's not as slow as it was 20 years ago.

It's only as slow as you remember because the actual performance was so bad that you can't physiologically remember it.

TurningCanadian

0 replies

1d23h

2024-07-25 19:23:40 UTC

That's not Java's fault though. IntelliJ IDEA is also built on Java and runs just fine.

mschuster91

0 replies

1d22h

2024-07-25 19:48:38 UTC

Java's ecosystem is just as bad. Gradle is insanely flexible but people create abominations out of it, Maven is extremely rigid so people resort to even worse abominations to get basic shit done.

gryfft

0 replies

1d23h

2024-07-25 19:17:34 UTC

Maybe just the JVM.

klooney

0 replies

1d4h

2024-07-26 13:45:22 UTC

Runs on any Linux, mostly, if you have a new enough kernel"

New enough kernel means CentOS 5 is right out. But it's been a decade, it'll run on anything vaguely sensible to be running today

compsciphd

0 replies

2024-07-25 18:07:47 UTC

docker is your userspace program carries all its user space dependencies with it and doesn't depend on the userspace configuration of the underlying system.

What I argued in my paper is that systems like docker (i.e. what I created before it), improve over VMs and (even Zones/ZFS) in their ability to really run ephemeral computation. i.e. if it takes microseconds to setup the container file system, you can run a boatload of heterogeneous containers even if they only needed to run for very shot periods of time). Solaris Zones/ZFS didn't lend itself to heterogeneous environments, but simply cloning as single homogeneous environment, while VMs suffered from that problem, they also (at least at the time, much improved as of late) required a reasonably long bootup time.

vundercind

9 replies

1d23h

2024-07-25 18:52:40 UTC

Docker’s the best cross-distro rolling-release package manager and init system for services—staying strictly out of managing the base system, which is great—that I know of. I don’t know of anything that’s even close, really.

All the other stuff about it is way less important to me than that part.

pxc

8 replies

1d19h

2024-07-25 23:02:45 UTC

This is wrong in pretty much every way I can imagine.

Docker's not a package manager. It doesn't know what packages are, which is part of why the chunks that make up Docker containers (image layers) are so coarse. This is also part of why many Docker images are so huge: you don't know exactly the packages you need, strictly speaking, so you start from a whole OS. This is also why your Dockerfiles all invoke real package managers— Docker can't know how to install packages if it doesn't know what they are!

It's also not cross-platform, or at least 99.999% of images you might care about aren't— they're Linux-only.

It's also not a service manager, unless you mean docker-compose (which is not as good as systemd or any number of other process supervisors) or Docker Swarm (which has lost out to Kubernetes). (I'm not sure what you even mean by 'init system for containers' since most containers don't include an init system.)

There actually are cross-platform package managers out there, too. Nix, Pkgsrc, Homebrew, etc. All of those I mentioned and more have rolling release repositories as well. ('Rolling release' is not a feature of package managers; there is no such thing as a 'rolling release package manager'.)

vundercind

3 replies

1d18h

2024-07-25 23:48:06 UTC

This is wrong in pretty much every way I can imagine.

Nope! It’s not wrong in any way at all!

You’re thinking of how it’s built. I’m thinking of what it does (for me).

I tell it a package (image) to fetch, optionally at a version. It has a very large set of well maintained up-to-date packages (images). It’s built-in, I don’t even have to configure that part, though I can have it use other sources for packages if I want to. It fetches the package. If I want it to update it, I can have it do that too. Or uninstall it. Or roll back the version. I am 100% for-sure using it as a package manager, and it does that job well.

Then I run a service with a simple shell script (actually, I combine the fetching and running, but I’m highlighting the two separate roles it performs for me). It takes care of managing the process (image, which is really just a very-fat process for these purposes). It restarts it if it crashes, if I like. It auto-starts it when the machine reboots—all my services come back up on boot, and I’ve never touched systemd (which my Debian uses), Docker is my interface to that and I didn’t even have to configure it to do that part. I’m sure it’s doing systemd stuff under the hood, at least to bring the docker daemon up, but I’ve never touched that and it’s not my interface to managing my services. The docker command is. Do I see what’s running with systemd or ps? No, with docker. Start, restart, or stop a service? Docker.

I’ve been running hobbyist servers at home (and setting up and administrating “real” ones for work) since 2000 or so and this is the smoothest way to do it that I’ve seen, at least for the hobbyist side. Very nearly the only roles I’m using Docker to fill, in this scenario, are package manager and service manager.

I don’t care how it works—I know how, but the details don’t matter for my use case, just the outcomes. The outcome is that I have excellent, updated, official packages for way more services than are in the Debian repos, that leave my base system entirely alone and don’t meaningfully interact with it, with config that’s highly portable to any other distro, all managed with a common interface that would also be the same on any other distro. I don’t have to give any shits about my distro, no “oh if I want to run this I have to update the whole damn distro to a new major version or else manually install some newer libraries and hope that doesn’t break anything”, I just run packages (images) from Docker, update them with Docker, and run them with Docker. Docker is my UI for everything that matters except ZFS pool management.

It's also not cross-platform, or at least 99.999% of images you might care about aren't— they're Linux-only.

I specifically wrote cross-distro for this reason.

There actually are cross-platform package managers out there, too. Nix, Pkgsrc, Homebrew, etc.

Docker “packages” have a broader selection and better support than any of those, as far as services/daemons go; it’s guaranteed to keep everything away from the base system and tidy for better stability; and it provides a common interface for configuring where to put files & config for easier and more-confident backup.

I definitely use it mainly as a package manager and service manager, and find it better than any alternative for that role.

pxc

2 replies

1d15h

2024-07-26 03:04:46 UTC

You’re thinking of how it’s built. I’m thinking of what it does (for me).

I've read your reply and I hear you (now). But as far as I'm concerned package management is a little more than that. Not everything that installs or uninstalls software is a package manager-- for instance I would say that winget and Chocolatey are hardly package managers, despite their pretensions (scoop is closer). I think of package management, as an approach to managing software and as a technology, as generally characterized by things like and including: dependency tracking, completeness (packages' dependencies are themselves all packaged, recursively, all the way down), totality (installing software by any means other than the package manager is not required to have a practically useful system), minimal redundancy of dependencies common to multiple packages, collective aggregation and curation of packages, transparency (the unit the software management tool operates on, the package, tracks the versions of the software contained in it and the versions of the software contained in its dependencies), exclusivity (packaged software does not self-update; updates all come through the package manager), etc. Many of these things come in degrees, and many package managers do not have all of them to the highest degree possible. But the way Docker gets software running on your system just isn't meaningfully aligned with that paradigm, and this also impacts the way Docker can be used. I won't enumerate Docker's deviations from this archetype because it sounds like you already have plenty of relevant experience and knowledge.

I don’t care how it works—I know how, but the details don’t matter for my use case, just the outcomes.

When there's a vuln in your libc or some similar common dependency, Docker can't tell you about which of your images contains it because it has no idea what glibc or liblzma are. The whole practice of generating SBOMs is about trying to recover or regenerate data that is already easily accessible in any competent package manager (and indeed, the tools that generate SBOMs for container images depend on actual package managers to get that data, which is why their support comes distro-by-distro).

Managing Docker containers is also complicated in some ways that managing conventional packages (even in other containerized formats like Flatpak, Snap, and AppImage) isn't, in that you have to worry about bind mounts and port forwarding. How the software works leads to a radically different sort of practice. (Admittedly maybe that's still a bit distant from very broad outcomes like 'I have postgres running'.)

The outcome is that I have [many services] that leave my base system entirely alone and don’t meaningfully interact with it, with config that’s highly portable to any other distro, all managed with a common interface that would also be the same on any other distro.

This is indeed a great outcome. But when you achieve it with Docker, the practice by means of which you've achieved it is not really a package management discipline but something else. And that is (sadly, to me) part of the appeal, right? Package management can be a really miserable paradigm when your packages all live in a single, shared global namespace (the paths on your filesystem, starting with /). Docker broke with that paradigm specifically to address that pain.

But that's not the end of the story! That same excellent outcome is also achievable by better package managers than ol' yum/dnf and apt! And when you go that route, you also get back the benefits of the old regime like the ability to tell what's on your system and easily patch small pieces of it once-and-for-all. Nix and Guix are great for this and work in all the same scenarios, and can also readily generate containers from arbitrary packages for those times you need the resource management aspect of containers.

The outcome is that I have [...] official packages

For me, this is not a benefit. I think the curation, integration, vetting, and patching that coordinated software distributions do is extremely valuable, and I expect the average software developer to be much worse at packaging and systems administration tasks than the average contributor to a Linux distro is. To me, this feels like a step backwards into chaos, like apps self-updating or something like that. It makes me think of all the absolutely insane things I've seen Java developers do with Maven and Gradle, or entire communities of hobbyists who depend on software whose build process is so brittle and undocumented that seemingly no one knows how to build it and Docker has become the sole supported distribution mechanism.

I specifically wrote cross-distro for this reason.

My bad! Although that actually widens the field of contenders to include Guix, which is excellent, and arguably also Flatpak, which still aligns fairly well with package management as an approach despite being container-based.

Docker “packages” have a broader selection and better support than any of those, as far as services/daemons go

I suppose this is an advantage of a decentralized authority-to-publish, like we also see in the AUR or many language-specific package repositories, and also of freedom from the burden of integration, since all docker image authors have to do is put together any runtime at all that runs. :-\

service manager

Ok. So you're just having dockerd autostart your containers, then, no docker-compose or Docker Swarm or some other layer on top? Does that even have a notion of dependencies between services? That feels like table stakes for me for 'good service manager'.

PS: thanks for giving a charitable and productive reply to a comment where I was way gratuitously combative about a pet peeve/hobby horse of mine for no good reason

vundercind

1 replies

1d12h

2024-07-26 06:00:14 UTC

Oh no, you’re fine, thanks for responding in kind, I get where you’re coming from now too. Maybe it’s clearer to label my use of it as a software manager or something like that. It does end up being my main interface for nearly everything that matters on my home server.

Like, I’m damn near running Docker/Linux, in the pattern of gnu/Linux or (as started as a bit of a joke, but is less so with each year) systemd/Linux, as far as the key parts that I interact with and care about and that complete the OS for me.

As a result, some docker alternatives aren’t alternatives for me—I want the consistent, fairly complete UI for the things I use it for, and the huge library of images, largely official. I can’t just use raw lxc or light VMs instead, as that gets me almost nothing of what I’m currently benefiting from.

I haven’t run into a need to have dependent services (I take the SQLite option for anything that has it—makes backups trivial) but probably would whip up a little docker-compose for if I ever need that. In work contexts I usually just go straight for docker-compose, but with seven or eight independent services on my home server I’ve found I prefer tiny shell scripts for each one.

[edit] oh, and I get what you mean about it not really doing things like solving software dependencies—it’s clearly not suitable as, like, a system package manager, but fills the role well enough for me when it comes to the high-level “packages” I’m intending to use directly.

pxc

0 replies

1d4h

2024-07-26 13:46:09 UTC

As a result, some docker alternatives aren’t alternatives for me—I want the consistent, fairly complete UI for the things I use it for, and the huge library of images, largely official.

This kind of uniformity of interface and vastness are among the things that made me fall in love with Linux package management as a kid, too. I can see how there's a vital similarity there that could inspire the language you used in your first comment.

I can’t just use raw lxc or light VMs instead, as that gets me almost nothing of what I’m currently benefiting from.

Right, those will give you the isolation, but primarily what you want is the low commitment and uniform interface to just try something (and if turns out to be good enough, why not leave it running like that for a few years).

I sometimes kind of enjoy packaging work, and even at work I sometimes prefer to build a container from scratch myself when we're doing container deployments, rather than using a vendor one. In fact we've got one ECS deployment where we're using a vendor container where the version of the software that I've got running locally through a package I wrote works fine, but the vendor container at that same version mysteriously chokes on ECS, so we've reverted to the LTS release of the vendor one. Building my own Nix package for that software involved some struggle: some reading of its build tooling and startup code, some attempts to build from source manually, some experiments with wrapping and modifying prebuilt artifacts, some experiments with different runtimes, etc. But it also taught me enough about the application that I am now prepared to debug or replace that mysteriously-busted vendor container before that deployment moves to production.

At the same time, I don't think that pain would be worth it for me for a homelab deployment of this particular software. At work, it's pretty critical stuff so having some mastery of the software feels a bit more relevant. But if I were just hoping to casually try it at home, I'd have likely moved on or just resorted to the LTS image and diminished my opinion of upstream a little without digging deeper.

The (conceptually) lightweight service management point is well-taken. In systemd, which I generally like working with, you always have to define at least one dependency relation (e.g., between your new service and one of the built-in targets) just to get `systemctl enable` to work for it! On one level I like the explicitness of that, but on another I can understand seeing it as actually unnecessary overhead for some use cases.

icedchai

3 replies

1d3h

2024-07-26 14:44:33 UTC

When I read the parent comment, I was picturing package manager and init system in quotes. Docker is the "package manager" for people who don't want to package their apps with a real package manager. It's a "service manager" and "init system" that can restart your services (containers) on boot or when they fail.

vundercind

1 replies

1d2h

2024-07-26 15:34:13 UTC

Right, it gives me the key functionality of those systems in a way that’s decoupled from the base OS, so I can just run some version of Debian for five years or however long it’s got security updates, and not have to worry about things like services I want to run needing newer versions of lots of things than some old Debian has. Major version updates of my distro, or trying to get back ported newer packages in, have historically been the biggest single source of “fuck, now nothing works and there goes my weekend” running home servers, to the point of making ROI on them not so great and making me dread trying to add or update services I was running. This approach means I can install and run just about any service at any recent-ish version without touching the state of the OS itself. The underlying tech of docker isn’t directly important to me, nor any of the other features it provides, in this context, just that it’s separate from the base OS and gives me a way to “manage packages” and run services in that decoupled manner.

icedchai

0 replies

1d2h

2024-07-26 16:19:12 UTC

I agree with all this and use it similarly. I hate when updating the OS breaks my own apps, or does something annoying like updating a dependency (like Postgres)... Docker is perfect for this.

pxc

0 replies

2024-07-26 17:48:08 UTC

You're right. I read the commenter I was replying to very badly. In my later discussion with them we covered a bit better how Docker can cover some of the same uses as package managers as well as the continued vitality of package management in the era of containers. It was a much better conversation by the end than it was at the beginning, thanks to their patience and good faith engagement.

nimish

7 replies

2024-07-25 17:41:52 UTC

Clear Containers/Kata Containers/firecracker VMs showed that there isn't really a dichotomy here. Why we aren't all using HW assisted containers is a mystery.

tptacek

5 replies

1d22h

2024-07-25 20:09:00 UTC

It's not at all mysterious: to run hardware-virtualized containers, you need your compute hosted on a platform that will allow KVM. That's a small, expensive, tenuously available subset of AWS, which is by far the dominant compute platform.

Spivak

4 replies

1d21h

2024-07-25 21:10:51 UTC

So… Lambda, Fargate, and EC2. The only thing you can't really do this with is EKS.

Like Firecracker was made by AWS to run containers on their global scale KVM, EC2.

tptacek

3 replies

1d21h

2024-07-25 21:14:41 UTC

Lambda and Fargate are implementations of the idea, not a way for you yourself to do any kind of KVM container provisioning. You can't generally do this on EC2; you need special instances for it.

For a variety of reasons, I'm pretty familiar with Firecracker.

Spivak

2 replies

1d19h

2024-07-25 23:18:59 UTC

What I'm I missing? AWS offers (virtual) hardware backed containers as a service, I would go so far as to say that a significant number of people are running vm backed containers.

And I've been at a few shops where EC2 is used as the poor-man's-firecracker by building containers and then running 1(ish) per VM. AWS's architecture actively encourages this because that's by far the easiest security boundary to manipulate. The moment you start thinking about two privilege levels in the same VM you're mostly on your own.

The number of people running production workloads who, knowingly or not, believe that the security boundary is not between containers but between the vms enclosing those containers is probably almost everyone.

ryapric

0 replies

1d16h

2024-07-26 02:21:41 UTC

What I'm I missing?

The parent isn't talking about e.g. EC2 as a virtualized platform, they're talking about EC2 not being a virtualization platform. With few machine-type exceptions, EC2 doesn't support nested virtualization -- you can't run e.g. KVM on EC2.

kasey_junk

0 replies

1d16h

2024-07-26 02:19:04 UTC

I think the argument is you need to be running nitro (I think, it’s been awhile?) instances to take advantage of kvm isolation

turtlebits

0 replies

2024-07-25 17:51:49 UTC

Engineers are lazy, especially Ops. Until it's easier to get up and running and there are tangible benefits, people won't care.

m463

7 replies

1d20h

2024-07-25 21:46:59 UTC

I've always hated the docker model of the image namespace. It's like those cloud-based routers you can buy.

Docker actively prevents you from having a private repo. They don't want you to point away from their cloud.

Redhat understood this and podman allows you to have a private docker infrastructure, disconnected from docker hub.

For my personal stuff, I would like to use "FROM scratch" and build my personal containers in my own ecosystem.

Carrok

4 replies

1d20h

2024-07-25 21:50:00 UTC

Docker actively prevents you from having a private repo.

In what ways? I use private repos daily with no issues.

m463

3 replies

1d19h

2024-07-25 23:18:10 UTC

If you reference a container without a domain, you pull from docker.io

With podman, you can control this with

  $HOME/.config/containers/registries.conf

  /etc/containers/registries.conf

with docker, not possible (though you can hack mirrors)

https://stackoverflow.com/questions/33054369/how-to-change-t...

Carrok

1 replies

1d18h

2024-07-26 00:09:47 UTC

So.. just use a domain. This seems like a nothing burger.

m463

0 replies

1d13h

2024-07-26 04:57:32 UTC

Not all dockerfiles (especially multi-stage builds) are easily sanitized for this.

think FROM python:latest or FROM ubuntu:20.04 AS build

They've put deliberate barriers in the way of using docker commands without accessing their cloud.

Cyph0n

0 replies

1d17h

2024-07-26 00:49:05 UTC

Or even easier: just fully qualify all images. With Podman:

nginx => docker.io/library/nginx

linuxserver/plex => docker.io/linuxserver/plex

mountainriver

0 replies

1d18h

2024-07-25 23:43:19 UTC

Huh? In what way does docker prevent you from having a private repo? Its a couple clicks to get on any cloud

icedchai

0 replies

1d19h

2024-07-25 23:15:23 UTC

One of the first things I did was set up a my own container registry with Docker. It's not terribly difficult.

ganoushoreilly

7 replies

2024-07-25 17:35:02 UTC

Docker is great, way overused 100%. I believe a lot of it started as "cost savings" on resource usage. Then it became the trendy thing for "scalability".

When home enthusiasts build multi container stacks for their project website, it gets a bit much.

applied_heat

5 replies

2024-07-25 17:48:27 UTC

Solves dependency version hell also

theLiminator

3 replies

2024-07-25 17:54:58 UTC

Solves it in the same sense that it's a giant lockfile. It doesn't solve the other half where updates can horribly break your system and you run into transitive version clashes.

ktosobcy

0 replies

1d10h

2024-07-26 07:51:32 UTC

Having been running a VPS with manually maintained services moving to docker saved me a lot of headache... things definitely breaks less often (almost never) and if they do it's quite easy to revert back to previous version...

bornfreddy

0 replies

2024-07-25 18:30:00 UTC

But at least you can revert back to the original configuration (as you can with VM, too).

Spivak

0 replies

1d21h

2024-07-25 21:06:00 UTC

It solves it in the sense that it empowers the devs to update their dependencies on their own time and ops can update the underlying infrastructure fearlessly. It turned a coordination problem into a non-problem.

sitkack

0 replies

1d23h

2024-07-25 19:20:27 UTC

It doesn't solve it, it makes it tractable so you can use the scientific method to fix problems as opposed to voodoo.

Yodel0914

0 replies

1d13h

2024-07-26 04:51:36 UTC

When home enthusiasts build multi container stacks for their project website, it gets a bit much.

I don't know - docker has been a godsend for running my own stuff. I can get a docker-compose file working on my laptop, then run it on my VPS with a pretty high certainty that it will work. Updating has also (to date) been incredibly smooth.

anonfordays

7 replies

2024-07-25 18:01:18 UTC

Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow, I didn't dig too deep into this is practice, maybe I'm afraid to learn the contrary, but I hope not

Having run both at scale, I can confirm and assure you they are not as secure as VMs and did not produce sane devops workflows. Not that Docker is much better, but it is better from the devops workflow perspective, and IMHO that's why Docker "won" and took over the industry.

kkfx

6 replies

1d22h

2024-07-25 20:03:39 UTC

A sane DevOps workflow is with declarative systems like NixOS or Guix System, definitively not on a VM infra in practice regularly not up to date, full of useless deps, on a host definitively not up to date, with the entire infra typically not much managed nor manageable and with an immense attack surface...

VMs are useful for those who live on the shoulder of someone else (i.e. *aaS) witch is ALL but insecure.

secondcoming

3 replies

1d20h

2024-07-25 21:38:33 UTC

I'm not sure what you're referring to here?

Our cloud machines are largely VMs. Deployments mean building a new image and telling GCP to deploy that as machines come and go due to scaling. The software is up to date, dependencies are managed via ansible.

Maybe you think VMs means monoliths? That doesn't have to be the case.

kkfx

2 replies

1d13h

2024-07-26 05:28:07 UTC

That's precisely the case: instead of owning hw, witch per-machine it's a kind-of monolith (even counting blades and other modular solution), you deploy a full OS or half-full to run just a single service, on top of another "OS". Of course yes, this is the cloud model, and is also the ancient and deprecated mainframe model, with much more added complexity and no unique ownership with an enormously big attack surface.

Various return of experience prove that cloud model is not cheap nor reliable than owning iron, it's just fast since you live on the shoulders of someone else. A speed you will pay at an unknown point in time when something happen and you have zero control other that.

DevOps meaning the Devs taking over the Ops without having the needed competences, it's a modern recipe to a failing digital ecosystems and we witnessed that more and more with various "biblical outages" from "Roomba devices briked due to an AWS mishap, cars of a certain vendor with a slice or RCEs, payment systems outages, ... a resilient infra it's not a centrally managed decentralized infra, it's a vast and diverse ecosystem interoperating with open and standard tools and protocols. Classic mail or Usenet infra are resilient, GMail backed by Alphabet infra is not.

What if Azure tomorrow collapse? What's the impact? What's the attack surface of living on the shoulder of someone else, typically much bigger than you and often in other countries where getting even legal protections is costly and complex?

Declarative systems on iron means you can replicate your infra ALONE on the iron, VMs meaning you need much more resources and you do not even know the entire stack of your infra, you can't essentially replicate nothing. VMs/images are still made the classical '80s style semi-manual way with some automation written by a dev knowing just how to manage his/her own desktop a bit and others will use it careless "it's easy to destroy and re-start", as a result we have seen in production images with someone unknown SSH authorized keys because to be quick someone pick the first ready made image from Google Search and add just few things, we are near the level of crap of the dot-com bubble, with MUCH more complexity and weight.

bbarnett

1 replies

1d11h

2024-07-26 06:41:20 UTC

(note .. use 'which' not 'witch', quite different words)

Not sure if you mentioned it, but cost and scaling is an absurd trick of AWS and others. AWS is literally 1000s, and in some usage cases even millions of times more expensive than your own hardware. Some believe that employee cost savings help here, but that's not even remotely close.

Scaling is absurd. You can buy one server worth $10k, that can handle the equivalent of thousands upon thousands of AWS instances' workload. You can buy far cheaper servers ($2k each), colo them yourself, have failover capability, and even have multi-datacentre redundancy, immensely cheaper than AWS. 1000 of times cheaper. All with more power than you'd ever, ever, ever scale at AWS.

All that engineering to scale, all that effort to containerize, all that reliance upon AWS and their support system.. unneeded. You can still run docker locally, or VMs, or just pound it out to raw hardware.

So on top of your "run it on bare metal" concept, there's the whole "why are you wasting time and spending money" for AWS, argument. It's so insanely expensive. I cannot repeat enough how insanely expensive AWS is. I cannot repeat enough how AWS scaling is a lie, when you don't NEED to scale using local hardware. You just have so much more power.

Now.. there is one caveat, and you touch on this. Skill. Expertise. As in, you have to actually not do Really Dumb Things, like write code that uses 1000s of times CPU to do the same task, or write DB queries or schema that eat up endless resources. But of course, if you do those things on your own hardware, in DEV, you can see them and fix.

If you do those in AWS, people just shrug, and pay immense sums of money and never figure it out.

I wonder, how many startups have failed due to AWS costs?

kkfx

0 replies

1d7h

2024-07-26 11:07:04 UTC

use 'which' not 'witch', quite different words

Thanks and sorry for my English, even if I use it for work I do not normally use it conversationally and as a result it's still very poor for me...

Well I do not specifically talk about AWS, but in general living on someone else is much more expensive in OPEX than what it can be spared in CAPEX, and it's a deeply critical liability, specially when we start to develop on someone else API instead of just deploy something "standard" we can always move unchanged.

Yes, technical debt is a big issue but is a relative issue because if you can't maintain your own infra you can't be safe anyway, the "initial easiness" means a big disaster sooner or later, and the more later it is the more expensive it will be. Of course an unipersonal startup can't have on it's own iron offsite backups, geo replication and so on, but dose the MINIMUM usage of third party services trying to be as standard and vendor independent as possible until you earn enough to own it's definitively possible at any scale.

Unfortunately it's a thing we almost lost since now Operation essentially does not exists anymore except for few giants, Devs have no substantial skill since they came from "quick" full immersion bootcamps where they learned just to do repetitive things with specific tools like modern Ford model workers able only to turn a wrench and still most of the management fails to understand IT for what it is, not "computers" like astronomers telescopes, but information, like stars for astronomers. This toxic mix have allowed very few to earn hyper big positions, but they start to collapse because their commercial model is technically untenable and we start all paying the high price.

nine_k

1 replies

1d20h

2024-07-25 21:44:32 UTC

VMs are useful when you don't own or rent dedicated hardware. Which is a lot of cases, especially when your load varies seriously over the day or week.

And even if you do manage dedicated servers, it's often wise to use VMs on them to better isolate parts of the system, aka limit the blast radius.

kkfx

0 replies

1d12h

2024-07-26 05:31:56 UTC

Witch is a good recipe to pay much more thinking to be smart and pay less, being tied to some third parties decisions for anything running, having a giant attack surface and so on...

There are countless lessons about how owning hw is cheaper than not, there are countless examples of "cloud nightmares", countless examples of why a system need to be simple and securely design from start not "isolated", but people refuse to learn, specially since they are just employees for living on the shoulder of someone else means less work to do and managers typically do not know even the basic of IT to understand.

turtlebits

6 replies

2024-07-25 17:50:32 UTC

Honestly, it really doesn't matter whether it's VMs or Docker. The docker/container DX is so much better than VMWare/QEMU/etc. Make it easy to run workloads in VMs/Firecracker/etc and you'll see people migrate.

packetlost

5 replies

2024-07-25 17:54:30 UTC

I mean, Vagrant was basically docker before docker. People used it. But it turns out the overhead over booting a full VM + kernel adds latency which is undesirable for development workloads. The techniques used by firecracker could be used, but I suspect the overhead of allocating a namespace and loading a process will always be less than even restoring from a frozen VM, so I wouldn't hold my breath on it swinging back in VM's direction for developer workloads ever.

yjftsjthsd-h

4 replies

1d22h

2024-07-25 19:38:48 UTC

It would be interesting to see a microvm (kata/firecracker/etc.) version of vagrant. And open source, of course. I can't see any technical reason why it would be particularly difficult.

packetlost

2 replies

1d17h

2024-07-26 01:16:36 UTC

I don't think they're that valuable tbh. Outside of cases where you're running completely untrusted code or emulating a different architecture, there's no strong reason to pick any VM over one of the several container paradigms.

yjftsjthsd-h

1 replies

1d15h

2024-07-26 02:57:44 UTC

One more usecase - which I admit is niche - is that I want a way to run multiple OSs as transparently as possible. A microvm that boots freebsd in less than a second and that acts almost like a native container would be excellent for certain development work.

Edit: Actually it's not just cross-OS work in the freebsd/linux sense; it would also be nice for doing kernel dev work. Edit my linux kernel module, compile, and then spawn a VM that boots and starts running tests in seconds.

packetlost

0 replies

1d4h

2024-07-26 13:57:58 UTC

Yeah, there are definitely still cases, but they're when you specifically want/need a different kernel, which as you said are rather niche.

mountainriver

0 replies

1d18h

2024-07-25 23:44:40 UTC

Oh they exist! Several of them in fact, they have never picked up a ton of steam though

topspin

6 replies

1d22h

2024-07-25 20:06:36 UTC

Isn't this discussion based on a false dichotomy? I, too, use VMs to isolate customers, and I use containers within those VMs, either with or without k8s. These tools solve different problems. Containers solve software management, whereas VMs provide a high degree of isolation.

Container orchestration is where I see the great mistake in all of this. I consider everything running in a k8s cluster to be one "blast domain." Containers can be escaped. Faulty containers impact everyone relying on a cluster. Container orchestration is the thing I believe is "overused." It was designed to solve "hyper" scale problems, and it's being misused in far more modest use cases where VMs should prevail. I believe the existence of container orchestration and its misapplication has retarded the development of good VM tools: I dream of tools that create, deploy and manage entire VMs with the same ease as Docker, and that these tools have not matured and gained popularity because container orchestration is so easily misapplied.

Strongly disagree about containers and dev/deployment ("NOT check"). I can no longer imagine development without containers: it would be intolerable. Container repos are a godsend for deployment.

rodgerd

2 replies

1d18h

2024-07-25 23:43:16 UTC

Container orchestration is the thing I believe is "overused." It was designed to solve "hyper" scale problems, and it's being misused in far more modest use cases where VMs should prevail.

As a relatively early corporate adopter of k8s, this is absolutely correct. There are problems where k8s is actually easier than building the equivalent capability elsewhere, but a lot of uses it's put to seem to be driven more by a desire to have kubernetes on one's resume.

zarzavat

0 replies

1d1h

2024-07-26 17:15:50 UTC

but a lot of uses it's put to seem to be driven more by a desire to have kubernetes on one's resume

It’s no wonder people feel compelled to do this, given how many employers expect experience with k8s from applicants.

Kubernetes is a computer worm that spreads via resumes.

topspin

0 replies

1d17h

2024-07-26 01:14:38 UTC

For what k8s was designed to do -- herding vast quantities of ephemeral compute resources across a global network -- it's great. That's not my problem with it. My problem is that by being widely misapplied it has stunted the development of good solutions to everything else. K8s users spend their efforts trying to coax k8s to do things it was never intended to do, and so the k8s "ecosystem" has spiraled into this duplicative, esoteric, fragile, and costly bazar of complexity and overengineering.

trueismywork

1 replies

1d3h

2024-07-26 14:46:34 UTC

Which language do you develop in?

topspin

0 replies

2024-07-26 18:13:50 UTC

In no particular order; Python, Go, Perl, C, Java, Ruby, TCL, PHP, some proprietary stuff, all recently (last 1-2 years) and in different versions: Java: 8, 11 and 17, for example. Deployed to multiple environments at multiple sites, except the C, which is embedded MCU work.

marcosdumay

0 replies

1d4h

2024-07-26 14:11:05 UTC

It was designed to solve "hyper" scale problems

Even then, IMO, it makes too little sense. It would be a bit useful if non-used containers wasted a lot of resources, or if you could get an unlimited amount of them from somewhere.

But no, just creating all the containers you can and leaving them there wastes almost nothing, they are limited by the hardware you have or rent, and the thing clouds can rent are either full VMs or specialized single-application sandboxes.

AFAIK, containers solve the "how do I run both this PHP7 and this PHP8 applications on my web server?" problem, and not much more.

analognoise

4 replies

1d23h

2024-07-25 18:34:05 UTC

What do you think of Nix/NixOS?

reddit_clone

2 replies

1d23h

2024-07-25 19:28:14 UTC

But that comes _after_ you have chosen VMs over Containers yes?

If you are using VMs, I think NixOs/Guix is a good choice. Reproducible builds, Immutable OS, Immutable binaries and Dead easy rollback.

It still looks somewhat futuristic. Hopefully gets traction.

solarpunk

0 replies

1d22h

2024-07-25 20:11:01 UTC

if you're using nixos, just to do provisioning, I would argue OStree is a better fit.

bspammer

0 replies

1d21h

2024-07-25 21:00:23 UTC

Nix is actually a really nice tool for building docker images: https://xeiaso.net/talks/2024/nix-docker-build/

egberts1

0 replies

1d23h

2024-07-25 19:00:03 UTC

Nix is trying to be like macOS's DMG but its image file is bit more parse-able.

ranger207

3 replies

1d20h

2024-07-25 22:26:30 UTC

Docker's good at packaging, and Kubernetes is good at providing a single API to do all the infra stuff like scheduling, storage, and networking. I think that if someone sat down and tried to create a idealized VM management solution that covered everything between "dev pushes changes" to "user requests website" then it'd probably have a single image for each VM to run (like Docker has a single image for each container to run) then management of VM hosts, storage, networking, and scheduling VMs to run on which host would wind up looking a lot like k8s. You could certainly do that with VMs but for various path dependency reasons people do that with containers instead and nobody's got a well adopted system for doing the same with VMs

crabbone

2 replies

1d3h

2024-07-26 15:02:49 UTC

I'm sorry, but:

* Docker isn't good at packaging. When people talk about packaging, they usually understand it to include dependency management. For Docker to be good at packaging it should be able to create dependency graphs and allow users to re-create those graphs on their systems. Docker has no way of doing anything close to that. Aside from that, Docker suffers from the lack of reproducible builds, lack of upgrade protocols... It's not good at packaging... maybe it's better than something else, but there's a lot of room for improvement.

* Kubernetes doesn't provide a single API to do all the infra stuff. In fact, it provides so little, it's a mystery why anyone would think that. All those stuff like "storage", "scheduling", "networking" that you mentioned comes as add-ons (eg. CSI, CNI) which aren't developed by Kubernetes, aren't following any particular rules, have their own interfaces... Not only that, Kubernetes' integration with CSI / CNI is very lacking. For example, there's no protocol for upgrading these add-ons when upgrading Kubernetes. There's no generic interface that these add-ons have to expose to the user in order to implement common things. It's really anarchy what's going on there...

There are lots of existing VM management solutions, eg. OpenStack, VSphere -- you don't need to imagine them, they exist. They differ from Kubernetes in many ways. Very superficially, yet importantly, they don't have an easy way to automate them. For very simple tasks Kubernetes offers a very simple solution for automation. I.e. write some short YAML file. Automating eg. ESX comes down to using a library like govmomi (or something that wraps it, like Terraform). But, in the mentioned case, Terraform only managed deployment, and doesn't take care of the post-deployment maintenance... and so on.

However, the more you deal with the infra, the more you realize that the initial effort is an insignificant fraction of the overall complexity of the task you need to deal with. And that's where the management advantages of Kubernetes start to seem less appealing. I.e. you realize that you will have to write code to manage your solution, and there will be a lot of it... and a bunch of YAML files won't cut it.

ranger207

1 replies

21h16m

2024-07-26 21:13:40 UTC

Docker's dependency management solution is "include everything you need and specify a standard interface for the things you can't include like networking." There's no concern about "does the server I'm deploying to have the right version of libssl" because you just include the version you need. At most, you have to have "does the server I'm deploying to have the right version of Docker/other container runtime for the features my container uses" which are a much smaller subset of changes. Reproducible builds, yeah, but that's traditionally more a factor of making sure your build scripts are reproducible than the package management itself. Or to put it another way, dockerfiles are just as reproducible as .debs or .rpms. Upgrading is replacing the container with a new one

Kubernetes is an abstraction layer that (mostly) hides the complexity of storage networking etc. Yeah the CNIs and CSIs are complex, but for the appdev it's reduced to "write a manifest for a PV and a PVC" or "write a manifest for a service and/or ingress". In my company ops has standardized that so you add a key to your values.yaml and it'll template the rest for you. Ops has to deal with setting up that stuff in the first place, which you have to do regardless, but it's better than every appdev setting up their own way to do things

My company's a conglomerate of several acquisitions. I'm from a company that was heavy into k8s, and now I'm working on getting everyone else's apps that are currently just deployed to a VM into a container and onto a k8s cluster instead. I might shouldn't've said k8s was an API per se, but it is a standardized interface that covers >90% of what people want to do. It's much easier to debug everything when it's all running on top of k8s using the same k8s concepts than it is debugging why each idiosyncratic VM isn't working. Could you force every app to use the same set of features on VMs? Want a load balancer, just add a value to your config and the deployment process will add your VM to the F5? Yeah, it's possible, but we'd have to build it, or find a solution offered by a particular vendor. k8s already has that written and everyone uses it

crabbone

0 replies

17h37m

2024-07-27 00:52:26 UTC

This is super, super, super naive. You, essentially, just solved for the case of one. But now you need to solve for N.

Do you seriously believe you will never be in a situation where you have to run... two containers?.. With two different images? If my experience is anything to go by, even postcard Web sites often use 3-5 containers. I just finished deploying a test of our managed Kubernetes (technically, it uses containerd, but it could be using Docker). And it has ~60 containers. And this is just the management part. I.e. no user programs are running there. It's a bunch of "operators", CNIs, CSIs etc.

In other words: if your deployment was so easy that it could all fit into a single container -- you didn't have a dependency problem in the first place. But once you get realistic size deployment, you now have all the same problems. If libssl doesn't implement the same version of TLS protocol in two containers -- you are going to have a bad time. But now you also amplified this problem because you need certificates in all containers! Oh and what a fun it is to manage certificates in containers!

Kubernetes is an abstraction layer that (mostly) hides the complexity of storage networking etc

Now, be honest. You didn't really use it, did you? The complexity in eg. storage may manifest in many different ways. None of them have anything to do with Kubernetes. Here are some examples: how can multiple users access the same files concurrently? How can the same files be stored (replicated) in multiple places concurrently? What about first and second together? Should replication happen at the level of block device or filesystem? Should snapshots be incremental or full? Should user ownership be encoded into storage, or should there be an extra translation layer? Should storage allow discards when dealing with encryption? And many, many more.

Kubernetes doesn't help you with these problems. It cannot. It's not designed to. You have all the difficult storage problems whether you have Kubernetes or not. What Kubernetes offers is a possibility for the storage vendors to expose their storage product through it. Which is nothing new. All those storage products can be exposed through some other means as well.

In practice, some storage vendors who choose to expose their products through Kubernetes usually end up with a limited subset of the storage functionality exposed in such a way. So, not only storage through Kubernetes doesn't solve your problems: it adds more of them. Now you may have to work around the restrictions of Kubernetes if you want to use some unavailable features (think, for example all the Ceph CLI that you are missing when using Ceph volumes in Kubernetes: it's hundreds of commands that are suddenly unavailable to you).

----

You seem like an enthusiastic person. And you probably truly believe what you write about this stuff. But you went way above your head. You aren't really an infra developer. You kind of don't even really recognize the general patterns and problems of this field. And that's OK. You don't have to be / do that. You just happened to be a new car owner who learned how to change oil on your own, and you are trying to preach to a seasoned mechanic about the benefits and downsides of different engine designs :) Don't take it to heart. It's one of those moments where maybe years later you'll suddenly recall this conversation and feel a spike of embarrassment. Everyone has that.

ktosobcy

2 replies

1d10h

2024-07-26 07:49:46 UTC

Modern "containers" were invented to make things more reproducible ( check ) and simplify dev and deployments ( NOT check ).

Why?

I have my RPi4 and absolutely love docker(-compose) - deploying stuff/services on in it just a breeze compared to previous clusterf*k of relying on system repository for apps (or if something doesnt work)... with docker compose I have nicely separated services with dedicated databases in required version (yes, I ran into an issue that one service required newer and another older version of the database, meh)

As for development - I do development natively but again - docker makes it easier to test various scenarios...

skydhash

1 replies

1d5h

2024-07-26 13:26:13 UTC

I’ve been using LXC/Incus as lightweight VMs (my home server is an old mac mini) and I think too many software is over reliant on Docker. It’s very much “ship your computer” with a bunch of odd scripts in addition to the Dockerfiles.

ktosobcy

0 replies

1d3h

2024-07-26 15:08:44 UTC

Could you elaborate the "ship your computer"? Majority of the images are base OS (which is as lean as possible) and then just the app...

To that end, full blown VM seems even more "ship your computer" thing?

Btw. isn't LXC base for the Docker as well? It looks somewhat similar to docker and podman?

gryfft

1 replies

2024-07-25 17:38:01 UTC

I've been meaning to do a bhyve deep dive for years, my gut feelings being much the same as yours. Would appreciate any recommended reading.

Gud

0 replies

1d22h

2024-07-25 20:23:00 UTC

Read the fine manual and handbook.

diego_sandoval

1 replies

1d17h

2024-07-26 01:00:45 UTC

and Docker industrial complex that developed around solving problems created by themselves or solved decades ago.

From my perspective, it's the complete opposite: Docker is a workaround for problems created decades ago (e.g. dynamic linking), that could have been solved in a better manner, but were not.

ktosobcy

0 replies

1d10h

2024-07-26 07:52:30 UTC

there are flatpacks/appimage/whatever but they are linux-only (mostly) and still lack something akin of docker-compose...

dhx

1 replies

1d10h

2024-07-26 07:34:05 UTC

The article doesn't read to me to be an argument about whether sharing a kernel is better or worse (multiple virtual machines each with their own kernel versus multiple containers isolated by a single kernel).

The article instead reads to me as an argument for isolating customers to their own customer-specific systems so there is no web server daemon, database server, file system path or other shared system used by multiple customers.

As an aside to the article, two virtual machines each with their own kernel are generally forced to communicate with each in more complex ways through network protocols which add more complexity and increase risk of implementation flaws and vulnerabilities existing. Two processes in different cgroups with a common kernel have other simpler communication options available such as being able to read the same file directly, UNIX domain sockets, named pipes, etc.

didntcheck

0 replies

20h52m

2024-07-26 21:37:02 UTC

Yep, the article just seems to be talking about single tenancy vs multi tenancy. The VMs vs containers thing seems mostly orthogonal

benreesman

1 replies

1d14h

2024-07-26 03:54:58 UTC

Namespaces and cgroups and LXC and the whole alphabet soup, the “Docker Industrial Complex” to borrow your inspired term, this stuff can make sense if you rack your own gear: you want one level of indirection.

As I’ve said many times, putting a container on a serverless on a Xen hypervisor so you can virtualize while you virtualize? I get why The Cloud wants this, but I haven’t the foggiest idea why people sit still for it.

As a public service announcement? If you’re paying three levels of markup to have three levels of virtual machine?

You’ve been had.

Spivak

0 replies

1d1h

2024-07-26 17:01:19 UTC

You're only virtualizing once. Serverless/FaaS is just a way to run a container, and a container is just a Linux process with some knobs to let different software coexist more easily. You're still just running VMs, same as you always were, but just have a new way of putting the software you want to run on them.

tptacek

0 replies

1d22h

2024-07-25 20:07:05 UTC

Jails/Zones are not pretty much as secure as a VM. They're materially less secure: they leave cotenant workloads sharing a single kernel (not just the tiny slice of the kernel KVM manages). Most kernel LPEs are probably "Jail" escapes, and it's not feasible to filter them out with system call sandboxing, because LPEs occur in innocuous system calls, too.

tomjen3

0 replies

1d22h

2024-07-25 20:27:56 UTC

If anything Docker is underused. You should have a very good reason to make a deploy that is not Docker, or (if you really need the extra security) a VM that runs one thing only (and so is essentially a more resource requiring Docker).

If you don’t, then it becomes much harder to answer the question of what exactly is deployed on a given server and what it takes to bring it up again if it goes down hard. If you but everything in Docker files, then the answer is whatever is set in the latest docker-compose file.

tiffanyh

0 replies

1d17h

2024-07-26 01:10:49 UTC

Are Jails/Zones/Docker even security solutions?

I always used them as process isolation & dependency bundling.

mountainriver

0 replies

1d18h

2024-07-25 23:41:35 UTC

Docker is fantastic and VMs are fantastic.

I honestly can’t imagine running all the services we have without containers. It would be wildly less efficient and harder to develop on.

VMs are wonderful when you need the security

markandrewj

0 replies

1d2h

2024-07-26 15:51:45 UTC

I wish people would stop going on about BSD jails as if they are the same. I would recommend at least using jails first. Most people using container technologies are well versed in BSD jails, as well as other technologies such as LXD, CRI-O, Micro VM's, and traditional virtualization technologies (KVM).

You will encounter rough edges with any technology if you use it long enough. Container technologies require learning new skills, and this is where I personally see people often get frustrated. There is also the lean left mentality of container environments, where you are expected to be responsible for your environment, which is difficult for some. I.E. users become responsible for more then in a traditional virtualizated environment. People didn't stop using VM's, they just started using containers as well. What you should use is dependent on the workload. When you have to manage more then a single VM, and work on a larger team, the value of containers becomes more apparent. Not to mention the need to rapidly patch and update in today's environment. Often VM's don't get patched because applications aren't architected in a way to allow for updates without downtime, although it is possible. There is a mentality of 'if it's not broke, don't fix it'. There is some truth that virtualized hardware can provide bounds of seperation as well, but other things like selinux also enforce these boundaries. Not to mention containers are often running inside VM's as well.

Using ephemeral VM's is not a new concept. The idea of 'cattle vs pets', and cloud, was built on KVM (OpenStack/AWS).

lkrubner

0 replies

1d14h

2024-07-26 03:53:57 UTC

I agree. VMs rely on old technologies, and are reliable in that way. By contrast, the move to Docker then necessitated additional technologies, such as Kubernetes, and Kubernetes brought an avalanche of new technologies to help manage Docker/Kubernetes. I am wary of any technology that in theory should make things simpler but in fact draws you down a path that requires you to learn a dozen new technologies. The Docker/Kubernetes path also drove up costs, especially the cost associated with the time needed to set up the devops correctly. Anything that takes time costs money. When I was at Averon the CEO insisted on absolutely perfect reliability and therefore flawless devops, so we hired a great devops guy to help us get setup, but he needed several weeks to set everything up, and his hourly rate was expensive. We could have just "push some code to a server" and we would have saved $40,000. When I consult with early stage startups, and they worry about the cost of devops, I point out that we can start simply, by pushing some code to a server, as if this was still 2001, and we can proceed slowly and incrementally from there. While Docker/Kubernetes offers infinite scalability, I warn entrepreneurs that their first concern should be keeping things simple and therefore low cost. And then the next step is to introduce VMs, and then use something like Packer to enable the VMs to be uses as AMIs and so allow the devops to develop to the point of using Terraform -- but all of that can wait till the product actually gains some traction.

icelancer

0 replies

1d14h

2024-07-26 03:51:41 UTC

Same. We're still managing ESXi here at my company. Docker/K8s/etc are nowhere close to prod and probably never will be. Been very pleased with that decision.

I will say that Docker images get one HUGE use case at our company - CUDA images with consistent environments. CUDA/pytorch/tensorflow hell is something I couldn't imagine dealing with when I was in college studying CS a few decades ago.

everforward

0 replies

2024-07-25 17:37:30 UTC

Modern "containers" were invented to make thinks more reproducible ( check ) and simplify dev and deployments ( NOT check ).

I do strongly believe deployments of containers are easier. If you want something that parallels a raw VM, you can "docker run" the image. Things like k8s can definitely be complicated, but the parallel there is more like running a whole ESXi cluster. Having done both, there's really only a marginal difference in complexity between k8s and an ESXi cluster supporting a similar feature set.

The dev simplification is supposed to be "stop dealing with tickets from people with weird environments", though it admittedly often doesn't apply to internal application where devs have some control over the environment.

Personally FreeBSD Jails / Solaris Zones are the thing I like to dream are pretty much as secure as a VM and a perfect fit for a sane dev and ops workflow

I would be interested to hear how you use them. From my perspective, raw jails/zones are missing features and implementing those features on top of them ends up basically back at Docker (probably minus the virtual networking). E.g. jails need some way to get new copies of the code that runs in them, so you can either use Docker or write some custom Ansible/Chef/etc that does basically the same thing.

Maybe I'm wrong, and there is some zen to be found in raw-er tools.

cryptonector

0 replies

22h14m

2024-07-26 20:15:14 UTC

Jails/Zones are just heavy-duty containers. They're still not VMs. Not that VMs are enough either, given all the side-channels that abound.

cryptonector

0 replies

22h18m

2024-07-26 20:11:03 UTC

I mean, yeah, but things like rowhammer and Spectre/Meltdown, and many other side-channels are a big deal. VMs are not really enough to prevent abuse of the full panoply of side-channels known and unknown.

TheNewsIsHere

0 replies

4h4m

2024-07-27 14:25:19 UTC

I feel the exact same way.

There are so many use cases that get shoved into the latest, shiniest box just because it’s new and shiny.

A colleague of mine once suggested running a CMS we manage for customers on a serverless stack because “it would be so cheap”. When you face unexpected traffic bursts or a DDoS, it becomes very expensive, very fast. Customers don’t really want to be billed per execution during a traffic storm.

It would also have been far outside the normal environment that CMS expects, and wouldn’t have been supported by any of our commercial, vendored dependencies.

Our stack is so much less complicated without running everything in Docker, and perhaps ironically, about half of our stack runs in Kubernetes. The other half is “just software on VMs” we manage through typical tools like SSH and Ansible.

ploxiln

6 replies

1d22h

2024-07-25 20:27:21 UTC

we operate in networks where outbound MQTT and HTTPS is simply not allowed (which is why we rely on encrypted DNS traffic for device-to-Console communication)

HTTPS is not allowed (locked down for security!), so communication is smuggled over DNS? uhh ... I suspect that a lot of what the customer "security" departments do, doesn't really make sense ...

jmprspret

5 replies

1d16h

2024-07-26 02:23:25 UTC

DNS tunneling, or smuggling through DNS requests, is like a known malware C2 method. Seems really weird to (ab)use it for ""security""

thinkst

4 replies

1d11h

2024-07-26 06:34:49 UTC

Product builders can learn loads from malware in terms of deployment and operational ease. Malware needs to operate without any assistance in unknown environments. Nobody is allowing outbound comms deliberately for malware, so tunnel methods were developed.

Networks have these capabilities, inherently they're part of the specs. But only malware seems to realise that and use it. We love reusing offensive techniques for defence (see our Canarytokens stuff), and DNS comms fits that perfectly. Our customers get an actual 2-minute install, not a 2-minute-and-then-wait-a-week-for-the-firewall-rules install.

marcosdumay

1 replies

1d3h

2024-07-26 14:55:56 UTC

The problem is that when you apply the malware lessons to your software, every anti-virus starts to work against you.

jmprspret

0 replies

14h59m

2024-07-27 03:30:45 UTC

That could be true. Especially those that opt for a heuristic/application anomalous behaviour approach. But then, you can add white listing and exceptions to most AV products.

jmprspret

1 replies

1d11h

2024-07-26 07:06:17 UTC

I didn't mean to imply that so for security was a bad thing. Now I read back my comment I see that is exactly how it sounds.

I agree with you

thinkst

0 replies

1d9h

2024-07-26 08:36:56 UTC

We've got several product features that have been driven by our offensive security background... thanks for the prod, we'll blog it.

SunlitCat

6 replies

1d23h

2024-07-25 19:27:38 UTC

VMs are awesome for what they can offer. Docker (and the like) are kinda a lean VM for a specific tool scenario.

What I would like to see, would be more App virtualization software which isolates the app from the underlying OS enough to provide an safe enough cage for the app.

I know there are some commercial offerings out there (and a free one), but maybe someone can chime in has some opinions about them or know some additional ones?

stacktrust

3 replies

1d23h

2024-07-25 19:29:55 UTC

HP business PCs ship with SureClick based on OSS uXen, https://news.ycombinator.com/item?id=41071884

SunlitCat

2 replies

1d22h

2024-07-25 20:12:05 UTC

Thank you for sharing, didn't know that one!

stacktrust

1 replies

1d21h

2024-07-25 20:32:05 UTC

It's from the original Xen team. Subsequently cloned by MS as MDAG (Defender Application Guard).

SunlitCat

0 replies

1d20h

2024-07-25 21:34:46 UTC

Cool! I know MDAG and actually it's a pretty neat concept, kinda.

peddling-brink

1 replies

1d22h

2024-07-25 19:36:59 UTC

That’s what containers attempt to do. But it’s not perfect. Adding a layer like gvisor helps, but again the app is still interacting with the host kernel so kernel exploits are still possible. What additional sandboxing are you thinking of?

SunlitCat

0 replies

1d22h

2024-07-25 20:15:12 UTC

Maybe I am a bit naive, but in my mind it's just a simple software running between the OS and the tool in question which runs said software in some kind of virtualization, passing all requests to the OS after a check what they might want to do.

I know that's what said tools are offering, but installing (and running) docker on Windows feels like loading up a whole other OS insides OS, so that even VM (Software) looks lean compared to that!

But I admit, that I have no real experience with docker and the like.

osigurdson

5 replies

2d1h

2024-07-25 17:29:35 UTC

When thinking about multi-tenancy, remember that your bank doesn't have a special VM or container, just for you.

01HNNWZ0MV43FF

3 replies

2024-07-25 17:44:09 UTC

My bank doesn't even have 2FA

jmnicolas

2 replies

2024-07-25 17:56:10 UTC

Mine neither and they use a 6 numbers pincode! This is ridiculous, in comparison my home wifi password is 60+ random chars long.

leononame

0 replies

1d22h

2024-07-25 20:28:46 UTC

But they do ask you only two digits of the pin on each try and they probably will lock your account after three incorrect attempts. Not saying 6 digits is secure, but it's better than everyone using "password" if they have a string policy on incorrect attempts.

And don't hm they have 2FA for executing transactions?

I'm pretty sure banks are some of the most targeted IT systems. I don't trust them blindly, but when it comes to online security, I trust that they built a system that's reasonably well secured and other cases, I'd get my money back, similar to credit cards.

dspillett

0 replies

1d7h

2024-07-26 11:01:49 UTC

Mine, FirstDirect in the UK, recently dropped the password from “between 5 and 9 case-sensitive alphanumeric characters” to “exactly six digits” and claimed that this was just as secure as before…¹²

My guess is that either they were cutting support costs and wanted to reduce the number of calls from people who forgot their more complicated password!. Either that or they are trying to integrate a legacy system, don't have the resources/access to improve that, so reduced everything else down to its level. When raised one on of their public facing online presences someone pointed out that it is no less than other online banks do, but if they are happy being just as good but no better than other banks there is nothing for me to be loyal to should another bank come up with a juicy looking offer.

----

[1] because of course 13,759,005,982,823,100 possible combinations is no better than exactly 1,000,000 where you know most people are going to use some variant of a date of birth/marriage and makes shoulder-surfing attacks no more difficult </snark>

[2] The only way it is really just as secure as before is if there is a significant hole elsewhere so it doesn't matter what options are available there. Going from zero security to zero security is just as secure as before, no lie!

dspillett

0 replies

2024-07-25 17:44:43 UTC

No, but they do have their own VM/container(s) separate from all the other banks that use the same service, with persisted data in their own storage account with its own encryption keys, etc.

We deal with banks in DayJob - they have separate VMs/containers for their own UAT & training environments, and when the same bank that works in multiple regulatory jurisdictions they usually have systems servicing those separated too as if there were completely separate entities (only bringing aggregate data back together for higher-up reporting purposes).

fsckboy

5 replies

2024-07-25 18:04:14 UTC

just as a meta idea, i'm mystified that systems folks find it impossible to create protected mode operating systems that are protected, and then we all engage in wasteful kluges like VMs.

i'm not anti-VM, they're great technology, i just don't think it should be the only way to get protection. VMs are incredibly inefficient... what's that you say, they're not? ok, then why aren't they integrated into protected mode OSes so that they will actually be protected?

toast0

0 replies

1d23h

2024-07-25 18:54:09 UTC

Windows has Virtualization Based Security [1], where if your system has the right hardware and the right settings, it will use the virtualization support to get you a more protected environment. IO-MMU seems like it was designed for virtualization, but you can use it in a non-virtualized setting too, etc.

[1] https://learn.microsoft.com/en-us/windows-hardware/design/de...

ploxiln

0 replies

1d21h

2024-07-25 20:35:16 UTC

The industry tends to do this everywhere: we have a system to contain things, we made a mess of it, now we want to contain separate instances of the systems.

For example, in AWS or GCP, you can isolate stuff for different environments or teams with security groups and IAM policies. You can separate them with separate VPCs that can't talk to each other. In GCP you can separate them with "projects". But soon that's not enough, companies want separate AWS accounts for separate teams or environments, and they need to be grouped under a parent org account, and you can have policies that grant ability to assume roles cross-account ... then you need separate associated groups of AWS accounts for separate divisions!

It really never ends, companies will always want to take whatever nested mess they have, and instead of cleaning it up, just nest it one level further. That's why we'll be running wasm in separate processes in separate containers in separate VMs on many-core servers (probably managed with another level of virtualization, but who can tell).

dale_glass

0 replies

1d10h

2024-07-26 08:03:52 UTC

Security is easier when the attack surface is limited.

An OS provides a huge amount of functionality and offers access to vast amounts of complex shared resources. Anywhere in that there can be holes.

A VM is conceptually simpler. We don't have to prove there's no way to get to a root exploit from a myriad services running as root but available to a normal application. We're concerned about things like that a VM won't access a disk belonging to another. Which is a far simpler problem.

bigbones

0 replies

2024-07-25 18:22:12 UTC

Because it would defeat the purpose. Turns out we don't trust the systems folks all that much

Bognar

0 replies

1d1h

2024-07-26 16:37:15 UTC

VMs as an isolation concept at the processor level are actually quite efficient, but unfortunately we use that to run whole operating systems which impose their own inefficiency. Micro-VMs that just run a process without an OS (or with an OS shim) are possible but we don't yet have good frameworks for building and using them.

tptacek

4 replies

1d22h

2024-07-25 20:04:27 UTC

The cool kids have been combining containers and hardware virtualization for something like 10 years now (back to QEMU-Lite and kvmtool). Don't use containers if the abstraction gets in your way, of course, but if they work for you --- as a mechanism for packaging and shipping software and coordinating deployments --- there's no reason you need to roll all the way back to individually managed EC2 instances.

A short survey on this stuff:

https://fly.io/blog/sandboxing-and-workload-isolation/

mwcampbell

3 replies

1d22h

2024-07-25 20:10:54 UTC

Since you're here, I was just thinking about how feasible it would be to run a microVM-per-tenant setup like this on Fly. I guess it would require some automation to create a Fly app for each customer. Is this something you all have thought about?

tptacek

1 replies

1d22h

2024-07-25 20:20:01 UTC

Extraordinarily easy. It's a design goal of the system. I don't want to crud up the thread; this whole "container vs. VM vs. dedicated hardware" debate is dear to my heart. But feel free to drop me a line if you're interested in our take on it.

heeton

0 replies

1d4h

2024-07-26 13:43:19 UTC

I’m also interested in your take on it, if you wanted to publish a response publicly. Would love something like this for enterprise SaaS clients.

DAlperin

0 replies

1d14h

2024-07-26 03:45:41 UTC

Also to add, we already have lots of customers who use this model.

bobbob1921

3 replies

2024-07-25 18:02:40 UTC

My big struggle with docker/containers vs VMs is the storage layer (on containers). I’m sure it’s mostly lack of experience / knowledge on my end, but I never have a doubt or concern that my storage is persistent and clearly defined when using a VM based workload. I cannot say the same for my docker/container based workloads, I’m always a tad concerned about the persistence of storage, (or the resource management in regards to storage). This becomes even more true as you deal with networked storage on both platforms

amluto

1 replies

1d19h

2024-07-25 22:42:51 UTC

It absolutely boggles my mind that read-only mode is not the default in Docker. By default, every container has an extra, unnamed, writable volume: its own root. Typo in your volume mount? You’re writing to root, and you will lose data.

Of course, once this is fixed and you start using read-only containers, one wonders why “container” exists as a persistent, named concept.

danhor

0 replies

2024-07-26 17:39:21 UTC

Because unless you resort to stuff like in-ram overlayfs which will also result in data loss, a lot of system software assumes it can write anywhere and will bitterly complain if not, even if it's not "real" data, and can be very annoying to fix. That's fina for carefully engineered containers, but the usual thrown together stuff docker started with gets a lot more annoying.

imp0cat

0 replies

1d23h

2024-07-25 19:12:19 UTC

Mount those paths that you care about to local filesystem. Otherwise, you're always one `docker system prune -a -f --volumes` from a disaster.

mikewarot

2 replies

1d23h

2024-07-25 19:11:45 UTC

It's nice to see the Principle Of Least Access (POLA) in practical use. Some day, we'll have operating systems that respect it as well.

As more people wake up to the realization that we shouldn't trust code, I expect that the number of civilization wide outages will decrease.

Working in the cloud, they're not going to be able to use my other favorite security tool, the data diode. Which can positively guarantee ingress of control, while still allowing egress of reporting data.

nrr

0 replies

1d20h

2024-07-25 22:04:21 UTC

If you're coming by after the fact and scratching your head at what a data diode is, Wikipedia's page on the subject is a decent crib document. <https://en.wikipedia.org/wiki/Unidirectional_network>

fsflover

0 replies

1d22h

2024-07-25 20:16:44 UTC

Some day, we'll have operating systems that respect it as well.

Qubes OS has been relying on it for many years. My daily driver, can't recommend it enough.

smitty1e

1 replies

1d22h

2024-07-25 19:35:03 UTC

Switching to another provider would be non-trivial, and I don’t see the VM as a real benefit in this regard. The barrier to switching is still incredibly high.

This point is made in the context of VM bits, but that switching cost could (in theory, haven't done it myself) be mitigated using, e.g. Terraform.

The brace-for-shock barrier at the enterprise level is going to be exfiltrating all of that valuable data. Bezos is running a Hotel California for that data: "You can checkout any time you like, but you can never leave" (easily).

tetha

0 replies

1d21h

2024-07-25 20:31:45 UTC

Heh. We're in the process of moving a service for a few of our larger customers over due to some variety of emergencies, let's keep it at that.

It took us 2-3 days of hustling to get the stuff running and production ready and providing the right answers. This is the "Terraform and Ansible-Stuff" stage of a real failover. In a full infrastructure failover, I'd expect it to take us 1-2 very long days to get 80% running and then up to a week to be fully back on track and another week of shaking out strange issues. And then a week or two of low-availability from the ops-team.

However, for 3 large customers using that product, cybersecurity and compliance said no. They said no about 5-6 weeks ago and project to have an answer somewhere within the next 1-2 months. Until then, the amount of workarounds and frustration growing around it is rather scary. I hope I can contain it to some places in which there is no permanent damage for the infrastructure.

Tech isn't necessarily the hardest thing in some spaces.

er4hn

1 replies

1d21h

2024-07-25 20:42:09 UTC

One thing I wasn't able to grok from the article is orchestration of VMs. Are they using AWS to manage the VM lifecycles, restart them, etc?

Last time I looked into this for on-prem the solutions seemed very enterprise, pay the big bux, focused. Not a lot in the OSS space. What do people use for on-prem VM orchestration that is OSS?

jinzo

0 replies

1d21h

2024-07-25 21:28:39 UTC

Depends what is your scale, but I used oVirt and Proxmox in the past, and it was (especially oVirt) very enterprisey but OSS.

vin10

0 replies

1d22h

2024-07-25 20:05:08 UTC

If you wouldn't trust running it on your host, you probably shouldn't run it in a container as well.

- From a Docker/Moby Maintainer

udev4096

0 replies

1d4h

2024-07-26 14:01:46 UTC

Nothing here will earn us a speaking invite to CNCF events

This made me laugh for some reason

stacktrust

0 replies

1d23h

2024-07-25 18:36:43 UTC

A modern virtualization architecture can be found in the OSS pKVM L0 nested hypervisor for Android Virtualization Framework, which has some architectural overlap with HP/Bromium AX L0 + [Hyper-V | KVM | Xen] L1 + uXen L2 micro-VMs with copy-on-write memory.

A Bromium demo circa 2014 was a web browser where every tab was an isolated VM, and every HTTP request was an isolated VM. Hundreds of VMs could be launched in a couple of hundred milliseconds. Firecracker has some overlap.

> Lastly, this approach is almost certainly more expensive. Our instances sit idle for the most part and we pay EC2 a pretty penny for the privilege.

With many near-idle server VMs running identical code for each customer, there may be an opportunity to use copy-on-memory-write VMs with fast restore of unique memory state, using the techniques employed in live migration.

Xen/uXen/AX: https://www.platformsecuritysummit.com/2018/speaker/pratt/

pKVM: https://www.youtube.com/watch?v=9npebeVFbFw

solatic

0 replies

1d2h

2024-07-26 15:44:04 UTC

There's nothing in Kubernetes and containers that prevents you from running single-tenant architectures (one tenant per namespace), or from colocating all single-tenant services on the same VM, and preventing multiple customers from sharing the same VM (pod affinity and anti-affinity).

I'm not sure why the author doesn't understand that he could have his cake and eat it too.

sim7c00

0 replies

1d8h

2024-07-26 09:31:16 UTC

i wish nanoVMs were better. its a cool concept leveraging the actual VM extensions for security. but all the ones i've seen hardly get into user-mode, dont have stack protectors or other trivial security features enabled etc. (smap/smep) making it super insecure anyway.

maybe someday that market will boom a bit more, so we can run hypervisors with vms in there that host single application kind of things. like a BSD kernel that runs postgres as its init process or something. (i know thats oversimplified probarbly ::P).

there's a lot of room in the VM space for improvement ,but pretty much all of it is impossible if you need to load an entire OS multi-purpose-multi-user into the vm.....

kkfx

0 replies

1d22h

2024-07-25 20:00:58 UTC

As much stuff you add as much attack surface you have. Virtualized infra are a commercial need, an IT and Operation OBSCENITY definitively never safe in practice.

jonathanlydall

0 replies

2024-07-25 17:40:14 UTC

Sure, it’s an option which eliminates the possibility of certain types of errors, but it’s costing you the ability to pool computing resources as efficiently as you could have with a multi-tenant approach.

The author did acknowledge it’s a trade off, but the economics of this trade off may or may not make sense depending on how much you need to charge your customers to remain competitive with competing offerings.

jefurii

0 replies

1d23h

2024-07-25 18:41:56 UTC

Using VMs as the unit allows them to move to another provider if they need to. They could even move to something like an on-prem Oxide rack if they wanted. [Yes I know, TFA lists this as a "false benefit" i.e. something they think doesn't benefit them.]

javier_e06

0 replies

1d2h

2024-07-26 15:46:36 UTC

Months ago I went to the movie theater. Why a $20.00 USD bill in my hand I asked the young one (yes I am that old) for a medium pop corn. "Credit Card" only. He warned me. "You have to take cash" I reminded me. He directed me to the box office where I had to purchase a $20 USD gift card which I then used to purchase the pop-corn. I never used the remaining balance. Management does not trust the crew of low wage minions, with cash, who would?

I had my popcorn right? What is the complain here?

I network comes done, stores will have no choice but to hand the food for free.

I am currently not trouble shooting my solutions. I am trouble shooting the VM.

ianpurton

0 replies

1d12h

2024-07-26 06:29:51 UTC

I've solved the same problem but used Kubernetes namespaces instead.

Each customer gets their own namespace and a namespace is locked down in terms of networking and I deploy Postgres in each namespace using the Postgres operator.

I've built an operator for my app, so deploying the app into a namespace is as simple as deploying the manifest.

coppsilgold

0 replies

1d17h

2024-07-26 00:34:35 UTC

If you think about it virtualization is just a narrowing of the application-kernel interface. In a standard setting the application has a wide kernel interface available to it with dozens (ex. seccomp) to 100's of syscalls. A vulnerablility in any one of which could result in full system compromise.

With virtualization the attack surface is narrowed to pretty much just the virtualization interface.

The problem with current virtualization (or more specifically, the VMM's) is that it can be cumbersome, for example memory management is a serious annoyance. The kernel is built to hog memory for cache and etc. but you don't want the guest to be doing that - since you want to overcommit memory as guests will rarely use 100% of what is given to them (especially when the guest is just a jailed singular application), workarounds such as free page reporting and drop_caches hacks exist.

I would expect eventually to see high performance custom kernels for a application jails - for example: gVisor[1] acts as a syscall interceptor (and can use KVM too!) and a custom kernel. Or a modified linux kernel with patched pain points for the guest.

In effect what virtualization achieves is the ability to rollback much of the advantage of having an operating system in the first place in exchange for securely isolating the workload. But because the workload expects an underlying operating system to serve it, one has to be provided to it. So now you have a host operating system and a guest operating system and some narrow interface between the two to not be a complete clown show. As you grow the interface to properly slave the guest to the host to reduce resource consumption and gain more control you will eventually end up reimagining the operating system perhaps? Or come full circle to the BSD jail idea - imagine the host kernel having hooks into every guest kernel syscall, is this not a BSD jail with extra steps?

[1] <https://gvisor.dev/>

Thaxll

0 replies

1d2h

2024-07-26 15:53:14 UTC

You could use different nodepool per customers using the same k8s control plane.

Melatonic

0 replies

1d14h

2024-07-26 03:37:47 UTC

Eventually we'll get a great system managing some form of micro VM that lots of people use and we have years of documentation and troubleshooting on

Until then the debate between VM and Containerisation will continue

JohnCClarke

0 replies

1d18h

2024-07-26 00:21:01 UTC

Question: Could you get the customer isolation by running all console access through customer specific lambdas which simply add a unique (and secret) header to all requests. Then you can run a single database with sets of tables keyed by that secret header value.

Would give you very nearly as good isolation for much lower cost.

JackSlateur

0 replies

2024-07-26 18:29:04 UTC

Boarf

This can be boiled down to "we use AWS' built-in security, not our own". Using EC2 instances is then nothing but a choice. You could do the exact same thing with containers (with fargate, perhaps ?) : one container per tenant, no relations between containers => same things (but cheaper).

Havoc

0 replies

1d12h

2024-07-26 06:02:44 UTC

So you end up with thousands of near idle AWS instances?

There has got to be a better middle ground. Like mult tenant but strong splits ( each customer on db etc )