return to table of content

Keylogger discovered in image generator extension

tamimio
20 replies
23h32m

Lesson for the people who run and execute stuff without looking at the code first.

vvpan
11 replies
23h30m

Which is everybody in the world except for a handful of people.

tamimio
9 replies
23h26m

Not really, and it takes a few minutes because most of these packages (including npm) are small. You don’t have to read the WireGuard codebase because it’s reputable enough, but for obscure or unknown add-ons/package code, it’s on you to double-check, just like reading the ‘readme’.

bdlowery
2 replies
23h11m

I haven’t looked at the source code of a single npm package I’ve installed in the past 5 years.

“It takes a few minutes”

Dude my web dev projects have like 1,000s of dependencies. I’m not going to check the source code of every package tailwind requires.

fbdab103
1 replies
22h44m

Even if you did review it, a motivated attacker is not going to have an exfiltrate_user_data(). The xz backdoor exploit was incredibly sophisticated, and one key of the design was sneaking a "." into a single line of a build test script.

A cursory audit of primary dependencies has almost zero chance of catching anything but a brazen exploit.

redserk
0 replies
22h18m

Yeah. Realistically I think the best course of action is just assume you’re already using a library that can exfiltrate data.

This requires allowlisting egress traffic and possibly even architecting things to prevent any one library from seeing too many things. This approach can be a big pain though and could be difficult to implement practically.

redserk
1 replies
23h19m

So just sneak the code in a dependency of a dependency.

Who’s diving 3-4 layers deep into dependencies?

netule
0 replies
23h9m

No need to hide it inside dependencies, just modify the code before building and pushing the package to PyPi.

gotbeans
0 replies
22h44m

Imo this makes no sense. There's zero chance you will start inspecting all dependencies even in a relatively small application, which now a days could pull already a large number of deps.

I don't see how doing any of this manually will help.

genter
0 replies
21h53m

Would you have caught the XZ backdoor?

froggertoaster
0 replies
23h19m

You can't "not really" this away. Most people don't bother looking at small package code, much less code for packages that are far more complex.

akira2501
0 replies
22h46m

This is why I refuse to use almost anything on npm. If you have a zero dependency project I'll consider it. If you have a dependency that also has a set of dependencies then I will never use your code.

creata
0 replies
21h50m

Most people should only download software from people they trust (to not be evil and also to be competent).

If you download code off some unknown person's GitHub repo, you'd be stupid not to read it very very carefully!

Geee
5 replies
23h23m

Ain't nobody got time for that. LLMs should be capable of analysing code for anything malicious / suspicious.

kibwen
2 replies
23h11m

Unfortunately, no, because the existence of LLMs that can automatically determine code that is suspicious will be offset by the existence of LLMs that can generate malicious code that bypasses the detection abilities of the aforementioned LLMs.

CrazyStat
1 replies
23h5m

Generative Adversarial LLMs, let’s go!

protosam
0 replies
21h49m

Perhaps we could just call these ALLMs (Adversarial Large Language Models). You’re already dropping the N in GAN, I see no need for the G.

As an end result I think someone clever could make a LLaMA pun for the name of a LLaMA based ALLM.

yonixw
0 replies
23h14m

Since LLM and keyloggers are turing machines, it won't happen. (Or more precisely: it won't beat the cat and mouse game of obfuscations.)

astromaniak
0 replies
23h9m

No, they cannot work with large code base, not yet. And have very limited talent for logic and debugging. They may improve at some point, probably will be hooked up with external tools.

StressedDev
1 replies
21h48m

Everyone runs code they have not inspected. For example, almost no one has read all of the code of in FreeBSD, Linux (kernel), MacOS, Open BSD, or Windows. I also doubt people are reading all of the code in their favorite Linux distribution.

Even inspecting the code is not enough because a lot of security vulnerabilities are not obvious. Basically, security is hard, and often there are not a lot of good solutions.

Here are some tricks I have found which have helped me minimize my risk:

1) Use different machines for different purposes. Basically, you should not use 1 PC (or Mac) for everything. I have one for my finances, one for gaming, and a general-purpose PC. If one gets hacked, the others are still fine.

2) Get software from trustworthy sources. Most of the major software companies are not going to ship malicious code. For open-source software, use software from popular projects which have a good reputation.

3) Ask yourself why is someone providing this software? Is it for money? Are they creating it because they enjoy it? How do they support themselves? For example, Google's business model is building a dossier on people so it can deliver ads they are more likely to click on. When Google gives you something for "free", they will probably use it to track you, or track visitors to your website.

4) Support the people who build the software you use. If its commercial software, pay for it, do not pirate it. If it's open source, donate time or money to the projects you use. Also, thank the people who work on the software, and ALWAYS treat them with respect.

5) Avoid pirated software, software from "free" porn web sites, etc. People who provide illegal software, or sketchy software are probably willing to put back doors in it.

creata
0 replies
21h42m

For open-source software, use software from popular projects which have a good reputation.

On this topic, how much should a person trust central repositories of well-known operating system distributions (e.g. Arch, Debian)? I know only trusted people can upload to them, and the only time I've ever heard of malware slipping past them was XZ, but I don't know how much care they take.

14
11 replies
1d

Is there no way to defend against a keylogger? What can you do if a simple keylogger can steal your passwords?

Retr0id
3 replies
1d

Aside from not using passwords or using 2FA, sandboxing helps.

A VM with GPU passthrough set up would be one example (although this is usually a pain to set up and I expect most people aren't doing it).

As a more user-friendly example, if you install an iOS app (local-model LLM and image generation apps exist), the sandboxing provided by the OS ought to be more than enough to prevent keyloggers, short of 0day exploits.

nsingh2
2 replies
23h54m

Not as secure as VMs but GPU passthrough with Docker/Podman is much easier to set up, and you can even use the GPU on the host machine at the same time.

creata
1 replies
21h37m

Are you giving it access to /dev/dri, or doing some fancier sandboxing?

(Would you even need anything fancier? I think /dev/dri is supposed to isolate users.)

nsingh2
0 replies
21h22m

Nvidia provides a toolkit to do this [1], getting a GPU into a container is as easy as running `podman run --device nvidia.com/gpu=all`. The process is similar for Docker, but rootless Docker requires some extra steps IIRC.

[1] https://docs.nvidia.com/datacenter/cloud-native/container-to...

Latty
2 replies
1d

Ideally, don't use passwords: Passkeys where supported, SSH Keys, client certificates, social login via a service that does support one of these methods.

Magic link emails can also work, but are potentially vulnerable if you copy/pasted it rather than clicking depending on the keylogger's capability and clipboard visibility, although the window for attack is small, it's a much more sophisticated attack that leaves more traces (good sites will reject reuse).

Second best, also use a second factor: U2F ideally, TOTP with the same caveats as magic link emails, and at the bottom of the barrel SMS which is better than nothing but known to be very flawed.

Honestly, if you are anything other than a casual user, and don't have devices with support baked in already, it's crazy not to spend ~£60 on a pair of security keys for passkey/U2F. It's not a lot of money and is just so much more secure.

danieldk
1 replies
23h23m

Ideally, don't use passwords: Passkeys where supported, SSH Keys, client certificates, social login via a service that does support one of these methods.

If a process has the privileges to run as a keylogger, it can also grab your local SSH private keys and possibly harvest passwords and passkeys from your local password manager vault [1]. The process has local access and since it is a key logger presumably your master password. (The complexity depends a bit on the password manager, e.g. IIRC macOS keychain always requires a roundtrip through the secure enclave).

Honestly, if you are anything other than a casual user, and don't have devices with support baked in already, it's crazy not to spend ~£60 on a pair of security keys for passkey/U2F. It's not a lot of money and is just so much more secure.

100% this. A secure enclave or a hardware key is the only way to keep your key material safe.

Also, app sandboxing should be the default. macOS App Store Apps are sandboxed. Unfortunately, these days the standard is still for applications to have unfettered access to a user's files.

[1] Passkeys can also be on a security key, but e.g. Yubikeys only have a small number of resident key slots and I think passkeys to most people means key material synced through iCloud/1Password/your favorite cloud.

Latty
0 replies
17h3m

When I talk Passkeys, I definitely mean hardware by default, which is how most websites position it: it's normally described as "set up a passkey for this device" and in practice the vast majority of people using them will be using a fingerprint reader in a laptop or on their phone, because most people don't set up password managers with passkeys.

To me, using a software for passkeys is a hack only power users will do, and yes, I see it as a bad idea.

Right now I believe Yubikeys can do 25 passkeys, which is a pretty low limit, but it offers enough to protect your most important accounts, and right now I doubt many people have more than 25 sites they use that support passkeys (of course, hopefully that goes up quickly).

ehsankia
1 replies
23h5m

"keylogger" may not be the right term here? I'm not familiar with how that term is broadly used for, but my definition of that term is a tool that logs your keypresses. Here, it seems like it was scraping your chrome/firefox data for login cookies?

Honestly there's quite a lot of malware that go against those files, I wonder if there's a way to require high privilege to accessing chrome/firefox appdata, or just block it entirely from other apps.

stuffoverflow
0 replies
20h23m

Yeah you're right, people miss use the term keylogger frequently. These kind of malware are broadly called "stealers" and usually do not involve keylogging.

Actual keyloggers tend to be rare nowadays due to them being easier to detect and the fact that in general the browser data is a more valuable target.

sureglymop
0 replies
22h18m

I mean, anything with root access can very easily use libevdev to get all keystrokes as well as mouse positions. (It's maybe 10 lines of code to do that).

So, don't run stuff as root. If it needs root access, run it in a virtual machine (personally I use qubes os for this).

millzlane
0 replies
1d

Use 2FA I'd imagine.

belladoreai
5 replies
1d

I have not seen a statement from Nullbulge so it's not appropriate to say that they took over the repo.

The author of the repo is claiming that their repo is hacked, but this is an obvious lie, because their very first GitHub commit is the one where they push the malware. Nobody would hack an empty GitHub account.

I don't know if the author of the repo is lying when they say that Nullbulge is behind the attack (perhaps the author is part of Nullbulge, perhaps not).

millzlane
4 replies
1d

I wouldn't be so sure no one would hack an idle account. I had my Spotify account taken before I even used it. I think in my case they used my account to pump up other lesser known artists.

janoc
1 replies
23h45m

There was also an actively exploited XSS vulnerability on Github in the recent days.

Doesn't mean that this guy was not a malicious actor, only that one shouldn't be so quick to cast stones without evidence.

belladoreai
0 replies
23h32m

The person who created the custom node is the same person who "hacked" it. Whether or not the account is technically owned by some unrelated civilian is not important, because there is no other activity on the account.

belladoreai
1 replies
1d

Okay, sure. But if we have an account which has never had any legitimate activity on it ever - an account that has only ever been used to push malware - then I don't know if it matters much who is the "rightful owner" of the account. Things would be different if the GitHub account had some legitimate activity before the "hack".

millzlane
0 replies
1d

I agree it doesn't matter much. Could be a noob mistake by the account owner and this is damage control.

zamalek
2 replies
1d

Must be script kiddies. You have the opportunity to deploy anything to a machine that almost certainly has a powerful GPU, and choose a key logger that exists in signature databases? Genius.

uyzstvqs
0 replies
23h34m

Quick search reveals anti-AI motivated script kiddies. Also some degen NSFW "art" content on DeviantArt and Reddit by the same name, their likely origin.

Stagnant
0 replies
23h46m

Telegram and discord webhooks are 100% signs of an unsophisticated attacker and they are a very common sight in malware samples. Github is full of skiddie "info stealer" projects that use telegram api / discord webhook to deliver the stolen data. They make no sense to use since anybody can spam that webhook endpoint. Not 100% sure about discord, but at least in the case of telegram anybody can even read and download all the data that has been sent to it.

8fingerlouie
0 replies
22h48m

Something is fishy here.

According to the original report, the “key logger” was in the custom wheels in the requirements.txt, but looking at that repository there has been only two commits, which according to Reddit both had malicious code in them.

Of course, proper discovery would be easier if the GitHub account still existed.

Seattle3503
10 replies
21h38m

How do people feel about using docker to prevent this sort of thing? Does it strike the right balance between usability and security?

belladoreai
7 replies
21h26m

Well, Docker is great for this as long as you're not one of the unlucky few whose machine is bricked because of Docker. So, mostly yes, I suppose.

lyu07282
6 replies
20h57m

What does that even mean?

belladoreai
4 replies
20h42m

"Bricking" is when your electronic device stops working, i.e. becomes a brick. Docker is known to occasionally brick Windows machines.

jiggawatts
3 replies
20h0m

Wait… what!?

This is the first I’m hearing of this. Do you have any references?

belladoreai
2 replies
19h27m

You can find many references by googling some variations of keywords Docker, Windows, brick

ASalazarMX
1 replies
18h37m

Googled that, thanks for not providing clear references to your claims, and found that docker can crash Windows on boot, but not "brick" it. People are still able to safe boot, run system recovery/restore, or even reinstall Windows if they choose.

Besides, bricking software is impossible, bricking refers to physical devices unable to bootstrap anymore.

Timber-6539
0 replies
14h20m

Not exactly. Hard brick is what you are referring to where you need to repair/reset the hardware OEM after corruption.

A soft brick is the actual reference here where you can easily recover from software/re-install.

justinclift
0 replies
20h22m

Docker itself doesn't seem to have the best quality control for their official releases, so blindly upgrading Docker will likely bite you in the ass if you do it for a few years. :(

8fingerlouie
0 replies
10h50m

What about second firewalls ?

Hobbit jokes aside, yes, it pokes holes in the firewall on the machine hosting docker. It generally creates a lot of firewall rules to isolate or permit traffic to/from containers and expose ports.

Your "safest" bet is probably to only expose docker containers on the localhost interface, and use a reverse proxy (Nginx/Traefik/etc) to expose services. At least that's how i did it when i last ran Docker a few years ago.

badrunaway
9 replies
21h34m

what can be done to stop all this? We need some sort of OS level layer to validate these things. If we put a local LLM which checks the bytecode of things which are getting installed/running for security = will that solve all this? My heart goes out to those who must have lost their money due to this.

creata
2 replies
21h27m

One basic measure (one part of a solution) would be to split Comfy into two parts: the part that does all the work (running plugins, generating images) should have access to nothing but read-only access to the files it needs, the GPU, and a socket to communicate with the other part.

badrunaway
1 replies
20h21m

A cleaner API you mean which exposes what is necessary only.

creata
0 replies
20h18m

I meant sandbox the less trusted bit.

KennyBlanken
2 replies
21h27m

Well, for one, the keylogger is detected by antivirus programs.

I keep coming across various projects whose executables trigger antivirus programs, and I think that when those triggers happen, "it's fine, don't worry" claims need to be treated with more skepticism.

At the same time, antivirus vendors need to stop being so lazy and using strings and such that are clearly part of an open source program/library for their signatures.

badrunaway
0 replies
20h19m

I believe there should be a clear indicator in UI of every OS when any new program listens to your keystrokes.. it should be the norm

asveikau
0 replies
15h37m

If you compile a benign binary yourself which has no malware, Chrome and Windows Defender will flag it as suspicious.

I was hacking on some open source stuff targeting win32, I posted some binaries on GitHub releases, I try to share with others... People tell me it's flagged as malware. It isn't malware. What do I tell them?

I hear code signing helps the heuristics to not get it flagged, but doesn't remove it.

If people working on said software want the warnings to be taken seriously, they should work on reducing false positives.

FileSorter
2 replies
21h24m

I think this is one of the use cases for a sandboxed WASM plugin system.

creata
1 replies
20h46m

But almost everyone working on these plugins really wants to use Python and PyTorch.

badrunaway
0 replies
20h20m

nobody ported python to wasm yet?

creata
8 replies
21h57m

Why does there seem to be such a disregard for security in deep learning?

There's examples like this post, but also, until recently, almost every deep learning model was literally distributed as a pickle file.

chx
3 replies
21h35m

It's not specific to deep learning, practically every industry will look at security as a cost just not worth it. When we start throwing the CEO into jail instead of making them pay a 18.5M fine for losing the data of 41 million customers that's when things will change. Until then, it's just the cost of doing business.

486sx33
2 replies
7h28m

Really? Throw a CEO in jail? This is just as crazy as the whole throw the supervisor in jail if the worker dies mantra in construction.

#1 users are responsible to look after their privacy. If you are using applications that don’t allow this - you need to reject the use of those applications.

#2 this needs to start happening in mass numbers. People need to rise up against these crazy corporate tech companies and their bull

creata
0 replies
4h35m

I would love to live in a world where everyone did that. But that's (currently) a utopian pipe dream.

I don't know if throwing CEOs in jail is the answer, but neither is putting all the responsibility on people to make tough choices like "give up my privacy or fall out of touch with my friends" or "give up my privacy or give up the chance to get this job".

chx
0 replies
5h7m

Well, we are currently at the "we tried nothing and we are out of ideas" stage so something needs to change.

dysoco
1 replies
21h19m

From my outsider perspective, it's a field that moves very fast, there seem to be new tools being released every week so:

1) As the developer if you focus on hardening, you might be too late to release.

2) People downloading shiny new libs/files/programs constantly.

3) Influx of people not that versed in the basics of computer security playing around with local LLM models, image generators, etc.

justinclift
0 replies
20h23m

That seems like an almost exact duplicate of the NodeJS/NPM issues?

Those same points (but the NodeJS/NPM version of them) is a lot of why that ecosystem is having security and reputation issues as well.

wisemang
0 replies
21h40m

"Security is not my field, I'm a stats guy": a qualitative root cause analysis of barriers to adversarial machine learning defenses in industry [0]

[0] https://dl.acm.org/doi/abs/10.5555/3620237.3620448

sys_64738
0 replies
20h1m

Isn’t this just one of the milestones that’ll eventually happen? Blind panic due to security always occurs at some point. There must be a ‘law’ defined for this somewhere.

uyzstvqs
6 replies
23h9m

I'm curious if it'd be possible to use a Code LLM to scan GitHub repos and detect possible malware hiding in source code.

ChrisMarshallNY
2 replies
23h0m

I have a feeling that we'll be seeing some businesses, built, around exactly that.

bdangubic
1 replies
22h42m

Github? ;)

tehlike
0 replies
22h27m

Socket.dev is not built around this but makes use of this.

pingou
1 replies
22h16m

I'm afraid a few simple tweaks, especially if the hackers themselves have access to the code LLM to try out their code, will be sufficient to evade detection.

nicce
0 replies
21h44m

Endless race like with Anti-Virus software.

dmazzoni
0 replies
20h23m

If such a tool became commonplace, bad actors would just run it on their own malware and keep tweaking it until the LLM failed to detect it.

skilled
3 replies
1d

Looks like a pretty small project. Only had 40 stars on GitHub before the repo was removed.

Was this the main method of GPT4 and Claude integrations for ComfyUI?

belladoreai
1 replies
1d

It was an extension for ComfyUI, which has 37k stars on GitHub. The way ComfyUI is commonly used is that a person shares a "workflow" file, which utilizes various obscure extensions (called "custom nodes") and then the people who want to run the workflow on their own computer will install all these obscure custom nodes that have like 40 stars on GitHub or so.

szundi
0 replies
23h52m

Just like an npm install

LtWorf
0 replies
21h59m

Using stars as popularity doesn't work.

I have personally never starred anything that I use. And 90% of the open source that I use isn't on github.

nsingh2
2 replies
1d

Not surprised at all, ComfyUI extensions are just arbitrary python code. The first time I tried ComfyUI extensions I put it in a podman container with GPU passthrough and blocked network access.

smarm52
0 replies
23h34m

Hopefully this will be just the incentive they need to do something safer. Something similar happened before the move from PKL to SAFETENSOR for model files.

aintnolove
2 replies
20h55m

I peered down the ComfyUI rabbit hole [1] and it is shockingly powerful. Did Adobe drop the ball on image generation? What are they doing over there? There has to be a better, more secure way to bundle up all this imagegen logic.

[1] https://learn.thinkdiffusion.com/bria-ai-for-background-remo...

orbital-decay
0 replies
18h24m

Adobe makes practical pipelines for creatives, not prototyping tools. ComfyUI is mostly for prototyping and ML nerds (I don't mean this in a bad way). There are more practical interfaces to get things done built on top of it, such as Krita Diffusion [1] and many others.

[1] https://github.com/Acly/krita-ai-diffusion

belladoreai
0 replies
20h44m

Yep, it's super powerful.

I would say that the "more secure way" is to just use ComfyUI without installing any obscure nodes from unknown developers. You can do pretty much anything using just the default nodes and the big node packs.

42lux
2 replies
1d

That discussion on reddit really is something else so much misinformation and pretend knowledge at work. It's as scary as the malware.

SoftTalker
1 replies
23h45m

And this is the input for AI training.....

nicce
0 replies
21h42m

Not just any input, but paid input :-)

Ukv
0 replies
23h34m

The user's reddit profile: https://archive.is/G5GIW

They have a couple of other tools hosted on HuggingFace, both having the malicious dependencies and both requiring entering API keys, namely:

"SillyTavern Character Generator": https://archive.is/gETq3 (requirements.txt: https://archive.is/xqqtA)

"Image Description with Claude Models and GPT-4 Vision": https://archive.is/6Ydgs (requirements.txt: https://archive.is/9Sp5C)

They've also posted some BeamNG mods, and were casting doubt on accusations that some other account's mod contained malware: https://archive.is/zLiaZ

That other account's reddit profile: https://archive.is/r9V1M

LtWorf
0 replies
1d

No domain and website registered?