Easy Stable Diffusion XL in your device, offline

I would highly recommend Fooocus to anyone who hasn't tried: https://github.com/lllyasviel/Fooocus

There are a bajillion local SD pipelines, but this one is, by far, the one with the highest quality output out-of-the-box, with short prompts. Its remarkable.

And thats because it integrates a bajillion SDXL augmentations that other UIs do not implement or enable by default. I've been using stable diffusion since 1.5 came out, and even having followed the space extensively, setting up an equivalent pipeline in ComfyUI (much less diffusers) would be a pain. Its like a "greatest hits and best defaults" for SDXL.

Looks like a complete contraption to setup and looks very unpleasant to use at first glance when compared against Noiselith.

The hundreds of python scripts and having the user to touch the terminal shows why something like Noiselith should exist for normal users rather than developers or programmers.

I would rather take a packaged solution that just works over a bunch of scripts requiring a terminal.

installation/setup is dead simple. up and running in under 3 minutes:

git clone https://github.com/lllyasviel/Fooocus.git

cd Fooocus

pip3 install -r requirements_versions.txt

python3 entry_with_update.py

Let's see...

pip3: command not found

Okay. I'll need to install it? What package might that be in, hmm. Moving on, I already know it's python.

/usr not writeable

Guess I'll use sudo...

= = =

Obviously I know better than to do this, but very few people would. This is not 'dead simple'! It's only simple for Python programmers who are already familiar with the ecosystem.

Now, fortunately the actual documentation does say to use venv. That's still not 'dead simple'; you still need to understand the commands involved. There's definitely space for a prepackaged binary.

The people that make software that does useful things, and the people that understand system security live on different planets. One day they'll meet each other and have a religious war.

This said, it's nice when developers attempt to detect the executable they need and warn what package is missing.

Depending on your platform, but if you read the readme, there’s a pre packaged release with Python embedded.

There are projects that set up "fat" Python executables or portable installs, but the problem in PyTorch ML projects is that the download would be many gigabytes.

Additionally, some package choices depend on hardware.

In the end, a lot of the more popular projects have "one click" scripts for auto installs, and there are some for Fooocus specifically, but the issue there is its not as visible as the main repo, and not necessarily something the dev wants to endorse.

Or you can use Stability Matrix package manager.

Yeah, VoltaML is also another excellent choice in stabilty matrix.

You have to make trade-off in software development. Fooocus trades on the best picture rather than the most beautiful interface, and also simplicity in its use. I think it is a good trade-off given the technology is improving at breaking-neck speed.

Look, DiffusionBee is still maintained but still no SDXL support.

Anyone who bet that the technology is done and it is time to focus on the UI is making the wrong bet.

This project is really cool and I like the stated philosophy on the README. I think it's making the right trade-off in terms of setting useful defaults and not showing you 100 arcane settings. However, the installation is too hard. It's a student project and free so I'm not criticizing the author at all but I think it's a pretty fair and useful criticism of the software and likely a significant bottleneck to adoption.

Huh? It has a really simple interface, much much much simpler than anything else that uses SD/SDXL locally. Installation is also simple for Windows/Linux, don't know about macOS.

I was afraid of the Python setup (even though I'm a Python developer), but yep: Make the virtualenv, install the dependencies, done. This is amazing, the images it generates are immediately beautiful.

It does look bad that it bundles GTM, though, as a sibling commenter says.

Samples:

https://imgz.org/i9oicVqo/

https://imgz.org/i8Ur3WjW/

https://imgz.org/i5j6r6TZ/

Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL, and most other UIs dont implement the style prompting.

Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL.

No, its not.

There are two text encoders, but they aren't really “prompt” and “style” inputs.

and most other UIs dont implement the style prompting.

Most UIs default mode of operation sends the same input to both text encoders, but at least comfy has nodes that support sending separate text to them. OTOH, while there may be some cases where sending different text to the two encoders helps in a predictable way, AFAIK most of the testing people has done has shown that optimal prompt adherence usually comes from sending the same to both.

Hmm well that was a massive misunderstanding on my part.

I’m not sure the origin, but using ViT-L (the encoder shared with SD1. x) for what you might call the main prompt and ViT-G (the new SDXL encoder, and also a successor to the encoder used as the single encoder in SD 2.x) for a style prompt was a common idea shortly after SDXL launched, so its understandable.

What are the two text encoders?

OpenCLIP ViT-G and CLIP ViT-L. The latter is the same encoder used in SD 1.x, OpenCLIP ViT-H was used as the encoder in SD 2.x, and ViT-G is, as I understand it, a successor and improvement on ViT-H.

Have to build it yourself on Mac, and we all know how "fun" building Python projects is

Just spent about 10 minutes building it on MacBook Pro M1. I come with significant bias against Python projects, but getting Fooocus to run was very, very easy.

I finally had a chance to set it up and yes, it works great!

did you get fooocus to run on silicon mac with mps support? i cant get mps support running for the love of god - any help would be much appreciated from me (as well as the 20 or so people plus that are currently looking for a solution on github to achieve normal speed generation compared to the 15 minutes per image :)) thank you

I think so… it’s an M1 Max Macbook Pro and it takes about 2-3 mins per image. I followed all the prerequisites on their install for mac page

Mine takes about ~3 min per image, didn't do anything special. Left everything at default settings after my initial install (about 5 days ago). Not speedy but certainly not 15 min. Running on an M1 with 32gb ram.

That's good to know!

Looks like the Web UI of the self-hosted install of Fooocus sells out the user to Google Tag Manager.

Can our entire field please realize that running this surveillance is a bad move, and just stop doing it.

I think that's coming from gradio?

Probably, auto1111 does the same - but I agree with GP it shouldn’t be there.

If it isn’t explicitly surveillance, it could effectively be.

Yeah, it is, just need to set an env var of GRADIO_ANALYTICS_ENABLED=False to stop it - probably should be added into launch.py along with the other env vars being set at launch.

Is this at all usable on a CPU-only system with a ton of RAM?

Not really. There is a very fast LCM model preset now, but its still going to be painful.

SDXL in particular isn't one of those "compute light, bandwidth bound" models like llama (or Fooocus's own mini prompt expansion llm that in fact runs on the CPU).

There is a repo focused on CPU-only SD 1.5.

Yeah, llama runs acceptably on my server, but buying a GPU and setting it all up seems really unfun. Also much more expensive than my hobby budget

You don't need a big one, even an old 4GB GPU will massively accelerate the prompt ingestion.

Yeah, Fooocus is much better if you are going for the best local generated result. Lvmin puts all his energy into making beautiful pictures. Also it is GPL licensed, which is a + in my book.

Eh, I messed around with it for a while - it's okay and good for beginners, but without much more effort you can get better results out of A1111 or ComfyUI

Just installed, this is very cool. Local AI is the future I want (and what I'm working on too). A few notes using it...

Pros:

- seems pretty self contained

- built in model installer works really well and helps you download anything from CivitAI (I installed https://civitai.com/models/183354/sdxl-ms-paint-portraits)

- image generation is high quality and stable

- shows intermediate steps during generation

Cons:

- downloads 6.94GB SDXL model file somewhere without asking or showing location/size. Just figured out you can find/modify the location in the settings.

- very slow on first generation as it loads the model, no record of how long generations take but I'd guess a couple minutes (m1 max macbook, 64GB)

- multiple user feedback modules (bottom left is very intrusive chat thing I'll never use + top right call for beta feedback)

- not open source like competitors

- runs 7 processes, idling at ~1GB RAM usage

- non-native UX on macOS, missing hotkeys you'd expect, help menu. electron app?

Overall 4/5 stars, would open again :)

You should check out Draw Things on macOS. It works well enough for SDXL on 8GiB macOS devices.

Are you the developer by any chance? If so, it would be helpful to state it.

I am. I thought this is obvious. My statement is objective. I would go as far as: it is the only app works at 8GiB macOS devices with SDXL-family models.

"You should check out this thing" has a very different implied context than "You should check out this thing I made". The first sounds like a recommendation from an enthusiastic user, not from the the author. Because of this, discovering that you are the author makes your recommendation feel deceptive.

I am sorry if you feel that way. I joined HN when it was a small tight-knit community without much of marketing presence. The "obvious" comment is more like "people know other people" kind of thing. I didn't try to deceive anyone to use the app (and why should I?).

If you feel this is unannounced self-promotion, yes, it is, and can be done better.

---

Also, for the "objective" comment, it meant to say "the original comment is still objective", not that you can be objective only by being a developer. Being a developer can obviously bias your opinion.

I think it was obvious. That said, thank you so much for your labor of love! The app is amazing! Any plans for SDXL Turbo support?

Should be in a few days. I asked Stability to clarify whether I can deliver weights through my Cloudflare bucket and whether qualifying as non-commercial is who runs the model, not who delivers the model.

2008, when we both joined, was 15 years ago. In the interim, the userbase has grown. Most people aren't recognizable as the author of an app under discussion, so a simple "Developer here" is appreciated as it was not obvious to me.

Nice app - but for future reference it is very much not obvious to any native English speaker. "You should check out X" sounds like a random recommendation.

How would that be obvious to anyone?

What do you mean it was obvious? Only the developer could make that objective statement?

Whoa, well let me just say thanks for the awesome app!! it's pretty entertaining to spin this up in situations where I don't have internet (Airplane, subway etc.)

I was also surprised on how well it ran on my iPhone 11 before I replaced it with a 15 pro.

(Let me know if you're looking for some Product Design help/advice, totally happy to contribute pro bono. No worries if not of course!)

Thanks. Yeah I played with your app early on and just fired it up again to see the progress. Frankly I find the interface pretty intimidating but it is cool that you can easily stitch generations together.

Unsolicited UX recs:

- strongly recommend a default model. The list you give is crazy long. It kind of recommends SD 1.5 in the UI text below the picker but has the last one selected by default. Many of them are called the same thing (ironically the name is "Generic" lol).

- have the panel on the left closed by default or show a simplified view that I can expand to an "advanced" view. Consider sorting the left panel controls by how often I would want to edit them (personally I'm not going to touch the model but it is the first thing).

You are doing great work but I wouldn't underestimate the value of simplifying the interface for a first-time user. It seems to have a ton of features but I don't know what I should actually be paying attention to / adjusting.

Is there a business model attached to this or do you have a hypothesis for what one might look like?

Agreed on UX feedback. It accumulated a lot of crufts from the old technologies to the new. This just echos my early feedback that co-iterating UI and the technology is difficult, you'd better pick the side you want to be on and there is only one correct side (and unfortunately, the current app is trying hard to be on both-side).

Any plans for SD Turbo? Both base and XL models would be a great fit for a mobile device.

Draw Things is amazing. Great work and thanks for developing it!

> not open source like competitors

Who are the competitors?

DiffusionBee: AGPL-3.0 license (Native app)

InvokeAI: Apache license 2.0 (web-browser UI)

automatic1111: AGPL-3.0 license (web-browser UI)

ComfyUI:GPL-3.0 license (web-browser UI)

There's more, but I don't pay enough attention to it

Thanks! https://lmstudio.ai/ too. For the more technically inclined perhaps.

I don't think lmstudio is competes with Stable Diffusion frontends, even for the technically-inclined.

for people with Intel video cards (all 10 of us!) there's also SD.Next (automatic1111 fork): https://github.com/vladmandic/automatic

I'd also recommend InvokeAI, an open source offering which has a very nice editable canvas and is very performant with diffusers.

https://github.com/invoke-ai/InvokeAI

I just installed InvokeAI and wish I hadn't. It installs -so much- outside of its target directory. A1111 and ComfyUI are fairly self contained where you put them.

It's all isolated in a single directory, though, right? I set it up ages ago, but my recollection is that it installs itself in ~/invoke on Linux and stays contained there.

I like ComfyUI the most now but it's probably not the most beginner friendly. But has great features, is extensible, and you can build workflows that work for you and save them so you don't have to click a million times like Auto1111.

Another con is it only works on Silicon Macs.

Apple Silicon* I presume?

This could honestly be the excuse I need (want) to order an absolute beast of a macbook pro to replace my 2013 model.

If you want an absolute beast, especially for this stuff, you probably want Intel + Nvidia. Apple Silicon is a beast in power efficiency but a top of the line M3 does not come close to the top of the line Intel + Nvidia combo.

Well this would just be the excuse. I'm typing this on a Ryzen 5950X w/32 GB of RAM and a 4090. So I guess I already have the beast?

I guess EPYC and a few H100s is the next big step, at a much higer price point...

If it's just for hobby/interest work, then just a heads-up that even the 1st generation Apple Silicon will turn over about one image a second with SDXL Turbo. The M3s of course are quite a bit faster.

The performance gains in recent models and PyTorch are currently outpacing hardware advances by a significant margin, and there are still large amounts of low-hanging fruit in this regard.

+1 for asking download location.

Is that 1GB idle per process or total for all 7 processes?

If you are interested in the tech-stack:

https://noiselith.notion.site/License-61290d5ed7ab4c918402fd...

So yes, it is an electron app with svelte, headless-ui, tailwindcss etc

There are already a number of local, inference options that are (crucially) open-source, with more robust feature sets.

And if the defense here is "but Auto1111 and Comfy don't have as user-friendly a UI", that's also already covered. https://github.com/invoke-ai/InvokeAI

I switched to InvokeAI and won't go back to basic a1111 webui. I like how everything is laid out, there are workflow features, you can easily recall all properties (prompt, model, lora, etc.) used to generate an image, things can be organized into boards, and all off the boards/images/metadata are stored in a very well-designed sqlite database that can be tapped into via DataGrip.

automatic1111: great for the fast implementation of the most recent generative features

comfyui: excellent for workflows and recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

InvokeAI: Great UX and community, arguably were a bit behind in features as they were focused on making the UI work well. Now at the stage of bringing in the best features of competitors - Like you, I can easily recommend it above all other options.

recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

Doesn't a1111 already do this? Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab. It doesn't automatically load the model, but that may be intentional because of how long it takes to change loaded models.

Comfy is node based. The saved metadata pulls up the full nodal workflow.

Doesn't a1111 already do this?

Not that provides the same thing, no, largely because of fundamental design differences.

Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab.

A1111 by nature, has a bunch of disconnected operations in separate tabs and scripts. Even if the PNG captures all of a generation operation that would be executed by a single launch-button click, its not really equivalent to capturing a whole ComfyUI workflow, which can be the equivalent of a process which would be numerous different tasks in A1111 with manually shuttling data between tabs and scripts.

A1111 has a bunch of manual "send to X" buttons to do with the output of runs, so that they can be the input of another task, wherein in Comfy those operations are part of one workflow with a pipeline connecting the output of one to the input of another. And when saving generation data, those manual shuttle points in A1111 are barriers as to what is part of a single generation that can be saved.

Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script. Played arround with comfy. Really nice, but i would like to automate it within my own environment.

Yeah, Invoke's nodes/workflows backend can be hit via the API. That's how the entire front-end UI (and workflow editor/IDE) are built.

I'm positive this can be done w/ Comfy too.

Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script.

Yes, you can, and the workflow JSON format has a reduced "API form" that discards visual/UI related information.

Also, if you are using Python, you could do your automation in Comfy (as custom nodes) instead of outside, too.

No idea whether or not the UI is user-friendly, but the installation steps alone for InvokeAI are already a barrier for 99.9% of the world. Not to say Noiselith couldn't be open-source, but it's clearly offering something different from InvokeAI.

I can't even figure out how one would install Noiselith. It has some text that says "Download for free on your PC", but it's not a button or a link. Maybe they're doing some weirdly locked-down user-agent sniffing and refuse to allow me to even attempt to download any version on Linux?

InvokeAI is installed via a script, sure, but it's also just a few clicks: download, extract, double-click on a specific file, enjoy.

Yeah invokeAI is fantastic!

Yeah "Run Stable Diffusion locally" is a weird pitch since that's already easy to do tbh.

Also just Krita with the diffusion AI plugin: https://github.com/Acly/krita-ai-diffusion

All the real homies use ComfyUI

Elaborate?

I'm being kind of tongue in cheek because I understand that this is for just making things really easy and ComfyUI is a node based editor that most people would have trouble with. But the best UI for local SD generation that the community is using is https://github.com/comfyanonymous/ComfyUI

If you are a programmer at heart, ComfyUI will feel very comfortable (pun intended). It's basically a visual programming environment optimized for the type of compositional programming that machine learning models desire. The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud. Use ComfyUI and other apps to orchestrate the models locally, but send data to the cloud and benefit from sharing GPU resources far more efficiently.

If anyone has a spare thousand hours to kill, I would build that and connect it up with the various front-ends including ComfyUI, A111, etc.. not a small amount of effort, but it will be rewarding.

The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud.

So, Civitai.com, if they had an API for the ob-site generation and training functions?

Sure, they’d be well placed to do this.

Agreed. It's worth the learning curve for the sheer power you can enable your workflows. I've always wanted to toy around with node based architectures and this seemed quite easy after using A1111 extensively. The community providing ready to go workflows has made it quite enjoyable too.

I can’t seem to get myself to switch. I’ve only used A1111 a dozen times and only for funny work images… I can’t seem to get myself to switch over to comfy because it looks rather intimidating.

I mean A1111 is great if you are a casual user as it has everything ready at your fingertips.

When you want to apply advanced workflows and repetitive tasks or use something cutting edge then Comfy is handy.

Another A1111 alternative to try which focuses on prompt generation is:

https://github.com/lllyasviel/Fooocus

Sales prompt: "Young woman with blonde curls in front of a fantasy world background, come hither eyes, sitting with her legs spread, wearing a white shirt and jeans hot pants."

I mean, really??

Glad I’m not the only one who found it inappropriate. Feels very much like a dog whistle.

What's subtle about it? In the dog whistle analogy, who are they who cannot hear the whistle?

To me this is more like yelling "ROVER! COME HERE BOY!" at the top of your lungs.

The actual prompt is "magic world and the girl sitting inside a computer monitor, fantasy, cinematic close up photo."

OP is just offended by the image of an attractive woman, I guess. Apparently that's "creepy" now.

I'm genuinely curious how many people in the open source community are pouring their sweat and blood into these projects that are, at the end of the day, enabling guys to transform their macbooks into insta-porn-books.

How many technological revolutions do we need to go through before we just accept an admit by default it’s typically about boobs?

Yeah that’s creepy as.

If the prompt wasn't somewhat sexual, divisive, or offensive it would be wide open to the chorus of "still not as good as midjourney/dall-e/imagen". Freedom from restriction is one of the main selling points.

The 16GB (base model) M2 Pro Mini, despite its overall awesomeness (running DiffusionBee.app / etc)... does not meet Minimum System Requirements (Apple Silicon requires 32GB RAM).

So now I have to contemplate shopping for a new mac TWICE in one year (never happened before).

Currently using SDXL (through Huggingface Diffusers) on an M1 16GB Mac. Takes on average 4-5mins to generate an image. It's usable.

Good lord. I can get a 2048x2048 upscaled output from a very complex ComfyUI workflow on a 4090 in 15 seconds. This includes three IPAdapter nodes, a sampling stage, a three-stage iterative latent upscaler, and multiple ControlNets. Macs are not close to competitive for inference yet.

I mean, a 4090 would appear to cost $2000, and came out a year ago; it has about 70bn transistors. The M1 could be had for $700 for a desktop, $1000 as part of a laptop, came out three years ago, and has 16bn transistors, some of which are CPU.

An M3 Ultra might be a more reasonable comparison for the 4090.

A very fair comment. I use an M2 MBP with max specs. It’s very powerful, but the Nvidia card draws a whole lot more power…

24GB cards weren't always $2000. I've seen people on this very forum [1] who brought two 3090s for just $600 each.

Agree the prices are crazy right now, though.

[1] https://news.ycombinator.com/item?id=37438847

https://github.com/invoke-ai/InvokeAI - runs on Mac silicon, can squeeze out SDXL images on a 16gb mac with SSD-1B or Turbo models.

When choosing a machine with non-expandable RAM, you went with the minimum configuration? That's a choice, I suppose, but the outcome wasn't exactly hard to foresee.

I realize it may be good marketing, but it's odd to have the fact that it's on device and offline be the primary differentiator when that's probably how most people use Stable Diffusion already.

I'd probably focus more on it being easy to install and use, as that's something that isn't done much. For me, if it doesn't have Controlnet, upscaling, some kind of face detailer, and preferably regional prompting, I'm out.

I also kind of wish all of these people that want to make their own SD generators would instead work on one of the open source ones that already exist.

While an app store might be a good idea, in a world with Auto111 and all of their extensions I think it's going to go over poorly with the Stable Diffusion community, for what it's worth.

I think there's probably a bunch of people who don't use things like A1111 because of the complexities of the download-this-which-downloads-this-which-downloads-this-then-you-manually-download-this-and-this setup model.

I can see how something simpler might appeal to new users, even if it doesn't appeal to existing users.

Sure, and I agree with that. As I said, I'd probably push that just as much as it being 'offline,' if not more.

I've used SD on my device, but I found it worth it to pay for the hosted version because it's much faster.

I’ve oddly found many cloud wrappers to stable diffusion. So I like the upfront on device/offline description.

It was weird when I was first playing with SD how many packages did severe phone home or vms or whatever instead of just downloading a bunch of stuff and running it.

You hit the nail on the head when you said it's good marketing, but go all the way. The thing you find odd tells you who they want to use their product; You're not their target audience. They are trying to convert people from using online-only services like Dall-E, not people who already use SD.

So it's free, but not open source.

What is the catch?

They will have a non-free (as in beer) version once they exit beta (per the website).

With no real way to confirm it doesn’t phone home.

IDK, this all seems weird considering there are four other really good projects that do all of these things already.

what? there are dozens of application level firewalls out there

This is when I feel the 24G mem limit of the mac book/air

Again, try Draw Things, it runs well for SDXL on 8GiB macOS devices.

yeah, I know there are options, I'm more interested in language models than image generation anyway, so llama.cpp

After installation, it wouldn't run on my Windows machine unless I granted public and private network access. Kinda tripped up since it says "offlilne".

I had a similar experience.

On the first run it downloads about 30GB of data. I don't know if it would work offline on subsequent runs because for me it never ran again without crashing!

Also upon uninstallation it left behind all its data (not user data, mind you. But the executable itself, its python venv, its updater, and all the models. Uninstall basically just removed the shortcut in the start menu).

If you disconnected completely from the internet did it still run?

That is completely wrong to advertise it as “offline” if it requires an active internet connection to run.

Works beautifully on my macbook m3 with 128gb.

That's quite a luxurious config!

I find it interesting that it requires 16GB of RAM on Windows but 32 on a Mac. Unfortunately that leaves me out ...

I think that's probably because RAM on Mac is shared with the GPU. On Windows, you need 16GB RAM plus 8GB on GPU.

Would it be possible to run Stable Diffusion in the browser via WebGPU?

https://websd.mlc.ai/#text-to-image-generation-demo

How's support for AMD GPUs? I only saw Nvidia listed.

The main issue with AMD is that to get reasonable performance you need to use ROCm, and ROCm is only available on Linux. They started porting parts of ROCm to Windows but it's not enough to be usable yet, might be different in few months.

Interesting, will check it out to see how it compares with https://diffusionbee.com which I am using for last few months for fun.

I just checked out both and Noiselith produces much, much better results.

Why do we never see AMD support in these projects?

I think it is a matter of why AMD does not support these projects. NVIDIA is involved everywhere. They could easy do the same. At least to what I have observed on the internetz.

Installed it. Ran it. Generated. Slow for some reason. Deleted it. Looks similar to Pinokio, and that is opensource.

definitely exciting to see more local clients come out. As mentioned in other comments, there are some great ones out already. I've used automatic1111 which is quick and doesn't require a ton of tuning. But it still has lots of knobs and options which makes it difficult initially. Fooocus is super quick but of course less customization.

Then there's ComfyUI, the holy grail of complicated, but with that complication comes the ability to do so much. It is a node-based app that allows you to create custom workflows. Once your image is generated, you can pipe that "node" somewhere else and modify it, eg: upscale the image or do other things.

I'd like to see if Noiselith or some others offer support for SDXLTurbo -- it came out only a few days ago but in my opinion is a complete game-changer. It can generate 512x512 images in ~half a second on consumer GPUs. The images aren't crazy quality but that ability to make a prompt like "fox in the woods", see it instantly and then add "wearing a hat" and see it instantly generate again is so valuable. Prior to that, I'd wait 12 seconds for an image. Sounds like not a big deal, but the value of being able to iterate so quickly makes local image gen so much more fun.

I keep getting "Failed to generate. Please try again" 10 seconds after model loading. It is hardly helpful, as trying again gives the same error.

Apple Silicon M1, 32GB RAM, in any case.

Doesn't seem to work with SDXL Turbo or SDXL LCM

but what s gonna happen to all those AI valuations if we all go offline

Very much enjoying this, but, I BSOD'd very hard after making an image. (Maybe my PSU or GPU is bad, I need to take a look).

I am running this on my 3070 RTX and 32GB ram Ryzen 5 and this is flawless!

Literally the reason why I am coming to HN every day! Thanks devs :)

Does not work at all, it needs you to go and find a “model”, like just download it for man.

What's the privacy and licensing like for this? I'm honestly too ignorant to know if someone's allowed to use this for commercial purposes, or even if it's sending the generated images/prompts somewhere even if it's rendering locally.

Guernika is a decent one for Mac, available in the App Store.

As others have stated, Local AI (completely offline after model/weight download) is the way to go. If I have the hardware why shouldn't I be able to run all these fancy software on my own machine?

There are many great suggestions and links to other similar/better packages, so follow the comments for more info, thanks :-)