return to table of content

Easy Stable Diffusion XL in your device, offline

brucethemoose2
35 replies
1d23h

I would highly recommend Fooocus to anyone who hasn't tried: https://github.com/lllyasviel/Fooocus

There are a bajillion local SD pipelines, but this one is, by far, the one with the highest quality output out-of-the-box, with short prompts. Its remarkable.

And thats because it integrates a bajillion SDXL augmentations that other UIs do not implement or enable by default. I've been using stable diffusion since 1.5 came out, and even having followed the space extensively, setting up an equivalent pipeline in ComfyUI (much less diffusers) would be a pain. Its like a "greatest hits and best defaults" for SDXL.

rvz
10 replies
1d23h

Looks like a complete contraption to setup and looks very unpleasant to use at first glance when compared against Noiselith.

The hundreds of python scripts and having the user to touch the terminal shows why something like Noiselith should exist for normal users rather than developers or programmers.

I would rather take a packaged solution that just works over a bunch of scripts requiring a terminal.

Liquix
6 replies
1d23h

installation/setup is dead simple. up and running in under 3 minutes:

git clone https://github.com/lllyasviel/Fooocus.git

cd Fooocus

pip3 install -r requirements_versions.txt

python3 entry_with_update.py

Filligree
3 replies
1d21h

Let's see...

pip3: command not found

Okay. I'll need to install it? What package might that be in, hmm. Moving on, I already know it's python.

/usr not writeable

Guess I'll use sudo...

= = =

Obviously I know better than to do this, but very few people would. This is not 'dead simple'! It's only simple for Python programmers who are already familiar with the ecosystem.

Now, fortunately the actual documentation does say to use venv. That's still not 'dead simple'; you still need to understand the commands involved. There's definitely space for a prepackaged binary.

pixl97
0 replies
1d21h

The people that make software that does useful things, and the people that understand system security live on different planets. One day they'll meet each other and have a religious war.

This said, it's nice when developers attempt to detect the executable they need and warn what package is missing.

gerwim
0 replies
1d8h

Depending on your platform, but if you read the readme, there’s a pre packaged release with Python embedded.

brucethemoose2
0 replies
1d21h

There are projects that set up "fat" Python executables or portable installs, but the problem in PyTorch ML projects is that the download would be many gigabytes.

Additionally, some package choices depend on hardware.

In the end, a lot of the more popular projects have "one click" scripts for auto installs, and there are some for Fooocus specifically, but the issue there is its not as visible as the main repo, and not necessarily something the dev wants to endorse.

zirgs
1 replies
1d20h

Or you can use Stability Matrix package manager.

brucethemoose2
0 replies
1d19h

Yeah, VoltaML is also another excellent choice in stabilty matrix.

liuliu
1 replies
1d23h

You have to make trade-off in software development. Fooocus trades on the best picture rather than the most beautiful interface, and also simplicity in its use. I think it is a good trade-off given the technology is improving at breaking-neck speed.

Look, DiffusionBee is still maintained but still no SDXL support.

Anyone who bet that the technology is done and it is time to focus on the UI is making the wrong bet.

rgbrgb
0 replies
1d22h

This project is really cool and I like the stated philosophy on the README. I think it's making the right trade-off in terms of setting useful defaults and not showing you 100 arcane settings. However, the installation is too hard. It's a student project and free so I'm not criticizing the author at all but I think it's a pretty fair and useful criticism of the software and likely a significant bottleneck to adoption.

Tiberium
0 replies
1d23h

Huh? It has a really simple interface, much much much simpler than anything else that uses SD/SDXL locally. Installation is also simple for Windows/Linux, don't know about macOS.

stavros
6 replies
1d17h

I was afraid of the Python setup (even though I'm a Python developer), but yep: Make the virtualenv, install the dependencies, done. This is amazing, the images it generates are immediately beautiful.

It does look bad that it bundles GTM, though, as a sibling commenter says.

Samples:

https://imgz.org/i9oicVqo/

https://imgz.org/i8Ur3WjW/

https://imgz.org/i5j6r6TZ/

brucethemoose2
5 replies
1d15h

Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL, and most other UIs dont implement the style prompting.

dragonwriter
4 replies
1d10h

Be sure to try the styles as well. Thats actually a seperate input than the prompt for SDXL.

No, its not.

There are two text encoders, but they aren't really “prompt” and “style” inputs.

and most other UIs dont implement the style prompting.

Most UIs default mode of operation sends the same input to both text encoders, but at least comfy has nodes that support sending separate text to them. OTOH, while there may be some cases where sending different text to the two encoders helps in a predictable way, AFAIK most of the testing people has done has shown that optimal prompt adherence usually comes from sending the same to both.

brucethemoose2
1 replies
1d10h

Hmm well that was a massive misunderstanding on my part.

dragonwriter
0 replies
1d7h

I’m not sure the origin, but using ViT-L (the encoder shared with SD1. x) for what you might call the main prompt and ViT-G (the new SDXL encoder, and also a successor to the encoder used as the single encoder in SD 2.x) for a style prompt was a common idea shortly after SDXL launched, so its understandable.

antman
1 replies
1d8h

What are the two text encoders?

dragonwriter
0 replies
1d7h

OpenCLIP ViT-G and CLIP ViT-L. The latter is the same encoder used in SD 1.x, OpenCLIP ViT-H was used as the encoder in SD 2.x, and ViT-G is, as I understand it, a successor and improvement on ViT-H.

pmarreck
6 replies
1d23h

Have to build it yourself on Mac, and we all know how "fun" building Python projects is

jessepasley
5 replies
1d21h

Just spent about 10 minutes building it on MacBook Pro M1. I come with significant bias against Python projects, but getting Fooocus to run was very, very easy.

pmarreck
3 replies
1d16h

I finally had a chance to set it up and yes, it works great!

mrbmbzl
2 replies
1d15h

did you get fooocus to run on silicon mac with mps support? i cant get mps support running for the love of god - any help would be much appreciated from me (as well as the 20 or so people plus that are currently looking for a solution on github to achieve normal speed generation compared to the 15 minutes per image :)) thank you

pmarreck
0 replies
3h7m

I think so… it’s an M1 Max Macbook Pro and it takes about 2-3 mins per image. I followed all the prerequisites on their install for mac page

jmpavlec
0 replies
11h26m

Mine takes about ~3 min per image, didn't do anything special. Left everything at default settings after my initial install (about 5 days ago). Not speedy but certainly not 15 min. Running on an M1 with 32gb ram.

pmarreck
0 replies
1d19h

That's good to know!

neilv
3 replies
1d20h

Looks like the Web UI of the self-hosted install of Fooocus sells out the user to Google Tag Manager.

Can our entire field please realize that running this surveillance is a bad move, and just stop doing it.

stoobs
2 replies
1d16h

I think that's coming from gradio?

SV_BubbleTime
1 replies
1d16h

Probably, auto1111 does the same - but I agree with GP it shouldn’t be there.

If it isn’t explicitly surveillance, it could effectively be.

stoobs
0 replies
1d15h

Yeah, it is, just need to set an env var of GRADIO_ANALYTICS_ENABLED=False to stop it - probably should be added into launch.py along with the other env vars being set at launch.

calamari4065
3 replies
1d21h

Is this at all usable on a CPU-only system with a ton of RAM?

brucethemoose2
2 replies
1d20h

Not really. There is a very fast LCM model preset now, but its still going to be painful.

SDXL in particular isn't one of those "compute light, bandwidth bound" models like llama (or Fooocus's own mini prompt expansion llm that in fact runs on the CPU).

There is a repo focused on CPU-only SD 1.5.

calamari4065
1 replies
1d15h

Yeah, llama runs acceptably on my server, but buying a GPU and setting it all up seems really unfun. Also much more expensive than my hobby budget

brucethemoose2
0 replies
1d15h

You don't need a big one, even an old 4GB GPU will massively accelerate the prompt ingestion.

liuliu
1 replies
1d23h

Yeah, Fooocus is much better if you are going for the best local generated result. Lvmin puts all his energy into making beautiful pictures. Also it is GPL licensed, which is a + in my book.

stoobs
0 replies
18h3m

Eh, I messed around with it for a while - it's okay and good for beginners, but without much more effort you can get better results out of A1111 or ComfyUI

rgbrgb
34 replies
2d

Just installed, this is very cool. Local AI is the future I want (and what I'm working on too). A few notes using it...

Pros:

- seems pretty self contained

- built in model installer works really well and helps you download anything from CivitAI (I installed https://civitai.com/models/183354/sdxl-ms-paint-portraits)

- image generation is high quality and stable

- shows intermediate steps during generation

Cons:

- downloads 6.94GB SDXL model file somewhere without asking or showing location/size. Just figured out you can find/modify the location in the settings.

- very slow on first generation as it loads the model, no record of how long generations take but I'd guess a couple minutes (m1 max macbook, 64GB)

- multiple user feedback modules (bottom left is very intrusive chat thing I'll never use + top right call for beta feedback)

- not open source like competitors

- runs 7 processes, idling at ~1GB RAM usage

- non-native UX on macOS, missing hotkeys you'd expect, help menu. electron app?

Overall 4/5 stars, would open again :)

liuliu
15 replies
1d23h

You should check out Draw Things on macOS. It works well enough for SDXL on 8GiB macOS devices.

miles
10 replies
1d23h

Are you the developer by any chance? If so, it would be helpful to state it.

liuliu
9 replies
1d23h

I am. I thought this is obvious. My statement is objective. I would go as far as: it is the only app works at 8GiB macOS devices with SDXL-family models.

cellularmitosis
4 replies
1d21h

"You should check out this thing" has a very different implied context than "You should check out this thing I made". The first sounds like a recommendation from an enthusiastic user, not from the the author. Because of this, discovering that you are the author makes your recommendation feel deceptive.

liuliu
3 replies
1d19h

I am sorry if you feel that way. I joined HN when it was a small tight-knit community without much of marketing presence. The "obvious" comment is more like "people know other people" kind of thing. I didn't try to deceive anyone to use the app (and why should I?).

If you feel this is unannounced self-promotion, yes, it is, and can be done better.

---

Also, for the "objective" comment, it meant to say "the original comment is still objective", not that you can be objective only by being a developer. Being a developer can obviously bias your opinion.

givinguflac
1 replies
1d17h

I think it was obvious. That said, thank you so much for your labor of love! The app is amazing! Any plans for SDXL Turbo support?

liuliu
0 replies
1d16h

Should be in a few days. I asked Stability to clarify whether I can deliver weights through my Cloudflare bucket and whether qualifying as non-commercial is who runs the model, not who delivers the model.

fragmede
0 replies
1d9h

2008, when we both joined, was 15 years ago. In the interim, the userbase has grown. Most people aren't recognizable as the author of an app under discussion, so a simple "Developer here" is appreciated as it was not obvious to me.

vunderba
0 replies
1d18h

Nice app - but for future reference it is very much not obvious to any native English speaker. "You should check out X" sounds like a random recommendation.

adamjc
0 replies
1d22h

How would that be obvious to anyone?

TheHumanist
0 replies
1d19h

What do you mean it was obvious? Only the developer could make that objective statement?

ProfessorLayton
0 replies
1d19h

Whoa, well let me just say thanks for the awesome app!! it's pretty entertaining to spin this up in situations where I don't have internet (Airplane, subway etc.)

I was also surprised on how well it ran on my iPhone 11 before I replaced it with a 15 pro.

(Let me know if you're looking for some Product Design help/advice, totally happy to contribute pro bono. No worries if not of course!)

rgbrgb
1 replies
1d21h

Thanks. Yeah I played with your app early on and just fired it up again to see the progress. Frankly I find the interface pretty intimidating but it is cool that you can easily stitch generations together.

Unsolicited UX recs:

- strongly recommend a default model. The list you give is crazy long. It kind of recommends SD 1.5 in the UI text below the picker but has the last one selected by default. Many of them are called the same thing (ironically the name is "Generic" lol).

- have the panel on the left closed by default or show a simplified view that I can expand to an "advanced" view. Consider sorting the left panel controls by how often I would want to edit them (personally I'm not going to touch the model but it is the first thing).

You are doing great work but I wouldn't underestimate the value of simplifying the interface for a first-time user. It seems to have a ton of features but I don't know what I should actually be paying attention to / adjusting.

Is there a business model attached to this or do you have a hypothesis for what one might look like?

liuliu
0 replies
1d21h

Agreed on UX feedback. It accumulated a lot of crufts from the old technologies to the new. This just echos my early feedback that co-iterating UI and the technology is difficult, you'd better pick the side you want to be on and there is only one correct side (and unfortunately, the current app is trying hard to be on both-side).

rcarmo
0 replies
1d4h

Any plans for SD Turbo? Both base and XL models would be a great fit for a mobile device.

heyyeah
0 replies
1d3h

Draw Things is amazing. Great work and thanks for developing it!

mikae1
8 replies
1d20h

> not open source like competitors

Who are the competitors?

quitit
3 replies
1d20h

DiffusionBee: AGPL-3.0 license (Native app)

InvokeAI: Apache license 2.0 (web-browser UI)

automatic1111: AGPL-3.0 license (web-browser UI)

ComfyUI:GPL-3.0 license (web-browser UI)

There's more, but I don't pay enough attention to it

mikae1
1 replies
1d20h

Thanks! https://lmstudio.ai/ too. For the more technically inclined perhaps.

dragonwriter
0 replies
1d19h

I don't think lmstudio is competes with Stable Diffusion frontends, even for the technically-inclined.

0xDEADFED5
0 replies
1d12h

for people with Intel video cards (all 10 of us!) there's also SD.Next (automatic1111 fork): https://github.com/vladmandic/automatic

vunderba
2 replies
1d18h

I'd also recommend InvokeAI, an open source offering which has a very nice editable canvas and is very performant with diffusers.

https://github.com/invoke-ai/InvokeAI

UberFly
1 replies
1d16h

I just installed InvokeAI and wish I hadn't. It installs -so much- outside of its target directory. A1111 and ComfyUI are fairly self contained where you put them.

demosthanos
0 replies
1d13h

It's all isolated in a single directory, though, right? I set it up ages ago, but my recollection is that it installs itself in ~/invoke on Linux and stays contained there.

8n4vidtmkvmk
0 replies
17h4m

I like ComfyUI the most now but it's probably not the most beginner friendly. But has great features, is extensible, and you can build workflows that work for you and save them so you don't have to click a million times like Auto1111.

philote
5 replies
1d23h

Another con is it only works on Silicon Macs.

Vicinity9635
4 replies
1d21h

Apple Silicon* I presume?

This could honestly be the excuse I need (want) to order an absolute beast of a macbook pro to replace my 2013 model.

wayfinder
2 replies
1d19h

If you want an absolute beast, especially for this stuff, you probably want Intel + Nvidia. Apple Silicon is a beast in power efficiency but a top of the line M3 does not come close to the top of the line Intel + Nvidia combo.

Vicinity9635
1 replies
1d19h

Well this would just be the excuse. I'm typing this on a Ryzen 5950X w/32 GB of RAM and a 4090. So I guess I already have the beast?

NorwegianDude
0 replies
1d17h

I guess EPYC and a few H100s is the next big step, at a much higer price point...

quitit
0 replies
1d20h

If it's just for hobby/interest work, then just a heads-up that even the 1st generation Apple Silicon will turn over about one image a second with SDXL Turbo. The M3s of course are quite a bit faster.

The performance gains in recent models and PyTorch are currently outpacing hardware advances by a significant margin, and there are still large amounts of low-hanging fruit in this regard.

sytelus
0 replies
1d18h

+1 for asking download location.

mtlmtlmtlmtl
0 replies
1d16h

Is that 1GB idle per process or total for all 7 processes?

maxdaten
0 replies
1d12h

If you are interested in the tech-stack:

https://noiselith.notion.site/License-61290d5ed7ab4c918402fd...

So yes, it is an electron app with svelte, headless-ui, tailwindcss etc

sophrocyne
13 replies
2d

There are already a number of local, inference options that are (crucially) open-source, with more robust feature sets.

And if the defense here is "but Auto1111 and Comfy don't have as user-friendly a UI", that's also already covered. https://github.com/invoke-ai/InvokeAI

internet101010
7 replies
2d

I switched to InvokeAI and won't go back to basic a1111 webui. I like how everything is laid out, there are workflow features, you can easily recall all properties (prompt, model, lora, etc.) used to generate an image, things can be organized into boards, and all off the boards/images/metadata are stored in a very well-designed sqlite database that can be tapped into via DataGrip.

quitit
6 replies
1d20h

automatic1111: great for the fast implementation of the most recent generative features

comfyui: excellent for workflows and recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

InvokeAI: Great UX and community, arguably were a bit behind in features as they were focused on making the UI work well. Now at the stage of bringing in the best features of competitors - Like you, I can easily recommend it above all other options.

squeaky-clean
2 replies
1d19h

recalling the workflows, as they're saved into the resulting image metadata (i.e. sharing images, shares the image generation pipeline)

Doesn't a1111 already do this? Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab. It doesn't automatically load the model, but that may be intentional because of how long it takes to change loaded models.

quitit
0 replies
1d18h

Comfy is node based. The saved metadata pulls up the full nodal workflow.

dragonwriter
0 replies
1d17h

Doesn't a1111 already do this?

Not that provides the same thing, no, largely because of fundamental design differences.

Theres a PNG Info tab where you can drag and drop a PNG and it will pull all the prompt, inverse prompt, model, etc. And then a button to send it to the main generation tab.

A1111 by nature, has a bunch of disconnected operations in separate tabs and scripts. Even if the PNG captures all of a generation operation that would be executed by a single launch-button click, its not really equivalent to capturing a whole ComfyUI workflow, which can be the equivalent of a process which would be numerous different tasks in A1111 with manually shuttling data between tabs and scripts.

A1111 has a bunch of manual "send to X" buttons to do with the output of runs, so that they can be the input of another task, wherein in Comfy those operations are part of one workflow with a pipeline connecting the output of one to the input of another. And when saving generation data, those manual shuttle points in A1111 are barriers as to what is part of a single generation that can be saved.

holoduke
2 replies
1d19h

Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script. Played arround with comfy. Really nice, but i would like to automate it within my own environment.

sophrocyne
0 replies
1d19h

Yeah, Invoke's nodes/workflows backend can be hit via the API. That's how the entire front-end UI (and workflow editor/IDE) are built.

I'm positive this can be done w/ Comfy too.

dragonwriter
0 replies
1d17h

Can you actually use those workflows in some sort of API from a script to automate it from lets say a python script.

Yes, you can, and the workflow JSON format has a reduced "API form" that discards visual/UI related information.

Also, if you are using Python, you could do your automation in Comfy (as custom nodes) instead of outside, too.

blehn
1 replies
2d

No idea whether or not the UI is user-friendly, but the installation steps alone for InvokeAI are already a barrier for 99.9% of the world. Not to say Noiselith couldn't be open-source, but it's clearly offering something different from InvokeAI.

demosthanos
0 replies
1d13h

I can't even figure out how one would install Noiselith. It has some text that says "Download for free on your PC", but it's not a button or a link. Maybe they're doing some weirdly locked-down user-agent sniffing and refuse to allow me to even attempt to download any version on Linux?

InvokeAI is installed via a script, sure, but it's also just a few clicks: download, extract, double-click on a specific file, enjoy.

smcleod
0 replies
1d19h

Yeah invokeAI is fantastic!

TaylorAlexander
0 replies
1d18h

Yeah "Run Stable Diffusion locally" is a weird pitch since that's already easy to do tbh.

GaggiX
0 replies
2d

Also just Krita with the diffusion AI plugin: https://github.com/Acly/krita-ai-diffusion

tracerbulletx
8 replies
2d

All the real homies use ComfyUI

weakfish
7 replies
2d

Elaborate?

tracerbulletx
6 replies
2d

I'm being kind of tongue in cheek because I understand that this is for just making things really easy and ComfyUI is a node based editor that most people would have trouble with. But the best UI for local SD generation that the community is using is https://github.com/comfyanonymous/ComfyUI

ttul
2 replies
1d22h

If you are a programmer at heart, ComfyUI will feel very comfortable (pun intended). It's basically a visual programming environment optimized for the type of compositional programming that machine learning models desire. The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud. Use ComfyUI and other apps to orchestrate the models locally, but send data to the cloud and benefit from sharing GPU resources far more efficiently.

If anyone has a spare thousand hours to kill, I would build that and connect it up with the various front-ends including ComfyUI, A111, etc.. not a small amount of effort, but it will be rewarding.

dragonwriter
1 replies
1d10h

The next thing this space needs is someone to build an API hosting every imaginable model on a vast farm of GPUs in the cloud.

So, Civitai.com, if they had an API for the ob-site generation and training functions?

ttul
0 replies
1d10h

Sure, they’d be well placed to do this.

rish
2 replies
1d23h

Agreed. It's worth the learning curve for the sheer power you can enable your workflows. I've always wanted to toy around with node based architectures and this seemed quite easy after using A1111 extensively. The community providing ready to go workflows has made it quite enjoyable too.

SV_BubbleTime
1 replies
1d16h

I can’t seem to get myself to switch. I’ve only used A1111 a dozen times and only for funny work images… I can’t seem to get myself to switch over to comfy because it looks rather intimidating.

rish
0 replies
1d5h

I mean A1111 is great if you are a casual user as it has everything ready at your fingertips.

When you want to apply advanced workflows and repetitive tasks or use something cutting edge then Comfy is handy.

Another A1111 alternative to try which focuses on prompt generation is:

https://github.com/lllyasviel/Fooocus

kleiba
7 replies
1d19h

Sales prompt: "Young woman with blonde curls in front of a fantasy world background, come hither eyes, sitting with her legs spread, wearing a white shirt and jeans hot pants."

I mean, really??

KolmogorovComp
2 replies
1d18h

Glad I’m not the only one who found it inappropriate. Feels very much like a dog whistle.

rcoveson
1 replies
1d18h

What's subtle about it? In the dog whistle analogy, who are they who cannot hear the whistle?

To me this is more like yelling "ROVER! COME HERE BOY!" at the top of your lungs.

samutek
0 replies
1d16h

The actual prompt is "magic world and the girl sitting inside a computer monitor, fantasy, cinematic close up photo."

OP is just offended by the image of an attractive woman, I guess. Apparently that's "creepy" now.

momojo
1 replies
1d18h

I'm genuinely curious how many people in the open source community are pouring their sweat and blood into these projects that are, at the end of the day, enabling guys to transform their macbooks into insta-porn-books.

SV_BubbleTime
0 replies
1d16h

How many technological revolutions do we need to go through before we just accept an admit by default it’s typically about boobs?

smcleod
0 replies
1d19h

Yeah that’s creepy as.

rcoveson
0 replies
1d18h

If the prompt wasn't somewhat sexual, divisive, or offensive it would be wide open to the chorus of "still not as good as midjourney/dall-e/imagen". Freedom from restriction is one of the main selling points.

ProllyInfamous
7 replies
2d

The 16GB (base model) M2 Pro Mini, despite its overall awesomeness (running DiffusionBee.app / etc)... does not meet Minimum System Requirements (Apple Silicon requires 32GB RAM).

So now I have to contemplate shopping for a new mac TWICE in one year (never happened before).

wsgeorge
4 replies
2d

Currently using SDXL (through Huggingface Diffusers) on an M1 16GB Mac. Takes on average 4-5mins to generate an image. It's usable.

ttul
3 replies
1d22h

Good lord. I can get a 2048x2048 upscaled output from a very complex ComfyUI workflow on a 4090 in 15 seconds. This includes three IPAdapter nodes, a sampling stage, a three-stage iterative latent upscaler, and multiple ControlNets. Macs are not close to competitive for inference yet.

rsynnott
2 replies
1d21h

I mean, a 4090 would appear to cost $2000, and came out a year ago; it has about 70bn transistors. The M1 could be had for $700 for a desktop, $1000 as part of a laptop, came out three years ago, and has 16bn transistors, some of which are CPU.

An M3 Ultra might be a more reasonable comparison for the 4090.

ttul
0 replies
1d10h

A very fair comment. I use an M2 MBP with max specs. It’s very powerful, but the Nvidia card draws a whole lot more power…

michaelt
0 replies
1d19h

24GB cards weren't always $2000. I've seen people on this very forum [1] who brought two 3090s for just $600 each.

Agree the prices are crazy right now, though.

[1] https://news.ycombinator.com/item?id=37438847

sophrocyne
0 replies
2d

https://github.com/invoke-ai/InvokeAI - runs on Mac silicon, can squeeze out SDXL images on a 16gb mac with SSD-1B or Turbo models.

myself248
0 replies
2d

When choosing a machine with non-expandable RAM, you went with the minimum configuration? That's a choice, I suppose, but the outcome wasn't exactly hard to foresee.

AuryGlenz
5 replies
2d

I realize it may be good marketing, but it's odd to have the fact that it's on device and offline be the primary differentiator when that's probably how most people use Stable Diffusion already.

I'd probably focus more on it being easy to install and use, as that's something that isn't done much. For me, if it doesn't have Controlnet, upscaling, some kind of face detailer, and preferably regional prompting, I'm out.

I also kind of wish all of these people that want to make their own SD generators would instead work on one of the open source ones that already exist.

While an app store might be a good idea, in a world with Auto111 and all of their extensions I think it's going to go over poorly with the Stable Diffusion community, for what it's worth.

michaelt
1 replies
2d

I think there's probably a bunch of people who don't use things like A1111 because of the complexities of the download-this-which-downloads-this-which-downloads-this-then-you-manually-download-this-and-this setup model.

I can see how something simpler might appeal to new users, even if it doesn't appeal to existing users.

AuryGlenz
0 replies
1d23h

Sure, and I agree with that. As I said, I'd probably push that just as much as it being 'offline,' if not more.

solarkraft
0 replies
1d23h

I've used SD on my device, but I found it worth it to pay for the hosted version because it's much faster.

prepend
0 replies
2d

I’ve oddly found many cloud wrappers to stable diffusion. So I like the upfront on device/offline description.

It was weird when I was first playing with SD how many packages did severe phone home or vms or whatever instead of just downloading a bunch of stuff and running it.

philipov
0 replies
2d

You hit the nail on the head when you said it's good marketing, but go all the way. The thing you find odd tells you who they want to use their product; You're not their target audience. They are trying to convert people from using online-only services like Dall-E, not people who already use SD.

amelius
3 replies
2d

So it's free, but not open source.

What is the catch?

sib
2 replies
1d22h

They will have a non-free (as in beer) version once they exit beta (per the website).

SV_BubbleTime
1 replies
1d16h

With no real way to confirm it doesn’t phone home.

IDK, this all seems weird considering there are four other really good projects that do all of these things already.

stjohnswarts
0 replies
1d8h

what? there are dozens of application level firewalls out there

verdverm
2 replies
1d23h

This is when I feel the 24G mem limit of the mac book/air

liuliu
1 replies
1d23h

Again, try Draw Things, it runs well for SDXL on 8GiB macOS devices.

verdverm
0 replies
1d23h

yeah, I know there are options, I'm more interested in language models than image generation anyway, so llama.cpp

dreadlordbone
2 replies
1d23h

After installation, it wouldn't run on my Windows machine unless I granted public and private network access. Kinda tripped up since it says "offlilne".

tredre3
0 replies
1d20h

I had a similar experience.

On the first run it downloads about 30GB of data. I don't know if it would work offline on subsequent runs because for me it never ran again without crashing!

Also upon uninstallation it left behind all its data (not user data, mind you. But the executable itself, its python venv, its updater, and all the models. Uninstall basically just removed the shortcut in the start menu).

kemotep
0 replies
1d23h

If you disconnected completely from the internet did it still run?

That is completely wrong to advertise it as “offline” if it requires an active internet connection to run.

tatrajim
1 replies
1d17h

Works beautifully on my macbook m3 with 128gb.

ukuina
0 replies
1d11h

That's quite a luxurious config!

solarkraft
1 replies
1d23h

I find it interesting that it requires 16GB of RAM on Windows but 32 on a Mac. Unfortunately that leaves me out ...

mthoms
0 replies
1d22h

I think that's probably because RAM on Mac is shared with the GPU. On Windows, you need 16GB RAM plus 8GB on GPU.

mg
1 replies
1d23h

Would it be possible to run Stable Diffusion in the browser via WebGPU?

skocznymroczny
0 replies
1d19h
evanjrowley
1 replies
1d21h

How's support for AMD GPUs? I only saw Nvidia listed.

skocznymroczny
0 replies
1d19h

The main issue with AMD is that to get reasonable performance you need to use ROCm, and ROCm is only available on Linux. They started porting parts of ROCm to Windows but it's not enough to be usable yet, might be different in few months.

alienreborn
1 replies
2d

Interesting, will check it out to see how it compares with https://diffusionbee.com which I am using for last few months for fun.

janmo
0 replies
2d

I just checked out both and Noiselith produces much, much better results.

LorenDB
1 replies
1d23h

Why do we never see AMD support in these projects?

stuckkeys
0 replies
1d23h

I think it is a matter of why AMD does not support these projects. NVIDIA is involved everywhere. They could easy do the same. At least to what I have observed on the internetz.

stuckkeys
0 replies
1d23h

Installed it. Ran it. Generated. Slow for some reason. Deleted it. Looks similar to Pinokio, and that is opensource.

stets
0 replies
1d22h

definitely exciting to see more local clients come out. As mentioned in other comments, there are some great ones out already. I've used automatic1111 which is quick and doesn't require a ton of tuning. But it still has lots of knobs and options which makes it difficult initially. Fooocus is super quick but of course less customization.

Then there's ComfyUI, the holy grail of complicated, but with that complication comes the ability to do so much. It is a node-based app that allows you to create custom workflows. Once your image is generated, you can pipe that "node" somewhere else and modify it, eg: upscale the image or do other things.

I'd like to see if Noiselith or some others offer support for SDXLTurbo -- it came out only a few days ago but in my opinion is a complete game-changer. It can generate 512x512 images in ~half a second on consumer GPUs. The images aren't crazy quality but that ability to make a prompt like "fox in the woods", see it instantly and then add "wearing a hat" and see it instantly generate again is so valuable. Prior to that, I'd wait 12 seconds for an image. Sounds like not a big deal, but the value of being able to iterate so quickly makes local image gen so much more fun.

stared
0 replies
1d23h

I keep getting "Failed to generate. Please try again" 10 seconds after model loading. It is hardly helpful, as trying again gives the same error.

Apple Silicon M1, 32GB RAM, in any case.

smcleod
0 replies
11h12m

Doesn't seem to work with SDXL Turbo or SDXL LCM

seydor
0 replies
1d21h

but what s gonna happen to all those AI valuations if we all go offline

saintradon
0 replies
1d13h

Very much enjoying this, but, I BSOD'd very hard after making an image. (Maybe my PSU or GPU is bad, I need to take a look).

orliesaurus
0 replies
1d14h

I am running this on my 3070 RTX and 32GB ram Ryzen 5 and this is flawless!

Literally the reason why I am coming to HN every day! Thanks devs :)

m3kw9
0 replies
1d23h

Does not work at all, it needs you to go and find a “model”, like just download it for man.

causi
0 replies
1d11h

What's the privacy and licensing like for this? I'm honestly too ignorant to know if someone's allowed to use this for commercial purposes, or even if it's sending the generated images/prompts somewhere even if it's rendering locally.

api
0 replies
1d17h

Guernika is a decent one for Mac, available in the App Store.

NKosmatos
0 replies
1d18h

As others have stated, Local AI (completely offline after model/weight download) is the way to go. If I have the hardware why shouldn't I be able to run all these fancy software on my own machine?

There are many great suggestions and links to other similar/better packages, so follow the comments for more info, thanks :-)