The underlying reason is that current AI businesses and models are failing to capture any significant economic value. We're going to get there, but it will take some more work. It won't be decades, but a few more years would be helpful
He's leaving to work on decentralized AI? That's exactly what Stability AI was doing before it became clear the economics no longer work out in practice, and starting a new company wouldn't change that. (Emad is an advisory board member to a decentralized GPU company, though: https://home.otoy.com/stabilityai/ )
Obviously this is the polite way to send him off given the latest news about his leadership, but this rationale doesn't track.
Think seti at home.
Instead of wasting all the compute on bitcoin we pretrain fully open models which can run on people's hardware. A 120b ternary model is the most interesting thing in the world. No one can train one now because you need a billion dollar super computer.
SETI made sense because there is a lot of data where you download chunk and do expensive computation and return thin result.
Model training is unlike that. It's large state that is constantly updated and updates require full, up to date state.
This means you cannot distribute it efficiently over slow network with many smaller workers.
That's why NVIDIA is providing scalable clusters with specialized connectivity so they have ultra low latency and massive throughput.
Even in those setups it takes ie. a month to train base model.
Converted to distributed setup this same task would take billions of years - ie. it's not feasible.
There aren't any known ways of contributing computation without access to the full state. This would require completely different architecture, not only "different than transformers" but "different than gradient descent", which would be basically creating new branch in machine learning and starting from zero.
Safe bet is on "ain't going to happen" - better to focus on current state of art and keep advancing it until it builds itself and anything else we can dream of to reach this "mission fucking accomplished".
That's wrong. What you described is data parallelism and it would indeed be very tricky to e.g. sync gradients across machines. But this is not the only method of training neural nets (transformers or any other kind) in parallel. If we'd like to train, say, a human brain complexity level model with 10^15 parameters, we'd need a model parallelism approach anyways. It introduces a bit of complexity since you need to make sure that each distributed part of the model can run individually with roughly the same amount of compute, but you no longer need to worry about syncing anything (or have the entire state of anything on one machine). The real questions is if you can find enough people to run this who will never be able to run it themselves in the end, because inference alone will still require a supercluster. If you have access to that, you might as well train something on it today.
Lack of data parallelism is implied by computation that is performed.
You gradient descend on your state.
Each step needs to work on up to date state otherwise you're computing gradient descend from state that doesn't exist anymore and your computed gradient descent delta is nonsensical if applied to the most recent state (it was calculated on old one, direction that your computation calculated is now wrong).
You also can't calculate it without having access to the whole state. You have to do full forward and backward pass and mutate weights.
There aren't any ways of slicing and distributing that make sense in terms of efficiency.
The reason is that too much data at too high frequency needs to be mutated and then made readable.
That's also the reason why nvidia is focusing so much on hyper efficient interconnects - because that's the bottleneck.
Computation itself is way ahead of in/out data transfer. Data transfer is the main problem and going in the direction of architecture that dramatically reduces it by several orders of magnitude is just not the way to go.
If somebody solves this problem it'll mean they solved much more interesting problem – because it'll mean you can locally uptrain model and inject this knowledge into bigger one arbitrarily.
Your gradient descent is an operation on a directed acyclic graph. The graph itself is stateless. You can do parts of the graph without needing to have access to the entire graph, particularly for transformers. In fact this is already done today for training and inference of large models. The transfer bottleneck is for currently used model sizes and architectures. There's nothing to stop you from building a model so complex that compute itself becomes the bottleneck rather than data transfer. Except its ultimate usability of course, as I already mentioned.
Your DAG is big. It's stateless for single pass. Next one doesn't operate on it anymore, it operates on new, updated one from previous step. It has fully connected sub DAGs.
There is nothing stopping you from distributing assembly/machine code for CPU instructions, yet nobody does it because it doesn't make sense from performance perspective.
Or amazon driving truck from one depo to other to unload one package at a time to "distribute" unloading because "distributing = faster".
Openai had a more distributable algorithm: https://openai.com/research/evolution-strategies
But given that they don't seem to have worked on it since, I guess it wasn't too successful. But maybe there is a way
Yes, if there was something interesting there you'd think since 2017 something would happen. Reinforcement Learning (that is compared with) is not particularly famous for its performance (it is it's biggest issue and reason for not being used that much). Also transformers don't use it at all.
That is the challenging part indeed.
But if we think of mixture of experts models outperforming "monolithic" models, why not? Maybe instead of 8 you can do 1000 and that is easy to paralellize. It sounds worth exploring to me.
I don't think MoE allows for that either. You'd have to come up with a whole new architecture that allows parts to be trained independently and still somehow be merged together in the end.
This paper addresses that issue and allows fully independent training of the experts:
This one, from a couple of days ago, might address that issue as well:
I think the MoE models are trained together just like any other network though, including the dispatcher layer that has to learn which "expert" route each token to. Perhaps you could do some kind of technically worse model architecture that is trained separately and then a more complex dispatcher that then learns to utilize the individually trained experts as best as it can?
You're right that parameter updates typically require full state, but during my PhD I've explored some possibilities to address this limitation (unfortunately, my ideas didn't pan out in the time I had). That said, there is research that has explored this topic and made some progress, such as this paper:
Transmission speeds aren't fast enough for this, unless you crank up the batch size ridiculously high.
LoRA training/merging basically is "crank up the batch size ridiculously high" in a nutshell, right? What actually breaks when you do that?
Cranking up the batch size kills convergence.
Wonder if that can be avoided by modifying the training approach. Ideas offhand: group by topic, train a subset of weights per node; figure out which layers have the most divergence and reduce lr on those only.
People are estimating that Meta is buying $8-10B in GPUs this year alone.
https://www.cnbc.com/amp/2024/01/18/mark-zuckerberg-indicate...
Yeah, certainly within an order of magnitude of that number.
Yeah, but they are also doing an insane amount of inference and continuously experimenting with developing new sota models.
Reminded me of Petals offering distributed ML inference anyone tried this?
I would expect most of the big tech firms have the capital to build a gpu cluster like that.
For the record, this already exists in the open source world for Stable Diffusion.
You can host a local horde and do "Seti at home" for stable diffusion.
"Decentralised" AI probably means something to do with crypto. Like mint this ocin to get access to that model? That's my guess.
decentralization != coinshit
https://en.wikipedia.org/wiki/Emad_Mostaque
Read his Wikipedia page and tell me he doesn’t sound like your run of the mill crypto scammer.
He claims that he holds B.A. and M.A. degrees in mathematics and computer science from the University of Oxford.[7][8] However, according to him, he did not attend his graduation ceremony to receive his degrees, and therefore, he does not technically possess a BA or an MA.[7]
Pretty simple background check would answer that question. If he’s claiming those credentials without actually having them I would assume it be common knowledge by now.
Someone became a US House Rep while lying about an education they did not have and a completely falsified resume. I wouldn't be so quick to assume that if he was lying everyone would know by now.
In the US attending your graduation ceremony has zero bearing on whether the university recognizes if you achieved a degree or not. Is the UK or Oxford different in this regard? Who cares if someone attended a ceremony. This sounds fraudulent at first glance. People with legit credentials don't need to add technicalities to their claim.
Kinda like Deltec's "Deputy CEO"? (Tether's bank), or even Deltec itself:
At the start of 2021, according to their website, it was a 55 year old bank. By the end of 2021, it was a 70 year old bank!
The bank's website is a WordPress site. And their customers must be unhappy - online banking hasn't worked for nearly two years at this point.
Anyway, their Deputy CEO gave this hilarious interview from his gaming rig. A 33 year old Deputy CEO, who by his LinkedIn claimed to have graduated HEC Lausanne in Switzerland with a Master of Science at the age of 15... celebrating his graduation by immediately being named Professor of Finance at a university in Lebanon. While dividing his spare time between running hedge funds in Switzerland and uhh... Jacksonville, FL.
The name of his fund? Indepedance [sic] Weath [sic] Management. Yeah, okay.
In this hilariously inept interview, he claimed that people's claims about Deltec's money movements being several times larger than all the banking in their country was due to them misunderstanding the country's two banking licenses, the names of which he "couldn't remember right now" (the Deputy CEO of a bank who can't remember the name of banking licenses), and he "wasn't sure which one they had, but we might have both".
Once the ridicule and all this started piling on, within 24 hours, he was removed from the bank's website leadership page. When people pointed out how suspicious that looked, he was -re-added-.
The bank then deleted the company's entire website and replaced it with a minimally edited WordPress site, where most of the links and buttons were non-functional and remained so for months thereafter.
I mean fuck it, if the cryptobros want to look at all that and say "seems legit to me", alright, let em.
but now that the crypto boys are back en vogue and are returning from hibernation / ai-vacations due to price levels you can combine 2 hype trends into one and capture the imagination & wallets of 2 intersecting circles of fools!
so if these days someone is talking about decentralized anything i'd bet it involves coinshit again
If you want sustainability it does
Like “here is a rug” let me pull it for you!
I’m guessing Emad sees the “rationale” in the recent revival of crypto prices - and the subsequent demand for altcoins
AI + Crypto bull market is a recipe for grabbing tons of VC money
using old nvidia cards as space heaters
A few years ago it was all about democratized AI
Previous controversy involving the CEO (2023): https://www.forbes.com/sites/kenrickcai/2023/06/04/stable-di...
“ In reality, Mostaque has a bachelor’s degree, not a master’s degree from Oxford. The hedge fund’s banner year was followed by one so poor that it shut down months later. The U.N. hasn’t worked with him for years. And while Stable Diffusion was the main reason for his own startup Stability AI’s ascent to prominence, its source code was written by a different group of researchers. “Stability, as far as I know, did not even know about this thing when we created it,” Björn Ommer, the professor who led the research, told Forbes. “They jumped on this wagon only later on.” “
“ “What he is good at is taking other people’s work and putting his name on it, or doing stuff that you can’t check if it’s true.”
Emad had the genius idea of essentially paying for the branding rights to an AI model. This was viewed as insane in the pre-ChatGPT era, and only paid off massively in retrospect.
Also all those 'controversies' were mostly the result of an aggrieved co-founder/investor who decided to sell their shares before SD1.4's success. Emad may not have proven to be competent enough to run an large AI lab in the long run, but those complaints are just trivial 'controversies'.
This was viewed as insane in the pre-ChatGPT era, and only paid off massively in retrospect.
Paid off how?
Perhaps the hundreds of millions in VC investment and company value?
I'm not sure I understand. Are you saying that if a company gets a valuation because of an investment, that creates value?
How would you otherwise define value, except the value that someone is willing to assign to a share of a company or goods when purchasing?
Value can go up and down though..
They are saying it was not clear "buying branding rights to an image model" would lead to any investments, any kind of high valuation, or any other financial success. It is only clear in hindsight.
Lots of people have used SD commercially, so we've been paid off collectively
He did provide thousands of GPUs for training all those open-source and open-weights models, no?
Maybe… he certainly was good at taking credit for it. Not course if they stepped in, rebranded something they didn’t make and threw a bunch of AwS GPUs they couldn’t actually afford at it though
https://petapixel.com/2023/06/05/so-many-things-dont-add-up-...
So where can I download that thing he took credit for?
People forget, moving things around and funding them and making it all work is what makes an entrepreneur. Elon Musk did that and all great founders do that. Emad doesn't have to write the code himself.
What he is good at is taking other people’s work and putting his name on it, or doing stuff that you can’t check if it’s true.
I have to say, this is a quite common ignorant statement that's said about almost every CEO.
I'm not sure if there's more to it in this particular case, but no, CEOs aren't stealing your work. Similarly, marketers aren't parasites. Designers aren't there to waste your time. Many engineers seem to hold similar belief that others are holding them down or taking advantage of their work. This is just a congnitive bias.
Emad jumped on the train after the major inventions were invented and PoCs were made. He could not have contributed to them unless he had a time machine. (Yes he contributed to the training costs of SD1.4 but the time point when he made the decision was not early research.)
Sorry I'm not well versed in the story and it's still not clear to me what he did wrong.
Where is the controversy here? Is the CEO expected to contribute to research? there seems to be some context I'm missing
Not surprised at all to see how unpopular this sentiment is on HN. For some reason HN seems to love one dimensional stereotypes for every job that isn't theirs.
Actually that's wrong, even the idea that engineers are smarter than managers is very prevalent here.
wow I did NOT know this
I wonder how many ppl among VC/PE circles are also sugar coating their experiences and successes
It's the game to play. Asymmetric information with the goal to attain wealth, influence, power, ego-stroke, whatever opportunity is present.
Thats a hit piece, but whatever, isn't that what the most prominent/funded academics do? As far as i know he is known generally as CEO of the company that makes SD, not as the creator of SD. It does look like without him these models wouldnt have evolved so much
An Oxford MA is confusingly what you automatically a few years after leaving undergrad https://en.wikipedia.org/wiki/Master_of_Arts_(Oxford,_Cambri...
So there's that
FWIW, Rombach and Esser, the two guys behind the original model, gave some appreciation for Emad on Twitter just now.
Really sad to see. Emad pivoting to posting crypto nonsense on twitter made me think the writing is on the wall for Stability, but I still didn't expect it so soon. I expect they'll pivot to closed models and then fade away.
crypto nonsense on twitter
got any links?
Not twitter but here he is talking about that:
You know it never ceases to amaze me how even the most respected fall prey to this money laundering scheme. If people even spent some time to read about Tether they would not touch this stuff. It's blood money.
Do you even know the tech behind crypto? Just because scammers and similar people use and promote it, doesn't make it a bad technology at all.
Tether isn’t crypto though, in the sense of being decentralized, permissionless, etc
I've never owned any Tether, so don't know much about it. How is it blood money?
Exactly how is cryptocurrency merely a "money laundering scheme"?
With a bold claim like that citations would be in order.
I suspect a lot of them know it's a scam, they just want in on it and don't want to admit it.
Wow much of what he’s saying in that video is a lie with regards to the history of latent diffusion, creating an open source GPT3, etc. Just taking credit for a bunch of work he didn’t have much to do with.
Translation of “stepping down to focus on distributed AI”:
Getting fired and making moves to capitalize on the current crypto boom while it lasts
Getting fired by whom? He is both the majority shareholder and controls the board
Really sad to see. Emad pivoting to posting crypto nonsense on twitter made me think the writing is on the wall for Stability
Open-source AI is a race to zero that makes little money and Stability was facing lawsuits (especially with Getty) which are mounting into the millions and the company was already burning tens of millions.
Despite being the actual "Open AI", Stability cannot afford to sustain itself doing so.
Source on them burning tens of millions? What are they burning it on?
I mean, is AI a less sketchy space in 2024 than crypto/blockchain in 2024? Two or three years ago sure, I guess, but today?
The drama around OpenAI is well documented, there are multiple lawsuits and an SEC investigation at least in embryo, Karpathy bounced and Ilya's harder to spot than Kate Middleton (edit: please see below edit in regards to this tasteless quip). NVIDIA is pushing the Dutch East India Company by some measures of profitability with AMD's full cooperation: George Hotz doesn't knuckle under to the man easily and he's thrown in the towel on ever getting usable drivers on "gaming"-class gear. At least now I guess the Su-Huang Thanksgiving dinners will be less awkward.
Of the now over a dozen FAANG/AI "boomerangs" I know, all of them predate COVID hiring or whatever and all get crammed down on RSU grants they accumulated over years: whether or not phone calls got made it's pretty clearly on everyone's agenda to wash all the ESOP out, neutron bomb the Peninsula, and then hire everyone back with at dramatically lower TC all while blowing EPS out quarter after quarter.
Meanwhile the FOMC is openly talking about looser labor markets via open market operations (that's direct government interference in free labor markets to suppress wages for the pro-capitalism folks, think a little about what capitalism is supposed to mean if you are ok with this), and this against the backdrop of an election between two men having trouble campaigning effectively because one is fighting off dozens of lawsuits including multiple felony charges and the other is flying back and forth between Kiev and Tel Aviv trying to manage two wars he can't seem to manage: IIRC Biden is in Ukraine right now trying to keep Zelenskyy from drone-bombing any more refineries of Urals crude because `CL` or whatever is up like 5% in the last three weeks which is really bad in an election year looking to get nothing but uglier: is anyone really arguing that some meme on /r/crypto is what's pushing e.g. BTC and not a pretty shaky-looking Fed?
Meanwhile over in crypto land, over the same period of time that AI and other marquee Valley tech has been turning into a scandal-plagued orgy of ugly headlines on a nearly daily basis, the regulators have actually been getting serious about sending bad actors to jail or leaning on them with the prospect (SBF, CZ), major ETFs and futures serviced by reputable exchanges (e.g. CME) have entered mainstream portfolios, and a new generation of exchanges (`dy/dx`, Vertex, Apex, Orderly) backed by conventional finance investments in robust bridge infrastructure (LayerZero) are now doing standard Island/ARCA-style efficient matching and then using the blockchain for what it's for: printing a consolidated, Reg NMS/NBBO/SIP-style consolidated tape.
As a freelancer I don't really have a dog in this fight, I judge projects by feasibility, compensation, and minimum ick factor. From the vantage point of my flow the AI projects are sketchier looking on average and below market bids on average contrasted to the blockchain projects, a stark reversal from even six months ago.
Edit: I just saw the news about Kate Middleton, I was unaware of this when I wrote the above which is in extremely poor taste in light of that news. My thoughts and prayers are with her and her family.
I and many many people around me use ChatGPT every single day in our lives. AI has a lot of hype but it’s backed by real crap that’s useful. Crypto on the other hand never really did anything practical except help people buy drugs and launder money. Or make people money in the name of investments.
I don’t know much about him. Was he into cryptocurrency before AI?
I expect they'll pivot to closed models and then fade away.
They already pivoted away from open-licensed models.
I love their product, but I was suspect of Emad ever since he said “There will be no programmers in five years.”[0]
That just sounds so simplistic that I don't believe he believes it himself.
[0] https://finance.yahoo.com/news/stability-ai-ceo-no-human-193...
I think when parsing that statement it's important to understand his (and your) definition of "programmer".
We (I) tend to use the term "programmer" in a generic way, encompassing a bunch of tasks waaay beyond "just typing in new code". Whereas I suspect he used it in the narrowest possible definition (literally, code-typer).
My day job (which I call programming) consists of needs analysis, data-modelling, workflow and UI design, coding, documenting, presenting, iterating, debugging, extending, and cycling through this loop multiple times. All while collaborating with customers, managers, co-workers, check-writers and so on.
AI can do -some- of that. And it can do small bits of it really well. It will improve in some of the other bits.
Plus, a new job description will appear- "prompt engineer".
As an aside I prefer the term "software developer " for what I do, I think it's a better description than "programmer".
Maybe one day there'll be an AI that can do software development. Developers that don't need to eat, sleep, or take a piss. But not today.
(P.S. to companies looking to make money with AI - make them able to replace me in Zoom meetings. I'd pay for that...)
There are almost no programmers today (you need to do malloc and low level sys calls in C to be considered a programmer).
I don't think you can be considered a programmer if you can't write your own syscall firmware code in assembly.
I don't think you can be considered a programmer if you can't perfurate your own punch cards.
That's right. We invented programming AI a very long time ago, and called it an "assembler". All you had to do was tell the assembler what kind of program you wanted, and it would do the programming work for you!
Then we invented another AI to tell the assembler what kind of program you wanted, and called it a "compiler". All you had to do was tell the compiler what kind of program you wanted it to tell the assembler you wanted, and it would do all the not-exactly-programming work for you!
And so on...
The CEO of nvidia said basically the same thing so I don't know if that's the best metric.
You mean the CEO of the company that just rode the AI hype wave to become one of the top 3 most valuable companies in the world? It's his fiduciary duty to say things like that, whether or not he believes them, the same as every other CEO in the AI space.
There is no fiduciary duty to lie, or make up stories you do not believe.
People really oversell fiduciary duty. Yet the whole point of top-level corporate roles is to steer a company predicated upon opinion, which means that you have great latitude to act without malfeasance.
Unfortunately when Stability shuts down Wall Street might get spooked that the bubble is popping and downvote NVDA
Well it's been almost a year since he said that. You would think we'd have lost the first chunk of programmers to AI by now.
We have, there are a bunch of bottom of the barrel executives that have done hiring freezes and layoffs under the assumption that AI would replace everything. It will go the exact same way as the outsourcing craze that swept the industry in the mid aughts. The executives that initiate the savings will be praised and rewarded lavishly, and then when everything blows up and falls apart those responsible for the short sighted and ineffective cuts will be long gone.
Are there any good numbers on how many positions this has affected?
The worst thing here is not that he doesn’t believe in it himself, but that he does. As George Costanza said, “it’s not a lie if you believe it”
No, my problem is that I don't believe he believes it. It is a lie.
I'm pretty sure Elon Musk also didn't believe in "FSD in 6 months" every time he said it, but it was just marketing.
I think Stability is in an interesting situation. A few suggestions on its direction and current state: 1. Stability AI's loss of talent at the foundational research layer is worrying. They've lost an incredibly expensive moat and there's enough unsolved problems in the foundation layer (faster models, more energy efficient models, etc.) to ensure Stability provides differentiated offerings. Step 1 should be rectifying the core issues of employment and refocusing this more into the AI lab space. I have no doubt this will require a re-steering of the ship and re-focusing of the "mission". 2. Stability AI's "mission" of building models for every modality everywhere has caused the company to lose focus. Resources are spread thin. With $100M in funding, there should be a pointed focus in certain areas - such as imaging or video. Midjourney has shown there is sufficient value capture already in just 1 modality. E.g. StableLM seems like early revenue rush and a bad bet with poor differentiation. 3. There is sufficient competition on the API layer. Stability's commitment to being open-source will continue to entice researchers and developers but there should be a re-focus on improvements in the applied layer. Deep UX wrappers for image editing and video editing while owning the end to end stack for image generation or video generation would be a great focal point for Stability that separates itself from the competition. People don't pay for images, they pay for images that solves their problems.
Deep UX wrappers for image editing and video editing while owning the end to end stack for image generation or video generation would be a great focal point for Stability that separates itself from the competition. People don't pay for images, they pay for images that solves their problems.
Recently, during an interview [1], when questioned about OpenAI's Sora, Shantanu Narayen (Adobe CEO) gave an interesting perspective on where value is created. His view (paraphrased generously)..
GenAI entails 3 'layers': Data, Foundational Models and the Interface Layer.
Why Sora may not be a big threat is because Adobe operates not only at first two layers (Data and Foundational model) but also at the interface layer. Not only Adobe perhaps knows better than anyone else what is need and workflow of a moviemaker, but I guess most importantly they already have moviemakers as their customers.
So product companies like Adobe (& Microsoft, Google etc.) are in better position to monetize GenAI. Pure-play AI companies like OpenAI are perhaps in B2B business. Actually, they maybe really in api business, they would have great data, would be building great foundational models and giving results of those as APIs; which other companies who are closer to their unique set of customers with their unique needs would be able to monetize and some part of those $$ flows back to pure-play AI companies
[1] At 5 mins mark.. https://www.cnbc.com/video/2024/02/20/adobe-ceo-shantanu-nar...
Not only Adobe perhaps knows better than anyone else what is need and workflow of a moviemaker
I only ever heard creatives complain about Adobe and their UI/UX and how they don’t understand their customers.
Never really used any of their products myself though. Maybe they still are best-in-class. I can’t tell.
People love illustrator
It was good in the 00s but now it’s rickety and antiquated and the GPU acceleration was never implemented correctly.
Figma could build an illustrator killer in 6 months if they wanted to and it would be obliterated.
If they actually tackled this task people would be kicking themselves for putting up with the shambles that is illustrator for this long.
Figma could build an illustrator killer in 6 months if they wanted to and it would be obliterated
Statements like this are almost always wrong, if for no other reason that a technically superior alternative is rarely compelling enough by itself. It that weren’t the case you would see it happen far more often…
Only because Adobe forced them to by murdering Freehand in front of their eyes.
People also hate illustrator
I don't know Adobe's business so could be wrong, but maybe "creatives" are not their key customers? If they're focusing on enterprise sales, they're selling to enterprise decision makers.
Every user hates using microsoft products, and don't get me started on SAP. But these are gigantic companies with wildly successful products aimed at enterprise customers.
Every user hates using microsoft products
Only because they've never had a chance to experience the competition.
Having worked in IBM and had to use the Lotus Office Suite I can tell you Microsoft won fair and square. And I'm not even talking about the detestable abomination that is Lotus Notes.
If they’re selling to enterprise decision makers, aren’t they also B2B? In which case they have the same deficiency they started OpenAI has.
Many years ago it was a good offering but it’s becoming increasingly clear with outsourcing and talent drain that the current teams working on the likes of Photoshop, After Effects and Premier do not actually understand how the core tool, both in its inner workings or even how it draws its own UI works at all and couldn’t either recreate it or even change its existing behavior.
Every major change in the last 6 years has either been weird window dressing changes to welcome panels or new document panels, in all cases building sluggish jank heavy interfaces, try navigating to a folder in the premier one and weep as clicks take actual seconds to recognize.
Or just silly floating tooltips like the ones in Photoshop that also take a second to visible draw in.
All tangible tool changes exist outside the interface or you jump to a web interface in a window and back with the results being passed between in a way that makes it very obvious the developers are trying to avoid touching the core tools code.
Very clear Narayens outsourcing and not being a product guy has lead to this
It's been like this since at least late 90s. At this point Photoshop is similar to Windows in that it has at least 6 mismatching UIs from 6 different eras in it. (or maybe more)
I only ever heard creatives complain about Adobe and their UI/UX and how they don’t understand their customers.
There are tools no one uses, and there are tools people constantly complain about.
Have been building a generative video editor and doing interviews with enterprise creative cloud users. Basically there’s a large knowledge investment in their tools, and adobe (so far) has shown that their user’s knowledge investment won’t become useless because adobe will continue to incorporate the future into their existing knowledge stack. Adobe doesn’t have to be the best, but just show they won’t miss the boat entirely with this next generation of tools.
Sure there maybe scope of improvement in the products, but the point is they have $Billions in sale (year on year) to those customers (Ad agencies, movie studios etc.).
Thanks for linking. I agree and strongly believe product companies are in the best position to monetize Gen AI. Existing distribution channels + companies being extremely fast to add AI features.
Where start-ups like Stability need to be rising to compete will have to be AI-native e.g. products re-thought of from the ground up like an AI image editor or as foundation-level AI research companies, agents or AI infrastructure companies.
There's no reason Stability can't play in both B2B and API if planned and strategized well and OpenAI can definitely pull it off with their tech and talent. But Stability has a few important differentiators from OpenAI where I believe if they launch an AI-native product in the multimodal space, they stand to differentiate significantly: - People join because they believed in Emad's vision of open source so it is their job to figure out a commercial model for open source. They can retain AI talent by ensuring a commitment to open source here. If they need to ensure their moat is retained and can commercialize, they should delay releasing model weights until a product surrounding the weights has been released first. Still open source and open weights but give them time to figure out a commercial strategy to capitalize their research. However because of this promise, they will not be able to license their technology to other companies. - Stability's strong research DNA (unsure about their engineering) is so badly fumbled by a lack of a cohesive product strategy that it leads to sub-par product releases. In agreement to the 3 'layers' argument, that's exactly Stability's greatest strength and weakness. Their focus on foundational models is incredibly strong and has come at the cost of the interface layer (and ultimately the data layer as it has a flywheel effect).
The company currently screams a need for effective leadership that can add on interface and data layers to their product strategy so they can build a strong moat outside of a strong research team which has shown it can disappear at any moment...
People don't want to buy a third of an inch drill, they want a third of an inch hole.
2024 is going to shift into a tough year for AI. Business minded folks are already starting to deeply question where the value is relative to the amount of money spent on training. Many/most of the GenAI companies have interesting ideas but no real business plan. Many of the larger AI companies look very shaky in terms of their governance and long term stability. Stability is showing itself to be very unstable, OpenAI has its mess that still seems not fully resolved, Inflection had a bunch of strange stuff go down earlier this week, and more to come.
I’m a huge fan of the tech, but as reality sets in things are gonna get quite rough and there will need to be a painful culling of the AI space before sustainable and long term value materializes.
Many/most of the GenAI companies have interesting ideas but no real business plan
-
This is most likely the reason being the SAMA firing, to be able to re-align to the MIC without terrible consequence, from a PR perspective.
no criticism here aside from the fact that we will see the AI killer fully autonomous robots will be here and unfettered by 'alignments' much sooner than we expected...
And the MIC is where all the unscrupulous monies without audits will come from.
no criticism here aside from the fact that we will see the AI killer fully autonomous robots will be here and unfettered by 'alignments' much sooner than we expected...
What exactly do you mean with this sentence? That less woke/regulated companies will suddenly leapfrog the giants now? What timeframe are we talking here?
And do you mean for example non public mil applications from US/China or whatever or private models from unknown players?
One thing i've been wondering is that if GPT-4 was 100 mil to train, then there's really a lot of plutocrats, despots, private companies and states for that matter that could in principle 10x that amount if they really wanted to go all in, and maybe they are right now?
The bottleneck is the talent pool out there though, but i'm sure there's a lot people out there from libertarians to nation states that don't care about alignment at all, which is potentially pretty crazy / exciting / worrying depending on view.
That less woke/regulated companies will suddenly leapfrog the giants now?
What kind of "leapfrog" do you think is necessary to produce a "killer fully autonomous robot"?
We've actually had "autonomous killer robots," machines that kill based on the outcome of some sensor plus processing, for centuries, and fairly sophisticated ones have been feasible for decades. (For example it's trivial to build a "killer robot" that triggers off of face or voice recognition.)
The only thing that's changed recently is the kind of decisions and how effectively the robot can go looking for someone to kill.
Of course we're there already sadly. I was just confused by the sentence structure.
This video is from 7 years ago: https://www.youtube.com/watch?v=ecClODh4zYk
The interesting thing is who's actively working on this, any despots, private companies, foreign nations, who from the talent pool, criminal orgs or even western military which mostly works for the western elite classes.
maybe they are right now?
Aren’t they? Pretty sure that most tactical and strategical decisions are automated to the bottom. Drones and cameras with all sorts of CV and ML features. You don’t need scary walking talking androids with red eyes to control battlefields and streets. The idea of “Terminator” is similar to a mailman on an antigrav bicycle.
Business minded folks are already starting to deeply question where the value is relative to the amount of money spent on training
Kinda. In my experience, the bigger issue is the skillset has largely been diffused.
Overtraining on internal corpora has been more than enough to enable automation benefits, and the ecosystem around ML is very robust now - 10 years ago SDKs like Scikit-learn or PyTorch were much less robust than they are now. Implementing commercial grade SVM or <insert_model_here> is fairly straightforward now.
ML models have largely been commodified, and for most usecases, the process of implementing models fairly straightforward internally.
IMO, the real value will be on the infrastructural side of ML - how to simply and enhance deployment, how to manage API security, how to manage multiple concurrent deployments, how to maximize performance, etc.
And I have put my money where my mouth is for this thesis, as it is one that has been validated by every peer of mine as well.
I agree with you. My reference in business questioning value was on the flood of money going into building new models. That’s where we will see a significant and painful culling soon as it’s all becoming quite commoditized and there’s only so much room in the market when that happens. Tooling, security services and other things to build on top of a commoditized generative AI market opens up other doors for products to be built.
flood of money going into building new models
In my experience, there really hasn't been that significant of a flood of money in this space for several years now, or at least not to the level I've seen based on discussion here on HN.
I think HN tends to skew towards conversations around models for some reason, but almost all my peers are either funding or working on either tooling or ML driven applications since 2021.
-------
I've found HN to have a horrible noise to value signal nowadays, and people with experience (eg. My friends in the YC community) deviating towards Bookface or in person meetups instead now.
There was a flood of new accounts in the 2020-22 period (my hunch is it's LessWrong, SSC, and Reddit driven based on the inside jokes and posts I've been seeing recently on HN) but they don't reflect the actual reality of the industry.
I agree that the quality of the posts has decayed dramatically since the start of the COVID pandemic and lots of the Reddit type memes and upvotes have added horrible levels of noise. I can still find gems in it and it’s still miles ahead of Twitter, but I do question my time using it passively and would much rather have a smaller and more focused community again.
I’ve been in crypto since 2010 and HN hasn’t predicted anything. At least with AI the congnoscenti here is on board. With crypto / blockchain the current plan is to pretend it doesn’t exist and never speak of it, as the industry never imploded as was long predicted.
Business minded folks are already starting to deeply question where the value is relative to the amount of money spent on training.
Not just training but inference too, right? They have to make money off each query.
Yes. Each query needs to generate enough revenue to pay for the cost of running the query, a proportional cost of the cost to train the original model, overhead, SG&A, etc. just to break even. Few have shown a plan to do that or explained in a defensible way how they’re going to get there.
A challenge at the moment is a lot of the AI movement is led by folks that are brilliant technologists but have little to no experience running viable businesses or building a viable business plan. That was clearly part of why OpenAI has its turmoil in that some where trying to be tech purists where others knew the whole AI space will implode if it’s not a viable business. To some degree that seems to behind a lot of the recent chaos inside these larger AI companies.
It's always fascinating to see. The business portions are much easier to solve for with an engineering mindset, but it seems to be a common issue that engineers never take it into account.
This is saying nothing about "technologists" (or as they're starting to become derided as "wordcels": people that communicate well, but cannot execute anything themselves).
It would be... not trivial, but straightforward to map out the finances on everything involved, and see if there is room from any standpoint (engineering, financial, product, etc.) to get queries to breakeven, or even profitable.
But at that point, I believe the answer will be "no, it's not possible at the moment." So it becomes a game of stalling, and burning more money until R&D finds something new that may change that answer (a big if).
the whole AI space will implode if it’s not a viable business
if the Good Lord's willing and the creek don't rise
I'm fairly sure this is why there's a rush to get AI related capabilities into processors for consumer devices, to offload that computing so they don't bear the ongoing cost and it'll probably be more responsive to the user.
Well RIP Stability AI.
I love their models and I love how they have changed the entire open source AI ecosystem for the better, but the writing was always on the wall for them given how unprofitable they are.
I don't think much of the AI startup scene or socials groups like e/acc would have existed if it weren't for the tech that they just gave away for free.
Its interesting how Stability AI and their VC funding have done a much better job of acting effectively as a non-profit charity (because they don't have profits. lol) to speed up AI development and open source their results as compared to other companies that were supposed to have been doing that from the beginning.
They really were the true ActuallyOpenAI.
Related to this, if you are an aspiring person who wants to improve the world, tricking a bunch of VC investors to fund your tech and then giving away the results to everyone free of charge is the single best way to do it.
socials groups like e/acc
Good riddance.
Hey, let me know when the doomers have built anything that has mattered because they choose to, and not because they are forced to my market forces.
At least the open source AI people have code that you can use, freely without restriction.
The doomers, on the other hand, don't do anything but try and fail to prevent other people from releasing useful stuff.
But, in some sense I should be thanking the doomers because I rather that people with such incompetence were the enemy as opposed to people who might have a chance of succeeding.
Not being e/acc doesn’t make one a doomer. It means being someone with a healthy relationship with technological progress, who hasn’t given up on the prospect of useful regulation where it’s helpful.
The US is great but the best AI researchers aren’t gonna want to live here if it becomes a hyper libertarian hellscape. They want to raise a family without their kids being exploited by technologies in ways that e/accs tell us we should just accept. It’s not sustainable.
if it becomes a hyper libertarian hellscape.
Releasing cool open source AI tech doesn't turn the world into a libertarian hellscape.
You are taking the memes way too seriously.
Mostly people just joke around on twitter while are also building tech startups. e/acc isn't overthrowing the government.
You may want to read Marc Andreessen’s manifesto. E/acc isn’t just about “releasing cool AI tech”.
It means unrestricted technological progress. Unrestricted, including from annoying things like consumer protection, environmental concerns, or even national security.
If e/acc was just about making cool open source stuff and posting memes on the Internet, you wouldn’t need a new term for it, that’s what people have been doing for the past 30 years.
Ok, thats nice and all. But the actual results of all of this is memes and people making cool startups.
Regardless of whether a couple people who are taking their own jokes too seriously truly believe that they are going to, I don't know, create magic AGI, the fact remains that the actual measurable results of all this is only:
1: funny memes
2: cool AI startups
Anything other than that is made up stuff in either your head, or the heads of people who just want to pump up the valuation of their startups.
Marc Andreessen’s manifesto
Yes, I'm sure he says a lot of things that will convince people to invest in companies that he also invests in. It is an effective marketing tactic. I'm sure he has convinced other VCs to invest in his companies because of these marketing slogans.
But regardless of what a couple people say to hype up their startup investments, that is unrelated to the actual real world outcomes of all of this.
you wouldn’t need a new term for it
The fact that me and you are talking about it, actually proves that yes some marketing terms both make a difference and also don't result in, I don't know, the government being overthrown and replaced by libertarian VCs or whatever nonsense that people are worried about.
If Marc writes something it doesn’t become the definition of e/acc. Marc is hyperbolic and he gets a lot of clicks and eyeballs. As a VC though, he does it for his interest.
E/acc has many interpretations. In the most basic sense it means “technology accelerates growth”. One should work on better technology and making it widely distributed. Instead of giving away money, one can have the biggest impact on humanity with e/acc.
we’ve been effectively accelerating for the past 200 years.
Nothing hyper libertarian there.
You are taking the memes way too seriously.
Exactly. There are two groups of people: ones that defend Effective Accelerationism online with a straight face, and ones that take memes too seriously
"Let me know when the people who think AI will destroy the world actually build cool AI toys for me to play with."
Yeah uh.
Huggingface are also the real actual openAI
How? What models have they put out that are relevant?
Zephyr: https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
Their platform for easy distribution and management of models has sped up the ecosystem more so.
Are you aware of smth called criminal code? This is one of the worst advices I've ever seen. Tricking people to obtain money? How's this not fraud?
This is one of the worst advices I've ever seen.
Really? Because stability AI caused a very large amount of good in the world.
It arguably kicked off the entire AI startup industry.
Tricking people to obtain money? How's this not fraud?
Its not fraud because you don't have to lie to anyone. You can tell VCs exactly what you plan on doing. Which is to open source all of your code... and... uhhh... yeah that will totally make the company valuable.
There are lots of ways of making sales pitches about open source, or similar, that will absolutely pass regulatory scrutiny and are "honest", and yet still have no hope of commercial success and also provide a huge amount of value to the public. Like what stability AI did.
"AI"s are still pretty much vaporwares like 40 years ago. When people get tired of these toys, the bubble will simply burst, and nothing valuable left.
I use chat-gpt almost every day and find it very usefull
And how you gonna verify chat-gpt's result? By googling?
Here's an example of some thing I tried at random a few weeks ago. I have a bunch of old hand written notes that are mind maps and graphs written on notebooks from 10+ years ago. I snapped a picture of all of the different graphs with my phone, threw them into chatGPT and asked it to convert them to mermaid UML syntax.
Every single one of them converted flawlessly when I brought them into my markdown note tool.
If you're using chatGPT as nothing more than a glorified fact checker and not taking advantage of the multimodal capabilities such as vision, OCR, Python VM, generative imagery, you're really missing the point.
Exactly how you would verify the result that your human underling yielded. You can even delegate the googling and summation to the AI and just verify the verification.
It has provided sources and does internet lookups for a while now
Yes.
Also, research isn't the only benefit, code generation, roleplay bots, are pretty good too.
Competitors are catching up and one day the chat API providers will race the prices to bottom
Doesn't this apply to any product though?
That’s not the same thing as chat gpt being vaporware.
Catching up to what they released a year ago. We don’t know what’s coming up next.
Looks like many AI startups are experiencing a bit of some turbulence and chaos in the recent months:
First the OpenAI rebellion in November, then the Inflection AI acqui-hire from Microsoft not willing to pay the over-valued $4B and deflected that to $600M instead (after making $0 revenue) and now a bit of in-stability at Stability AI with the CEO resigning after many employees leaving.
What does that say about the other AI companies out there who have raised tons of VC cash and aren't making any meaningful amount of revenue? I guess that is contributing to the collapse of this bubble with only a very few companies surviving.
Wait you're telling me you can't make the next trillion dollar industry by incinerating capital while making no money?
You have a point, but remember Amazon and Google did exactly that and figured out the business model later.
Big difference is that Amazon and Google had fast growing revenue
this seems super misleading
Amazon nailed high revenue growth from the very beginning, just reinvesting in growth & deferring the margin story. They could have stopped at any time.
Google nailed high traffic from the beginning, so ad sales was always a safe Plan B. The founders hoped to find something more aesthetic to them, failed, and the conservative path worked.
The reason I write this is misleading is b/c this is very different from a ZIRP YC era thinking that seems in line with your suggestion:
- Ex: JustinTV used their VC $ to pivot into Twitch, and if that didn't work, game over.
- Ex: Uber raised bigger & bigger VC rounds until self-driving cars could solve their margins / someone else figured it out. They seem to be figuring out their margins, but it was a growing disaster and unclear if they could with such an unpredictable miracle.
In contrast, both Amazon & Google were in positions of huge cash flows and being able to switch to highly profitable growth at any time. They were designed to control their destinies, vs have bankers/VCs dictate them.
Amazon was ruthless at making money from day 0.
Amazon was famous for not taking profit, and instead putting profit back into the company, but revenue? Amazon was generating gobs and gobs and gobs of revenue from day 0.
I am quite sure Google had a lot to offer from day one, which is why they were encouraged to start the business. There wasn’t any open source Google’s.
Yes maybe the business model wasn’t perfect but ad revenue was already well and truly a thing by the time Google invented a better search engine. All they had to do was serve the ads and the rest is history.
Surely the same strategy will work during economic contraction and a period of high interest rates and borrowing costs.
OpenAI rebellion
This phrasing makes me feel like we're living in some techno-future where corporations are de-facto governance apparatuses, squashing rebellion and dissidents :^)
Colonial companies did this all the time, e.g. the British East India Company, and the 1800s American railroads had the Pinkertons. There's also the phenomenon of the company town.
The writing was on the wall for Stability AI after they went on the massive anti-adult content tirade for their newer models.
No, that's the sign they're being sensible, since the thing you want is illegal in most countries.
I looked this up and apparently the controversy is that Stable Diffusion was trained on child porn? How can the model itself not be considered objectionable material then? Does the law not apply some kind of transitive rule to this? And don't they want to arrest someone for having the child porn to train it on?
To say it was “trained on child porn” is just about the most “well, technically….” thing that can be said about anything.
Several huge commercial and academic projects scraped billions of images off of the Internet and filtered them best they knew how. SD trained on some set of those and later some researchers managed to identify a small number of images classified as CP were still in there.
So, out of the billions of images, was there greater than zero CP images? Yes. Was it intentional/negligent? No. Does it affect the output in any significant way? No. Does it make for internet rage bait and pitchforking? Definitely.
There's two controversies that are the opposite of each other.
1. Some people are mad that Stable Diffusion might be trained on CSAM because the original list of internet images they started with turned out to link to some. (LAION doesn't actually contain images, just links to them.)
This one isn't true, because they removed NSFW content before training.
2. Some other people are mad that they removed NSFW content because they think it's censorship.
That actually isn't the legal issue I meant though. It's not that they trained on it, it's that it contains adult material at all (and can be shown to children easily), and that it can be used to generate simulated CSAM, which some but not all countries are just as unhappy about.
That 'anti-adult tirade' is a strategic choice.
1. Open source even more capable 'adult models' and get sued to oblivion (They still have massive lawsuits)
2. Neuter the model to uselessness and have users abandon it.
Both are bad choices, and require a very, very skilled CEO to thread the needle. Emad failed. that's all.
I don’t know if that is a factor. But the dog whistle is “safety”.
Maybe related to last years controversy?
“Mostaque had embezzled funds from Stability AI to pay the rent for his family's lavish London apartment” and that Hodes learned that he “had a long history of cheating investors in prior ventures in which he was involved”
It also seems like the company just isn't doing very well, it looks like there have been constant problems since Stable Diffusion was released - not enough funding, people leaving, etc. Which I don't get - you create a massive new piece of software that is a huge leap forward and you can't just get more funding? There have to be big structural issues at Stability.
not enough funding
They have raised 110M in October.
Yet for the 6 months before that they were talking about running out of money, and even the month after they got funded were considering selling the company due to money issues.
I’m practicality shaking my head in disbelief at all the red flags this guy has and people are still defending him. Stability and its work are great. We should support an open community and ethos. And Emad can still be a shady narcissist con man. These are all compatible views.
The “wework” ceo of ai bubble
A lot more social good with this one though. Hard to cheer for Stability.ai’s failure.
Stability models are mostly open source. While it was never going to last, Stability put the entire industry in a race to the bottom, all the while building up the open ecosystem.
Wework transferred investor money to office building owners and the CEO himself.
Stability transferred investor money to countless AI users in the world. Emad certainly didn't get a billion dollar payday.
Looks like he went to crypto
EigenLayer's tag line is? "Build open innovation ∞ play infinite sum games. Also: @eigen_da" (from their twitter profile).
Infinite sum games...
LMAO
"infinite games" + "positive sum" => "infinite sum"
has big Sarah Palin energy: "refute" + "repudiate" => "refudiate"
https://www.npr.org/sections/itsallpolitics/2010/11/15/13133...
Decentralized systems, peer to peer, Blockchain, smart contracts, are all important technologies with real use cases. It is not accurate to refer to any of them as simply "crypto" especially in this context.
what's the real reason?
The most obvious potential real reason is Stability isn't making money, and the people that matter don't think Emad was going to be able to turn it around.
Well, that means his replacement will be a money shark, enshittyfying Stability AI for higher profits.
There's nothing to enshitify, their models were open sourced. Now they may no longer release future models for free, but its entitlement to think we'll just get free improvements forever.
Also not all CEO replacements turn out bad. Uber certainly has turned itself around.
6 months ago, I expressed my doubt about the viability of Stability business model on HN News. Mostaque answered the questions with a resounding yes and claimed the business of Stability.AI is better than ever.
Today he resigned.
He still owns a ton of stock, and last month said they were on track to be cash flow positive this year.
Would you please stop doubting? The consequences are just too great. /joke
Emad is such an obvious grifter it’s honestly mad that he attracted so much VC money.
He couldn’t even get his own story straight regarding his education and qualifications which should be a pretty clear disqualifying red flag from the outset.
The Forbes article from last year was dismissed on here as a hit piece but the steady flow of talent out was the clear sign, capped by the original SD authors leaving last week (probably after some vesting event or external funding coming through).
haters always gonna hate no matter what someone does
Exactly, i don't understand why people are not seeing this
I guess SD3 will never be released, then. What a pity. :-/
That's my guess too. Emad teased SD3 while it looked like he was looking for more money, but without convincing rationale for not releasing it already. The samples may have been heavily cherry-picked, we don't know if it's actually a decent model in practice.
That's silly. For one thing, he says he's still the majority shareholder.
OpenAI? closed source. Stability AI? facing instability.
Startup idea: Unprofitable.ai
Accenture : $1.1 billion GenAI projects!
Huge news, and his reasoning doesn’t seem to make sense to me. Can anyone elaborate further?
There have been a series of high profile staff/researcher departures, implied to be partially as a result of Emad's leadership. The latest departures of the researchers who helped develop Stable Diffusion could be fatal: https://news.ycombinator.com/item?id=39768402
https://en.wikipedia.org/wiki/Emad_Mostaque
that wikipedia page screams "grifter", wow
Really no wonder he ended up in “decentralized AI” aka crypto grifting. It must be like returning home after a long day.
Bubble starting to burst, history doesn’t repeat itself but it rhymes
let us hope
AI as a field feels like crypto from a few years back
At least, this has the same impact on GPU prices.
Wow, the comments seems mean spirited. I would say Thanks for releasing open source models and accepting to pass the torch
I'm very grateful for SD. But I'm also quite sure SD3 and the future models won't be open.
What's with these weirdly convoluted re-spellings of Arabic/Muslim names to make them look not Arabic/Muslim?
He's not an Arab, he's ethnically Bangladeshi, and that's a common way to spell his name here.
His shares still have majority vote and full board control.
He probably saw all the crypto AI grifters make hundreds of millions and wanted in on the action.
With his name attached, any crypto AI coin will launch straight to $500m mcap
You just got to bribe a few key AI researchers to completely control the future of humanity, lol.
So… instability.ai?
lol Emad was always seemed like an obvious fraud to me. Not quite SBF level but same vibe. Whenever someone goes overboard on the nerd look it’s always a red flag.
I can't find much information about Shan Shan Wong, the new co-CEO. Not even a photo of this person on the internet.
Anyone else have information about them?
Recent genAI shakeups in the past week:
1. Inflection AI -- ceo out to MSFT
2. Stability AI -- ceo out to ____ (infinite sum games? with EigenLayer?)
What else? Is there a "GenAI is going great" website yet? (ala "web3 is going great": https://www.web3isgoinggreat.com/)
Fishy, it was the outsider that obliterated the establishment and the current narrative
It was bound to happen anyways..
I interviewed at Stability AI a while ago and that interview was a complete shit show. They quite literally spent 40 minutes talking about Emad and his "vision". I think we actually talked about what they wanted me to do there for like 15 minutes.
I was not feeling confident about them as a company that I wanted to work for before that interview, afterwards I knew that was a company I wouldn't work for.
Midjourney is the most popular discord channel by far with 19.5M+ members, $200M in revenue in 2023 with 0 external investments and only 40 employees.
The problem has nothing to do with commercializing image gen AI and all to do with Emad/Stability having seemingly 0 sensible business plans.
Seriously this seemed to be the plan:
Step 1: Release SD for free
Step 2: ???
Step 3: Profit
The vast majority of users couldn't be bothered to take the steps necessary to get it running locally so I don't even think the open sourcing philosophy would have been a serious hurdle to wider commercial adoption.
In my opinion, a paid, easy to use, robust UI around Stability's models should have been the number one priority and they waited far too long to even begin.
There's been a lot of amazing augmentations to the stable diffusion models (ControlNet, Dreambooth etc) that have propped up, lots of free research and implementations because the research community has latched onto the stability models and I feel they failed to capitalize on any of it.
The more I think about the AI space the more I realize that open sourcing large models is pointless now.
Until you can reasonably buy a rig to run the model there is simply no point in doing this. It's no like you will be edified by setting the weights either.
I think an ethical business model for these business is to release whatever model can fit into a $10,000 machine and keeping the rest closed source until above machine is able to run them.
The released image generation models run on consumer GPUs. Even the big LLMs will run on a $3500 Mac with reasonable performance, and the CPU of a dirt cheap machine if you don't care about it being slow, which is sometimes important and sometimes isn't.
Also, things like this are in the works:
https://news.ycombinator.com/item?id=39794864
Which will put the system RAM of the new 24-channel PC servers in range of the Nvidia H100 on memory bandwidth, while using commodity DDR5.
The `big' AI models are trillion parameter models.
The medium sized models like GPT3 and Grok are 185b and 314b respectively.
There is no way for _anyone_ to run these on a sub $50k machine in 2024, and even if you can the token generation speed on CPU is under 0.1 tokens per second.
You can get registered DDR4 for ~$1/GB. A trillion parameter model in FP16 would need ~2TB. Servers that support that much are actually cheap (~$200), the main cost would be the ~$2000 in memory itself. That is going to be dog slow but you can certainly do it if you want to and it doesn't cost $50,000.
How slow? Depending on the task I fear it could be too slow to be useful.
I believe there is some research on how to distribute large models across multiple GPUs, which could make the cost less lumpy.
You can get a decent approximation for LLM performance in tokens/second by dividing the model size in GB by the system's memory bandwidth. That's assuming it's well-optimized and memory rather than compute bound, but those are often both true or pretty close.
And "depending on the task" is the point. There are systems that would be uselessly slow for real-time interaction but if your concern is to have it process confidential data you don't want to upload to a third party you can just let it run and come back whenever it finishes. And releasing the model allows people to do the latter even if machines necessary to do the former are still prohibitively expensive.
Also, hardware gets cheaper over time and it's useful to have the model out there so it's well-optimized and stable by the time fast hardware becomes affordable instead of waiting for the hardware and only then getting to work on the code.
Why would increasing memory bandwidth reduce performance? You said "You can get a decent approximation for LLM performance in tokens/second by dividing the model size in GB by the system's memory bandwidth"
Even looking on Amazon, DDR4 seems still a decent bit above $2/GB:
2 x 32GB: $142
2 x 64GB: $318
8GB: $16
2 x 16GB: $64
2TB of 128GB DDR4 ECC: $9,600 (https://www.amazon.com/NEMIX-RAM-Registered-Compatible-Mothe...)
What does this mean? What motherboards support 2TB of RAM at $200? Most of them are pushing $1,000. With no CPU.
It may not hit $50K, but it's definitely not going to be $2K.
ChatGPT is 20B according to Microsoft researchers, also the fact that big AI models are trillion parameter models is mostly speculation, about GPT-4 it was spread by geohot.
To be precise, ChatGPT 3.5 turbo being 20B is officially a mistake from a Microsoft Researcher, quoting a wrong source published before the release of chatgpt3.5 turbo. Up to you to believe it or not. But I wouldn’t claim it’s a 20B according to Microsoft Researchers.
The withdrawn paper: https://arxiv.org/abs/2310.17680
The wrong source: https://www.forbes.com/sites/forbestechcouncil/2023/02/17/is...
The discussion: https://www.reddit.com/r/LocalLLaMA/comments/17jrj82/new_mic...
It's interesting how the paper was completely retracted instead of just being corrected.
GPT-3 was 175B, so it'd be a bit odd if GPT-4 wasn't at least 5x larger (1T), especially since it's apparently a mixture of experts.
Mixtral 8x7b is better than both of those and runs on a top spec M3 Max wonderfully.
Indeed! Also, Mixtral 8x7b runs just as well on older M1 Max and M2 Max Macs, since LLM inference is memory bandwidth bound and memory bandwidth hasn't significantly changed between M1 and M3.
It didn't change at all, rather was reduced in certain configurations.
It's just semantic gymnastics. I'm sure most people will consider LLaMa 70B a big model. Of course if you define big = trillion then sure big = trillion[1].
[1]: https://en.wikipedia.org/wiki/No_true_Scotsman
I will make a 2 trillion parameter model just so your comment becomes outdated and wrong.
Disagree. A few weeks ago, I followed a step-by-step tutorial to diwnlad ollama, which in turn can download various models. On my not-soecisl laptop with a so-so graphics card, Mixtral runs just fine.
As models advance, they will become - not just larger - but also more efficient. Hardware advances. Large models will run just fine on affordable hardware in just a few years.
I’ve come to the opposite conclusion personally - AI model inference requires burst compute, which particularly suits cloud deployment (for these sort of applications).
And while AIs may become more compute-efficient in some respects, the tasks we ask AIs to do will grow larger and more complex.
Sure you might get a good image locally but what about when the market moves to video? Sure chat GPT might give good responses locally, but how long will it take when you want it to refactor an entire codebase?
Not saying that local compute won’t have its use-cases though… and this is just a prediction that may turn out to be spectacularly wrong!
Ok but yesterday I was on a plane coding and I wouldn’t have minded having GPT4 as it is today available to me.
Thanks to Starlink, planes should have good internet soon.
Huge models are the best type to open source.
You get all the benefits of academics and open source folks pushing your model forward, and a vastly improved hiring pool.
But it doesn't stop you launching a commercial offering, because 99.99% of the world's population doesn't have 48GB+ of VRAM.
The dream
For a founder maybe , definitely not for employees .
AI startups need not an insignificant amount of startup capital , you cannot just spend weekends to build like you would a saas app . Model training is expensive so only wealthy individuals can even consider this route
Companies like that have no oversight or control mechanisms when management inevitably goes down crazy paths, also without external valuations option vesting structures are hard to ascertain value.
As a counter-point, with no VCs there's more equity left for employees.
As a counter-counter-point that gets rarely discussed on HN, VCs aren't taking as much of the pie as people think. In a 2-founder, 4-engineer company, it wouldn't be unusual to have equity be roughly:
20% investors 70% founders 2-3% employees (1% emp1, 1% emp2, 0.5% emp3, 0.25% emp4) 7% for future employees before next funding round
This is not a fair comparison because you are not taking into account liquidation preferences. Those investors don't have the same class of equity as everyone else. That doesn't matter in the case of lights out success but it matters a great deal in many other scenarios.
Sure. My point was that most employees think that VCs take 80+%, and especially the first few employees usually have no idea just how little equity they have compared to the founders.
You can't run a sustainable business if you take VC money. VCs need an exit.
If valued $2bn then even a 0.1% is $2m. Not bad.
Early employees may have gotten 2-3% and are completely undiluted.
Yes indeed! Probably furiously vesting now!
Sometimes you need to say fuck the money, I’ve already got enough, and I just want to do what I enjoy. It may not be an ideal model for HN but damn not everything in life is about grinding, P/E ratios, and vesting schedules
Yeah, that's easier to say when you have enough. A lot of employees might not be in that privilege position. The reality for some of the folks might be addressing education loans, families to take care of, tuition for kids, medical bills, etc.
Also oversize houses, expensive cars, etc...
Only if your costs are a lot lower than $200M, which given the price of GPU compute right now, is not guaranteed.
Heres a Stable Diffusion buisness idea: sign up all the celebrities and artists who are cool with AI, and provide end users / fans with an AI image generation interface, trained on their exclusive likenesses / artwork (loras).
You know, the old tried and true licensed merchandise model. Everybody gets paid.
I think the following isn't said often enough: there must be a reason why there are extremely few celebrities and artists who are cool with AI, and it cannot be something abstract and bureaucratic as copyright concerns although those are problematic.
It's just not there yet. GenAI outputs aren't something audiences wants to hang on a wall. It's something that evoke sense of distress. Otherwise everyone's tracing them at least.
Most people mix up all the different kinds of intellectual property basically all the time[0], so while people say it's about copyright, I (currently) think it's more likely to be a mixture of "moral rights" (the right to be named as the creator of a work) and trademarks (registered or otherwise), and in the case of celebrities, "personality rights": https://en.wikipedia.org/wiki/Personality_rights
People have a wide range of standards. Last summer I attended the We Are Developers event in Berlin, and there were huge posters that I could easily tell were from AI due to the eyes not matching; more recently, I've used (a better version) to convert a photo of a friend's dog into a renaissance oil painting, and it was beyond my skill to find the flaws with it… yet my friend noticed instantly.
Also, even with "real art", Der Kuss (by Klimt) is widely regarded as being good art, beautiful, romantic, etc. — yet to me, the man looks like he has a broken neck, while the woman looks like she's been decapitated at the shoulder then had her head rotated 90° and reattached via her ear.
[0] This is also why people look at a Google street view image with a ©2017 Google[1] tiled over on a blue sky and say "LOL, Google's trying to own the sky", or why people even on this very forum ask how some new company can trademark a descriptive term like "GPT"[2], seemingly surprised by this being possible even though there's already a very convenient example of e.g. Hasbro already having "Transformers".
[1] https://www.google.com/maps/@33.7319434,10.8655264,3a,77.2y,...
[2] https://news.ycombinator.com/item?id=35692476
The point is, generative AI images are not widely regarded as good art. They're often seen as passable for some filler use cases and hard to tell apart from human generations, but not "good".
It's not not-there-yet because AI sometimes generates sixth fingers, it's something another level from Gustav Klimt, Damien Hirst, Kusama Yayoi, or the likes[0]. It could be that genAI is leaving something that human artist would filter out, or because images are too disorganized that they appear to us to be encoding malice or other negative emotions, or maybe I'm just wrong and it's all about anatomy.
But whatever the reason is, IMO, it's way too rarely considered good, gaining too few supportive celebrities and artists and audiences, to work.
0: I admit I'm not well versed with contemporary art, or art in general for that matter
My point is: yes AI is different — it's better.
Always? No. But I chose Der Kuss specifically because of the high regard in which it is held, and yet to my eye it messes with anatomy as badly as if it had put in 6 fingers (indeed, my first impression when I look closely at the hand of the man behind the head of the woman, is that the fingers art too long and thumb looks like a finger).
Why would those celebs pay Stability any significant money for this, given they can get it for a one off payment of at most a few hundred dollars salary/opportunity cost by paying an intern to gather the images and feed it into the existing free tools for training a LoRA?
I think in this case the celebs are getting paid for using their likeness.
That sounds like the "lose money on every sale" philosophy of the first dot-com bubble, only without even the "but make it up in volume" second half.
"Cool with AI" and "sell my likeness so nobody ever needs to hire me again" are too close for comfort on this one.
And also, here's a way to just make so much pornography of me.
I'm betting the list of folks who would sign the AI license are pretty small, and mostly irrelevant.
You can already do that with reference images and even for inpainting. No training required. Also no need to pay actors outrageous sums to use their likeness in perpetuity as long as you do business. The licensing still tricky anyways, because even if the face is approved and certified, the entire body and surroundings would also have to be. Otherwise you basically re-invented the celebrity deepfake porn movement. I don't see any A-lister signing up for that.
Are we sure that Midjourney is still on that trajectory?
I was a heavy user since the beginning but my usage has dropped to almost 0
I think you're a sample size of one
I mean of course I am a sample size of one when I am speaking about my own experience?
That's why I asked that question to see if others notice something similar or if that's just me
I wonder if MidJourney is still ripping. I’m actually curious if it’s superior to ChatGPT’s Dall-E images… I switched and cancelled my subscription when ChatGPT added images, but I think I was mostly focused on convenience.
If you have a particular style in mind then results may vary but aesthetically Midjourney is generally still the best, however Dalle-3 has every other model beat in terms of prompt adherence.
Image quality, stylistic variety, and resolution are much better than ChatGPT. Prompt following is a little better with ChatGPT, but MJ v6 has narrowed the gap.
That’s not true. He was pretty open about the business plan. The plan was to have open foundational models and provide services to governments and corporations that wanted custom models trained on private data, tailored to their specific jurisdictions and problem domains.
Was there any traction on this? I cannot imagine government services being early customers. What models would the want?Military -- maybe, for simulation or training, but that requires focus, dedicated effort and a lot of time. My 2c.
I've heard this pitch from a few AI labs. I suspect that they will fail, customers just want a model that works in the shortest amount of time and effort. The vast majority of companies do not have useful fine tuning data or skills. Consultancy businesses are low margin and hard to scale.
Leonardo.ai have basically done exactly this and seem to be doing OK.
It’s a shame because they’re literally just using stable diffusion for all their tech but built a nicer front end and incorporated control net. No-where else has done this.
Controlnet / instantID etc are the really killer things about SD and make it way more powerful than Midjourney, but they aren’t even available via the stability API. They just don’t seem to care.
InstantID uses a non-commercial licensed model (from insightface) as part of its pipeline so I think that makes it a no-go for being part of Stability's commercial service.
First paragraph: These are wild stats. Thank you to share. How did they fund themselves, if no external funding?
Yes, and MJ has no public API either. Same for Ideogram, I imagine they have at least 10m in the bank, and aren't even bothering making an API despite being SoTA for lots of areas.
Yes, and I'm grateful to them for sticking to that plan. :-)
But for the individuals involved, it might also be
Step 2: Leverage fame in AI space for massive VC injection on favorable terms.
There's money to be made for sure, and Stability's sloppy execution and strategy definitely didn't help them. But I think there are also industry-wide factors at play that make AI companies quite brittle for now.
What's insane to me is the fact that the best interfaces to utilize any of these models, from open source LLMs to open source diffusion models, are still random gradio webUIs made by the 4chan/discord anime profile picture crowd.
Automatic1111, ComfyUI, Oobabooga. There's more value within these 3 projects than within at least 1 billion dollars worth of money thrown around on yet another podunk VC backed firm with no product.
It appears that no one is even trying to seriously compete with them on the two primary things that they excel at - 1. Developer/prosumer focus and 2. extension ecosystem.
Also, if you're a VC/Angel reading my comments about this, I would very much love to talk to you.
AI businesses/models are capturing economic value: the problem is that costs are increasing much faster than revenue.
With the current limitations of AI in mind, it often looks like a solution looking for a problem. It's too unreliable, slow, dumb, and too expensive for a lot of the tasks companies would like to use it for.
And that becomes part of the problem because it's hard to sell unreliable technology unless you design the product in a way that plays well with the current shortcomings. We will get there, but it's still a few iterations away.
I don't think it's that SD and LLMs are solutions looking for problems, it's that there are very clear problems to which they provide 90% of a solution and make it impossible to clear the last 10%.
They're the new WYSIWYG/low-code. Everyone that doesn't fully understand the problem space thinks they're some ultimate solution that is going to revolutionise everything. People that do are responding with a resounding 'meh'.
Stable Diffusion is a great example. Something that can generate consistent game assets would be an absolute game changer for the entire game industry and open up a new wave of high tech indie game development, but despite every "oh wow" demo hitting the front page of HN, we've had the tech for a couple of years now and the only thing that's come out of it is some janky half solutions (3D meshes from pictures that are unworkable in real games, still no way to generate assets in consistent styles without a huge amount of complex tinkering) and a bunch of fucking hentai lol.
A game with no consistency in the art is probably enabled. We've crossed the threshold where something like Magic the Gathering could be recreated by a tiny team and a low budget.
I don't think the limiting factor here is the software; it looks like we got AI-generated art pretty much as soon as consumer graphics cards could handle it (10 years ago it would have been quite hard). I'd be measuring progress in hardware generations not years and from that perspective Stable Diffusion is young.
Current AI is entirely incapable of generating the balanced and fun/engaging rule sets required for a MtG style game. Sure the art assets could be generated with skilled prompting and touchup but even that is nowhere close to the strong statement you made.
OP likely meant that a Midjourney-level AI can easily generate all the card art.
Obviously, current AIs cannot generate game rulesets because the game feel is an internal phenomenon that cannot be represented in the material domain and therefore AIs cannot train on it.
And even a lot of the hentai is fucking worthless for the same reasons! Try generating some kinbaku. It's really hard to get something where all the rope actually connects and interacts sensibly because it doesn't actually know what a knot is. Instead, you end up with M. C. Escher: Fetish Edition.
The NovelAI V3 model is really good at this, shibari and related stuff, it's a heavily finetuned SDXL model.
Yeah not sure what you’re talking about. Pony diffusion and conteolnet can fix most of the issues you’re describing
Pretty much what happened with Speech Recognition for 30 years. That last 10% had to be handled manually. Even if you get 90% right, it still means ever second sentence has issues. And as things scale up the costs of all that manual behind the scenes hacking scale up too. We underestimated how many issues involved Ambiguity - where N people see the same thing and have N different interpretations. So you see a whole bunch of Speech Rec companies rising and falling over time.
Now things are pretty mature, but it took decades to get there but there is still a whole bunch of hacks upon hacks behind the scenes. Same story will repeat with each new problem domain.
We use Whisper for automatic translation, supposedly SotA, but we have to fix its output, I would say, very often. It repeats things, translates things for no reason, has trouble with numbers.. it's improved in leaps and bounds but I'd say that speech recognition doesn't seem to be there yet.
Human work is much more deterministic than AI as it encompasses a lot more constraints than what the task specified. If you take concept art creations, while the brief may be a few sentences, the artist knows to respect anatomy and perspective rules, as well as some common definitions (when the brief says ship, you know that it’s the ship concept approved last week). As an artist, I’ve used reference pictures, dolls, 3d renders and one of the most aspect these tools had was consistency. I don’t see Large Models be consistent without another models applying constraint to what they’re capable of producing, like rules defining correct anatomy and extracting data that defines a character. The fact is we do have tools like MakeHuman [0], Marvelous Designer [1], and others that let you generate ideas that are consistent in their flexibility.
I look at Copilot and it’s been the same for me. I’m either working on a huge codebase and most of the time, it means tweaking and refactoring, which is not something I trust a LLM with. Or it’s a greenfield project and I usually write only the necessary code for a task and boilerplate generation is not a thing for me. Coding for me is like sculpting and LLM-based solutions feel like trying to do with bricks attached to my feet. You can get something working if you’re patient enough, but it’s make more sense and it’s more enjoyable to just use your fingers.
[0]: http://www.makehumancommunity.org/
[1]: https://marvelousdesigner.com/
Palworld
We use AI for simple tasks and it already pays off. It's not unreliable if you do it right.
No, Stability isn't getting 99.99% of the money people are making from Stable Diffusion. Their (lack of a) business model is the problem.
OpenAI and midjourney have millions of paying customers.
Tens of millions of freeloaders too.
Every single Ad blocker users is a freeloader to Google/Youtube.
Didn't stop them from being extremely successful.
Pretty sure YouTube has been losing a lot of money every year before they very aggressively ramped up ad frequency and duration as well as subscriptions in recent years. They could only do that thanks to Google’s main cash cow; YouTube would have been dead if not acquired and subsidized by Google.
I am both a Youtube Family Premium and a dedicated ad-blocker user.
studios are the target here, not consumers. Pareto principle applies when you need more than a passive user. 20% or less of the serious studios (which is already a minority) will end up providing 80% of the value of any given AI solution.
...and thousands of enterprise customers translating into billions of dollars worth of deals.
The point is, OpenAI can afford to have free-loaders as long as their deals from enterprise, governments are paying for the service.
Midjourney doesn't have a free plan so no free-loaders there and they're making $200M+ with no VCs.
Stability.ai will always suffer from free-loaders due to their fully open source AI.
Stability's recent models (SD3, SV3D, StableLM2, StableCode, and more) are neither open licensed nor planned for release as open licensed.
Sounds like a fairly precarious position they’re in though. This isn’t some new magic tech anymore.
Yeah, but not sure if OpenAI is profitable yet (they lost $500M last year), and costs are rising, and competition increasing.
GPT-4 cost $100M+ to train (Altman), but Dario Amodei has said next-gen models may cost $1B to train, and $10B models are not inconceivable.
I'd guess OpenAI's payroll is probably $0.5B (770 highly paid employees + benefits, not to mention hundreds of contractors creating data sets).
It would be negligent for them to aim for profitability right now.
They're doing what they should: growing the customer base while continuing to work on the next generation of the core technology, and developing the support code to apply what they have to as broad a cross-section of problems as they have the potential to offer a solution for.
Stability has dreamstudio.ai, but it never seemed like they were seriously investing in it to try to compete with Midjourney?
So the problem might be two-fold. First, there's an oversupply of companies trying to use AI relative to the current technological capabilities. Second, even when value is created, the companies don't capture the value to profit out of it.
No. There can be tremendous value in AI, you just won't find it here.
The reason is that Stability.ai gave away everything for free. Until recently, they didn't even attempt to charge money for their models.
I've heard the only reason they're not already closed up is that they're reselling all of the rented GPU quota they leased out years ago. Companies are subletting from Stability, which locked in lots of long term GPU processing capacity.
There's no business plan here. This is the Movie Pass of AI.
They have raised 110M in October and they say that training a particular model costs them hundreds of thousands $ in compute costs.
They don't lack money.
They needed to take an emergency $50M loan after that, so it clearly didn't last long
It will be interesting to see when OpenAI gets as open about their costs as their revenue.
I see it this way to be honest:
- companies will aggresively try to use AI in the next 2-3 years, downsizing themselves in the meantime
- the 3-5 year launch mark will show that downsizing was an awful idea and took too many hits to really be worth it. I don't know if those hits will be in profits (depends on the company) but it will clearly hit that uncanny valley.
- 6-8 year mark will have studios hiring like crazy to get the talent they bled back. They won't be as big as before, but it will grow to a more sane level of operation.
- 10-12 year mark will have the "Apple" of AI finally nail the happy medium between efficiency and profitability (hopefully without devastating the workers, but who knows?). Competitors will follow throw and properly usher the promises AI is making right now.
- 15 year mark is when AI has proper pipelining, training, college courses, legal lines, etc. established and becomes standard faire, no stranger than using an IDE.
As I see it, companies and AI tech alike are trying to pretend to be the 10 year mark all the while we're currently in legal talks and figuring out what and where to use AI to begin with. In my biased opinion, I hope there's enough red tape on generative art to make it not worth it for large studios to leverage it easily (e.g. generative art loses all copyright/trademarkability, even if using owned IPs. Likely not that extreme, but close).
Companies are aware of the current AI generation being a tool and not a full replacement (or they will be after the first experiments they perform).
They will not downsize, they will train their workforce or hire replacements that are willing to pick up these more powerful and efficient tools. In the hands of a skilled professional there will be no uncanny valley.
This will result in surplus funds, that can be invested in more talent, which in turn will keep feeding AI development. The only way is up.
Not allowing copyright on AI generated work is a ridiculous and untenable decision that will be overturned eventually.
90% of startups fail, so it's not just a matter of waiting for the tech to get better and having more value - most of the current players will simply fail and go out of business.