HN comments for: What I learned from looking at 900 most popular open source AI tools

hintymad

11 replies

1d14h

2024-03-15 03:43:04 UTC

So many cool ideas are being developed by the community. Here are some of my favorites.

Batch inference optimization: FlexGen, llama.cpp

Faster decoder with techniques such as Medusa, LookaheadDecoding

Model merging: mergekit

Constrained sampling: outlines, guidance, SGLang

So essentially a handful of people are doing God's work. They have deep knowledge on modeling and optimization, and they build amazing libraries for millions of mortals. On the other hand, it'll be hard for an engineer to work on training frameworks or building models with new knowledge or new capabilities (except some small-scale finetuning) or optimization in general -- the hardware cost for doing such work is prohibitive to such engineers.

pama

10 replies

1d14h

2024-03-15 04:21:30 UTC

Isn’t that the same with most engineering disciplines though? A nuclear engineer cannot build a nuclear power station at home, a chemical engineer or process chemist doesn’t have access to the industrial grade infrastructure outside of their job, a computer hardware architect cannot hope to design hardware at home and fabricate it at 3nm or better at TSMC. I guess that software engineering was more of an exception to this rule for a while because home computers were amazing enough to help build useful software. Even throughout all these years many people worked on parallel code that run on large clusters or infrastructure that was not appropriate for operating at home and now with deep learning a subset of that skill set is very desirable. I agree that additional public contributions to the training process on large clusters would be fantastic; eventually these people will be trained in all the right systems courses and will figure out their way to the jobs where they can apply their skills and grow.

hintymad

8 replies

1d12h

2024-03-15 05:29:03 UTC

But not necessarily true for software engineers. The so-called three romances of CS can all be attempted by anyone: compiler, operating systems, and computer graphics. And in the recent years, databases, distributed systems, and machine learning algorithms can also be attempted by anyone at home. Only the large models and the associated optimizations are really beyond most people's reach.

gleenn

4 replies

1d11h

2024-03-15 06:49:30 UTC

I'm a CS grad and software engineer and have never in my time heard anyone describe the "three romances". Where did you hear that term and can you give me a TL;DR on why you actually think those fields are the 3?

Joel_Mckay

3 replies

1d9h

2024-03-15 08:53:21 UTC

More like the holy trinity 99.998% of CS majors couldn't code if their career depended on completing.

https://youtu.be/TRZAJY23xio?feature=shared&t=2346

I don't agree with a lot of what Steve stood for, but he was right about this dynamic range observation.

Cheers =)

eropple

2 replies

1d4h

2024-03-15 13:49:05 UTC

I don't think that's true at all. I think most "CS majors" would need to study how to do those things, but none of them are conceptually difficult and build on the same algorithmic and data structures knowledge you learn for other tasks.

I never built a compiler until I did. I never built an operating system until I did. (I don't remember when I started doing computer graphics, though, because that was a long time ago.)

Joel_Mckay

1 replies

1d3h

2024-03-15 15:14:26 UTC

Sure, one could spend a semester learning to build a rudimentary compiler, or watch your prof build a better solution in under 37 lines of Prolog.

Wish I was joking here... =)

gleenn

0 replies

20h47m

2024-03-15 21:38:31 UTC

Do you have a link to their project? Rich Hickey also always reiterates that Simple isn't Easy. Knowing exactly how something works means you can express it in as concise a way as possible.

molticrystal

0 replies

1d4h

2024-03-15 14:19:15 UTC

I don't think they are denying being able to do any of those things, it just often won't be in the same league. You aren't going to be producing linux, llvm, or unreal, but instead very specific projects which may be remarkable or just a toy, templeos, tinyc, 64k demos. Not ChatGPT, but maybe autocomplete or an inferior llm using the same methods. For the hardware example they gave, while 3nm might be inaccessible, you can fabricate at 180nm, 130nm and 90nm process nodes [0] [1]. Even chemistry and nuclear science isn't beyond the home tinkerers grasp, but as complexity rises, the ability to acquire, control and synthesize diminishes rapidly.

[0] https://developers.google.com/silicon

[1] https://en.wikipedia.org/wiki/Google_Silicon_Initiative

hashtag-til

0 replies

1d8h

2024-03-15 09:42:25 UTC

I didn't now about "three romances of CS" terminology. #TIL

dartos

0 replies

1d4h

2024-03-15 13:50:51 UTC

I don’t think anyone who would/could get into low level computer graphics wouldn’t be able to do so in LLM optimization land.

The only barrier to both areas is just the amount of math and gpu knowledge

Nonoyesnoyes

0 replies

20h49m

2024-03-15 21:36:54 UTC

I'm easily able to contribute to a lot of open source projects either out of the box or with little onboarding time.

When I read about those optimization blog articles it feels to me that I need to take at least half a year or a year as a sebatical to understand all of it

chiphuyen

5 replies

1d17h

2024-03-15 00:47:35 UTC

Hi, I'm Chip, the author of the post. I spent waaaay too much time doing research for this. The AI engineering layer was especially hard because so many tools have similar and/or overlapping features. It was also a lot of pain trying to understand the repos that only have Chinese in the README do.

The full list of the repos is published here: https://huyenchip.com/llama-police

swyx

1 replies

1d16h

2024-03-15 01:26:33 UTC

great list! as someone who's also trying to map the ai engineering landscape... i wonder what u think of adding other parts of the AI stack (https://www.latent.space/p/dec-2023). right now you have 4 categories and those are all in the text/code-heavy RAG/Agent world, but i think the space has broadened out a bit as i see it. for example, you could add:

- finetuning/other post-pretrain model tools (axolotl, mergekit <- all made and used by people without traditional ML engineer/researcher background)

- multimodal models/frameworks like vocode and comfyui

- AI UX tools like vercel ai sdk

- synthetic data generation tooling? whatever the nous pple have made

open question whether inference frameworks like llama.cpp/ollama or vllm and tgi count as AI Eng tools? again given the background of ggeranov and the students behind the other projects, arguably yes but ofc it starts to bleed into classical mlops here. (update: i see u have them in the "model development" category, ok fair)

chiphuyen

0 replies

1d16h

2024-03-15 01:50:49 UTC

IMO, the classical mlops is closer to the genai stack than most people think. E.g. experiment tracking is the same: with classical mlops, you experiment with hyperparams, with genai, you experiment with prompts. Similarly, finetuning is just an extension of training. Even vector databases for RAG is just vector search + databases, both of which have been around forever.

The post-train world is what I find to be the most fun. Techniques like model merging, constrained sampling, and all the new creative techniques for inference optimization and faster decoding are super cool!

braza

1 replies

1d16h

2024-03-15 01:32:39 UTC

Hey Chip, thanks for your contributions along the years and the amazing book.

2 questions: From what you researched, how many of those solutions are ready for production? and Regarding this mortality, what are the let's say top 5 things someone needs to think even before to do a PoC over those tools?

chiphuyen

0 replies

1d16h

2024-03-15 01:44:43 UTC

Production is a spectrum. Many of the repos I see are still demowares, but at the same time, most companies I've seen are also still at the PoC phase instead of massively scaling up their GenAI use cases.

I don't think the considerations for adopting a tool has changed. It starts from what problem you want to solve, the money/time budget you have for the solution, ROI of each solution.

I know it sounds generic, but without more detail, it's hard to give a more concrete answer!

shnkr

0 replies

1d14h

2024-03-15 04:17:45 UTC

cool. this has been my problem all along. thanks for the time you put in.

Der_Einzige

3 replies

1d14h

2024-03-15 04:14:46 UTC

Wish that Automatic1111/Ooobabooga/Comfy would be specifically talked about since they are such unique examples of popular open source AI tools.

porkbeer

2 replies

1d13h

2024-03-15 04:28:49 UTC

Well, talk about them! How are they unique?

washadjeffmad

1 replies

1d5h

2024-03-15 12:36:46 UTC

Well, for one, their popularity eclipses the examples given:

AUTOMATIC1111/stable-diffusion-webui - 126K stars

oobabooga/text-generation-webui - 34.4K stars

comfyanonymous/ComfyUI - 28.1K stars

Some people want to avoid drawing attention to chan sites, where a lot of those projects' developers are active, but the omissions are still glaring (a bit like leaving John Prine out of the CMA). These were the first and most famous projects in inference and generative AI, so they tend to be more advanced frameworks, making them less accessible as introductory tools to low-depth end users, which the author doesn't seem to be?

diggan

0 replies

1d5h

2024-03-15 13:20:56 UTC

Well, for one, their popularity eclipses the examples given:

That's like the least interesting comparison, especially if we want to talk about "unique examples".

I'd say out of those, ComfyUI is probably the most interesting one, as it's naturally extensible and has a node-UI so you can basically reorganize the image generation pipeline and your workflow to however you want.

brianjking

2 replies

1d17h

2024-03-15 00:42:19 UTC

This is a fantastic read.

swyx

1 replies

1d17h

2024-03-15 00:57:36 UTC

elaborate? wondering what stood out to you/what you wanted more detail on.

(am OP but not the author). i always feel like looking at github trends is kinda cool but fail to get deeper insight i can use to inform my work.. its more of a trailing indicator right?

beauzero

0 replies

1d2h

2024-03-15 16:04:31 UTC

This saved me probably 3 months work digging up a good summary of the current state of specifically LLMs. This is a highly dense, easily consumable summary for someone like me. Even if it's only 80% encompassing super valuable. Thank you.

Context: Business apps. Bleeding edge curious and supportable day-to-day pragmatic developer. 25+ years web stack, 10% client desktop apps and small business sysadmin. Spending 6+ hours a day currently retooling concepts and where they can fit into day-to-day business stacks. "Retired" from SMB market after spending 25 years there and moved to government hoping for another 20-30. I love what I do.

grbsh

1 replies

1d5h

2024-03-15 12:53:10 UTC

The graph of cumulative repos over time is really interesting — it looks like we may be approaching the end of the S-curve for AI hype. I wonder if the graph will continue to flatten, or if there will be a much higher, longer, slower, and more enduring S curve. I imagine a similar double S curve pattern occurred after 2001 in web. Anyone have ideas for how I could measure this for 1995-2015 web?

samstave

0 replies

1d5h

2024-03-15 13:03:23 UTC

not very granular, but looking up, if available Alexa rankings (and whatever the other competitors were) for traffic...

---

Sure, here's what I found:

*Physical Size of the Internet (1995-2000)*: - In 1995, the Internet had a worldwide user base of less than 40 million⁸. - By 2000, there were 361 million users worldwide⁸. - In terms of websites, there were 9,950,491 websites in 2000⁹.

*Internet Bandwidth (1995-2000)*: - The average internet access speed in 1995 was 24 kbps². - By 2000, the average internet access speed had increased to 1,116 kbps². - In terms of telecommunications capacity, it was 2.2 optimally compressed exabytes in 2000⁶.

*Fastest Internet Links (1995-2000)*: - In the early 1990s, the fastest available modem was capable of transferring data at a maximum speed of 14.4 kilobytes per second (kbps)³. - By the late 1990s, broadband had emerged, offering a maximum theoretical data transfer speed of 512k per second³, which was over nine times as fast as a 56k modem.

gerdesj

1 replies

1d17h

2024-03-15 00:30:59 UTC

This section invites some discussion (says British bloke):

"https://huyenchip.com/2024/03/14/ai-oss.html#the_growing_chi..."

swyx

0 replies

1d17h

2024-03-15 00:39:22 UTC

gotta be careful with china "open source". a lot of them look like this https://github.com/dreamoving/dreamoving-project which is a very nice Apache 2.0 readme file

MattyRad

1 replies

19h54m

2024-03-15 22:31:22 UTC

You can actively see a fresh "hype curve" in the transformer-debugger repo that was posted a couple days ago (https://github.com/openai/transformer-debugger) (star history https://star-history.com/#openai/transformer-debugger&Date).

At the time I saw the repo link posted on HN, it had 1.6k stars/16 hours. What channel/platform are people subscribed to to star it so quickly? Discord? I'm not implying any nefariousness, mind you, I'm only wondering where all the stargazers were referred from so fast and in such volume.

Diris

0 replies

19h33m

2024-03-15 22:52:39 UTC

Personally, I saw it on the tweet from Jan Leike. https://x.com/janleike/status/1767347608065106387?s=20

zone411

0 replies

1d12h

2024-03-15 05:33:34 UTC

Einops and safetensors are not niche! They are just more technical and well-known by people who do more than GPT wrappers starting in 2022 ;)

visitor4712

0 replies

1d13h

2024-03-15 04:40:02 UTC

Thank you for this outstanding work!

userbinator

0 replies

1d14h

2024-03-15 03:58:59 UTC

I find it quite astounding that there are already over 900 open source AI tools (and from the article, it doesn't sound like these are all mainly clones/forks of each other.)

timrogers

0 replies

1d8h

2024-03-15 10:03:14 UTC

Super helpful article - it’s so great to have a zoomed out view of the space.

One question that stands out to me is where evaluation at the application level should be its own category, rather than folded in to bigger groups.

swyx

0 replies

1d17h

2024-03-15 00:26:39 UTC

author highlight tweet thread: https://twitter.com/chipro/status/1768388213008445837

sevagh

0 replies

1d5h

2024-03-15 12:29:30 UTC

Something feels off about sticking Einops (a great utility library for tensor reshaping, been around for years) as a footnote in a "LLM 2023-2024 hype" list - it's an anachronistic inclusion. Feels like the author creates a narrative that starts with LLMs and acts like everything else is a building block for LLMs.

nixlim

0 replies

1d17h

2024-03-15 01:10:38 UTC

Thank you for this. A good read and useful information.

neom

0 replies

1d15h

2024-03-15 02:27:34 UTC

Chips book is really good FWIW: Designing Machine Learning System - https://www.oreilly.com/library/view/designing-machine-learn...