GPT-4 has been out for almost a year now, and it seems that the frantic pace of open-ai releasing new groundbreaking tech every month has come to a halt. Anyone knows what's happening with open AI ? has the recent turmoil with sama caused lag in the company ? or are they working on some superweapon ?
It's not, its just too much noise to see.
We are getting a lot of great models out of China in particular (Yi, Qwen, InternLM, ChatGLM), and some good continuations like Solar.
Lots of amazing papers on architectures and long context are coming out.
Backends are going crazy. Outlines is shoving constrained generation everywhere, and Lorax is a LLM revelation as far as I'm concerned.
But you won't hear about any of this on Twitter/HN. Pretty much the only thing people tweet about is vllm/llama.cpp and llama/mistral, but there's a lot more out there than that.
Where do you like to keep up to date on these? Arxiv preprints, or some other place?
For Llama-based progress - Reddit - /r/LocalLlama has been my top source of info, although it's been getting a little more noisy lately.
I also hang out on a few Discord servers: - Nous Research - TogetherAI / Fireworks / Openrouter - LangChain - TheBloke AI - Mistral AI
These, along with a couple of newsletters, basically keep a pulse on things.
Lots of interesting information is so fragmented in niche Discords. For instance, KoboldAI on merging and RP models in general, and some other niches. Llama-Index. VoltaML and some others in regards to SD optimization. I could go on and on, and know only a tiny fraction of the useful AI discords.
And yeah, /r/LocalLlama seems to be getting noisier.
TBH I just follow people and discuss stuff on huggingface directly now. Its not great, but at least its not discord.
surprised someone doesnt just build an AI aggregator for this type of thing, seems like a real valueable product.
They have! And posted them on HN!
Some are pretty good! Check out this little curated nugget: https://llm-tracker.info/
I used to follow one with a UI that resembled HN itself, but now I can't find it in my bookmarks, lol.
Those rooms move too fast, and often are segregated good/better/best (meaning the deeper you want to go on a topic, the "harder" it is, politically and labor-wise, to get invited to the server).
Speaking of hard skills: how does one just hang out on a Discord server in any useful fashion? I lost the ability to deal with group chats when I started working full-time - there's no way I can focus on the job and keep track of conversations happening on some IRC or Discord. I wonder what the trick is to use those things as source of information, other than "be a teenager, student, or a sysadmin or otherwise someone with lots of spare time at work", which is what I realized communities I used to be part of consist of.
I follow https://www.reddit.com/r/LocalLLaMA/
Submissions on hn are welcome.
I have submitted some in the past. Others are submitting them! And I upvote every one I like in /new. But HNers don't really seem interested unless its llama.cpp or Mistral, and I don't want to spam.
I can't say I blame them either, there is a lot of insane crypto-like fraud in the LLM/GenAI space. I watch the space like a hawk... and I couldn't even tell you how to filter it, it's a combination of self-training from experience and just downloading and testing stuff myself.
I see a ton of papers on X related to this space.
I did try make a submission of what I thought was a "underreported" LLM two months ago.
https://news.ycombinator.com/item?id=38505986
Zero interest for some reason.
Edit: Deepseek coder has 4 submissions to HN with almost zero interest.
submissions << public interest
LoRAX [0] does sound super helpful and so I’d be curious if there are some good examples of people applying it. What are some current working deployments where one has 100s or 1000s of LoRA fine tuned models? I guess I can make up stuff that makes sense, so that’s not really what I’m asking, I’m interested in learning about any known deployments and example setups.
[0] https://github.com/predibase/lorax
There aren't really any I know of, because its brand new and everyone just uses vllm :P
No one knows about it! Which is ridiculous because batched requests with loras is mind blowing! Just like many other awesome backends like InternLM's backend, LiteLLM, Outline's VLLM fork, Aphroidte, exllamav2 batching servers and and such. Heck, a lot of trainers don't even publish the loras they merge into base models.
Personally we are waiting on the integration with constrained grammar before swapping to Lorax. Then I am going to add exl2 quantization support myself... I hope.
FYI, vLLM also just added experimental multi-lora support: https://github.com/vllm-project/vllm/releases/tag/v0.3.0
Also check out the new prefix caching, I see huge potential for batch processing purposes there!
Missed this, thanks.
Everything is moving so fast!
Very interesting, but the comment you replied to was specifically asking about OpenAI.
Yeah I misinterpreted it open ai as "open source ai"
TBH I do not follow OpenAI much. I like my personal models local, and my workplace likes their models local as well.
I will speculate! They have a model that far surpasses GPT-4 (achieved agi internally) but sama is back on a handshake agreement that they will only reveal models that are slightly ahead of openly available LLMs. Their reasoning being that releasing a proprietary model endpoint slowly leaks the models advantage as competitors use it to generate training data.
WTF is AGI?
EDIT: Clearly the point is lost on the repliers. There is no general understanding of what 'general intelligence' is. By many metrics, ChatGPT already has it. It can answer basic questions about general topics. What more needs to be done? Refinement, sure, but transformer-based models have all qualifications to be 'general intelligence' at this point. The responses are more coherent than many people I've spoken with.
Artificial General Intelligence
https://en.m.wikipedia.org/wiki/Artificial_general_intellige...
The problem with this definition 'an agent that can do tasks animals or humans can perform' is that it's not clear what that would look like. If you produce a system that can be interacted with via text input only but is otherwise capable of doing everything a human can do in terms of information processing, is that AGI? Or does AGI imply a human-like embodied form? Why?
Any curve-fitting is now AI, so they had to come with new term.
When you take a university / school course, how is that functionally different from curve fitting? Given that arbitrarily complex states can be modeled as high-dimensional curves, all learning is clearly curve fitting, whether in humans, machines, or even at the abiological level (for example, self optimizing processes like natural selection). Even quantum phenomema are -- at the end of the day -- curve fitting via gradient descent (hopefully it's had enough time to settle at a global minima!)
Artificial General Intelligence
A meaningless term.
Artificial General Intelligence. OpenAI defines it as "highly autonomous systems that outperform humans at most economically valuable work"
Okay, well, I'll say that at least that's a definition (it has to be good enough at something to make money). Arguably of course, it already does that. Me personally, I've used it to automate tasks I would have previously shelled out to fiverr, upwork, and mechanical turk. I've had great success using it to summarize municipal codes from very very lengthy documents, down to concise explanations of relevant pieces of information. Since I would have previously paid people to do that (was running an information service for a friend), I would consider that AGI. I guess the catch here is now 'most', but that implies a lot of knowledge about the economy I don't think openai has. What is 'economically valuable'? Who decides?
At the end of the day, as with most things, AGI is a meaningless term because no one knows what that is.
I wouldn't think this matters as long as you charge enough to use that API. For example, you could have a tiered pricing structure where the first 100k words per month generated costs $0.001 per word, but after that it costs $0.01 per word.
Even then it's still a lucrative proposition, since you only need to generate the dataset once, and then it can be reused.
This kind of pricing would also make it much less compelling for other users. 100k tokens is nothing when you're doing summarization of large docs, for example.
Probably getting bogged down by endless safety testing and paperwork by now
Maybe tangential to this comment, but I had been using 3.5 to write contracts, but a couple of weeks back, even when using the exact same prompts that would've worked in the past, 3.5 started saying, to paraphrase, "for any contracts you need to get a lawyer."
Anyone else had this experience? It seems like they're actually locking down some very helpful use cases, which maybe falls into the "safety" category or, more cynically, in the "we don't want to be sued" category.
As they tack on more and more guardrails it becomes lazier and less helpful. One way it gets around admitting that it has instructed not to help you is by directing you to consult a human expert.
I also suspect there’s some dynamic laziness parameter that’s used to counteract increased load on their servers. You can ask the same prompt over and over in a new chat throughout the day and suddenly instead of writing the code you asked for or completing a task, it will do a small part of the work with “add the code to do xyz here” or explain the steps required to complete a task instead of completing it. It happens with v4 as well.
That's a brilliant risk mitigation mechanism! The AI won't recursively self-improve to superhuman levels if it just keeps getting tired of thinking.
Using it for legal matters is (understandably) explicitly against use policy. It's not meant to be used for legal advice so the response you're receiving is accurate - go get a lawyer.
https://openai.com/policies/usage-policies
It's a toy, it's not meant to be used for anything actually useful - they'll keep nerfing it case by case, because letting users do useful stuff with their models is either leaving money on the table, taking an increased PR/legal risk, or both.
Add 'reducing compute use as much as possible' to the list, I'm sure. I can look at some of my saved conversations from months ago and feel wistful for what I used to be able to do. The Nerfening has not been subtle.
Ha, yeah, that's the right description of my feeling. Like, thanks for being so incredibly useful, ChatGPT, saving me hours of copying and pasting bits and pieces from countless Google searches for "sample warranty deed," only to pull the rug out when I wanted to start depending on your "greatness."
I guess, as one of the other replies here said, it was never "allowed" to give you text for a contract according to the TOS, but then it would've been best if it never replied so effectively to my earlier prompts. Taking it away just seems lame.
Edit: bad typing
GPT4 came out 3 years after GPT3
Slight knitpick
GPT 3.5 came out 2 years after GPT 3
GPT 4 came out 1 year after 3.5
Slight nitpick, but "nitpick" is not spelled with a "k".
knitpic
Actually nitpick IS spelled with a “k”, just not one at the beginning of the word. If we’re gonna be pedantic details matter!
As we now know, GPT-3.5 was essentially an early preview of the "move fast, break things" variety - GPT-4 was already well underway by then.
Law of diminishing returns. The low-hanging fruit has been picked.
https://www.wired.com/story/openai-ceo-sam-altman-the-age-of...
Given how they've struggled to even make even GPT 4 economical it's highly unlikely that larger models would be in any way cost effective for wide use.
And there's certainly more to be found in training longer on better datasets and adjusting the architecture.
It's important to remember that "not cost-effective" is not the same as "not useful", though.
Depends on how you look at it I guess. If it's really too expensive and overly slow to run for end users but is able to say, generate perfect synthetic datasets over time which can then be used to train smaller and faster models which pay back for that original cost then it is cost effective, just in a different application.
I don't think that's what that means. I think he means "we don't have to keep making models larger, because we've found ways to make very good models that are also small".
Probably an unpopular opinion:
A probabilistic word generator is still a word generator. It might be slightly better next version, we've seen it get worse, its more of the same
We can talk about AGI all we want, but it wont be built on the same technology as these word generators. There will have to be a technological breakthrough years before we even get close
The companies focusing on LLMs right now are dealing with
* Generate better words
* Make it cheaper to generate (use less compute)
* Find better training material
There is a ton of money to be made, but its still more of the same
And humans are just human generators.
In both cases, intelligence appears to be just an interesting emergent side effect.
1996 Deep Blue beats Gary Kasparov: "And humans are just chess engines. In both cases intelligence appears to be just an interesting side effect"
A few theories:
- They have better just stuff, but aren't release it yet. They are at the top already, it makes sense they would hold their cards until someone got close.
- They are more focused on AGI, and not letting themselves get side tracked with the LLM race
- LLMs have peaked and they don't want to release only a minor improvement.
FWIW OpenAI seems to have a corroded definition for AGI that is essentially "[An] AI system generally smarter than humans".
They don't seem to use the typical definition I'm used to of some variation of autonomy or (pseudo)-sentience.
So their LLM race is the race for AGI
This hypothesis has a curious habit of surfacing when OpenAI is fundraising. Together with the world-ending potential of their complete-the-sentence kit.
Anyone knows what's happening with open AI ?
Yeah, they're an arms dealer now.
LoRA of war.
I sometimes wonder if they hit a limit the government was comfortable with and their more advanced technologies are only available to the gov. I assume that's your implication.
Maybe there are diminishing returns on improving quality, so they're trying to improve the efficiency at a given quality level instead? There is a lot value to producing results instantly, but more importantly efficiency will allow you to serve more users (everyone is bottlenecked on inference capacity) and gain more users thanks to winning on pricing.
Yup, I think this is it. Economies of scale haven't kicked in yet, and if they continue in the same path it doesn't look achievable.
Think differently is again the way forward.
GPT 4 is so far ahead of everything else that it doesn't make much sense to rush for GPT 5 release. They could get extra GPT 5 customers when they release it as they are pretty sure no one else would take away those users.
With a higher operating cost as well.
them and meta have bought like all of Nvidia’s production capacity to run and train their next models
OpenAI has a huge amount of people working on training data. https://time.com/6247678/openai-chatgpt-kenya-workers/ https://www.nbcnews.com/tech/innovation/openai-chatgpt-ai-jo...
Alignment tax is real.
But they probably have something more powerful internally. GPT-4 took months to be available to the public.
Likely the way AI is trained and inferenced there is a huge constraint on the GPU side even if they have GPT5 out. Imagine how slow it is, which means they focus on creating useful products and APIs around it first
I know, it's really disappointing. They've only completely changed the world like 3 times last year.
Has it come to a halt? I guess looking only at the perspective of no gpt-5...then yes? But I see wide access to multi-modal. Huge increases to throughput, not much latency these days. Better turbo models though I would agree in some areas those have been steps back with poorer output quality. Adding to that list, massive cost reductions so its easy to justify any of the models being used.
I'm sorry but this comment seems to be incredibly uninformed. OpenAI just released a new version of gpt-4 turbo quite recently. And just in the last couple of days they released the ability to use @ to bring any GPT into a conversation on the fly.
Check again in two months