return to table of content

OpenAI – transformer debugger release

xcodevn
60 replies
10h25m

I must say, understanding how transformers work is arguably the most important research problem in history, assuming that AGI can be achieved by just scaling up current LLM models on text, video, audio, etc.

kindking
44 replies
9h54m

That is a very big assumption.

kromem
40 replies
9h29m

It really depends on the definition.

"Better than the average human at most profitable tasks" is a much lower bar than most people on HN might think.

I have vendors who instead of filling out a web form which remembers their inputs and eventually even fills everything out for them instead print it out and fax it back in.

We're probably only about 2-3 years away from transformers being self-optimizing enough in prompts and evaluations to outpace the average worker in most tasks in most roles. (It won't necessarily be that much cheaper after the multiple passes and context windows required, and crucially probably won't be better at all tasks in most roles.)

If you define AGI as "better than any human at profitable tasks" or "better than average at all tasks" then yes, we're a long ways off and transformers alone probably won't get us there.

lm28469
24 replies
8h3m

"Better than the average human at most profitable tasks"

I think the HN crowd forgets that what really runs the world are min wage workers running around and doing real world things, not code monkeys and glorified type writers filling excel sheets. So yes, replacing the bullshit jobs we invented to keep people busy will be relatively easy, that's if you don't account for the fact you'll now have to create bullshit+ jobs to keep them busy

And even then we're far away, sure it can shit out code for a todo webapp and create semi realistic images of a monkey eating a burrito but that's about it. More than a year ago someone betted against me here that chatgpt would revolutionise the world in the next year, nothing happened really, geeks are excited, execs are buying the hype, tons of money is transferred, yet there was no 4th industrial revolution.

What happened though is that the web is flooded with absolutely useless content, amazon is full of ai generated books, students rely more and more on chatgpt to generate homeworks, thesis, "find" solutions, &c. it might very well end up being a net negative for the average joe in the long run

hotdogscout
18 replies
7h49m

I think the HN crowd forgets that what really runs the world are min wage workers running around and doing real world things, not code monkeys and glorified type writers filling excel sheets.

This is not true at all. How many products do you use that come primarily from minimum wage workers?

If a few people responsible for Google maps running stopped working the GDP loss would be much bigger than if magnitudes more minimum wage workers did the same.

diggan
10 replies
7h47m

Forget "how many", without those people, you'd end up without food on your table. Then the rest of the products you use wouldn't matter.

hotdogscout
8 replies
7h42m

Also not true, I know the big farms of my state and they don't depend on minimum wage workers and I live in a third world country.

I doubt agricultural wages or truck drivers are minimum wage jobs where you live. Assuming US https://www.nass.usda.gov/Charts_and_Maps/Farm_Labor/fl_allw...

lm28469
3 replies
6h59m

France is one of if not the biggest agriculture power in Europe, most farmers can't even generate a 35hr min wage equivalent while working 80+ hours a week.

20% of them live in poverty, half of them make less than 22k euros a year

Truck drivers earn between min wage and 150% of min wage, while being on the road every day and not having a social life, they drive 8 hours per day and sleep in their fucking truck while some code monkey makes 300k+/year coding memeojis at apple. Guess which ones will be automated first by openai lmao

qvrjuec
1 replies
3h33m

Truck drivers earn between min wage and 150% of min wage

Where are you getting this information? It's absolutely wrong. Long haul truckers (the one's you're saying don't have social lives because they drive 8 hours per day) make $71,196 on average in the US[1].

[1] https://www.ziprecruiter.com/Salaries/LONG-HAUL-Truck-Driver...

allendoerfer
0 replies
1h58m

He is talking about France in the sentence before. There are barely any truckers in Germany with a German nationality. They are simply not competitive. Same goes for package delivery.

Just imagine what would happen to a trucker's salary in the US if it were to create a unified market with Mexico and all of Central America.

https://en.wikipedia.org/wiki/2004_enlargement_of_the_Europe...

It's not necessarily a bad thing. Economies of Eastern European countries have been growing after all and Western Europe does not have enough workers because of its demographics anyway. My take is, that everybody is winning, there is less poverty than before, but some sideffects look ugly for a while.

hotdogscout
0 replies
2h54m

half of them make less than 22k euros a year

I'd be extremely happy making this amount. Some people are just accustomed to easier lives.

while some code monkey makes 300k+/year coding memeojis at apple.

Meme position for a privileged caste statistically irrespective of skill in an institution that can piss money on anything and still succeed.

diggan
3 replies
6h58m

Farm work, especially work that doesn't require specialization (planting, maintaining, harvesting), is pretty much minimum wage work where I live, in Spain. Minimum wage here is ~1300 EUR / month. But it also differs wildly by region here, as some regions are really poor while others rich (relatively).

Besides the farm work, there is food processing workers (cutting, cleaning, basically assembly lines), packaging, workers at warehouses, people who work at the counters of the store, and all the support roles for those positions. If you go outside and eat, you have all the restaurant personnel to take into account as well.

There is a lot of low skilled labor that goes into making what we eat today. I'm not sure how you could possibly claim that none of those people are on minimum wage.

hotdogscout
2 replies
6h43m

Not all of the work you cited is essential. Would society crumble without retail?

Minimum wage in Spain is significantly more money than anything I've made in my life. It's a very comfortable position for the vast majority of the world.

There is a lot of low skilled labor that goes into making what we eat today. I'm not sure how you could possibly claim that none of those people are on minimum wage.

People doing essential work that isn't trivially replaceable have the bargaining power to charge more than the minimum wage in a moderately free market of human work, usually they do.

squigz
1 replies
3h28m

Not all of the work you cited is essential. Would society crumble without retail?

Did I miss the part where the other comment mentioned retail, or where you respond to the half dozen other examples of essential work?

Minimum wage in Spain is significantly more money than anything I've made in my life. It's a very comfortable position for the vast majority of the world.

Instead of moving the bar some more, could you just define what minimum wage would be an acceptable bar for you in this conversation?

hotdogscout
0 replies
3h7m

Yes you missed retail, read it again.

https://uk.indeed.com/career/warehouse-worker/salaries

Do you really need me to Google every single essential position known before conceding that society is not maintained by minimum wage workers?

CamperBob2
0 replies
2h36m

Take a look at what happened to farm employment figures over the last 100 years.

It was a good thing.

lm28469
6 replies
7h8m

The people picking trash in my street stopped working for 2 days and it looks like I live in some third world country now, two fucking days and it looks like I live in the middle of an open air dump

If trucks stopped deliveries every city would die in a week

If construction workers stopped building / maintaining we'd be in deep shit in 6 months or less

If the people in warehouses stopped working for a week the economy would tank like it rarely does

Nurses, doctors, bus/tram/train drivers, police, firefighters, ambulances, janitors, trash pickers, plumbers, sewer workers, electricians, people taking care of water treatment plants, power plants, teachers, social workers, ...

You could delete facebook, openai, instagram, twitter, netflix, tesla and 90% of startups from the face of the earth right now and I'd have the exact same life as yesterday. Remove any of the people I mentioned above and society would crumble in no time

And none of these are even remotely close to being automated at all, nobody cares about most of these jobs. But hey, here is a dancing kangaroo: https://www.youtube.com/watch?v=Zuivg5rz_aA

hotdogscout
5 replies
6h52m

Are any of the positions you cited minimum wage workers where you live? Again, assuming US:

https://money.cnn.com/2016/02/24/news/economy/trash-workers-...

https://money.usnews.com/careers/best-jobs/garbage-collector...

You could delete facebook, openai, instagram, twitter, netflix, tesla and 90% of startups from the face of the earth right now and I'd have the exact same life as yesterday. Remove any of the people I mentioned above and society would crumble in no time

Yes because you picked non-essential work. (?)

lm28469
3 replies
6h40m

Yes because you picked non-essential work. (?)

Then again tell me who we're automating out of the work force right now ? Trash pickers or code monkeys ? Truck drivers or artists ?

hotdogscout
0 replies
6h35m

Essential and non-replaceable are different concepts.

fkyoureadthedoc
0 replies
6h16m

All of them? We've been working on and succeeding at automating physical tasks for decades.

evilduck
0 replies
4h18m

Growing up my trash was picked up by a human and the truck crew had two or three people on it jogging house to house to pick up trash as the driver slow rolled through the neighborhood.

Now my trash is serviced by one person who mostly never leaves the cab and who would be better described as a skilled machine operator than as a menial labor role. The work isn't completely automated but technology has reduced the job market for trash truck crews by two-thirds. I'm guessing the barrier is higher now too, requiring training and certifications to run the robotics on the truck instead of physical fitness being the primary prior qualification.

rvnx
0 replies
6h40m

Even further, maybe the world would actually be better without these companies.

Now that there are great inventions like TikTok, teenagers are depressed as hell, and they don't go to meet each other to play soccer together, because the "social" networks are giving the illusion of having that connection.

joenot443
1 replies
5h15m

I think the HN crowd forgets that what really runs the world are min wage workers running around and doing real world things

Is this really true? It's certainly a nice soundbyte when you're making class arguments or trying to dunk on the "HN crowd", but I think it falls apart under any level of scrutiny.

Who keeps your lights on? Who drives your bus? Who manages your sewage? Who teaches your kids? Who builds your roads? None of them make minimum wage and would probably be a little insulted to be characterized as such.

It's pretty reductionist to call anyone outside our realm of tech a "min wage worker", they're just workers like or I. I think it's a pretty stupid and pointless exercise to subdivide people into useful or non-useful workers, serving no purpose but to further pet the smugness of HN AI skeptics.

weakfish
0 replies
2h31m

I think this comment focuses too much on the “minimum wage” aspect - the core of the argument is that those are roles not at risk to AI in its present state, not necessarily the compensation aspect

ecoquant
0 replies
6h15m

This all reminds me of my asshole great uncle making fun of me as a teenager circa 1997 while I was at the computer and on the internet.

Sarcastically asking me "Can you meet girls on that?" and then laughed.

He wasn't wrong in the short term but laughably wrong in the long term.

_Algernon_
0 replies
7h15m

there was no 4th industrial revolution.

Yet. The industrial revolution didn't happen in a year either.

CamperBob2
0 replies
2h38m

"More than a year ago"? Really? What did anyone think was going to happen in a year?

This sort of thing usually takes longer than you expect it to, and then it usually happens faster than you expect it to.

shafyy
11 replies
9h27m

"Better than the average human at most profitable tasks"

This is not the definition of AGI. You can't just make up a random definition to fit your argument, lol.

ibejoeb
2 replies
9h11m

What is the definition? The definition of AGI is one of the central points of contention in the biggest industry legal battle.

SilverBirch
1 replies
8h21m

It's only in contention because 1 of the sides has a tonne of money, a hurt ego, and is willing to pay lawyers to argue the sky is red in order to get revenge on his former colleagues. I don't think anyone would seriously claim OpenAI has achieved AGI today.

exe34
0 replies
7h40m

No, what they have is several narrow ASIs.

exe34
2 replies
9h19m

No that's the economically dominating definition. The philosophical one will happen much later or may never happen, but human society may change beyond recognition with the first one alone.

exe34
0 replies
5h12m

Dennett describes it as real magic. The magic that can be performed is not considered real magic (it's merely a trick of confidence), whereas real magic is that which couldn't possibly be done.

worldsayshi
0 replies
9h18m

I don't think the main intention was to define AGI but to zoom in on an interpretation of AGI that would provide enough value to be revolutionary.

johnthewise
0 replies
9h18m

There is no single definition of AGI. Performing most intellectual tasks human perform today is both general and a form of intelligence, so I too agree with it.

infecto
0 replies
6h6m

My understanding is that AGI has no formal definition as it means different things to different people.

The poster here created his own definition, but what is wrong with that? He set a very specific bar to achieve, something that most "high-level" thinkers in the space have not really done. Isn't the point of discourse to bring your ideas to the table?

falcor84
0 replies
9h19m

I'm actually all in on people making up new definitions for vague terms at the start of an argument as long as they're explicit about it.

And I particularly like this one, which is much more clearly measurable. If you feel AGI is taken, maybe we should coin this one as APGI or something like that

belter
0 replies
8h9m

This is the correct way to approach the answer to when and how to achieve AGI. Otherwise please present here your Engineer Specification for defining AGI...

On a timeframe for achieving AGI: https://youtu.be/cEg8cOx7UZk?t=1658

goatlover
0 replies
3h38m

Robotics is more important to AGI, because the bulk of human intelligence comes from manipulating and navigating the physical world. Which includes a large amount of social interaction. Allan’s are tools to assist humans. They aren’t automating most jobs away anytime soon.

Jensson
0 replies
2h15m

Need to be better than an average expert. Humans are general intelligences since you can train a human to do anything, so a general intelligent machine needs to be able to be trained and become equal to human experts, matching an average untrained human isn't worth much.

Buttons840
0 replies
4h54m

I have vendors who instead of filling out a web form which remembers their inputs and eventually even fills everything out for them instead print it out and fax it back in.

Somewhere along the way we built computer that are so intuitive people find printing and faxing easier than our web apps. This isn't completely the fault of any single web app, users have a lot of learned avoidance because of bad experiences with many apps.

In the end completely automating the job ended up being easier than building a good interface for a human to do the job.

ImHereToVote
2 replies
9h38m

Which might turn out to be correct. Might be wrong also. We have no priors to AGI developing. Only NGI, and we know preciously little about how to achieve NGI too, except the bedroom way.

joaogui1
0 replies
9h18m

In vitro fertilization too!

furyofantares
0 replies
9h12m

We have a lot of priors - everything we've ever done has not produced AGI.

Maybe scaling transformers is the best way forward. I'm hopeful. But it's a big assumption that it will produce AGI.

reexpressionist
4 replies
8h28m

We may well look back in future years and view the underlying approach introduced in Reexpress as among the more significant results of the first quarter of the 21st century. With Reexpress, we can generate reliable probability estimates over high-dimensional objects (e.g., LLMs), including in the presence of a non-trivial subset of distribution shifts seen in practice. A non-vacuous argument can be made that this solves the alignment/super-alignment problem (the ultimate goal of the line of work in the post above, and why I mention this here), because we can achieve this behavior via composition with networks of arbitrary size.

Because the parameters of the large neural networks are non-identifiable (in the statistical sense), we operate at the unit of analysis of labeled examples/exemplars (i.e., the observable data), with a direct connection between the Training set and the Calibration set.

This has important practical implications. It works with essentially any generative AI model. For example, we can build an 'uncertainty-aware GPT-4' for use in enterprise and professional settings, such as law: https://github.com/ReexpressAI/Example_Data/blob/main/tutori...

(The need for reliable, controllable estimates is critical regardless of any notion of AGI, since the existing LLMs are already getting baked into higher-risk settings, such as medicine, finance, and law.)

infecto
2 replies
6h4m

I hope you are not part of the founding team but if you are, you truly are doing your startup a disservice. Sharing your startup/ideas is great but doing it in the form of an advertisement "underlying approach introduced in Reexpress as among the more significant results of the first quarter of the 21st century" is just weird.

pfd1986
0 replies
5h30m

Given his username I'd say.. he is.

Agreed: disclaimer needed.

blackoil
0 replies
4h59m

Yeah, I am all in for hustling but this post is way over the top, particularly for this forum.

12345hn6789
0 replies
4h43m

Re: this is an advertisement for a product by its own employees.

logicchains
4 replies
9h8m

We already understand how transformers work: their architecture can learn to approximate a very large class of functions-on-sequences (specifically continuous sequence-to-sequence functions with compact support: https://arxiv.org/abs/1912.10077). It can do it more accurately than previous architectures like RNNs because transformers don't "forget" any information from prior items in the sequence. Training transformers to predict the next item in sequences eventually forces them to learn a function that approximates a world model (or at least a model of how the world behaves in the training text/data), and if they're large enough and trained with enough data then this world model is accurate enough for them to be useful for us.

If you're asking for understanding the actual internal world model they develop, it's basically equivalent to trying to understand a human brain's internal world model by analysing its neurons and how they fire.

theGnuMe
1 replies
6h42m

Training transformers to predict the next item in sequences eventually forces them to learn a function that approximates a world model

There is absolutely no proof for this statement.

H8crilA
0 replies
5h29m

Have you used ChatGPT? It can do (at least) simple reasoning, for example simple spatial reasoning or simple human behavior reasoning.

zamfi
0 replies
4h38m

it's basically equivalent to trying to understand a human brain's internal world model by analysing its neurons and how they fire

A major challenge here is that it's very hard/expensive to analyze these neurons, especially at any kind of scale within one human.

Not so with LLMs.

seydor
0 replies
5h15m

We shouldn't assume that either of those tasks is impossible

xxs
1 replies
8h10m

most important research problem in history

That has to be some extremely narrowed version of all research that has happened (or will happen?)

seydor
0 replies
8h0m

Considering that knowing how knowing works is at the top of the ordo cognoscendi, it s not that narrow.

otabdeveloper4
1 replies
5h14m

the most important research problem in history

Probably not.

assuming that AGI can be achieved by just scaling up current LLM models

Lmao.

xcodevn
0 replies
2h17m

Probably not.

Lmao.

resource_waste
0 replies
5h22m

assuming that AGI can be achieved by just scaling up current LLM models on text, video, audio, etc.

Is any sane person actually trying this?

I can't imagine an LLM ever going AGI.

seanieb
24 replies
10h58m

Can’t help but think Elons lawsuit will trigger more releases by OpenAI. His core arguments are BS, but raised legitimate questions about their non-profit status and lack of non-profit related activity.

nonethewiser
7 replies
5h28m

His core arguments are BS, but raised legitimate questions about their non-profit status and lack of non-profit related activity.

These legitimate questions about their non-profit status ARE his core arguments. His complaints are the same as the popular notion on hackernews that OpenAI has failed to be open.

jhonof
2 replies
3h23m

They are part of his core argument, but not the entire argument. The (imo) shakier part of the argument is that he is entitled to damages even though he doesn't own shares in the company. This is atypical, most of the time you lose that right once you get rid of your shares. It seems like he is arguing that he was misled and sold his shares because of that, which will be hard to prove if true, and even if it is true, isn't really illegal.

_diyar
1 replies
2h52m

I believe at the time it was reported that Musk and the OpenAI board chose to part ways due to his for-profit AI research (in Tesla) posing a conflict of interest to the OpenAI mission.

Hence his argument boils down to "You made me sell my shares because my 'closed' AI research conflicted with your 'open' non-profit ideals. However, you have since stoped to adhere to your mission statement, effectively seizing to publish your research findings and pursuing for-profit goals. As a consequence of this I have lost a bunch of money."

And as nuts as Musk is, I kind of see his point here.

jhonof
0 replies
6m

That argument doesn't really hold a lot of water with me to be honest. If I sell my shares in company because it isn't working on tech I agree with, and then they pivot and start working on tech I agree with and their shares pop, I am not entitled to sue them for damages.

John23832
2 replies
2h49m

These legitimate questions about their non-profit status ARE his core arguments.

Except for the portion where he advocated for a capped-profit arm which he would run under Tesla. You know that part that is shown in the OpenAI emails. It seemed he was ok with it when HE would stand to benefit.

You can't believe that Elon is "a great businessman" while also believing that he is totally altruistic and ignores all of the self serving impulses that have gotten him to where he is.

revscat
1 replies
1h47m

GP is discussing legal arguments, not motivations.

John23832
0 replies
50m

While I am not a lawyer, I think it’s pretty apparent you can’t sue someone for damages for participating in an action that you conspired to be a part of.

refulgentis
0 replies
4h32m

Erm, you get it, it's right in your comment: lawsuit about non-profit, HN about open weights.

gurgunday
6 replies
6h49m

Why do you think they are BS? You say it raised legitimate questions about their non-profit status and lack of non-profit related activity.

myko
3 replies
4h30m

I think some believe Elon's argument is BS because Elon claims to be upset about OpenAI charging for their work. However, internal communications/texts show he wanted to do the same thing. So it seems like sour grapes that he isn't the one in charge.

ramblerman
0 replies
4h2m

An argument isn’t good or bad because of the motivations of the person behind it.

I get Elon is not popular right now. But jeez this is playground emotional level logic.

_diyar
0 replies
2h44m

I detailed my understanding of this case in a sibling-comment to this one, but I think the fact that he wanted to "do the same thing" (read: use the for-profit engine to fund AI research) only makes his case stronger!

If I understand correctly, the OpenAI board asked Musk to remove himself and sell his stake as his for-profit pursuits conflicted with the mission of open, non-profit research. But after he left, they started a for-profit subsidiary and did exactly what they didn't want him doing. I could see how a judge might side with him.

GreedClarifies
0 replies
2h37m

Let's be real. They think Elon's arguments are BS since he bought Twitter and he destroyed their Leftie safe space.

They are also very upset with Elon's very public shift away from the current Left.

It's Elon derangement syndrome.

rvz
1 replies
6h22m

Why do you think they are BS?

Because me and everyone else here really hates Elon Musk.

You say it raised legitimate questions about their non-profit status and lack of non-profit related activity.

As soon as OpenAI accepted that investment from Microsoft it has only been used as a for-profit proxy to have their own ‘DeepMind’ to bring down Google which they have tried for years with Bing and the Scroogled campaigns.

OpenAI is now no better than Google DeepMind and both are as closed and secretive and OpenAI ditched from their charter and non-profit status and continues to make excuses for not giving back their research.

marknutter
0 replies
3h4m

Because me and everyone else here really hates Elon Musk.

Speak for yourself.

resource_waste
4 replies
5h23m

When I said this last year, people said

"BUT THEY ARE DEMOCRAITIZING IT"

Its good to see that I was always right, even last year. I just didn't have a dad who owned an emerald mine.

simondotau
3 replies
4h39m

People seem to imagine that Errol Musk is dead and Elon got a big inheritance from his daddy's estate. In fact Errol is still very much alive, became bankrupt sometime in the 1990s, and is now a deadbeat living off financial support from his sons.

Was there ever an emerald mine? Maybe. Though nobody has been able to find a scrap of evidence for it. Literally the only "evidence" is Errol's own boasts, which is that he bought a half-share of a Zambian mine for 100k ZAR ($50k USD). Hardly the stuff generational empires are made of.

garyfirestorm
1 replies
4h25m

if you bought 50k USD worth of $AAPL in 1990, by conservative estimates they will be worth 560 million USD worth today. I'd argue, that's generational stuff right there.

bitcurious
0 replies
4h20m

If you bought the winning lottery ticket, you’d be a lottery winner.

n2d4
0 replies
4h3m

The Zambian mine is the emerald mine everyone is talking about. From Walter Isaacson's biography:

> [Errol and a Zambian busienessman] agreed on a price [for Errol's plane], and instead of taking a payment in cash, Errol was given a portion of the emeralds produced at three small mines that the entrepreneur owned in Zambia. [...]

> Errol, who never had an ownership stake in the mine, expanded his trade by importing raw emeralds and having them cut in Johannesburg. "Many people came to me with stolen parcels," he says. "On trips overseas I would sell emeralds to jewelers. It was a cloak-and-dagger thing, because none of it was legal." After producing profits of roughly $210,000, his emerald business collapsed in the 1980s when the Russians created an artificial emerald in the lab. He lost all of his emerald earnings.

$210,000 in 1986 would be about $600,000 today.

furyofantares
1 replies
2h19m

raised legitimate questions about their non-profit status and lack of non-profit related activity

What is "non-profit related activity"? IANAL but the only justification needed for non-profit designation is not being for-profit.

And maybe their structure with a for-profit subsidiary is up for debate because of this. But you don't get to be non-profit by doing "good stuff" like open sourcing things. You get it by investing all profits back into the business.

Evil Corp for Doing Evil could be a non-profit if it invested all its profits back into the business.

LordDragonfang
0 replies
1h7m

What is "non-profit related activity"? What is "non-profit related activity"? IANAL but the only justification needed for non-profit designation is not being for-profit.

Good thing you are NAL, because this is flat out wrong. "Non-profit" is a term used to refer to specific tax-exempt institutions under US tax law 501(c), and has nothing to do with investing profits "back into the business" (otherwise, you could have called Amazon or Uber "non-profits").

OpenAI[1] is specifically a 501(c)(3)[2] which explicitly requires the organization's purpose be one of the following "exempt" purposes:

charitable, religious, educational, scientific, literary, testing for public safety, fostering national or international amateur sports competition, and preventing cruelty to children or animals. [3]

If it does not continue to fill one of those purposes, it can lose its tax-exempt nonprofit status.

[1] https://projects.propublica.org/nonprofits/organizations/810...

[2] https://www.irs.gov/charities-non-profits/charitable-organiz...

[3] https://www.irs.gov/charities-non-profits/charitable-organiz...

jeanlucas
0 replies
2h9m

Remember, open source is just on the beginning and for recruiting purposes

brucethemoose2
0 replies
3h28m

My concern is that Elon has shut the door with a sloppy lawsuit.

If some other entity (Government agency? Another company?) Comes around to pull this string, its going to set a bad precedent.

bilsbie
2 replies
7h3m

How many transformers are in a typical LLM? Or is the whole thing considered a transformer?

jerpint
0 replies
6h15m

The architecture as a whole is referred to as a transformer (autoregressive decoder-only architecture in the case of gpt style LLMs).

Note that there can be other types of LLMs too that are not necessarily transformer based (SSM, RNN, etc)

dfgtyu65r
0 replies
6h57m

Normally, the LLM is composed of multiple transformer blocks, where each block consists of the (mutli-head) attention and fully-connected feedforward components. These are then stacked on top of each other to give the final output of the network.

jolux
0 replies
2h17m

And isort, which Ruff also supports.

paulcjh
0 replies
10h0m

A very meagre attempt to look like they provide open source tools help the world safely make AGI

nuz
0 replies
4h7m

Yearly obligatory open source drop. What was it last time, whisper?

CityOfThrowaway
0 replies
14h12m

This is pretty cool. I find it intriguing that we're neural surgery on LLMs!