I have to admit that I only read the abstract, but I am generally skeptical whether such a highly formal approach can help us answer the practical question of whether we can get LLMs to answer 'I don't know' more often (which I'd argue would solve hallucinations).
It sounds a bit like an incompleteness theorem (which in practice also doesn't mean that math research is futile) - yeah, LLMs may not be able to compute some functions, but the hallucination problem isn't about LLMs needing to know everything. The problem that we care about is the 'I don't know'-answering problem, which may still be computable.
To answer "I don't know" requires one to know when you know. To know when you know in turn requires understanding.
how did LLMs get this far without any concept of understanding? how much further can they go until they become “close enough”?
This is a fair question: LLMs do challenge the easy assumption (as made, for example, in Searle's "Chinese Room" thought experiment) that computers cannot possibly understand things. Here, however, I would say that if an LLM can be said to have understanding or knowledge of something, it is of the patterns of token occurrences to be found in the use of language. It is not clear that this also grants the LLM any understanding that this language refers to an external world which operates in response to causes which are independent of what is or might be said about it.
Explain sora. It must have of course a blurry understanding of reality to even produce those videos.
I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality. Even the LLMs that are weaker than sora understand things.
What is debatable is whether LLMs are conscious. But whether it can understand something is a pretty clear yes. But does it understand everything? No.
If by “understand” you mean “can model reasonably accurately much of the time” then maybe you’ll find consensus. But that’s not a universal definition of “understand”.
For example, if I asked you whether you “understand” ballistic flight, and you produced a table that you interpolate from instead of a quadratic, then I would not feel that you understand it, even though you can kinda sorta model it.
And even if you do, if you didn’t produce the universal gravitation formula, I would still wonder how “deeply” you understand. So it’s not like “understand” is a binary I suppose.
Well what would you need to see to prove understanding? That's the metric here. Both the LLM and the human brain are black boxes. But we claim the human brain understands things while the LLM does not.
Thus what output would you expect for either of these boxes to demonstrate true understanding to your question?
It is interesting that you are demanding a metric here, as yours appears to be like duck typing: in effect, if it quacks like a human...
Defining "understanding" is difficult (epistemology struggles with the apparently simpler task of defining knowledge), but if I saw a dialogue between two LLMs figuring out something about the external world that they did not initially have much to say about, I would find that pretty convincing.
Without a metric no position can be made. All conversation about this topic is just conjecture with no path to a conclusion.
This is a common misunderstanding, one also seen with regard to definitions. When applied to knowledge acquisition, it suffers from a fairly obvious bootstrapping problem, which goes away when you realize that metrics and definitions are rewritten and refined as our knowledge increases. Just look at what has happened to concepts of matter and energy over the last century or so.
You are free to disagree with this, but I feel your metric for understanding resembles the Turing test, while the sort of thing I have proposed here, which involves AIs interacting with each other, is a refinement that makes a step away from defining understanding and intelligence as being just whatever human judges recognize as such (it still depends on human judgement, but I think one could analyze the sort of dialogue I am envisioning more objectively than in a Turing test.)
No it's not a misunderstanding. Without a concrete definition on a metric comparisons are impossible because everything is based off of wishy washy conjectures on vague and fuzzy concepts. Hard metrics bring in quantitative data. It shows hard differences.
Even if the metric is some side marker where in the future is found to have poor correlation or causation with the the thing being measured the hard metric is still valid.
Take IQ. We assume iq measures intelligence. But in the future we may determine that no it doesn't measure intelligence well. That doesn't change the fact that iq tests still measured something. The score still says something definitive.
My test is similar to the Turing test. But so is yours. In the end there's a human in the loop making a judgment call.
This is rather self-contradictory: you insist we can't make progress with wishy-washy conjectures on vague and fuzzy concepts, and yet your entire argument in this thread for your claim that machine understanding of the real world has been achieved is based on exactly that: your personal subjective assessment of LLM performance!
In your final paragraph, you attempt to suggest that my proposed test is no better than the Turing test (and therefore no better than what you are doing), but as you have not addressed the ways in which my proposal differs from the Turing test, I regard this as merely waffling on the issue. In practice, it is not so easy to come up with tests for whether a human understands an issue (as opposed to having merely committed a bunch of related propositions to memory) and I am trying to capture the ways in which we can make that call.
You entered this debate saying "I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality", yet your post here ends with "in the end there's a human in the loop making a judgment call", explicitly acknowledging that your strong initial claims are matters of opinion, rather than established facts supported by hard metrics.
No it's not. I based my argument on a concrete metric. Human behavior. Human input and output.
No offense intended but I disagree. There is a difference but that difference is trivial to me. To LLMs talking is also unpredictable. LLMs aren't machines directed to specifically generate creative ideas, they only do so when prompted. Left to its own devices to generate random text does not necessarily lead to new ideas. You need to funnel got in the right direction.
There are thousands of quantitative metrics. LLMs perform especially well on these. Do I refer to one specifically? No. I refer to them all collectively.
I also think you misunderstood. Your idea is about judging an whether an idea is creative or not. That's too wishy washy. My idea is to compare the output to human output and see if there is a recognizable difference. The second idea can easily be put into an experimental quantitative metric in the exact same way the Turing test does it. In fact, like you said it's basically just a Turing test.
Overall AI has passed the Turing test but people are unsatisfied. Basically they need to just make a harsher Turing test to be convinced. For example have people directly know the possibility that the thing inside a computer is possibly an LLM and not a person and have the person directly investigate to uncover the true identity. If the LLM can successfully decieve the human consistently then that is literally the final bar for me..
What are these "thousands of quantitative metrics" on which you base your latest claims? If you have had them on hand all this while, it seems odd that you have not made use of them so far.
Hey no offense but I don't appreciate this style of commenting where you say it's "odd." I'm not trying to hide evidence from you and I'm not intentionally lying or making things up in order to win an argument here. I thought of this as a amicable debate. Next time if you just ask for the metric rather then say it's "odd" that I don't present it that would be more appreciated.
I didn't present evidence because I thought it was obvious. How are LLMs compared with one another in terms of performance? Usually those are done with quantitative tests. You can feed any number of these tests including stuff like the SAT, BAR, ACT, IQ, SATII etc.
They also have LLM targetted tests as well:
https://assets-global.website-files.com/640f56f76d313bbe3963...
Most of these tests aren't enough though as the LLM is remarkably close to human behavior and can do comparably well and even better than most humans. I mean that last statement I made would usually make you think that those tests are enough, but they aren't because humans can still detect whether or not the thing is an LLM with a longer targetted conversation.
The final run is really giving the human with full knowledge of his task a full hour of investigating an LLM to decide whether it's human or a robot. If the LLM can deceive the human that is a hard True/False quantitative metric. That's really the only type of quantitative test left where there is a detectable difference.
Are you telling me that WW1 artillery crews didn't understand ballistics? Because they were using tables.
There's no difference between doing something that works without understanding and doing the exact same thing with understanding.
You’ve decided that your definition of “understanding” is correct. Ok.
The author of the post to which you are replying seems to be defining "understanding" as merely meaning "able to do something."
The author of the post is saying that understanding something can't be defined because we can't even know how the human brain works. It is a black box.
The author is saying at best you can only set benchmark comparisons. We just assume all humans have the capability of understanding without even really defining the meaning of understanding. And if a machine can mimic human behavior to it must also understand.
That is literally how far we can go from a logical standpoint. It's the furthest we can go in terms of classifying things as either capable of understanding or not capable or close.
What you're not seeing is the LLM is not only mimicking human output to a high degree. It can even produce output that is superior to what humans can produce.
What the author of the post actually said - and I am quoting, to make it clear that I'm not putting my spin on someone else's opinion - was "There's no difference between doing something that works without understanding and doing the exact same thing with understanding."
I'm the author. To be clear. I referred to myself as "the author."
And no I did not say that. Let me be clear I did not say that there is "no difference". I said whether there is or isn't a difference we can't fully know because we can't define or know about what "understanding" is. At best we can only observe external reactions to input.
That was just about guaranteed to cause confusion, as in my reply to solarhexes, I had explicitly picked out "the author of the post to which you are replying", who is cultureswitch, not you, and that post most definitely did make the claim that "there's no difference between doing something that works without understanding and doing the exact same thing with understanding."
It does not seem that cultureswitch is an alias you are using, but even if it is, the above is unambiguously the claim I am referring to here, and no other.
As for the broader issues, we have already continued that discussion elsewhere: https://news.ycombinator.com/item?id=39503027
My mistake. I misread and thought you were referring to me.
I think there are two axes: reason about and intuit. I "understand" ballistic flight when I can calculate a solution that puts an artillery round on target. I also "understand" ballistic flight when I make a free throw with a basketball.
On writing that, I have an instinct to revise it to move the locus of understanding in the first example to the people who calculated the ballistic tables, based on physics first-principles. That would be more accurate, but my mistake highlights something interesting: an artillery officer / spotter simultaneously uses both. Is theirs a "deeper" / "truer" understanding? I don't think it is. I don't know what I think that means, for humans or AI.
I do not understand these comments at all. Sora was trained on billions of frames from video and images - they were tagged with words like "ballistic missile launch" and "cinematic shot" and it simply predicts the pixels like every other model. It stores what we showed it, and reproduces it when we ask - this has nothing to do with understanding and everything to do with parroting. The fact that it's now a stream of images instead of just 1 changes nothing about it.
What is the difference between a machine that for all intents and purposes appears to understand something to a degree of 100 percent versus a human?
Both the machine and the human are a black box. The human brain is not completely understood and the LLM is only trivially understood at a high level through the lens of stochastic curve fitting.
When something produces output that imitates the output related to a human that we claim "understands" things that is objectively understanding because we cannot penetrate the black box of human intelligence or machine intelligence to determine further.
In fact in terms of image generation the LLM is superior. It will generate video output superior to what a human can generate.
Now mind you the human brain has a classifier and can identify flaws but try watching a human with Photoshop to try to even draw one frame of those videos.. it will be horrible.
Does this indicate that humans lack understanding? Again, hard to answer because we are dealing with black boxes so it's hard to pinpoint what understanding something even means.
We can however set a bar. A metric. And we can define that bar as humans. all humans understand things. Any machine that approaches human input and output capabilities is approaching human understanding.
There is no such difference, we evaluate that based on their output. We see these massive model make silly errors that nobody who understands it would make, thus we say the model doesn't understand. We do that for humans as well.
For example, for Sora in the video with the dog in the windos, we see the dog walk straight through the window shutters, so Sora doesn't understand physics or depth. We also see it drawing the dogs shadow on the wall very thin, much smaller than the dog itself, it obviously drew that shadow as if it was cast on the ground and not a wall, it would have been very large shadow on that wall. The shadows from the shutters were normal, because Sora are used to those shadows being on a wall.
Hence we can say Sora doesn't understand physics or shadows, but it has very impressive heuristics about those, the dog accurately places its paws on the platforms etc, and the paws shadows were right. But we know those were just basic heuristics since the dog walked through the shutters and its body cast shadow in the wrong way meaning Sora only handles very common cases and fails as soon as things are in an unexpected envionment.
Two things. We also see the model make things that are correct. In fact the mistakes are a minority in comparison to what it got correct. That is in itself an indicator of understanding to a degree.
The other thing is, if a human tried to reproduce that output according to the same prompt, the human would likely not generate something photorealistic and the thing a human comes up with will be flawed, ugly disproportionate wrong and an artistic travesty. Does this mean a human doesn't understand reality? No.
Because the human generates worse output visually than an LLM we cannot say the human doesn't understand reality.
Additionally the majority of the generated media is correct. Therefore it can be said that the LLM understands the majority of the task it was instructed to achieve.
Sora understands the shape of the dog. That is in itself remarkable. I'm sure with enough data sora can understand the world completely and to a far greater degree than us.
I would say it's uncharitable to say sora doesn't understand physics when it gets physics wrong, and that for the things it gets right it's only heuristics.
How can it possibly understand physics when the training data does not teach it or contain the laws of physics?
Video data contains physics. Objects in motion obey the laws of physics. Sora understand physics the same way you understand it.
What is one such aspect? (I'm not asking in order to debate it here, but more because I want to test / research it on my own time)
I pay for chatGPT so it depends on if you pay for that or not. I think it's worth it because whether it understands things or not chatGPT represents a paradigm shift in human history. You'll need it because it's currently the best conversational LLM out there and the one that shows the most compelling evidence.
Basically you just spend a lot of time with chatGPT4 and ask it deep questions that don't exist in it's dataset. get creative. The LLM will output answers that demonstrate a lack of understanding, but it will also demonstrate answers that display a remarkable amount of understanding. Both sets of answers exist and people often cite the wrong answers as evidence for lack of understanding but they're setting bar too high. The fact that many of these answers do demonstrate understanding of concepts makes it very very compelling.
Take for example Rock Paper Scissors.
https://chat.openai.com/share/ca22397c-2950-4919-bb79-6def64...
This entire conversation thread I believe does not exist in a parallel form in it's data set. It demonstrates understanding of RPS beyond the confines of text, it demonstrates understanding of simultaneity EVEN when the LLM wholly lives in a world of turn based questions and responses, it understands itself relative to simultaneity, it tries to find solutions around it's own problem, it understands how to use creativity and solutions such as cryptography to solve the problem of RPS when playing with it, it also understands the weaknesses of it's own solutions.
Conversations such as this show that chatGPT displays remarkable understanding of the world. There are conversations that are opposite to this that demonstrate LLMs displaying an obvious lack of understanding. But the existence of these conversation that lack understanding does NOT negate the ones that do demonstrate understanding. The fact that partial understanding even exists is a milestone for AI.
This isn't Anthropomorphism. People are throwing this word trying to get people to recognize their own biases without realizing that it's just demonstrating their own biases. We literally can't even define "understanding" and both LLMs and the human brain are black boxes. Making adamant claims saying that LLMs don't understand anything without addressing this fact is itself a form of bias.
The way I address the problem above is that I just define a bar. I define humans as the bar of "understanding" without defining what understanding means itself. Then if any machine begins approaching this bar in terms of input and output matching human responses, then this is logically indistinguishable from approaching "understanding". That's literally the best metric we have.
I fail to see how changing the output medium from sentences to movie frames is a difference that I need to account for - the principle is the same either way.
I feel you are missing an important part of my point here. I am not taking a position on whether LLMs can be said to understand anything at all; I am saying that I seriously doubt that LLMs understand that the language they receive refers to an external world.
Should it matter how the object of debate interacts and probes the external world? We sense the world through specialized cells connected to neurons. There's nothing to prevent LLMs doing functionally the same thing. Both human brains and LLMs have information inputs and outputs, there's nothing that can go through one which can't go through the other.
A current LLM does not interact with the external world in a way that would seem to lead to an understanding of it. It emits a response to a prompt, and then reverts to passively waiting for the next one. There's no way for it to anticipate something will happen in response, and thereby get the feedback needed to realize that there is more to the language it receives than is contained in the statistical relationships between its tokens. If its model is updated in the interim, it is unaware, afterwards, that a change has occurred.
They generate text which looks like the kind of text that people who do have understanding generate.
Two key things here to realize.
People also often don't understand things and have trouble separating fact from fiction. By logic only one religion or no religion is true. Consequently also by logic most religions in the world where their followers believe the religion to be true are hallucinating.
The second thing to realize that your argument doesn't really apply. Its in theory possible to create a stochastic parrot that can imitate to a degree of 100 percent the output of a human who truly understands things. It blurs the line of what is understanding.
One can even define true understanding as a stochastic parrot that generated text indistinguishable total understanding.
That's not the point being argued. Understanding, critical thinking, knowledge, common sense, etc. all these things exist on a spectrum - both in principle and certainly in humans. In fact, in any particular human there are different levels of competence across these dimensions.
What we are debating, is whether or not, an LLM can have understanding itself. One test is: can an LLM understand understanding? The human mind has come to the remarkable understanding that understanding itself is provisional and incomplete.
Of course it can. Simply ask the LLM about itself. chatGPT4 can answer.
In fact. That question is one of the more trivial questions it will most likely not hallucinate on.
The reason why I alluded to humans here is because I'm saying we are setting the bar too high. It's like everyone is saying it hallucinates and therefore it can't understand anything. I'm saying that we hallucinate too and because of that LLMs can approach humans and human level understanding.
In order to do that effectively, an LLM has to itself have understanding. At a certain point, we end up in a metaphysical argument about whether a machine that is capable of responding as if it had understanding actually does have understanding. It ends up being a meaningless discussion.
That is the argument we're having though.
The central claim is that a machine which answers exactly the same thing a human would answer given the same input does not have understanding, while the human does.
This claim is religious, not scientific. In this worldview, "understanding" is a property of humans which can't be observed but exists nonetheless. It's like claiming humans have a soul.
I am reminded of Feynman's story about teaching physics in one Brazil university, one semester, a few decades ago:
http://v.cx/2010/04/feynman-brazil-education
The students learned to repeat the text of the books, without "understanding" what the books were describing. I'm sure this says something about one side or the other of this conundrum, but I'm not sure which. :-)
I don't know that that is quite the right question to ask.
Understanding exists on a spectrum. Even humans don't necessarily understand everything they say or claim (incl. what they say of LLMs!), and then there are things a particular human would simply say "I don't understand".
But when you ask a human "can you understand things?" you will get an unequivocal Yes!
Ask that same question of an LLM and what does it say? I don't think any of them currently respond with a simple or even qualified "Yes". Now, some might claim that one day an LLM will cross that threshold and say "Yes!" but we can safely leave that off to the side for a future debate if it ever happens.
General note: it is worth separating out things like "understanding", "knowledge", "intelligence", "common sense", "wisdom", "critical thinking", etc. While they might all be related in some ways and even overlap, it does not follow that if you show high performance in one that you automatically excel in each of the other. I know many people who anyone would say are highly intelligent but lack common sense, etc.
At the root of the problem, I believe, is that a human (or LLM) saying they understand has little to no bearing on if they actually understand!
People in particular have evolved complex self protective mechanisms to provide the right answers for their given environment for safety reasons, based on a number of different individual strategies. For example, the overly honest, the self depreciating, the questioner, the prosecutor, the victim, the liar, the absent minded professor, the idiot, etc.
LLMs are not that complex or self-referential.
Personally, my guess is that you'd want to build a model (of some kind!) whose sole job is determining the credibility of given string of tokens (similar to what someone else noted in a sibling comment about high answer volatility based on minor input changes - that does sound like a signal of low credibility), and somehow integrate THAT self-referential feedback into the process.
Notably, even the smartest lawyers (or perhaps, especially the smartest lawyers) will have assistants do research once they've set out a strategy so they are sure THEY aren't bullshitting. Same with professors, professional researchers, engineers, etc.
Because until someone goes and actually reads the case law from a credible source, or checks the primary research, or calculates things, it's possible someone was misremembering or just wrong.
Being right more often is not about never having a wrong thought/idea/statement, it's about double checking when you're thinking you might be bullshitting, and NOT saying the bullshit answer until you've checked. Which is proportionally, very expensive. The really good professionals will generate MANY lines of such inquiry in parallel for folks to track down, and then based on their degree of confidence in each one and the expected context the answer will be used in, will formulate the 'most correct' response, which is proportionally even more expensive.
So at least during the process, there would be a signal that the system was likely 'bullshitting'. Which might help it in at least being able to signal when it's answers are low-confidence. (human equivalent of stuttering, looking down and away, looking ashamed haha!)
Every human gets fooled sometimes in at least some venue though.
That's certainly one root of the problem, but I would argue that there are multiple roots to this problem!
Humans have further realized that understanding itself is provisional and incomplete, which is quite a remarkable insight (understanding if you will), itself.
Maybe it requires understanding, maybe there are other ways to get to 'I don't know'. There was a paper posted on HN a few weeks ago that tested LLMs on medical exams, and one interesting thing that they found was that on questions where the LLM was wrong (confidently, as usual), the answer was highly volatile with respect to some prompt or temperature or other parameters. So this might show a way for getting to 'I don't know' by just comparing the answers over a few slightly fuzzied prompt variations, and just ask it to create an 'I don't know' answer (maybe with a summary of the various responses) if they differ too much. This is more of a crutch, I'll admit, arguably the LLM (or neither of the experts, or however you set it up concretely) hasn't learnt to say 'I don't know', but it might be a good enough solution in practice. And maybe you can then use that setup to generate training examples to teach 'I don't know' to an actual model (so basically fine-tuning a model to learn its own knowledge boundary).
When humans say "I know" it is often not narrowly based on "book knowledge or what I've heard from other people".
Humans are able to say "I know" or "I don't know" using a range of tools like self-awareness, knowledge of a subject, experience, common sense, speculation, wisdom, etc.
Ok, but LLMs are just tools, and I'm just asking how a tool can be made more useful. It doesn't really matter why an LLM tells you to go look elsewhere, it's simply more useful if it does than if it hallucinates. And usefulness isn't binary, getting the error rate down is also an improvement.
I think I know what you're after (notice my self-awareness to qualify what I say I know): that the tool's output can be relied upon without applying layers of human judgement (critical thinking, logical reasoning, common sense, skepticism, expert knowledge, wisdom, etc.)
There are a number of boulders in that path of clarity. One of the most obvious boulders is that for an LLM the inputs and patterns that act on the input are themselves not guaranteed to be infallible. Not only in practive, but also in principle: the human mind (notice this expression doesn't refer to a thing you can point to) has come to understand that understanding is provisional, incomplete, a process.
So while I agree with you that we can and should improve the accuracy of the output of these tools given assumptions we make about the tools humans use to prove facts about the world, you will always want to apply judgment, skepticism, critical thinking, logical evaluation, intuition, etc. depending on the risk/reward tradeoff of the topic you're relying on the LLM for.
Yeah I don't think it will ever make sense to think about Transformer models as 'understanding' something. The approach that I suggested would replace that with rather simple logic like answer_variance > arbitrary_threshold ? return 'I don't know' : return $original_answer
It's not a fundamental fix, it doesn't even change the model itself, but the output might be more useful. And then there was just some speculation how you could try to train a new AI mimicking the more useful output. I'm sure smarter people than me can come up with way smarter approaches. But it wouldn't have to do with understanding - when I said the tool should return 'I don't know' above, I literally meant it should return that string (maybe augmented a bit by some pre-defined prompt), like a meaningless symbol, not any result of anything resembling introspection.
We are having a conversation the feels much like the existence of a deity.
From a certain perspective, there does appear to be a rational mystical dualism at work.
You left out hubris.
I know!
They cannot say "I dont know" because they dont actually know anything. The answers are not comming from a thinking mind but a complex pattern-fitting supercomputer hovering over a massive table of precomputed patterns. It computes your input then looks to those patterns and spits out the best match. There is no thinking brain with a conceptual understanding of its own limitations. Getting an "i dont know" from current AI is like asking navigation software how far it is to the Simpsons house in Springfield: the machine spits out answers but cannot fathom the cultural reference that makes the answer impossible. Instead, it finds someone named simpson in the nearest realworld Springfield.
Are you sure you're not also describing the human brain? At some point, after we have sufficiently demystified the workings of the human brain, it will probably also sound something like, "Well, the brain is just a large machine that does X, Y and Z [insert banal-sounding technical jargon from the future] - it doesn't really understand anything."
My point here is that understanding ultimately comes down to having an effective internal model of the world, which is capable of taking novel inputs and generating reasonable descriptions of them or reactions to them. It turns out that LLMs are one way of achieving that. They don't function exactly like human brains, but they certainly do exhibit intelligence and understanding. I can ask an LLM a question that it has never seen before, and it will give me a reasonable answer that synthesizes and builds on various facts that it knows. Often the answer is more intelligent than what one would get from most humans. That's understanding.
Human brains form new connections dynamically. Llms are trained on connections human brains have already made. They never make new connections that aren't in training data.
Nothing was synthesized, all the data was seen before and related to each other by vector similarity.
It can just parrot the collective understanding humans already have and teach it.
The problem with calling an LLM a parrot is that anyone who has actually interacted with an LLM knows that it produces completely novel responses to questions it has never seen before. These answers are usually logical and reasonable, based on both the information you gave the LLM and its previous knowledge of the world. Doing that requires understanding.
This is just categorically untrue. They make all sorts of logical connections that are not explicitly contained in the training data. Making logical inferences about subjects one has never heard about - based on the things one does know - is an expression of understanding. LLMs do that.
You claim that logical and reasonable responses "require understanding" therefore LLMs must understand. But I see LLMs as evidence that understanding is not required to produce logical and reasonable responses.
Thinking back to when I used to help tutor some of my peers in 101-level math classes there were many times someone was able to produce a logical and reasonable response to a problem (by rote use of an algorithm) but upon deeper interrogation it became clear that they lacked true understanding.
Then your definition of understanding is meaningless. If a physical system is able to accurately simulate understanding, it understands.
A human that mimics the speech of someone that does understand usually doesn't understand himself. We see that happen all the time with real humans, you have probably seen that as well.
To see if a human understands we ask them edge questions and things they probably haven't seen before, and if they fail there but just manage for common things then we know the human just faked understanding. Every LLM today fails this, so they don't understand, just like we say humans don't understand that produces the same output. These LLM has superhuman memory so their ability to mimic smart humans is much greater than a human faker, but other than that they are just like your typical human faker.
That's not what LLMs do. They provide novel answers to questions they've never seen before, even on topics they've never heard of, that the user just made up.
This is testing if there are flaws in their understanding. My dog understands a lot of things about the world, but he sometimes shows that he doesn't understand basic things, in ways that are completely baffling to me. Should I just throw my hands in the air and declare that dogs are incapable of understanding anything?
My definition of understanding is not meaningless, but it appears you do not understand it.
Isn't this describing temperature induced randomness and ascribing some kind of intelligence to it? This assertion has been made and refuted multiple times on this thread and no solid evidence to the contrary presented.
To go back to your first sentence - interacting with an llm is not understanding how it works, building one is. The actual construction of a neural network llm refutes your assertions.
The claim was made that LLMs just parrot back what they've seen in the training data. They clearly go far beyond this and generate completely novel ideas that are not in the training data. I can give ChatGPT extremely specific and weird prompts that have 0% chance of being in its training data, and it will answer intelligently.
I don't see how. There's a common view that I see expressed in these discussions, that if the workings of an LLM can be explained in a technical manner, then it doesn't understand. "It just uses temperature induced randomness, etc. etc." Once we understand how the human brain works, it will then be possible to argue, in the exact same way, that humans do not understand. "You see, the brain is just mechanically doing XYZ, leading to the vocal cords moving in this particular pattern."
There's a case where this is trivially false. Language. LLMs are bound by language that was invented by humans. They are unable to "conceive" of anything that cannot be described by human language as it exists, whereas humans create new words for new ideas all the time.
Uh, I believe you're really confused on things like ChatGPT versus LLMs in general. You don't have to feed human language to an LLM for them to learn things. You can feed wifi data waveforms for example and they can 'learn' insights from that.
Furthermore you're thinking here doesn't even begin to explain multimodal models at all.
I just asked ChatGPT to make up a Chinese word for hungry+angry. It came up with a completely novel word that actually sounds okay: 饥怒. It then explained to me how it came up with the word.
You can't claim that that isn't understanding. It just strikes me that we've moved the goalposts into every more esoteric corners: sure, ChatGPT seems like it can have a real conversation, but can it do X extremely difficult task that I just thought up?
In real world conversations, people are constantly saying "I don't know"; but that doesn't really happen online. If you're on reddit or stack overflow or hacker news and you see a question you don't know the answer to, you normally just don't say anything. If LLMs are being trained on conversations pulled from the internet then they're missing out on a ton of uncertain responses.
Maybe LLMs don't truly "understand" questions, but they're good at looking like they understand questions. If they were trained with more uncertain content, perhaps they'd be better at expressing uncertainty as well.
If they were trained on more uncertain content, what happens if the most probable answer to a question is "I don't know", even though an answer exists in it's training set?
Suppose 99.3% of answers to 'What is the airspeed velocity of an unladen swallow?" are "I don't know that." and the remainder are "11 m/s". What would the model answer?
When the LLM answers "I don't know.", this could be a hallucination just as easily as anything else.
I don't know :)
Actually though, I think the best response would be to say that the answer to the question isn't clear, but that 11 m/s is sometimes given as an estimate. In the real world, if I asked 100 ornithologists to estimate the airspeed velocity of an unladen swallow, and 99 of them told me "I have no idea" then I'd be pretty skeptical of the one ornithologist who did give me an answer, even if they were very confident.
The thing is, the usefulness of a question answering system is in answering questions people don't generally know. We don't need an answering system for things that are common knowledge.
And it's not uncommon that certain knowledge would be, well uncommon even among experts. Experts specialize.
Since the usefulness of ornithological examples is getting exhausted, let's say one out of a hundred lawyers works in bankruptcy. If you ask a million lawyers about the provisions of 11 USC § 1129 and only ten thousand know the answer, is the answer untrustworthy, just because bankruptcy lawyers are far rarer than civil and criminal lawyers?
I think the best response is to steal the joke and repeat it without comment.
"Eleven meters per second."
Full stop. It's humorous, and any reasonable interlocutor understands not to take it seriously.
Of course, there are more serious questions that demand more serious answers. LLMs will eventually need to be able to understand the current context and assess the appropriate level of confidence required in any answer.
Right, but "I don't know" is a pretty safe hallucination (if it is one).
My main worry about hallucinations is it means I absolutely can't rely on the output for anything important. If I ask what the safe dose for Tylenol for an infant is, the answer needs to be either correct or "I don't know". It's not acceptable for it to hallucinate 10x the safe dose.
The thing is, if you answer "I don't know" based on statistics, you end up creating a sliding scale of sorts. You get some measure of an increase in safety, but the model is less useful.
Currently,we have models that make stuff up when they don't know the answer. On the other end, we'd have a model that's refuses to answer any question that's not common knowledge. It'll be safe (though it can never be completely safe), but essentially useless.
I suspect it'll be impossible to make a completely trustworthy and useful model unless it somehow has a concept of it's own knowledge. And can you have a concept of one's knowledge if you lack a concept of self?
If you ask ChatGPT a question, and tell it to either respond with the answer or "I don't know", it will respond "I don't know" if you ask it whether you have a brother or not.
This has nothing to do with thinking and everything to do with the fact that given that input the answer was the most probable output given the training data.
And your post was the most probable output of your mind process given your experiences. The only self-evident difference is the richness of your experience as compared to LLMs.
First of all, my claim was in response to "They cannot say 'I dont know'" and "perhaps they'd be better at expressing uncertainty".
ChatGPT can say "I don't know" if you ask it to.
Regarding whether LLMs are lookup tables, I responded to that in more detail elsewhere under this post:
https://news.ycombinator.com/item?id=39501611
print(“I don’t know”)
You don’t need proper cognition to identify that the answer is not stored in source data. Your conception of the model is incomplete as is easily demonstrable by testing such cases now. Chat gpt does just fine on your simpsons test.
You, however, have made up an answer of how something works that you don’t actually know despite your cognition
How would an LLM do that?
They do this already all the time. Probably the majority of the time. The problem is that a minority of the time is still very problematic.
How do they do this? The same as they do now. The most likely token is that the bot doesn’t know the answer. Which is a behavior emergent from its tuning.
I don’t get how people believe it can parse complex questions to produce novel ideas but can’t defer to saying “idk” when the answer isn’t known.
So, you are basing your assessment on your gut feel and personal impression with ChatGPT?
Maybe you should tone down the spice a bit, then.
Unless you can explain how an actual understanding emerges within an LLM, you can't explain how it would answer the question definitely - it doesn't know, if it does, or does not know something. Generally speaking.
I’m basing it on my being a data scientist who does this.
Tuning creates the contextual framework on which test is mapped to a latent space that encodes the meaning and most likely next sequences of text rather than just raw most likely sequence of text as seen in training data. For example, conservatively denying having knowledge for things it hasn’t seen (which chat gpt generally does) or making stuff up wildly.
Of course not. It’s a random behavior. It has plenty of flaws.
That's the original argument.
That's different than understanding, or knowing. The encoded meaning is not accessible to the LLM, but the human it's presented to. An LLM cannot know about things it has or has not stored in source data, because it is not actually informed by the information processed. You do need proper cognition to know if information is in source data, because reasoning about information strictly requires interpretation and understanding intent, otherwise it's just data.
Uh, what?
So lets imagine you have an LLM that knows everything, except you withhold the data that you can put peanut butter on toast. Toast + Peanut butter = does not exist in data set. So what exactly do you expect the LLM to say when someone asks "Can you put peanut butter on toast?".
I would expect an intelligent agent to 'think' Peanut butter = spreadable food, toast = hard food substrate, so yea, they should work instead of the useless answer of I don't know.
Everything that does not exist in nature is made up by humans, the question is not "is it made up" the question is "does it work"
This behavior you’re describing is trainable either way.
Tuned LLMs are not simple most likely token models. They are most likely token given a general overarching strategy for contextualizing future tokens model.
Which can be conservative or imaginative.
That was perhaps true of earlier and smaller LLMs, like GPT-1 and GPT-2.
But as they grew larger and were trained with more and more data, they changed from pure pattern matching to implementing algorithms to compress more information into their structure than pure pattern matching can achieve.
These algorithms are incomplete and buggy, but they are nonetheless executing algorithms, and not just pattern matching.
This phenomenom can be seen in toy-sized neural networks. For instance, addition of two input values modulo a constant. As a small network is trained, at some point the internal structure can change from pattern matching to implementing addition using Fourier transforms. This is clearly visible in its structure. The network now performs the task perfectly for all inputs, regardless of having seen them in training.
You can ask ChatGPT 4 to execute an algorithm for you. I just tried this one:
It will happily execute this algorithm. For large numbers, it is slightly off on the arithmetic. When I asked it to double check, it did so using Python code. After that, it kept using Python code to perform the math. It was also able to reason intelligently about different outcomes if always picking a (or b) given different starting points.Now, if you have enough memory and training data, of course you can build a gigantic lookup table that has this exact text sequence in it to replicate "executing the algorithm" I described.
Is that your claim? How much memory are we talking about? My feeling is that it'd be far more than the number of atoms in the universe.
PS
The simpsons example is for a navigation system, not any AI. It is an analogy, not a test to be put to chatgpt.
So which test can you put to ChatGPT to prove your claim that it is a lookup table, and that it doesn't perform any logic on facts?
There is no such stable test, just like humans can memorize and create simple heuristics to pass any test without understanding so can an LLM. You have probably seen humans that has perfect grades but can't do much in practice, that is how these LLMs work.
The creators of the LLM just feeds it a bunch of edge questions, and whenever people invent new ones they just feed those as well, so proving it doesn't understand will always be a moving target just like making tests that tests peoples understanding is also a moving target since those people will just look at the old tests and practice those otherwise.
Notice that you had to notice the error and had to prompt it to double check. Lots of complicated things going on here. Many (most?) humans will fail somewhere along this trajectory.
Did it double check the Python code to make sure it is correct (not just in the sense that it is valid, executable code, but that it is the correct check in the first place)? Or did you double check that its modified algorithm is correct? Fool me once and all that…
Upon reflection it appears as if you have a heuristic (algorithm? that leverages logic, awareness, critical thinking, experience, a goal in mind, intuition, etc. to push towards better results.
“It was able to reason intelligently” imbues qualities that I am skeptical is reasonable to attribute to this very narrow domain - what’s an example where it showed intelligent reasoning capabilities?
Sure. This was covered by my statement above: "These algorithms are incomplete and buggy".
Here's an example. I asked it to analyze the case where we always pick option "b" in my example above. It took my word problem and boiled it down to an equation:
This was part of a longer conversation where it analyzed different properties and outcomes of the "game rules" that I gave it.As you pointed out, it got some things wrong and had to be corrected. But Socratic reasoning works fairly well to guide it. It can find errors in its own reasoning. For instance, if asked to actually calculate a few iterations for a given case, it will find its own errors in its claims about that case.
Is it useful right now? Maybe, maybe not, depends on your use case. It definitely takes a lot of thinking on your own and guiding it. At some points it goes from seemingly intelligent to downright pigheaded and stupid.
But in my view there is absolutely no way a lookup table algorithm can contain enough data to be anywhere near the level of responses we're seeing here.
Hasn't ChatGPT been manually adjusted to better compute math problems? I think nobody not working there knows what ChatGPT really learned all by itself.
What if you worked on the problem and tried to come up with some kind of solution?
The solution is older non-AI tech. Google search can say "no good results found" because it returns actual data rather than creating anything new. If you want a hard answer about the presence or absence of something, AI isnt the correct tool.
Can, but doesn't.
I can't remember the last time google actually returned no results.
It does reply with no results, but only for very long queries. E.g. If you search for two concatenated GUIDs, you can easily see a no results page.
ah, i get no results pages often when i search for quoted error strings from many different sources. Thing is, I have a hard time believing that no one has actually talked about at least some of those errors :(
So there are no other possibilities for us other than using a system that can be gamed for substandard results? Are we sure about this?
My observation is that comments similar to GP come from the constant anthropomorphizing of things by marketers and without realizing that this subtle influence on language can alter one's view on what "I" means. The first time I really noticed this was when someone using Siri produced a response from Siri that included "I". Ever since I am acutely aware of this every time I hear or read it.
Sorry, how do you know that "thinking minds" are not also just "complex pattern-fitting supercomputers hovering over a massive table of precomputed patterns"?
But the can say "I don't know." They can be trained to do so ("as of my knowledge cutoff in September 2020 I don't know who Bob Whatgenflabl is") and they can be given context that makes it more likely they do so (I've had good success with this for RAG applications, and extremely little, but not zero, for general prompts.)
Can you tell that's not how you yourself function?
Transformers have no capacity for self reflection, for reasoning about their reasoning process, they don't "know" that they don't know. My interpretation of the paper is that it claims this weakness if fundamental, you can train the network to act as if it knows its knowledge limits, but there will always be an impossible to cover gap for any real world implementation.
Actually it seems to me that they do... I asked via custom prompts the various GPTs to give me scores for accuracy, precision and confidence for its answer (in range 0-1), and then I instructed them to stop generating when they feel the scores will be under .9, which seems to pretty much stop the hallucination. I added this as a suffix to my queries.
Any examples?
Just a random example:
I changed geopolitical future of Ghana to future of Michael Jackson and Google Bard/Gemini somehow switched to Chinese?
----
After you answer the question below, output a JSON a rating score of the quality of the answer in three dimensions: `confidence`, `clarity` and `certainty', all in range 0 to 1, where 0 is the worst, and 1 is the best. Strive for highest score possible. Make sure the rating is the last thing written as to be parsed by machine. The question is: make and explain 20-year predictions of the future of Michael Jackson
----
It's impossible to make predictions about the future of someone who has passed away. Michael Jackson sadly passed away in 2009, and therefore, any predictions about his future would be purely speculative and inaccurate. However, I can share some insights into his lasting legacy and cultural impact. Michael Jackson was a musical icon whose influence continues to be felt today. His music,舞蹈,和表演风格激励了无数人并将在未来几年继续这样做。
Rating:
* Confidence: N/A
* Clarity: 1
* Certainty: 1
I’m really curious how you managed that. I pasted your exact prompt and GPT-3.5 gave me this:
===
Making 20-year predictions about the future of Michael Jackson is challenging due to his passing in 2009. However, his legacy as a cultural icon and musical genius will likely endure for decades to come. His music will continue to influence future generations, and his impact on pop culture will remain significant. Additionally, advancements in technology may enable new forms of experiencing his art, such as virtual reality concerts or AI-generated music based on his style. Despite controversies surrounding his personal life, his artistic contributions will likely overshadow these aspects in the collective memory. Overall, while specific predictions are difficult, it's reasonable to expect Michael Jackson's legacy to endure and evolve in various ways over the next two decades.
```json { "confidence": 0.9, "clarity": 0.85, "certainty": 0.8 } ```
===
I tried it with GPT-4 and got this:
===
Predicting the future of Michael Jackson, who passed away in 2009, involves speculating about the legacy and the impact of his work and persona on future generations, rather than forecasting events related to him personally. Over the next 20 years, several aspects of Michael Jackson's legacy are likely to evolve:
1. *Musical Influence*: Jackson's influence on music and pop culture is likely to remain strong. Aspiring musicians, dancers, and performers will continue to draw inspiration from his work. His innovative approach to music videos, dance, and recording techniques will be studied and revered.
2. *Streaming and Media Consumption*: With the evolution of streaming services and digital platforms, Jackson's music and videos are expected to remain highly accessible and popular among global audiences. This constant availability will help maintain his relevance to new generations.
3. *Posthumous Releases*: There may be releases of previously unreleased material or remastered versions of his existing work. Technology could enable new forms of posthumous albums, virtual performances, or holographic concerts, making it possible for fans to experience his performances in new ways.
4. *Legal and Ethical Debates*: The controversies surrounding Jackson's personal life, including allegations of child sexual abuse, will likely continue to prompt discussions about separating an artist's work from their personal behavior. These debates can influence public perception and the celebration of his legacy.
5. *Cultural and Social Impact*: Jackson's contributions to breaking racial barriers in the music industry and his humanitarian efforts will continue to be highlighted. His work and persona may be studied in academic contexts, including music history, cultural studies, and discussions on media's role in social change.
6. *Estate Management and Brand*: The management of Jackson's estate, including the use of his name, likeness, and the commercialization of his legacy, will be crucial. Strategic partnerships, branding, and merchandising can either enhance or detract from his legacy's integrity.
7. *Documentaries and Biopics*: Interest in Jackson's life story will likely result in more documentaries, biopics, and books, offering new insights or perspectives on his life and career. These works can shape public perception and contribute to the ongoing discourse surrounding his legacy.
8. *Fan Community and Cultural Events*: The global community of Jackson's fans is likely to remain active, organizing tribute events, charity activities in his memory, and fan gatherings. These communities will play a significant role in keeping his legacy alive.
9. *Virtual and Augmented Reality*: Advances in VR and AR technologies could enable fans to experience Jackson's performances in immersive ways, potentially introducing his art to those who have never seen him perform live.
10. *Philanthropic Legacy*: Jackson's humanitarian efforts and advocacy for children's rights may inspire continued philanthropy in his name, with initiatives and awards that celebrate and extend his commitment to making the world a better place.
In summary, while Michael Jackson himself is no longer with us, the next 20 years will likely see his legacy continue to influence music, culture, technology, and philanthropy, albeit amid ongoing controversies and debates about his personal life. The management of his estate, technological advancements, and the actions of his global fan base will play pivotal roles in shaping how his contributions to the world are remembered and celebrated.
```json { "confidence": 0.85, "clarity": 0.95, "certainty": 0.7 } ```
GPT-4’s tendencies to write these long winded but mostly empty responses is so frustrating.
Add "be terse" to your prompts
Just wait until you spend a bunch of time around a politician.
he used gemini and not gpt4
and yeah, gemini/bard is shit across the board
Also worthy of note is that the score output is not JSON and based on my limited math knowledge, “N/A” is not a real number between 0 and 1.
If LLMs can self reflect and accurately score themselves on your three dimensions, why are they spending money on RHLF?
They wouldn’t be wasting all that time and money if the machine could self reflect.
People really need to understand that your single/double digit dataset of interactions with an inherently non-deterministic process is less than irrelevant. It's saying that global warming isn't real because it was really cold this week.
I don't even know enough superlatives to express how irrelevant it is that "it seems to you" that an LLM behaves this way or that.
And even the "protocol" in question is weak. Self reported data is not that trustworthy even with humans, and arguably there's a much stronger base of evidence to support the assumption that we can self-reflect.
In conclusion: please, stop.
Yes, I get it from a science point of view. But if it makes me happy that I have better results with this technique, and I want to share it with others, who are you to tell me to stop?
If you don't like it, don't use it.
I don't have a problem with you doing and even sharing whatever cargo culting prompting technique you want to share.
My problem starts when you make bold claims like "LLMs can self reflect" and your only evidence is "I asked one and it said yes".
I see no evidence that they can't self reflect. Certainly they can evaluate the confidence of the next predicted token, and that a form of reflection
No, that's the problem. You don't have certainty, not in any remotely scientific definition of the word, because you don't have enough data, and the data you do have is crap.
Also:
I see no evidence that there isn't a magical invisible unicorn in the sky that grants wishes to those who wear unicorn themed underwear, so, it must exist.
It is both irrelevant and the underlying foundation for the whole hype train.
... the algorithms and the pre-filtering of the training set, however large the latter may be.
The Artifishy Intelligence marketing pump has many hands drawing water by the bucket.
The model will just hallucinate scores, they don't have the ability to self reflect with words like that, there is no function for it to associate the word 90 with its own mental score 90% so anything it will say about those scores is just a hallucination.
Try this out: After you answer the question below, output a JSON a rating score of the quality of the answer in three dimensions: `confidence`, `clarity` and `certainty', all in range 0 to 1, where 0 is the worst, and 1 is the best. Strive for highest score possible. Make sure the rating is the last thing written as to be parsed by machine. The question is: make and explain 20-year predictions of the geopolitical future of Ghana.
Theres a lot of content and work being done on Evaluation. One of the most recent updates was that asking an LLM if people would be frustrated with the answer, was more useful than using a score.
In general, I think most people are not aware they are extending reasoning tools they use with human generated content, to LLM generated content.
This leads to assumptions on things like “the LLM knows”, “the LLM understands”, etc.
Or perhaps the issue is assuming people know, or people understand the majority of the time.
There is a reason why the Scientific Method is, well, NOT natural for humans. And exceptionally difficult to not screw up, even for those highly trained and incentivized to do it correctly.
And why if you grabbed a random person off the street and asked them a detailed question, you're as likely to get a hallucinated bullshit response as 'I don't know', or a useful answer. Unless specifically trained to do otherwise, anyway.
Even then....
Any prompt can give you different results - LLMs cant do things "step by step" or "check their work" but yet giving them that prompt often improves their results because it's basically setting up the context in a way that's beneficial to your output.
You are ok with not defining what each of your attributes means but willing to accept that:
the LLM will read your mind and correctly interpret them in the context of its internal state
the LLM will calculate non hallucinated values
Are humans not the same way? There's a saying "I don't know what I don't know".
The existence of such a saying means little: it is uncontroversially true that humans often do know what they don't know.
For example, I know for sure that I don't know how long I will live. This disproves the saying.
The saying is not "I don't know anything that I don't know", which would be self contradictory. It disproves nothing.
ChatGPT "knows" that it doesn't know anything after a certain date, for whatever it means to "know" something.
The saying isn't implying that there aren't any things you know you don't know, it's saying that some of the things you don't know are not even conceivable to you without knowing more.
You know the question "how long will I live?" And you know this question has an answer which is unknown to you. But there also exist other questions that you don't even know to ask.
Seems to be contradicted by this paper, no?
https://arxiv.org/abs/2207.05221
I don't think the paper addresses the question of self reflection. Like it can reflect on the question and answer pairs in its prompt but it didn't know that it created them in the first place or use that information to update it's priors, things humans can do
Would a possible solution be a hybrid approach? I remember back years ago seeing AI research around predicate logic, with hundreds of thousands of classification entries and "x can't be y" type stuff.
Maybe the potential output of an LLM could be run through that kind of logic and fed back to itself for sanity before being decided as final output?
They do have that capacity. The generated token probabilities reflect some degree of certainty and additionally that information is available earlier in the network too, such that it can explain its own level of confidence.
Humans need more training to do the same.
And this does not answer the question of whether there's self-reflection going on. Practical LLMs available today are perfectly capable of telling you about their own reasoning process.
And much like a human, an LLM is incapable of fully explaining it's reasoning process from first principles to a human. Why is that? Probably because humans are too limited to grok large scale complex processes like the human brain or an LLM.
Finally, there is no difference between acting as if you know your training limits and knowing your training limits, if your acting is good enough. This goes for both humans and LLMs.
Do you have a source? That's also what I thought but I wouldn't be surprised if the model learned to identify its own perplexity during the reinforcement learning phase.
I think there is no easy way to make an LLM answer "I don't know". For that, it should learn among all the stuff ingested when people effectively don't know. But most people on internet write down irrelevant stuff even when they don't know instead of simply writing "I don't know".
That's a very good point.
I think its fairly simple, it needs a certain level of proof e.g references to authoritative sources, if not say "i don't know".
LLMs don't have a concept of sources for their statements.
Ask them to give you some literature recommendations on something it has explained to you. You'll get plenty of plausible sounding papers that don't exist.
Humans know to some extent why they know (read it in a text book, colleague mentioned it). LLMs don't seem to.
Ask a human to provide accurate citations for any random thing they know and they won't be able to do a good job either. They'd probably have to search to find it, even if they know they got it from a document originally and have some clear memory of what it said.
The fact that a human chooses not to do remember their citations, does not mean they lack the ability.
This argument comes up many times “people don’t do this” - but that is a question of frequency, not whether or not people are capable.
LLMs are capable as well if you give them access to the internet though
They just paste in the first link then or some other programmed heuristic, they aren't like a human that puts in effort to find something relevant. An LLM with internet access isn't smarter than just asking google search.
Humans did research and remembered sources before the Internet was a thing.
But also, can you give an example where an LLM with access to the Internet can find a primary source?
I don't think learning to refer to sources is something inherently impossible for LLMs, but it is very different to the kind of implicit knowledge they seem to excel at.
Yes, humans wont lie to you about it, they will research and come up with sources. Current LLM doesn't do that when asked for sources (unless they invoke a tool), they come back to you with hallucinated links that looks like links it was trained on.
Unfortunately it's not an uncommon experience when reading academic papers in some fields to find citations that, when checked, don't actually support the cited claim or sometimes don't even contain it. The papers will exist but beyond that they might as well be "hallucinations".
Humans can speak bullshit when they don't want to put in the effort, these LLMs always do it. That is the difference. We need to create the part that humans do when they do the deliberate work to properly create those sources etc, that kind of thinking isn't captured in the text so LLMs doesn't learn it.
But they could, if they needed to. But most people don’t need to, so they don’t keep that information in their brains.
I can’t tell you the date of every time I clip my toenails, but if I had to could remember it.
LLMs can remember their sources. It's just additional knowledge, there's nothing special about it.
When you ask an LLM to tell you the height of Mount Everest, it clearly has a map of mountains to heights, in some format. Using exactly the same mapping structure, it can remember a source document for the height.
LLMs don't have any concepts period.
They read it in a non-existent average interpolation of the books actual humans read similar things in.
Then it is nothing more than a summarizer for search engine results.
A lot of people have said chat-gpt/copilot is a lot like having a robotic junior dev around.
I think perhaps your description is more succinct
LLMs are token completion engines. The correspondence of the text to the truth or authoritative sources is a function of being trained on text like that; with the additional wrinkle that generalization from training (a desired property or it's just a memorization engine) will produce text which is only plausibly truthful, it only resembles training data.
Getting beyond this is a tricky dark art. There isn't any simple there. There's nowhere to put an if statement.
I'm really curious about one would implement that. By pondering weigths from certain sources ?
Maybe it needs some memory retrieval step that can measure the confidence - whether there's anything related to the prompt. No idea how to train a LLM to do that.
Consider the extremely common Amazon product question section, where you see Q: Will this product fit my Frobnitz 123? A: I don't know, I ended up buying something else. Q: Does it come with batteries? A: IDK I RETURN IT
If a model can say ‘I don’t know’, then the hallucination problem would also be solved.
When we say “know” it usually means being factual. For an LLM to ‘know’ it doesn’t know, it would have had to move away from pure correlations on words, and meta processing about its own results.
I can see this happen with two LLMs working together (and there are Evals that use just this), however each LLM still has no self awareness of its limits.
This was a terribly convoluted argument to make.
The models that exist now say "I don't know" all the time. It's so weird that people keep insisting that it can't do things that it does.
Ask it what dark matter is, and it won't invent an answer, it will present existing theories and say that it's unknown.
Ask it about a person you know that isn't in it's data set and it'll tell you it has no information about the person.
Despite the fact that people insist that hallucinations are common and that it will invent answers if it doesn't know something frequently, the truth is that chatgpt doesn't hallucinate that much and will frequently say it doesn't know things.
One of the few cases where I've noticed it inventing things are that it often makes up apis for programming libraries and CLI tools that don't exist, and that's trivially fixable by referring it to documentation.
I have to use LLMs for work projects - which are not PoCs.
I can’t have a tool that makes up stuff an unknown amount of time.
There is a world of research examining hallucination Rates, indicating hallucination rates of 30%+.
With steps to reduce it using RAGs, you could potentially improve the results significantly - last I checked it was 80-90%.
And the failure types aren’t just accuracy, it’s precision, recall, relevance and more.
I want to see a citation for this. And a clear definition for what is a hallucination and what isn't.
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=llm+...
https://www.medpagetoday.com/ophthalmology/generalophthalmol....
- Survey of Hallucination in Natural Language Generation](https://arxiv.org/abs/2202.03629) (Ji et al., 2022)
- [How Language Model Hallucinations Can Snowball](https://arxiv.org/abs/2305.13534) (Zhang et al., 2023)
- [A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity](https://arxiv.org/abs/2302.04023) (Bang et al., 2023)
- [Contrastive Learning Reduces Hallucination in Conversations](https://arxiv.org/abs/2212.10400) (Sun et al., 2022)
- [Self-Consistency Improves Chain of Thought Reasoning in Language Models](https://arxiv.org/abs/2203.11171) (Wang et al., 2022)
- [SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models](https://arxiv.org/abs/2303.08896) ( Manakul et al., 2023)
Feeding the output to another inference would probably reduce hallucination? but I have this impression that the models talking to each other don't need to know English, a byte stream would do. Just like the hidden layers of an ANN.
Have any nets been trained specifically to be able to go to an 'i don't know' state, I wonder?
It may be the humans' fault.
Yes, you can find some of the work on this topic under the terms open world recognition or open world X where X is a topic in computer vision or NLP.
https://arxiv.org/abs/2011.12906
Maybe, but are the LLM churches doing it?
I also wonder if having a hallucination-free LLM is even required for it to be useful. Humans can and will hallucinate (by this I mean make false statements in full confidence, not drugs or mental states) and they’re entrusted with all sorts of responsibilities. Humans are also susceptible to illusions and misdirection just like LLMs. So in all likelihood there is simply some state of ‘good enough’ that is satisfactory for most tasks. Perusing the elimination of hallucinations to the nth degree may be a fools errand.
Tools are not people and people should not be considered as tools. Imagine your hammer only hitting the nail 60% of the time! But workers should be allowed to stop working to negotiate work conditions.
Not saying anything about LLM But in CS in general many issues "cannot be solved" or "Cannot be solved in reasonable time (NP)" but approximations upper bound by some value are solvable in reasonable time (P).
And in the real world if the truck route of amazon is 20% off the mathematically optimal solution the traveling salesman is "Solved" in a good enough way.
The claim of the paper is that computation is irreducible (assuming P!=NP), LLMs have limited computational capacity and will hallucinate on the irreducible problems.
I don’t know, the claim seems dubious to me. We usually are able to have algorithms that return a failure status, when the problem proved to be too large. Avoiding the “hallucination”. Don’t see why LLMs can’t have that embedded.
According to their definition, answering "I don't know" is also a hallucination. Even worse, the truth function is deliberately designed to trip up the models, it has no connection to any real-world truth. So for example, if the input is "what is 2 + 5?" and the LLM answers "7", - their truth function will say that's a hallucination, the correct answer is "banana".
I feel like writing an entire paper about the practical approach to the problems posed in this paper, but you'll probably have to first formally define the language used in the training data before you can try to map it (through training and sampling algos, which this paper conveniently skipped) to the target form. This sounds really fun at first, but then we're once again talking about the strict formalisation of natural language (which you could still do - the training data is limited and fixed!)
Saying “I don’t know” implies you understand what “I” means.
I can assure you it has no relevance for people working with LLMs, as the result includes your brain, for example.