I would say that there is very little danger of a proof in Lean being incorrect.
There is a serious danger, which has nothing to do with bugs in Lean, which is a known problem for software verification and also applies in math: one must read the conclusions carefully to make sure that the right thing is actually proved.
I read Wilshaw's final conclusions carefully, and she did indeed prove what needed to be proved.
The paper makes a similar point like this:
Every definition and theorem in mathlib and this project have been checked by Lean’s trusted kernel, which computationally verifies that the proofs we have constructed are indeed correct. However, Lean cannot check that the statements of the definitions and theorems match their intended English equivalents, so when drawing conclusions from the code in this project, translation to and from English must be done with care.
This is precisely why humans will always be involved with creating software.
LLMs already write English better than most native speakers. I wouldn't bet everything on this.
Do you trust LLM so much that you don't check what it writes before sending the email?
LLMs can write better English, but the curating step is still critical, because it also produces a lot of garbage.
Would you trust a brand new assistant to write up an email for you without proof reading it? How much training would they require before you didn't need that step? How much training / fine-tuning would an LLM need? What about the next gen LLM?
Remember, we're not talking about a static target here, and the post I replied to set no qualifications on the claim that a human will always be needed to check that a mathematical definitions in the proof match the English equivalents. That's a long timeline on a rapidly moving target that is, as I said, already seems to be better than most humans at understanding and writing English.
Depends on the complexity, but for the simpler things I think I could get confident in a day or so. For more complex things, it might take longer to assess their ability.
But I'm not going to trust LLM blindly for anything.
I don't defend this strong claim and limit my answer to LLMs (and mostly just state of the art). OTOH I believe that trust will continue to be a big topic for any future AI tech.
Again, what does "blindly" mean? Suppose you went a month without finding a single issue. Or two months. Or a year. The probability of failure must literally be zero before you rely on it without checking? Are you applying the same low probability failure on the human equivalent? Does a zero probability of failure for a human really seem plausible?
There's a reason “if you want it done right, do it yourself” is a saying.
I feel like this conversation is incorrectly conflating "probability of error" with "recourse when things go wrong".
Choosing to handle it yourself does not reduce probability of error to zero, but it does move the locus of consequence when errors occur. "you have nobody to blame but yourself".
One reason people might trust humans over AI regardless of failure rate is answering the questions "what recourse do I have when there is an error" compounded by "is the error model self-correcting": EG when an error occurs, does some of the negative consequence serve to correct the cause of the error or doesn't it.
With another human in the loop, their participation in the project or their personal honor or some property can be jeopardized by any error they are responsible for. On the one hand this shields the delegator from some of the consequence because if the project hemorrhages with errors they can naturally demote or replace the assistant with another who might not have as many errors. But on the other hand, the human is incentivized to learn from their mistakes and avoid future errors so the system includes some self-correction.
Using a static inference LLM, the user has little recourse when there is an error. Nowhere to shift the blame, probably can't sue OpenAI over losses or anything like that. Hard to replace an LLM doing a bad job aside from perhaps looking at ways to re-fine-tune it, or choose a different model which there aren't a lot of materially competing examples.
But the biggest challenge is that "zero self-correction" avenue. A static-inference LLM isn't going to "learn from its mistakes", and the same input + random seed will always produce the same output. The same input with a randomized seed will always produce the same statistical likelihood of any given erroneous output.
You'd have to keep the LLM on a constant RLHF fine tuning treadmill in order for it to actually learn from errors it might make, and then that re-opens the can of worms of catastrophic forgetting and the like.
But most importantly, that's not the product that is presently being packaged one way or the other and no company can offer any "learning" option to a single client at an affordable price that doesn't also commoditize all data used for that learning.
If the LLM required a constant fine-tuning treadmill, you wouldn't actually use it in this application. You could tell if you were on such a treadmill because its error rate wouldn't be improving fast enough in the initial phases while you were still checking its work.
As for what recourse you have in case of error, that's what fine-tuning is for. Your recourse is you change the fine-tuning to better handle the errors, just like you would correct a human employee.
Employees are not financially liable for mistakes they make either, just their job is at stake, but this is all beside the point: if the LLM's error rate is equal to or lower than a human employee, why prefer the human?
It will eventually become as chess is now: AI will check and evaluate human translation to and from English.
And if it says the human got it wrong, then tough luck for the human if they didn't. :(
Constructing a sentence is only the last step in writing, akin to pressing the shutter release on a camera. LLMs can turn a phrase but they have nothing to say because they do not experience the world directly. They can only regurgitate and remix what others have said.
Apart from the whole "generating bullshit" thing, sure.
Humans still generate more bullshit than LLMs.
Some do. Not all.
till they incorporate more of what some of your writing and loose their advantages
How do you think humans are doing this this if you don't think machines can ever do anything similar?
You misunderstand. The point here is not about humans being better or worse at some task than humans, but humans defining the objective function.
This doesn't seem to follow. Why kind computers get better at doing this (anticipating what humans want or whatever) than people? Some people are better at it than others and people are not magic, so I'd guess at some point computers will get it too.
I think what the parent post is referring to is that clarifying human intention rather axiomatically involves a human at some stage in the process.
The problem I express relates to the issues people mention about libraries: if a defined concept is used, one has to be sure the definition is correct (i.e., that the right thing has been proved).
Wilshaw's formalization is not vulnerable to this objection, though libraries were used. What is proved is that a certain defined concept satisfies a certain suite of formulas of first order logic. If there is a predicate satisfying that suite of formulas, NF is consistent.
and it very much IS an essential part of my confidence in this proof that conversations between me and Sky Wilshaw reveal that she understands my argument [and was able to point out errors and omissions in paper versions!]
human interaction helps create confidence. But the software is extremely reliable: a philosophical challenge based on bugs in theorem proving software just is not going to hold water.
There's a (small, grey) link that reads 'edit' among the links at the top of each comment you can use if you want to change or add something to a comment you've already written, if you prefer that to replying to yourself.
I tried using it, and I could edit, but the update button did nothing; the edit never got posted. So I'll stick with little multiple replies for now.
The edit happens inline, immediately when you hit update - it won’t take you back to the thread but the comment is updated right above the input field (and in thread). It’s not terribly important either way.
I am sure you know this, but for the audience: the danger can be mitigated somewhat with a "test suite" of theorems and examples about the definitions. These examples can be very simple ("this particular object, with this particular operation, is a group; this other object is not") or much more sweeping and general (e.g. fundamental theorems like "all objects with this property are isomorphic" or "all objects with this property embed canonically in this other construction"). It doesn't prove that your definitions are correctly capturing what your natural-language proof talks about, but it can help you be more confident.
Another danger is some sort of bug in Lean itself. This isn't unprecedented in theorem provers [1][2]. These might be hard to hit by accident... but there are larger and larger collaborations where arbitrarily people fill in steps (like [3]). Someone trolling one of these efforts by filling a step in using a bug they found might become worth worrying about.
[1]: https://inutile.club/estatis/falso/
[2]: https://www.youtube.com/watch?v=sv97pXplxf0
[3]: https://terrytao.wordpress.com/2023/11/18/formalizing-the-pr...
It's kind of a root of trust problem, isn't it? I think the algorithm for checking proofs is relatively simple. All those fancy tactics boil down to a sequence of rewrites of an expression tree, using a small handful of axiomatic rewrite rules.
The trusted codebase becomes that checking algorithm, along with the "compiler" that translates the high-level language to term rewriting syntax. Formally verifying that codebase is a rather circular proposition (so to speak), but you could probably bootstrap your way to it from equations on a chalkboard.
Note also that there is an independent checker https://github.com/leanprover/lean4checker to ensure that you're not pulling any fancy tricks at the code level: that the compiled output, free of tactics, is in fact a proof.
Congratulations on the verification of your proof! It must be great to have your research life's crowning work being formally confirmed! Also a great victory for the new foundations of Quine.
I have shown the consistency of New Foundations. My aim is not actually to promote it as a working set theory. NFU, which admits Choice, is probably better for that. But if there are people who want to use NF as the foundation, it is now seen to be as secure as a rather small fragment of ZFC.
From a foundational perspective, it is also important to note that this proof is one of equiconsistency between NF and the Lean kernel, which itself is handchecked. Mechanized theorem provers preserve that level of correctness imputed to them via outside injection, from humans or other out-of-band systems.
It certainly isnt a proof of equiconsistency between NF and the Lean kernel. The theory implemented in the Lean kernel is considerably stronger than NF.
This is why specification is much more important than verification / proof. We are bound by how accurate we make our propositions.
Both are very important.
Many congratulations on being formally proved right, Randall!
That's always the problem with these pesky computers.
They do exactly what you tell them to.