return to table of content

Some notes on Rust, mutable aliasing and formal verification

pron
50 replies
1d8h

I usually love Graydon's posts, but this one -- like many articles about formal methods -- may be misleading for most readers unfamiliar with the field. It's a little like reporting an order of magnitude improvement in transmuting lead into gold without mentioning that there are 29 further orders of magnitude to go before the method is cost-effective in practice.

Does no aliasing make formal verification significantly easier? Absolutely! Does it mean we can expect a practical and cost-effective way to verify real world programs? Not even remotely. Yes, some programs, circuits, and program components are formally verified every day, but they are the exception rather than the rule; they are relatively very small and they are constructed in particular careful ways. Nice properties that allow local reasoning about some other properties are important but do not materially affect the way we can assure the correctness of mainstream software with sound methods. A reasonable, real-world program that would take a million years of effort to verify without such properties will only take 100,000 years of effort to verify with them. That is good and it is even useful in practice because a program that would take a year to verify could now take a month -- making it actually cost-effective -- only the number of such programs to begin with is vanishingly small in the grand scheme of things.

A program in a language with no heap, no pointers, no integers -- only boolean variables -- and no loops larger than two iterations (the language is nowhere near Turing complete) cannot be practically verified (by reduction from TQBF). Not in theory, not in practice, and there aren't even heuristic methods that cover many real world instances of such programs (of course, some such programs could happen to be easily verifiable). It can be made to work for some properties (like memory safety) -- we say that we can make these invariants inductive or composable -- but it turns out that that's not nearly good enough for what software needs.

Back in the seventies and eighties, and even nineties, the hope was that while the "worst case" complexity of verifying programs was known to be intractable, perhaps certain local guarantees and structures employed by programming languages could move us away from the worst case. It has since been proven to not be the case. Even the hope that programs people actually write are far enough from the worst case for good heuristic methods to emerge, and even that now seems to not be the case.

Many years ago I gave a talk covering the relevant findings: https://pron.github.io/posts/correctness-and-complexity

The main result is that the verification most interesting properties that we'd like to verify does not compose. I.e. if we can prove a property for components P1...Pn, then proving that property for P1 ○ ... ○ Pn (where ○ is some composition such as a subroutine call or message passing) does not just grow super-polynomially in n (which would be good!), but super-polynomially in the size of each of the components, i.e. it would be just as hard to prove if everything was not decomposed. I.e. correctness does not decompose.

Yes, some important programs can be soundly verified and various local properties can make that easier, which is useful but because of software is growing much faster than the scale of effective formal verification and because we've learned more results about the complexity of software verification, end-to-end verification of "ordinary" software appears further away today than it did 50 years ago. That is why there's a shift in research of software assurance toward unsound methods. A focus on soundness does not, I believe, help the cause of more correct software, but distracts away from more techniques that have proven more fruitful.

This gap between "we can verify quite a bit" and "what we can verify is a drop in the ocean" -- both of which are true at the same time -- is something that is often missing from discussions of formal methods.

roca
35 replies
1d8h

I think it's reasonable to look forward to verifying that small Rust libraries use 'unsafe' correctly. That alone would be really helpful.

pron
32 replies
1d6h

Even if that were the case, you'd be working very hard, wasting your assurance budget, to prove something underwhelming, like memory safety or other similarly low-level properties. After all that hard work you'd only end up at the same point where most of the software world -- which is already using memory-safe languages -- is today. On the other hand, if you were to spend your budget how they do (rather, should do) with modern unsound methods, you would likely end up not noticeably worse than if you did both.

Remember, the goal isn't to prove some technical property, but to find the greatest number of bugs that matter to you -- those you believe are most dangerous for your particular program or library -- within budget. Researchers are often interested in discovering what properties they can verify, but practitioners care about the functional properties they need to verify.

Deductive methods (including type systems) are particularly myopic in that regard precisely because they're so limited in their effectiveness to begin with. People interested in those particular techniques are rightfully very excited when they discover they can do a little bit more than they previously could, even if what they can do is still so far behind what practitioners need (and are getting elsewhere with more practical methods).

In general, the gap between research and what practitioners think it might mean for their industry work is large in this field and greatly misunderstood by practitioners.

naasking
16 replies
1d6h

to prove something underwhelming, like memory safety or other similarly low-level properties. After all that hard work you'd only end up at the same point where most of the software world -- which is already using memory safe languages -- is today.

I'm not sure why this is underwhelming. You reduce your hardware cost budget by using a low-level language while ensuring you don't suffer from the single greatest source of security vulnerabilities in real programs. These have significant real-world impact.

pron
15 replies
1d5h

I'm not sure why this is underwhelming. You reduce your hardware cost budget by using a low-level language while ensuring you don't suffer from the single greatest source of security vulnerabilities in real programs.

There are two problems with your statement. First, it's unclear whether what you save pays for what you gain. Second, the participation of memory unsafety in vulnerabilities is not binary. If you can cheaply reduce memory unsafety by 95%, that doesn't mean that paying a high price for the remaining 5% is worth it. Yes, memory safety is the single greatest source of security vulnerabilities in some languages (it is not in general, but then again most software is already written in memory-safe languages), but once you've already reduced it by 95% it no longer is. (BTW, I'm not even sure that soundly eliminating the 95% is clearly worth it compared to alternatives, but that's a separate discussion).

There's this obsession with soundness that is undoubtedly interesting to researchers of sound methods, but isn't actually justified by the real-world needs of software (sometimes it is and sometimes it isn't, but when and by how much is something that can only be answered empirically; it isn't something that can be simply asserted). Software correctness is at least as much art as it is a science, and ignoring the empirical aspects that make it into an art and focusing only on the aspects that are science because that's where things are clear-cut doesn't make those aspects disappear. At the end of the day someone has a budget and they need to spend it in the way that yields to most correct software; how that is best done is an empirical question. After all, you started with an empirical finding about memory safety being a leading source of some bugs, but it simply doesn't follow that total memory safety soundness is the answer to the empirical question that we need answered. Soundness is clear-cut and mathematical (and that's also why it's attractive [1]); sadly, software correctness is the very opposite.

We must continue researching sound methods, but if anyone thinks -- knowing what we know now -- that this should be the main focus and the best answer to software correctness then I think they are misjudging the nature of the problem.

[1]: Ironically, people who focus on soundness love proofs, yet they try to skirt around the proofs of the futility of soundness by resorting to empirical -- though untested -- claims.

naasking
14 replies
1d4h

Here's the problem with this argument: the long history of security vulnerabilities demonstrates that people are complete shite at estimating both the likelihood of security problems, and the severity and scope of any unaddressed security problems.

So the only empirical data that we know to be absolutely reliable is this long history of colossal human failure on estimating this risk. That's why I don't trust your argument one bit, and thus, that's why soundness properties are important, because they take flawed human judgment completely out of the equation.

Which isn't to say that I want soundness proven for all properties no matter the cost, but I do take issue with your outright dismissal of soundness. There are low-hanging fruit like memory safety that handle the vast majority of these problems, and so unquestionably have a great bang for your buck.

pron
13 replies
1d4h

But you are again trying to make a non-binary problem into a binary one. We will never be able to soundly verify all the properties we need. That dream is just getting further and further away. 100% correctness is unattainable. So given that, let's say we could only ever hope for 99%. In that case it doesn't matter whether we get rid of 99% of problems using unsound methods or soundly get rid of the cause of 99% of the problems -- the end result is the same. What matters is the cost of getting to that 99% correctness, not the technique.

My choice of technique -- sound or otherwise -- is driven by its cost-effectiveness, not principle. I don't care if you're using a sound technique or an unsound one; I only care about how much it costs to give me the most correctness I need and can get within my budget.

I do take issue with your outright dismissal of soundness

I don't dismiss soundness. I dismiss the focus on soundness out of principle rather than effectiveness. That myopic focus is grounded in the preferences of researchers of sound methods; it is not grounded in the study of software correctness.

naasking
12 replies
1d4h

But you are again trying to make a non-binary problem into a binary one. We will never be able to soundly verify all the properties we need.

I literally said I did not want soundness for all properties no matter the cost, but that soundness for some critical properties is absolutely warranted. The binary for some properties is absolutely there and completely justifiable based on the reasoning I laid out (memory safety in particular).

foldr
6 replies
1d3h

The binary for some properties is absolutely there and completely justifiable based on the reasoning I laid out

As a spectator in this thread, I do not see an argument for this conclusion in any of your posts. You're just insisting on it rather than arguing based on data.

naasking
5 replies
1d3h

You're just insisting on it rather than arguing based on data.

So the unbelievably long history of memory safety vulnerabilities is not an argument for distrusting human judgment on the prevalence or scope of possible unsafety in one's program, and thus an argument for memory soundness. This is literally the argument I presented here [1]. In your mind, this is not an argument based on data?

[1] https://news.ycombinator.com/item?id=40378315

foldr
4 replies
1d3h

It's not an argument for always insisting on eliminating 100% of memory safety bugs via sound methods regardless of budget. That requires a more elaborate argument than just "memory safety bugs are really bad and therefore we should eliminate all of them with certainty".

naasking
3 replies
1d2h

It's not an argument for always insisting on eliminating 100% of memory safety bugs via sound methods regardless of budget.

Firstly, I never said sound methods ought to be used. Secondly, I also never said that memory safety bugs always must be eliminated without qualification, I specifically couched this in a context of security. In that context, ensuring memory safety is absolutely critical. For instance, scientific simulations of the weather or something don't have such considerations.

The whole problem of memory unsoundness is that you cannot predict the scope or severity of unsoundness. They are simply not like other "bugs", so the heuristic arguments the OP presented simply don't work. All other desirable properties derive from memory soundness, so if you cannot rely on it then your system effectively has no guarantees of any kind.

So yes, any system with any kind of attack surface should absolutely ensure memory safety by any means necessary. If you need to use an unsafe language for performance reasons, as per my initial reply, then isolate the unsafety to trusted kernels (like checked arrays) that can be verified by model checking or other proof techniques.

foldr
2 replies
1d1h

Firstly, I never said sound methods ought to be used.

You kind of did say this: "soundness for some critical properties is absolutely warranted". It seemed clear that you were taking memory safety to be one of the "critical properties".

if you cannot rely on it then your system effectively has no guarantees of any kind.

In the real world you cannot rely on anything absolutely. All guarantees are probabilistic. Rust, for example, does not have a formally verified compiler, so you cannot be absolutely sure that your provably memory safe code will actually compile down to memory safe machine code (or indeed to machine code that does what it's supposed to do in any other respect). Does that mean we need to go off and formally verify the Rust compiler because otherwise none of our systems will have guarantees of any kind? If it doesn't, then note that you are making exactly the same kind of cost/benefit analysis that pron is talking about.

It's easy to list lots of things that systems 'should' do. In a realistic scenario there is a budget of time/effort/attention/money.

naasking
1 replies
23h29m

You kind of did say this: "soundness for some critical properties is absolutely warranted". It seemed clear that you were taking memory safety to be one of the "critical properties".

You can achieve soundness without "sound methods" in the formal sense that you were using the term.

If it doesn't, then note that you are making exactly the same kind of cost/benefit analysis that pron is talking about.

I disagree, there's at least an order of magnitude difference between what we're discussing (probably more). pron is tossing around numbers like eliminating 95% of memory safety bugs, where the safety of runtime checked arrays will eliminate more like 99.99% of memory safety bugs, where that 0.01% allows for the chance the you made a mistake in your checked array implementation. There is simply no equating the two.

foldr
0 replies
22h47m

I think "sound methods" in this context just means methods that guarantee soundness. You can have soundness without sound methods, but you can't know (for sure) that you have it.

pron
4 replies
1d2h

but that soundness for some critical properties is absolutely warranted

A particular technique can only be warranted for practical use by its cost effectiveness, not on principle. If you justify soundness for a property because that property is at the root of 90% of bugs then you should still prefer another technique that reduces 90% of bugs more cheaply. It doesn't matter what technique is used to reduce that amount of bugs.

I understand the argument that nothing but soundness covers the full gamut of mistakes people make in practice around memory safety, but at the end of the day of those mistakes amount to 800 bugs in your program out of 1000, then you should still prefer a technique that reduces 800 bugs -- the same or different ones of equal severity -- for cheaper.

Indeed, there are situations where sound methods are cost-effective, and that is where they tend to be popular.

naasking
3 replies
1d2h

If you justify soundness for a property because that property is at the root of 90% of bugs then you should still prefer another technique that reduces 90% of bugs more cheaply. It doesn't matter what technique is used to reduce that amount of bugs.

This is a false equivalence, as not all bugs are the same and so cannot be trivially equated in this fashion. Memory safety violations permit classes of bugs of significantly higher severity than other types of bugs. Your off-hand disclaimer of "the same or different ones of equal severity" doesn't mean anything, because you cannot predict the severity or scope of even trivial memory unsoundness, so your heuristic arguments of preferring lower cost methods simply don't work. A memory safety bug can range from something as simple as "a process crashed" in the best case, to "we lost control of all our servers worldwide and all of our customer data was compromised".

So I would summarize my dispute with your position as follows:

1. Memory safety is NOT an underwhelming property. Arguably, all other properties of your program derive from memory safety.

2. If you have to use an unsafe language that permits violating memory safety, say for performance reasons, and you have any kind of attack surface, then ensuring memory safety is important. Probably more important than most other properties you otherwise think are important due to #1.

3. Heuristic approaches to memory safety do not necessarily entail a reduction in the severity of bugs and/or security vulnerabilities. Arguments that you eliminated 80% of trivial memory unsoundness is simply not compelling if it leaves all of the subtle, more dangerous ones in place.

4. The long history of security vulnerabilities is a main driver in the interest in soundness among researchers, contra your claims that that this obsession has no association with real-world needs.

I think I'll just leave it at that.

pron
2 replies
1d1h

This is a false equivalence, as not all bugs are the same and so cannot be trivially equated in this fashion.

Correct, and I tried hard not to make that equivalence.

Memory safety violations permit classes of bugs of significantly higher severity than other types of bugs.

This is not known to be true after eliminating 95% of memory safety violations.

Your off-hand disclaimer of "the same or different ones of equal severity" doesn't mean anything, because you cannot predict the severity or scope of even trivial memory unsoundness, so your heuristic arguments of preferring lower cost methods simply don't work.

Ah! Now that is 100% true, but it cuts both ways. Of course it's a problem that we can't know in advance what would be a severe bug or it's probability, but saying let's shift resources into sound methods in areas where they happen to work implicitly makes such guess. It's alright to acknowledge that measuring the effectiveness is hard, but that also means we can't presuppose the effectiveness of something just because it has an easy to understand mathematical property. I'm not arguing for or against sound methods; I'm saying that since the question is empirical, it must be answered by empirical means as that is the only way to answer it.

A pragmatic argument may be that you'd rather risk overspending on correctness than underspending, which essentially means: spend whatever you can on soundness whenever you can have it. The problem is that the world just doesn't work this way, because budgets are limited, and any work that is done to increase confidence in one property must, necessarily, come at the expense of other work that may also have value. There is simply no escape from measuring the cost/benefit of an economic activity that has a non-zero cost.

If I fly on a plane I want to know that the company doing the avionics spent its assurance budget in the way that eliminated most bugs, not that it spent it on ensuring memory safety (when I was working on avionics, we didn't use deductive methods even though they were available at the time precisely because their ROI was low and that's bad for correctness where you want every dollar to do the most good).

Memory safety is NOT an underwhelming property. Arguably, all other properties of your program derive from memory safety.

Except this is the starting point for most software. If your goal is to get to where most software is already is -- I would very much characterise it as underwhelming.

If you have to use an unsafe language that permits violating memory safety, say for performance reasons, and you have any kind of attack surface, then ensuring memory safety is important. Probably more important than most other properties you otherwise think are important due to #1.

Yes, but how much should you invest in getting from 0 safety to 90% vs how much from 90% to 100%?

Arguments that you eliminated 80% of trivial memory unsoundness is simply not compelling if it leaves all of the subtle, more dangerous ones in place.

I agree, but again -- it cuts both ways. If you invest a significant amount of effort in eliminating the last 5% of memory safety violations you cannot claim that it's more effective than spending your effort elsewhere. All I'm saying is that these are empirical questions and the fact that whether something is sound or not has a clear yes/no answer doesn't really help tackle the empirical problem.

The long history of security vulnerabilities is a main driver in the interest in soundness among researchers, contra your claims that that this obsession has no association with real-world needs.

This is clearly not true, because the focus on soundness has only decreased over the past 50 years and continues to decrease. In the 70s it was soundness is the only way. Now there's more research into unsound methods.

naasking
1 replies
23h8m

I'm not arguing for or against sound methods; I'm saying that since the question is empirical, it must be answered by empirical means as that is the only way to answer it.

And I say that the history of CVEs empirically shows that any important software with an attack surface must ensure memory safety, because no heuristic approaches can be relied upon.

It's alright to acknowledge that measuring the effectiveness is hard, but that also means we can't presuppose the effectiveness of something just because it has an easy to understand mathematical property.

Except you can measure the effectiveness of memory safety in preventing vulnerabilities in domains with an exposed attack surface. Your argument just reduces to an overall focus on bug count, despite acknowledging that 1) most security vulnerabilities are memory safety related, and 2) that memory safety bugs and other types of bugs can't be equated; and your only "escape hatch" is a supposition that a heuristic approach that "eliminates 95% of memory safety violations" probably doesn't leave anything serious behind. Sorry, 40+ years of CVEs does not make this claim reassuring.

Except this is the starting point for most software. If your goal is to get to where most software is already is -- I would very much characterise it as underwhelming.

It's not the starting point of software that has high performance or low power requirements, or expensive hardware, which is the context in which I took issue with your statement in my first reply. That's why memory safety is not underwhelming in this context.

Now there's more research into unsound methods.

There's more research into those methods because security costs are still a negative externality, and so a cost reduction focus doesn't factor it into account.

Anyway, I feel we're just circling here.

pron
0 replies
17h16m

And I say that the history of CVEs empirically shows that any important software with an attack surface must ensure memory safety, because no heuristic approaches can be relied upon.

That no heuristic approaches can be relied upon to eliminate all memory safety violations -- something that is obviously true -- does not mean that eliminating all memory safety is always worth it. If it is the cause of 80% of security attacks, reducing it by 90% will make it a rather small cause compared to others, which means that any further reduction is a diminishing return.

Except you can measure the effectiveness of memory safety in preventing vulnerabilities in domains with an exposed attack surface. .. and your only "escape hatch" is a supposition that a heuristic approach that "eliminates 95% of memory safety violations" probably doesn't leave anything serious behind. Sorry, 40+ years of CVEs does not make this claim reassuring.

Sorry, this doesn't make any logical sense to me. If X is important because it is a cause of a high number of bugs then reducing the number of bugs caused by X necessarily reduces its remaining importance. You cannot at once claim that something is important because it's the cause of many bugs and then claim that what's important isn't reducing the number of bugs. The number of bugs can't be both relevant and irrelevant.

Anyway, I feel we're just circling here.

Yep.

tialaramex
5 replies
1d5h

Remember, the goal isn't to prove some technical property, but to find the greatest number of bugs that matter to you

No. And this has been a particular challenge for poor C++. If we prove a property then the property holds, but if we just found and meticulously fixed 846 of the crucial holes in our system, the bad guys exploit #847 to break in just the same. Is there a #848? #849? Maybe.

We as an industry are significantly under-delivering and we ought to take responsibility for that.

I think you're correct about the gap but wrong about what the gap is. WUFFS exists today. Lots of people on HN have written or even are writing codecs in a general purpose programming language and don't even realise that's a terrible idea. "I'm pretty careful" substitutes for a machine checked proof and the results can be seen easily.

pron
4 replies
1d5h

And this has been a particular challenge for poor C++. If we prove a property then the property holds, but if we just found and meticulously fixed 846 of the crucial holes in our system, the bad guys exploit #847 to break in just the same. Is there a #848? #849? Maybe.

That is an unattainable goal for most properties we care about, particularly if they're as complex as "an unauthorised user cannot obtain some information". Even a complete and total elimination of all memory safety bugs and all concurrency bugs won't get you anywhere near that goal (and you can verify that for yourself by looking at common server vulnerabilities). I should have said that the attainable goal is etc..

naasking
3 replies
1d3h

particularly if they're as complex as "an unauthorised user cannot obtain some information". Even a complete and total elimination of all memory safety bugs and all concurrency bugs won't get you anywhere near that goal

Actually that's not really true. If you take that as a base and remove ambient authority, then you have capability security which does get you as close to fulfilling that property as is possible to achieve. I'd say that's pretty near that goal.

pron
2 replies
22h24m

I disagree, for multiple reasons:

1. You need to verify that all the sensitive data is indeed guarded with the appropriate capabilities.

2. You need to verify that the capability objects are created only after proper authorisation.

3. You need to verify that the capability objects cannot escape their relevant context.

4. You need to verify that the data is not accessible by "extra-program" means, e.g. the file containing the data may be read via a dumb directory-injection attack.

Some simple sound techniques by the programming language can help with 3 and a part of 2, but not the rest. Indeed the current #1 top vulnerability [1] is a misconfigured or incorrect access control.

[1]: https://owasp.org/www-project-top-ten/

naasking
1 replies
21h39m

Some background:

1. "An unauthorised user cannot obtain some information" is not a property that can be achieved by any access control system, even in theory, eg. an authorized user can just provide a copy to an unauthorized user. I will outline the only achievable property below.

2. All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.

3. You can obtain more capabilities by accessing capabilities contained within resources designated by other capabilities, just like you can obtain more references by accessing fields of objects that contain references.

4. Capabilities intrinsically combine designation with authorization. If a user has a capability to some resource, they are by definition authorized to access it. There is no such thing as a "context" in which a resource may or may not be accessed, you only either have or do not have a capability to it.

5. Capabilities are not ambient in the environment, and so the only path to obtaining capabilities is through the capability set you're given. An example of an ambient authority is the ability to obtain a file handle to any file in a system using an arbitrary file string you can just construct. Another example is being able to convert an arbitrary number into a reference, ie. violating memory safety.

Hopefully it's now clear how there's a formal correspondence between capabilities and memory safe languages that have had ambient authority removed. This does not in any meaningful way restrict our ability to construct programs, at worst we must pass around a few more parameters that we used to access using ambient authorities (like File.Open). Access to resources can now only flow along the reference graph and security analysis now becomes tractable, basically another type of dataflow analysis.

So the security property that you want to verify is that there is no way for user X to access resource Y if given capability set C. This means that the transitive closure of C doesn't include Y.

There are numerous ways to achieve this, but the most trivial way is to use a typed language and create a type specifically to hold Y, and two capability set types for authorized and unauthorized users, and then you can ensure by simple inspection that a Y never appears in any reference placed in a C (or you can go more precise and only ensure that Y never appears in any return/escape position, thus allowing X to operate on Y's indirectly via some API). The type system is already a dataflow analysis, so this just piggybacks off of that to achieve the property you're after. Note that these are comparatively simple steps compared to getting memory safety to begin with.

That said, I agree that "extra-program" means of undermining the foundation of one's secure system obviously undermines the security properties. This is not really a compelling objection to me anymore than the proof of the validity of an algorithm must assume that your computer correctly adds two numbers.

pron
0 replies
6h52m

You're thinking like a language designer rather than a black-hat hacker :)

All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.

The problem is that data doesn't begin life as a typed object but as a string of bits. I.e. it passes some "IO barrier" between the outside world and the world controlled by the language. At some point, some code must assign those bits to an object, and this code could be vulnerable.

If a user has a capability to some resource, they are by definition authorized to access it.

This is defining whatever the program does as being correct, but that is not what we usually mean by correctness [1]. It's like saying that anyone who has obtained root privileges is, by definition, authorised to have them.

I fully understand the point about what is attainable and what isn't, but this is another example of my point about how some who think about correctness focus on what can be done without properly considering how much what can be done is worth, which is defined entirely but what needs to be done (or what's desired). The thing that I, the application's owner, want is not to have my data protected by a programming language's type system or, say, by means of a fuzzer but to have my data not leak to people I don't want to give my data to. In the case of security, the attacker would always go after the weakest point, whatever it is, and the question is by how much you've reduced the chances of a successful attack and at what cost.

Saying I can't give you what you want but here's what I can give you is perfectly fine, but how much what you can give me is worth to me depends entirely on how much it covers what I want. The value of reducing the chance of a successful attack by 70% is the same regardless of the means by which you achieve it, and then becomes purely a matter of cost.

I fully understand the difficulty of judging probabilities, but you need to understand that that is the sole source of value in correctness. But what I want is to lower the probability of some failure as much as possible for my $100 of investment. I don't care how it's done. That means that debates over assurance methods should be grounded in that question of cost and value. What makes an assurance method better or worse than another isn't how sound it is but how much it costs per unit of value, which is fault reduction (obviously the cost isn't linear). Soundness is a technique for the purpose of controlling value and costs, which may be more or less effective depending on circumstances. The point is that it's a means not an end.

roca
4 replies
1d5h

As a practitioner but also a small-time CS researcher, I don't expect verifying that small Rust libraries use 'unsafe' correctly will "waste my assurance budget".

On the cost side: we don't know yet how hard it will be to formally verify small Rust libraries that wrap unsafe code in safe abstractions. That's partly because we don't even know exactly what the specification is that unsafe Rust code has to satisfy. But once we've settled on that and built good proof assistant tools, I think the work will usually be pretty easy, because most of these libraries are fundamentally quite simple and small, even including all their dependencies.

You correctly identified compositionality as a key issue. Fortunately, verifying correct use of 'unsafe' is inherently compositional: knowing nothing but the code of the library and its dependencies, we must prove that the necessary safety properties hold in all possible contexts.

On the benefit side: proving that libraries do not undermine Rust's safety guarantees is important, and the importance grows with the popularity of the library. And of course those guarantees aren't just "memory safety" but data-race-freedom and the general "mutable-xor-shared" property --- the importance of which is the subject of the OP here!

pron
3 replies
1d5h

You correctly identified compositionality as a key issue. Fortunately, verifying correct use of 'unsafe' is inherently compositional

Yes, it is an inductive property. Sadly, not a very powerful one.

On the benefit side: proving that libraries do not undermine Rust's safety guarantees is important, and the importance grows with the popularity of the library.

It is only important if you take it as an axiom that all programs must be proven to be memory-safe. That may be an axiom accepted by some Rust fans, but it is obviously not an axiom of software. What software needs are methods that reduce dangerous bugs (where "dangerous" means different things for different programs) in the most cost-effective way. It may well be the case that, say, soundly eliminating 95% of bugs caused by memory unsafety and using the rest of the budget on unsound methods is more effective than spending it on eliminating the last 5% of those particular bugs. Software's needs do not currently justify complete sound memory safety at all costs, so the question of cost is still the most relevant one. There is no justification to focus on one cause of bugs just because it can be soundly eliminated.

And of course those guarantees aren't just "memory safety" but data-race-freedom and the general "mutable-xor-shared" property --- the importance of which is the subject of the OP here!

Neither of these are worthwhile enough to move the needle. This is precisely what I mean by the myopic focus on what can be done as opposed to what needs to be done. If the gap is 10 orders of magnitude, narrowing it by 1 doesn't have a big impact, unless it were completely free, which it isn't.

roca
0 replies
21h43m

Your focus on "the most cost effective way" is fine in principle but in practice, we don't know ahead of time what bugs exist and what their impact is. Experience has shown that the prevalence and impact of memory safety and data race bugs is very difficult to predict, and the impact has very high variance. There are so many examples of memory safety bugs and races that lie dormant for years and suddenly become very important (e.g. because someone malicious discovered them, or because the software started being used in a different context).

Note that these safety bugs are really different from most other kinds of bugs in that their potential impact is unbounded: they can cause the program to behave in arbitrary ways. Logic errors, even stuff like non-UB integer overflow, have impact that is still constrained by the expected semantics of the programming language.

Given that uncertainty, it makes sense to take precautions. If you have cheap tools to eliminate those bugs, do it just in case. That might even be cheaper than trying to figure out ahead of time how important those bugs are going to be over the lifetime of the software (which is often an impossible task anyway).

roca
0 replies
21h54m

Ah, you seem to think I proposed using formal verification to eliminate 100% of memory safety bugs no matter what the cost. In fact I didn't say that, and I don't recommend that. There will certainly be cases, typically in very large libraries that use 'unsafe', where the invariants you have to prove are very complicated and fragile and difficult to verify, and therefore the cost outweighs the benefit.

What I'm looking forward to is formalisms and tools to pick the "low hanging fruit" --- as I said, verifying the correctness of small libraries that use 'unsafe'. In small libraries, in general, and invariants and proofs are smaller and proof automation is correspondingly more effective.

roca
0 replies
21h51m

If the gap is 10 orders of magnitude, narrowing it by 1 doesn't have a big impact, unless it were completely free, which it isn't.

It doesn't need to be completely free to have a positive net benefit.

zozbot234
3 replies
1d5h

to prove something underwhelming, like memory safety or other similarly low-level properties.

Memory safety is table stakes. If you can't even prove memory safety, your program doesn't really have any well-defined semantics after all.

you'd only end up at the same point where most of the software world -- which is already using memory safe languages -- is today.

Which is another way of saying that the modern software world is already using lightweight formal methods in an end-to-end fashion. Type checking is one such use of lightweight formal methods.

Unsound methods can be useful to surface possible bugs quickly and efficiently, much like fuzzing does. But they're not a kind of verification - they provide no guarantee of correctness.

pron
2 replies
1d5h

Which is another way of saying that the modern software world is already using lightweight formal methods in an end-to-end fashion.

Sure, but only for specific properties that are inductive. Most properties that programmers care about aren't. It is precisely this kind of extrapolation from one property to another that doesn't work.

But they're not a kind of verification - they provide no guarantee of correctness.

Correct 1. nothing else does either (again, sound formal verification only works in very limited situations -- either in the property they guarantee or the program they can guarantee it in -- and is falling further and further behind what the software world needs) and 2. unlike for algorithms, a "guarantee" is not and cannot be the bar for real software running on physical hardware, as you cannot guarantee the real behaviour of a software system even if the algorithm is guaranteed to be correct. Unlike an algorithm, a software system is ultimately a physical object, not a mathematical one. To crudely oversimplify, you don't care about any confidence beyond the probability of a bit flip.

In other words, you're pointing out that the current state-of-the-art in the field is not as good as something we don't (perhaps can't) have and don't need to begin with.

zozbot234
1 replies
1d5h

Sure, but only for specific properties that are inductive. Most properties that programmers care about aren't.

Yes but in practice, we keep finding ways of making more such interesting properties "inductive" by placing appropriate restrictions on how the program can be designed. This is how the Rust borrow checker works, at a high level.

pron
0 replies
1d4h

Not really, and this is what I meant by having an order of magnitude improvement is very exciting to the researcher even if in practice there are still 29 to go, but it doesn't make a big impact. Even something as simple as the borrow checker -- that only brings you to about where most of software, with all its bugs, already is -- comes at the cost of only being able to use memory in certain ways. I.e. to make the property inductive, it had to be made more coarse (indeed, that is what is always done when looking for inductive properties [1]) which requires more effort on behalf of the user.

As I've written above, over the past two or three decades we've only learned the theory that things are more difficult than we thought, and at the same time the capabilities of sound verification have fallen further away from the needs of mainstream software. The trajectory of both the theory and practice of sound methods has been negative. All the while, we've made some nice strides in unsound methods. Tony Hoare started recognising it in the nineties (which somehow shifted his 70s view on the subject), and the trend has continued: Yes, there are improvements in sound methods, but the problems become harder faster than those improvements, and the improvements in unsound methods is also faster.

[1]: That is seen very elegantly in TLA+, but I digress.

zozbot234
1 replies
1d7h

That would require a formal model of how 'unsafe' works, and we still don't have that. Pointer provenance is an especially difficult issue, also affecting C/C++ specification efforts.

roca
0 replies
1d6h

Indeed, we don't have that model, and we definitely need it. I don't think it's intractable. We've made a lot of progress. I would like to see more focused effort on this particular problem from the academic community.

pron
7 replies
1d6h

P.S.

A focus on soundness does not, I believe, help the cause of more correct software but distracts from techniques that have proven more fruitful in practice (and supported by theory) -- sometimes to the surprise of those who used to think that soundness is the only way, e.g. http://users.csc.calpoly.edu/~gfisher/classes/509/handouts/h.... This happens over and over: there is some important improvement in sound methods while at the same time unsound methods are making much bigger strides toward more correct software in practice. Here's just one very recent example (which isn't even necessarily state-of-the-art in unsound methods): https://antithesis.com

Of course, this is a very complex topic, because there are some specific situations where sound techniques are more powerful and practical.

jerf
4 replies
1d4h

Which kind of touches on the other problem, which is the difficulty of getting people to even use the easy, low-cost techniques we already know. Here we are talking about SAT solving and formula propagation and how unsafe Rust interacts with those things and in the world I experience a moment of shock if I so much as see

    type Username string
    type Password string

    func Login(Username, Password)
instead of

    func Login(username string, password string)
in code. (Obviously these type signatures are just complex enough to make my point and would generally have more to them in the real world.)

Or at times even just encountering a unit test is enough to surprise me. Programmers need to get really, really tired of fixing the same bug(s) over and over in QA before they finally have the brainwave of adding their first unit test somewhere around the 25,000 line point.

I wish formal methods all the best and I wish I had more time to spend with them, but I'm awfully busy fighting upstream just to keep the unit tests running every time a new developer joins the project. I'm just asking for you to please install the git precommit hook or actually pay attention to the CI or whatever. Please. I'm not asking you to learn a SAT solver. Just keep the tests passing, and if a test broke because it's no longer applicable, please remove it instead of just letting it break for the next couple of years. etc.

dboreham
2 replies
1d3h

before they finally have the brainwave of adding their first unit test somewhere around the 25,000 line point

In my experience the driver for this is "whoever has the money", not programmers. The programmer who carefully writes tests from line 0 gets laid off/told to speed up while the programmer who hacks together a bunch of janky code with no tests, but demos something cool to the <person with the money> is not laid off.

jerf
1 replies
1d2h

Only if the programmers are doing it wrong. I add unit tests from day one not because I'm just that disciplined... honestly I think my discipline may even be a touch sub-average compared to a lot of programmers... I add them from day one because they speed me up, starting somewhere around the third week or so (so we're not talking years before it pays off here). I do not understand how programmers are OK with making a change and then only finding out later that they broke something distant that they wouldn't have thought would break. It would be so frustrating. Or decide to refactor something in week 4 without unit test support.

Your janky code guy may beat me in the first month but after that I'm quite likely going to pass him and never look back.

pas
0 replies
9h23m

Revealed preference of business is to move fast and break things. ("Any idiot can build a bridge, it takes a professional (an engineer!) to build one that barely stands.")

I grieve with you but it seems the way out of this is through some kind of magic potion cooked of the usual forbidden ingredients of: free and open source, being your own boss, idealism, etc.

vlovich123
0 replies
1d1h

It's not a bad idea to strongly type the username and password so that you can't so easily pass in unexpected input to the login / can centralize validation (e.g. username and password can't be more than 100 bytes each to attempt a login). It also lets you implement things like secure erasure of the password once it gets dropped.

Weakly typing that though is a sin. I wish Rust made creating strongly typed wrappers around primitives a bit more ergonomic.

pas
0 replies
9h20m

By any chance can you elaborate on a bit of what are these unsound methods are, what's the current state-of-the-art? And what antithesis does? (Is it a fuzzer?)

ngruhn
0 replies
2h0m

What does "unsound method" mean in this context? Doesn’t that mean, if it finds a bug that this may not actually be a bug? I.e. it may give false positives?

littlestymaar
5 replies
1d3h

Hi Mr Presler,

I don't know if you realize that you've preached the exact same argument here on HN a hundred time for the past 6 years at least, without ever adding nuance after the many comments you've received on the topic here. You're not entirely wrong, but also far from entirely correct as half the world interested in the topic has had many chances to show you over the hundreds of time you've made that argument here.

And what's fascinating (besides your relentless preaching) is that the complete absence of any evolution in your position after all this time show that you haven't learned anything from these discussions with others, nothing.

That really sounds like pathological close-mindedness at that point. It means that everybody is losing there time when engaging with you, and that's really something you should try and think about.

Regards

pron
4 replies
1d2h

Hi. I think that's the normal dynamics when an expert in a particular subject discusses something with laymen. I'm not a researcher, but I have applied a range of formal methods for software verification for about a decade on systems ranging from safety-critical real-time embedded software to distributed databases. I constantly learn from people with a similar or higher level of experience. Even in my comments today you can find me mentioning a technique I only learnt about relatively recently from the people at Antithesis (I'd known about it since FoundationDB, but I didn't know as much about the details of their state exploration heuristics).

It is one of the greatest disappointments of my carreer that that group is still small; even most researchers don't have much experience in the mainstream application of formal methods, plus they each tend to focus on a narrow range of methods. The vast majority of practitioners care about correctness, but very few explore approaches other than simple testing or use of simple type systems; even something as basic and as old as non-trivial assertions -- which is a form of formal specification -- is underutilised.

BTW, a complicating factor is that verification techniques often don't extrapolate well from one application domain to another; a method that is effective for cryptography may not be effective for distributed systems and vice-versa.

littlestymaar
3 replies
1d1h

I think that's the normal dynamics when an expert discusses something with laymen in a particular subject.

But this is hubris, or at least arrogance, on your part: this is a very broad subject with many impact (reliability, security[1], language UX, teach-ability, productivity impact in heterogeneous teams, and so on), and most of the people who've debated with you are in fact often more experts than you in at least one aspect of the topic. Your obstinacy to see them as mere “laymen” is what makes you unable to improve in your understanding of the big picture, which come by aggregating the wisdom that comes from experts of complementary topics to yours.

Your inability to do so is likely the reason for your intellectual stagnation.

And this kind of antagonistic posturing is actually a problem for the whole world, because as you are proudly arguing about the “failures of sound methods” and the success of the alternative, the truth is that even if the real world deployment of TLA+ far exceeds the one of Coq, it's still pretty negligible in the grand scheme of things. And arguing that “but distributed systems do use TLA+ all the time” would be as pointless as saying “but Airbus is using Coq to make planes fly”.

In fact, none of the state of the art methods have gained widespread use, and this industry as a whole keeps pumping out tons of broken software every day without any sign of changing course.

The only successful (in terms of adoption) means of reducing the numbers of bugs are the following:

- static typing [2]

- unit testing

- fuzzing

All of them are pretty bad at their job in average, but at least together they avoids at a billion dollar worth of bugs every years, which is still much better than any of the technically superior, but practically unheard of but most software developers, alternatives. The jury is still out on what can actually work at scale.

Preaching for years on internet forums that your option of choice is the good option, despite its lasting failure to get mainstream adoption is as delusional as the posture you are criticizing, and at least those people are trying something instead of losing their time being patronizing over the internet.

[1] security is very different from reliability in that it has an adversarial nature: it doesn't matter if a bug is “less likely than hardware failure” on random input, if an attacker can trigger on purpose it with high probability. This is for instance something you have been consistently overlooking over these years.

pron
2 replies
23h36m

TLA+ and Coq (at least as far as their use in software goes) are both examples of formal specification languages offering a deductive proof systems (and TLA+ also offers a model checker -- yet another sound method). And I don't think you'd find me selling either one of them in this conversation, in which I mostly focus on the limitations of such sound tools.

Unit tests (depending how you interpret the properties of interest) and fuzzing are indeed probably the most famous examples of unsound methods, and so it sounds like you actually agree with me: it is unsound methods that have been more successful in practice except for the simple properties proven by simple type systems. But no matter what argument you find more or less convincing, you should try to listen and learn -- because it's clear that you don't understand the conversation -- rather than shout from the peanut gallery.

If you find the terminology unfamiliar, you can think of sound methods as "static" ones [1], i.e. they don't execute the program, while unsound methods are usually "dynamic", i.e. require executing the program [2]. Indeed, my point is that "static" and "dynamic" methods are most effective when used in combination, and it is the static methods that are more cost effective for simple properties -- their effectiveness drops as the properties become more "interesting" -- and it is the dynamic methods that are more cost-effective for more complex properties. It sounds like you agree.

Now perhaps you can understand my original point: a certain attribute of a programming language that makes certain sound (static) methods easier when verifying certain classes of properties doesn't help all that much if those methods are so limited to begin with for that class either in the range of properties for which they're effective or their scalability. I.e. making a method that, in the relevant situations, is 100x too expensive for your budget only 90x more expensive for your budget doesn't have much of an impact on its overall practicality.

[1]: This is more of a large overlap than an equivalence. A human or LLM reviewing code is also a static form of correctness assurance, but it isn't a sound one.

[2]: I won't confuse you with abstract interpretation and model checking, both of which are sound and are can be viewed as either static or dynamic depending on your perspective. I consider them static as they don't rely on full concrete executions of the program. Some interesting unsound methods like concolic testing have some "static" parts, but I consider them ultimately dynamic.

littlestymaar
1 replies
22h44m

I only mentioned Coq vs TLA+ because I've read your blog and I know you have strong taste preferences between those two.

That's because, unlike you, I actually try and understand people I don't agree with and see where they come from and under what perspective their views are indeed interesting, and that's how you learn how to understand complex (not difficult) topics.

But that, my friend, requires that you don't immediately consider yourself superior to your interlocutor and first consider them like peers who have different perspective on things because they have a different background. That's the inevitable conditions for improvement.

Thinking about other as mere peasants is probably more intellectually comfortable, but it's also far less constructive. Needless to say, it's also not a great way to earn respect.

DEADMINCE
0 replies
6h11m

Your lecturing here seems to be making a lot of insulting assumptions unnecessarily. It's probably best to try and avoid giving 'advice' like this.

Animats
42 replies
1d10h

That's a good article.

Having worked on program verification in the past, Rust looks like the most useful modern language to which formal methods can be applied. Rust's rules eliminate so many of the cases that are hard to formalize.

Big remaining problems involve deadlock analysis, both in the thread sense and in the Rc/borrow sense. Those are somewhat equivalent. I'd like to have static deadlock analysis in Rust. If you had that, I think you could get safe back-pointers, too. If you can prove that all borrow and upgrade calls can't fail, you can eliminate most reference counts. That gives you free interior mutability when that's possible.

The author's comment on provers is interesting. The big problem with theorem provers is that they're developed by people who like to prove theorems. They're in love with the formalism. This leads to a UI disconnect with programmers.

You can knock off most things you have to prove with a SAT solver. But you'll need something heavier for the hard problems. Coq is too manual. The author thinks ACL2 is too functional. Not sure what to do there, but I've been out of this for decades.

Machine learning may help in guiding theorem provers. Most programs aren't that original in terms of control flow and data usage. So inferring a proof plan from other code may work. ML systems can't be trusted to do the proof, but may take over guiding the process.

klabb3
16 replies
1d5h

Big remaining problems involve deadlock analysis, both in the thread sense and in the Rc/borrow sense. Those are somewhat equivalent. I'd like to have static deadlock analysis in Rust. If you had that, I think you could get safe back-pointers, too. If you can prove that all borrow and upgrade calls can't fail, you can eliminate most reference counts. That gives you free interior mutability when that's possible.

I’m not a formal verification guy, but I don’t think locks in the traditional sense is helpful. No compiler will give you anything useful to work with wrt locks. You just accept and live with the danger.

In fact, locks and ref counts are runtime constructs in Rust. Imo it “goes against” even Rust’s own strong principles of single ownership - these things are shared mutable state. Arc broke Rusts RAII model so much that they had to remove scoped threads (which needs destructors to run before end-of-scope).

Anyway, rust aside. Global refcounts have cyclic and leak issues that again are global. I don’t think we can necessarily get rid of them, but I do believe containing them is better, eg with arena-style scoping.

As for locks my gut feeling is we need an asymmetric data structure, for instance a channel. In eg Go you can break a channel into sender and receiver, and have the sender goroutine do `defer close(ch)` which is guaranteed to run if the rest of thread finishes, even if there’s a panic. Doesn’t necessarily have to be channels but separating the roles (reader-writer / producer-consumer) is much easier to reason about, and that probably helps formal analysis too.

zozbot234
13 replies
1d4h

Scoped threads have been reintroduced to Rust since ver 1.63, released in August 2022. There's also significant interest in introducing true linear types to Rust, where destructors would be guaranteed to run - however the changes required for that are large and will probably have to be introduced in a new Rust edition.

the_mitsuhiko
11 replies
1d3h

however the changes required for that are large and will probably have to be introduced in a new Rust edition.

Or more likely: never. The odds of linear types making it into the language are non existing from what I can tell.

CooCooCaCha
10 replies
1d2h

Unfortunately rust language development has practically ground to a halt. There is a trickle of work but it’s highly unlikely that major language features will happen any time soon.

estebank
5 replies
1d2h

I'm not sure what you mean. There are fewer flashy announcements for things that block people, but development continues unabated, with large refactors that enable things that weren't possible before. Things like Return Position Impl Trait In Traits, Derefs in patterns, a new trait system checker, enhancement to const eval, generators and that's just of the top of my head. Not everything is gonna land soon, but development is happening. The major language feature work is happening, they just are large enough or contentious enough that stabilizing them will take time.

CooCooCaCha
4 replies
1d

Oh please, they’ve landed a few nice things but larger items like generators are likely not coming soon.

Just look at the async roadmap for example, how many years has it been and how much progress has been made?

As someone who is firmly in the Rust community it’s annoying how everyone seems to know this but you can’t fucking talk about it. I can’t tell you how many times I’ve seen passive aggressive comments like “this would be much easier if we had variadics… maybe someday”

estebank
3 replies
23h27m

landed a few nice things but larger items like generators are likely not coming soon.

Generators got blocked on settling AsyncIterator/Stream. The work for it is happening. It's taken too long, but not because people are sitting on their thumbs.

Just look at the async roadmap for example, how many years has it been and how much progress has been made?

Hard things are hard and take time.

As someone who is firmly in the Rust community it’s annoying how everyone seems to know this but you can’t fucking talk about it.

You can say whatever you want to say, everyone can. But expect pushback from the people pulling the cart when you state that the cart isn't moving just because it isn't moving as fast as you want in the exact direction you want. Just take a damn glance at the review queue, or even the release notes. A ton of work is being done, of different levels of difficulty and the things that aren't stabilized quickly is because they have research-level open questions.

I can’t tell you how many times I’ve seen passive aggressive comments like “this would be much easier if we had variadics… maybe someday”

I don't know why you read that as passive-aggressive, but I certainly see your reply as dickish.

CooCooCaCha
2 replies
23h6m

If the cart is moving slowly I’m going to point out it’s moving slowly. If you want to take that personally that’s your problem, but I’d prefer to be honest about the situation.

The fact of the matter is significant language features have stalled. And comments like the one i quoted are passive aggressive when you consider they’re casting doubt on whether certain language features will ever see the light of day.

It’s especially disheartening when you don’t really hear much from the language team and it doesn’t seem like progress is being made on these language features.

klabb3
1 replies
19h34m

GGP here. I think you’re both right. As a formerly involved contributor I can assure you that there are legitimately hard problems, and that there are very bright people working on it.

However, the reason why it’s so hard, is imo because Rust matured itself so quickly, by basically having a bunch of 10x mega brains lay a foundation of insufficient but unchangeable features and then leave a lot of details to be solved later when (a) there are much more people and stakeholders involved and (b) there are much more promises made around governance, stability etc. And, as it turns out, laying the “remaining pieces of the puzzle” is harder, especially if you’re not allowed to move the initial pieces.

This in turn creates an incongruence for outside observers, where it appears like development is slower (which is true) because people are incompetent/lazy/prioritize badly (which is not true).

The mistake here, in my humble opinion, is moving ahead so quickly many years ago. Async is the perfect example of this, but as I alluded to earlier, I think allowing leaks was a similarly rushed decision, and that was much earlier. If I were to bet, people were excited about maturing the language for production, but, since there was sooo much uncharted territory, there were novel subtle challenges which during the frenzy appeared like they could be solved quickly with brainpower alone to meet the deadlines and keep the excitement going. But even the wrinkliest of brains can’t deduce the implications of 4th order effects down the line - you have to try it out and build real life things. You have to be humble, when everyone else isn’t.

CooCooCaCha
0 replies
14h49m

This basically confirmed what I expected thank you. I noticed development started to slow after 1.0 and when the rust foundation was created.

I think there was a post about how there would never be a rust 2.0 and the foundation seems to have driven a lot of people away.

the_mitsuhiko
3 replies
1d1h

The reason I think this has low changes of landing has little to do with the rust developers as the complexity of the feature and the invasiveness to the ecosystem.

CooCooCaCha
1 replies
1d

Yeah, like I said it’s a trickle at this point.

the_mitsuhiko
0 replies
6h28m

You're misunderstanding the problem entirely. Even if you could implement it overnight, it would be unlikely to ship.

littlestymaar
0 replies
22h55m

I mostly agree with your sentiment, but at the same time I remember being quite skeptical of the possibility of Rust getting async/await at all because I saw no way around the lack of immovable types in the language, and I thought it would be way to disruptive to add it. And then Pin came and allowed to work around that issue (don't get me wrong, Pin is a dirty hack and I hate it, but at least it works, even if it makes tge ergonomics questionable for some use-cases…).

Since then reserve my judgment on stuff I think to be impossible.

klabb3
0 replies
1d2h

Scoped threads have been reintroduced to Rust

Right, but those are a mere shadow of what they used to be, which was JoinHandle with a lifetime parameter. You can’t build a thread pool or async runtime with static borrowing across tasks, which makes the use case very narrow.

There's also significant interest in introducing true linear types to Rust, where destructors would be guaranteed to run

I’m glad to hear that. I’ve always hoped it’s possible to introduce non-breaking through a new auto trait but there are many delicate details around unwinds and such that may be gnarly. As well as an accumulated dependence on Arcs in library code.

Animats
1 replies
10h54m

I’m not a formal verification guy, but I don’t think locks in the traditional sense is helpful. No compiler will give you anything useful to work with wrt locks. You just accept and live with the danger.

Deadlock analysis is well understood. You have to prove that locks get locked in the same order regardless of the path taken. That can be easy or hard. In many cases, if you have the call graph annotated with where locks are set and cleared, you can try to prove that the program is deadlock-free. I'd like to see Rust tools to help with that.

klabb3
0 replies
5h57m

Right, but that’s a global property of everything that uses the lock, so if you’re writing a library you'd have to run the analysis against real or synthetic call graphs. Plus it’s very easy to accidentally break that property if you’re not using it in an analysis-friendly way, no? Instead, role-separated data structures like eg channels, run-once-functions and atomic value swaps can - due to their constraints - make them harder to use in ways that break analysis, and can be analyzed locally.

Say eg my library is acting as a producer. Ideally I can check that I am producing 0-N items and that it closes the channel. Analyzing the consumer can be deferred and decoupled, which is good because it may not exist yet.

rtpg
10 replies
1d10h

I think that Coq/Agda/Lean and friends are really going to be the winners in the proof space. Interactivity is a pretty good model for the feedback loop, and they're systems that exist and work, without a lost of asterisks.

The biggest thing I think these tools are missing out of the box is, basically, "run quickcheck on my proof for me". It's too easy to have some code, and try to prove some property that isn't true, and rip your hair out thinking "why can't I get the proof working!"

If halfway through your proof you end up in a thing that seems nonsensical, it would be nice for these tools to just have a "try to generate a counter-example from here" command that spits out a thing.

Proofs are so path-dependent, and it's not trivial to do in general. But I think they are all very close to greatness, and every time I've gone through the effort to formally verify a thing, it's been pretty illuminating. Just... if you're trying to prove some code, the process should take into account the idea that your code might be buggy.

pjmlp
7 replies
1d10h

I also quite like Idris, the problem with these languages is that they require being a Mage level 10, while most folks are still getting around magic exists.

kccqzy
3 replies
1d2h

Most programmers don't have enough previous experience proving things so this is unnatural for them. It is sadly impossible to require programmers to have been a math major to get them the mental facility to work with these languages.

trealira
2 replies
22h50m

At my university, all majors of computer science have to take a course in discrete math, which involves writing a couple different types of math proofs using proof by induction (including strong induction) and proof by contradiction. However, that was the only class I've ever taken where I've had to write proofs regularly.

Anyway, I think that, at a minimum, at least one class like that should be part of every CS major's education, if it isn't already. That way, proofs are less intimidating, and things like Coq and Isabelle/HOL are less intimidating.

kccqzy
1 replies
22h9m

I agree with you, but the ship has sailed to retroactively require all programmers to have such experience. Also, the typical introductory discrete math course likely doesn't require writing proofs in a language precise enough to be formalized. I would think that we need at least one more introductory course to logic (first order logic, Peano axioms) and an introductory course to type systems (structural induction, progress and preservation, etc).

trealira
0 replies
21h49m

We also learned some first order logic, enough to write the proofs in formal language, though we didn't learn anything about type systems.

You're probably right about the ship having sailed, though. It's probably harder to get people to learn new things when they haven't been in school for a long time.

rtpg
1 replies
1d7h

I found the onboarding to be super rough… I think it’s the sort of thing where you really want to do a full day workshop on it to get started.

The interactive elements really can’t be done justice in looking at readme’s or textbooks alone IMO

Tainnor
0 replies
1d6h

In my case (Lean), I've been getting a lot of help through the Zulip community. There's a dedicated #new_members stream where anybody can ask total beginner questions. It's quite humbling actually when you have someone like Kevin Buzzard answering your noob questions.

I'm sure other solvers (Coq, Idris, etc.) also have online communities where you may be able to ask questions.

Tainnor
0 replies
1d6h

I'll have to agree with this. In particular, there always seems to be some stage where you'll have to think carefully about how the simplifier works exactly, about different types of equality (syntactic vs. definitional vs. provable), etc. And changing internal representations of objects will also mean that old proofs might not compile anymore. And tactics are often rather opaque blackboxes.

I say this as a non-expert who has been enjoying writing some Lean recently and has previously also written some Coq and Idris.

themulticaster
0 replies
1d7h

Isabelle/HOL has Quickcheck, which is precisely what you think it is. Although it only works if the free variables in the thesis you'd like to prove are sufficiently simple AFAIK (think integers or lists). The more powerful alternative to Quickcheck is Nitpick, which is a purpose-built counterexample finder.

The workflow you'd like already exists one to one in Isabelle: You can simply type "nitpick" halfway through a proof and it will try to generate a counterexample in that exact context.

Animats
0 replies
1d8h

"try to generate a counter-example from here" For the theories behind some SAT solvers, that's quite possible. There's a set of problems that are completely decidable. These are composed of add, subtract, multiply by constants, logic operators, and theory of arrays and structs. That covers most things which cause programs to crash and many of the things you want to write in assertions.
xavxav
7 replies
1d10h

You can knock off most things you have to prove with a SAT solver. But you'll need something heavier for the hard problems. Coq is too manual. The author thinks ACL2 is too functional. Not sure what to do there, but I've been out of this for decades.

Agreed, though I think Lean is making big progress in the UX front of ITP verification through its extensive meta-programming facilities.

My issue with these tools applied to verifying external languages like Rust, is that the proof is not written in the target language, forcing your developers to learn two languages.

I have recently been thinking about what a "verification aware Rust" would look like, colored by my experience writing Creusot, the work on Verus and Aeneas and my time in the lab developing Why3. I think building such a language from the ground up could provide a step change in ease of verification even from Rust, particularly with regard to those "hard parts" of the proof.

GregarianChild
6 replies
1d8h

"verification aware Rust" ... building such a language from the ground up could

Could you sketch in a few bullet point what you think is missing and how to fix the gaps?

In my experience a core problem is that adding rich specifications slows down rapid evolution of programs, since the proliferate transitively along the control flow graphs. We can see this most prominently with Java's checked exceptions, but most effect annotations (to give an example of simple specification annotations) suffer from the same cancerous growth. This is probably at least partly intrinsic, but it could be improved by better automation.

xavxav
5 replies
1d6h

Could you sketch in a few bullet point what you think is missing and how to fix the gaps?

Not yet, I am planning on writing some blog posts about it but there are still enough fuzzy points I don't want to share yet.

... since the proliferate transitively along the control flow graphs. ... suffer from the same cancerous growth.

This is one of the biggest and most difficult issues of verification, I fully agree. The 'promise' of verified code is that you don't need to understand the code, you can just look at the smaller, higher-level spec, but right now verified code has a specification & proof overhead > 1, it needs to drop to < 0.1 for it to become economical.

Animats
3 replies
1d

Yes, keep plugging on that one. Verification statements should be in the same language as the program, and be syntax and type checked on every compile. Not in comments. That keeps them consistent with the code.

weinzierl
2 replies
1d

Sorry that this is a very basic question but I never had any of this in uni and I find it super exciting:

If I am trying to prove something I can state in my target language how would statement and proof differ?

If the thing to prove is not expressible in the target language (e.g. the fact that a function terminates) I would have to use a separate language anyway.

Could you give an example how this could hypothetically look like (preferably in pseudo Rust)?

opnitro
1 replies
21h29m

Different user, but sure! Three examples:

1) you might have two algorithms for computing the same thing, one simple and obviously right, and one complex and optimized. So you want to state that they always produce the same thing.

  fn my_complex_algo(x : Input) -> o:Output ensures o == simple_algo(x)

2. You might have high level "properties" you want to be true about your program. For example, a classic property about type-checkers is that they are "sound" (meaning that if you say a program is well typed, then when you run it it must be impossible to get a type error).

3. You might have really simple properties you want to prove, for example that an "unreachable!()" is indeed actually unreachable.

Animats
0 replies
10h58m

you might have two algorithms for computing the same thing, one simple and obviously right, and one complex and optimized.

Right. Those are the kinds of systems that are easy to specify concisely. Databases, for example, can be specified as giant arrays with linear search.

ratmice
0 replies
1d4h

To me at least, the most difficult part of verifying rust code has been the lack of any pre/post conditions on the standard library, I quite often have ended up rewriting chunks of the std library, pulling them directly into my program to verify them.

I personally think it would be interesting if we had a nightly-only/unstable pre/post conditions in the std library. Then one could take the commit of the compiler that corresponds to a stable release and verify against that. That is the overhead I seem to be bumping my head against whenever I try to verify some rust code at least.

MaxBarraclough
3 replies
1d8h

Machine learning may help in guiding theorem provers

Interesting idea. There would be real upside where the machine learning system gets it right and guides the prover to a valid proof, and little downside if it gets it wrong. (The prover won't derive an invalid proof, it will just fail to derive a valid one.) I don't think there are many applications of machine learning that have this property.

eru
2 replies
1d6h

(The prover won't derive an invalid proof, it will just fail to derive a valid one.) I don't think there are many applications of machine learning that have this property.

Don't basically all applications of any heuristic solver to NP problems have this property? (Including problems that are only in NP, because they are also already in P.)

MaxBarraclough
1 replies
1d5h

Yes, sure, I could have phrased that better. There are no doubt many possible applications of the idea. The recent headline-grabbing applications of machine learning tend not to be like this.

eru
0 replies
10h23m

Yes, agreed.

Though even amongst the recent-headline-grabbing stuff there are some areas that you can improve at by the equivalent of AlphaGo's self-play. Basically, when you have a slow and expensive option to use a weaker version of your system to produce better decisions / smarter data, and then use those to train the system.

zozbot234
0 replies
1d9h

The issue with SAT/SMT for program verification is it's too much like dark magic. It's acceptable if you can have it generate a proof certificate for the properties you care about and commit these with the Rust source code - but AIUI, generating usable proof certificates for 'unsat' proofs is a hard problem where you might be better off writing out the proof interactively.

nextaccountic
0 replies
1d1h

F* is this mythical language that proves stuff automatically (with an SMT solver, more powerful than SAT), but still lets you prove manually if the automated solver fails

The crypto routines in Firefox and Wireguard is actually written in F* (or more specifically Low, a low level DSL embedded in F), not in Rust, and is fully verified

https://project-everest.github.io/

https://mitls.org/

akira2501
12 replies
1d9h

The typical Rust program has either no refcells, or a very small number, just as it has either no unsafe code or a very small amount

Really? How are we quantifying this?

littlestymaar
10 replies
1d9h

I don't think anyone has actual measurements on this, but at the same time anyone who's written and read a significant amount of Rust[1] can tell you that unsafe and RefCell aren't something you encounter every day (or even every year when it comes to RefCell).

[1] for my part I've been doing it full time for the past 7 years, and also teach Rust at university.

the__alchemist
6 replies
1d4h

Mutex<RefCell<T>> is a common pattern in embedded for setting up globals that are accessed from a critical section.

steveklabnik
5 replies
1d4h

… sorry, I’m having my first coffee of the morning, so I’m probably missing something: what would putting a RefCell inside a mutex buy you? You’re already ensuring unique access with the Mutex, how would you ever get multiple accesses to the RefCell?

the__alchemist
4 replies
1d4h

I'm not sure! The critical section-ed Mutex prevents the multiple accesses; Note that you generally have an Option inside the RefCell, so you can initialize it as a static, then assign to it later. Maybe that's related

For copy types, you can use `Cell` instead. And of course, atomics for primitive types.

steveklabnik
3 replies
1d1h

Wouldn't you want the option outside? That's the usual pattern with "I want to assign later."

Anyway, I haven't seen this type used in the embedded projects I've worked on, do you have an example? Maybe I can suss out the reason from that.

the__alchemist
2 replies
1d

I don't have examples, as the OSS code bases I can find either A: Don't have practical examples, or B: Use RTIC or Embassy.

Here's what happens if you leave out the RefCell:

  TEST.borrow(cs).replace(Test {});


  error[E0596]: cannot borrow data in a `&` reference as mutable
If you use Option on the outside, I'm not sure how to then set or access the variable:

  static TEST: Mutex<Option<RefCell<Test>>> = Mutex::new(None);
What would you recommend as an alternative? Use case: Sharing state between interrupt handlers and similar.

steveklabnik
1 replies
21h45m

Ah! So this is an unfamiliarity issue on my part: I didn’t realize cortex_m::interrupt::Mutex has a different API than std::sync::Mutex or the various spinlock mutices. It deliberately only provides immutability because they want you to be able to choose how to do the interior mutability yourself. Now this all seems reasonable. Tricky!

(With hubris the interrupt stuff is abstracted away so you don’t need to access stuff this way, hence at least some of that unfamiliarity.)

the__alchemist
0 replies
21h35m

I appreciate the insight! Forgot to mention it was that Mutex. Of note, the syntax is kind of messy (Especially the access syntax, which I omitted), but it's not so bad with macros.

kalkr
2 replies
1d8h

What about Arc<Mutex<T>> ?

roca
0 replies
1d7h

In six years I wrote a lot of the 200K lines of Rust for Pernosco --- a fairly sophisticated application server for a SaaS debugger. There are 40 instances of the string 'Arc::new(Mutex::new('. A lot of them are shared caches and other innocuous things. There are 22 occurrences of 'RefCell::new('. There are other uses of Arc, and other uses of Mutex, but fundamentally the system has not collapsed into an arbitrary graph of mutable data.

Of course, one's mileage can vary. Most of our system is designed to be stateless at the top level, which helps.

littlestymaar
0 replies
1d7h

It's indeed more common than RefCell, but it's not like you're using them everywhere either (as using multiple locks is a recipe for deadlocks, so you should always be very mindful about how you use shared memory between threads).

I've run a quick analysis on the 1,145,666 lines of Rust code (excluding comments and blank lines) I have on my computer, which belong to several code bases I've worked on over the past few years (professional proprietary code, side projects and open source projects I just cloned to fiddle with), and here are the results:

- unsafe: 8247 matches, 1 every 138 lines (though keep in mind most of those aren't about mutable aliasing at all, like FFI or SIMD intrinsic)

- RefCell: 252 matches, 1 every 4546 lines

- Arc<Mutex<: 199 matches, 1 every 5757 lines

I think that's fair to say that “The typical Rust program has either no refcells, or a very small number, just as it has either no unsafe code or a very small amount”

ModernMech
0 replies
1d4h

Rust developers are loathe to use RefCells when a mutable borrow does the trick. It's preferable to rearchitect your solution than to use a RefCell. RefCells are a last resort, just as with unsafe.

titzer
9 replies
1d4h

I don't really get the gushing over this post in the comments, it papers over entire fields of program analysis in a single breath. While I like Graydon and respect his point of view, this paragraph is hopelessly over-simplified.

Conversely, languages that have a GC have unfortunately often not felt it necessary to bother having strong local reasoning support. Java for example has a GC and also uncontrolled mutable aliasing, so weak local reasoning and consequently more-challenging formal verification. If you write to an object in Java every other variable referencing that object "sees" the change, so any logical formula that held over all those variables might be invalidated by the write unless you use some fancy method of tracking possible write-sets or separating the heap.

First of all, there have been several languages that regions in their type system and are still backed by GC, so this is a strawman. Take, for example, Pony. Second, there is an entire field of pointer analysis and escape analysis that can and does infer uniqueness and reasons about whether two references can be aliases. Third, the whole point of static types is to slice apart the heap into independently non-aliasing parts using this technology called "classes" and "fields". We're not talking about JavaScript here, I think we should stop pretending like Java and C# and Scala and a bazillion GC'd languages don't have local reasoning about mutable state.

mrkeen
6 replies
1d3h

I think we should stop pretending like Java and C# and Scala and a bazillion GC'd languages don't have local reasoning about mutable state.

Pop-quiz: what does this print out?

  void myMethod(final Map<String, String> map) {
    map.remove("key");
    int oldSize = map.size();
    map.put("key", "val");
    int newSize = map.size();
    System.out.println(newSize - oldSize);
  }
EDIT: Apologies, I misread your double negative.

Nope. I reread it again and I'm still thoroughly confused. Java does have local reasoning, as opposed in particular to JavaScript, which does not?

vlovich123
5 replies
1d1h

Doesn't the answer depend on whether or not the map is being mutated by other threads concurrently?

To be fair, even in Rust that answer would depend on the specific map implementation - for example if the map internally held a lock to expose a thread-safe non-mut interface, there's a race between when you remove/insert and when you read the size. That suggests that the whole concept of local reasoning about mutable state gets pretty messy pretty quick unless you stick to very basic examples.

mrkeen
1 replies
23h55m

Doesn't the answer depend on whether or not the map is being mutated by other threads concurrently?

Yes. To reason about here, you need to look elsewhere. Reasoning locally about your code doesn't tell you what happens.

To be fair, even in Rust that answer would depend on the specific map implementation

I'm not convinced that Rust would disappoint here - this example is the essence of what Rust promises to do safely (well, that and GC-less GC). Per tfa:

for a very, very long time the field has been held back by programming languages with too much mutable aliasing to be tractable.

And Rust makes this better?

Yes. Rust references are subject to the "shared-xor-mutable" rule

If you're reading and writing to your map, then no-one else is. It's the reward you get for pushing through the pain of lifetimes/ownership that other languages don't impose on you.

To be fair, I haven't touched Rust since the latest push for async programming... has it really just dropped 'share-xor-mutable' on the floor?

vlovich123
0 replies
23h0m

If you're reading and writing to your map, then no-one else is. It's the reward you get for pushing through the pain of lifetimes/ownership that other languages don't impose on you.

I think you have failed to engage with the thrust of what I said - if the map is Sync in Rust, all bets are thrown out the window of your ability to reason locally about whether this piece of code works just by looking at this function - the map may be mutated between your mutation & read of the size by another thread.

If you're reading and writing to your map, then no-one else is. It's the reward you get for pushing through the pain of lifetimes/ownership that other languages don't impose on you.

Only if you use mutable references. A lot of Sync code doesn't necessarily give you a mut reference to enforce that exclusion (e.g. lock-free data structures).

To be fair, I haven't touched Rust since the latest push for async programming... has it really just dropped 'share-xor-mutable' on the floor?

Async has nothing to do with it & share^mut is still correct. I'm trying to highlight that with Sync you can lose that because the interfaces all become share even if they under-the-hood mutate in a thread-safe way.

iknowstuff
1 replies
1d

Only if the map is Sync :) and also, if map is a &mut, not even another thread will modify it while you hold it that reference.

vlovich123
0 replies
22h59m

Correct, I'm just highlighting that there are gaps in Rust's ownership model where local reasoning fails. They're fairly narrow compared to other languages, but you do have to know about them.

vlovich123
0 replies
22h45m

Ah - the author does call out that Rust can’t do local reasoning perfectly. The other place it comes up aside from the Sync example I describe is RefCell which similarly lets you mutate from behind a shared reference thus also violating local reasoning. The author also describes why Rust deviated here (the niche Rust was moving into with systems programming often needs an escape hatch for some small amount of non-local mutability)

zozbot234
0 replies
1d4h

...there is an entire field of pointer analysis and escape analysis that can and does infer uniqueness and reasons about whether two references can be aliases.

You can't do this automagically in the general case, you really do need something much like separation logic. (And of course the Rust borrow checking semantics can be seen as a simplified variety of separation logic.)

Classes and fields don't give you completely local reasoning about mutable state, because class/object A can end up depending on the mutable state of class/object B. And class inheritance as found in Java adds all sorts of further complexity of its own, especially as programs evolve over time.

thinkharderdev
0 replies
1d2h

First of all, there have been several languages that regions in their type system and are still backed by GC, so this is a strawman

Is it? He's not claiming that GC somehow makes it impossible to have strong local reasoning. He's just saying that, for whatever reason, the designers of most languages have made choices contrary to that goal. Which seems obviously true to me. Counter-examples exist of course but looking at the mainstream GC'd general purpose programming languages (Java, C#, Python, etc) it is true.

Third, the whole point of static types is to slice apart the heap into independently non-aliasing parts using this technology called "classes" and "fields"

The whole point according to who? "Object-oriented" design means many things to many people and from what I understand the original idea was gesturing at something that we would now call the "actor model" which legitimately does try and carve up the heap into non-aliased, owned data structures which can only be mutated through message passing, but OO as it is actually implemented in the real world seldom even approaches those original goals. And languages like Java do nothing to prevent you from storing mutable references to the same object in multiple different objects. This is drawn out more eloquently than I am capable of doing in this post https://without.boats/blog/references-are-like-jumps/

throwaway17_17
7 replies
1d10h

I really like that we are now two posts deep from a quote from a ~1973 paper by Hoare. I posted a really long comment in the original boat’s article’s HN thread pointing out that skewing the quote towards a Rust-centric view (while understandable given boat’s history with the language) was an artificial narrowing of the scope of Hoare’s critique. Now Grayson has taken that narrow slice and used it as jumping off point to discuss a few interesting areas/design-points of Rust. While I still think my comment was correct, the ensuing discussion by Grayson in this blog were more than sufficient to overcome any technical gripes with the inciting article.

throwaway17_17
1 replies
21h34m

Thanks, how do you get the actual link for a particular post?

lolinder
0 replies
21h11m

Click on the timestamp in the comment (7 hours ago, or whatever). That'll take you to a page that has that comment as the root, so you can just copy that URL.

throwaway17_17
1 replies
1d10h

My original comment about boat’s post citing the Hoare paper:

After the opening quote by Hoare and a quick intro paragraph, the author says 'What Tony Hoare was writing about when he said that references are like jumps was the problem of mutable, aliased state.' withoutboats is a well known Rust developer, so I guess this assumption about Hoare's quote should not be a surprise. However, I don't see the text of the quote as reinforcing that perspective. I can see an argument for the article's extrapolation, taking Hoare's seemingly more general lamentations about the semantic existence of references (a semantic group containing at least 'reference, pointer, or indirect address into the languageas an assignable item of data') and then applying to the Rust model of non-aliased, mutable state as an attempt to circumvent the problems Hoare is addressing. But this argument is an attempt at taking a narrowly scoped perspective on a problem, correcting that small slice of the greater problem, and then announcing the entirety of the problem no longer exists. Like I said, understandable why the article would take this direction, I just don't think it truly addresses the totality of Hoare's critique of references as an undesirable semantic abstraction. The title of the section withoutboats has quoted and the first sentence are unfortunately left out of his selection: "8. Variables One of the most powerful and most dangerous aspects of machine code programming is that each individual instruction of the code can change the content of any register, any location of store, and alter the condition of any peripheral: it can even change its neighboring instructions or itself. Worse still, the identity of the location changed is not always apparent from the written form of the instruction; it cannot be determined until run time, when the values of base registers, index registers, and indirect addresses are known." [This paragraph is essentially acknowledging that low-level machine code gains its computational power from unlimited ability to alter the state of computation] Also, the ... following the quote's second paragraph omits the following: "For example, in ALGOL 68, the assignment x : = y; always changes x , but the assignment x: = y+l; if x is a reference variable may change any other variable (of appropriate type) in the whole machine. One variable it can never change is x!" [This quote, which was removed, makes it explicit that Hoare is addressing the mere existence of references] Both of the omitted sections tend very strongly toward Hoare's actual critiques being the semantic concept of references in high level languages being problematic, not merely mutable state. There is some natural extension of Hoare's discussion of references as means of assignment, which does lead to the 'spooky action' occurances. However, following this section of Hints on Programming Language Design, Hoare talks for a bit about structured programming, scope, and parameter importance. Discussing that without references, programmers have disjoint names for semantic objects and those are only altered/mutated by passing the sole existing object to a procedure as a parameter and having it passed back. Overall, the TL;DR may be negatively stated as Rust developer gonna view things through a Rust-y lense. However, I think that is an incorrect reading. withoutboats skipped a crucial step in his going from Hoare's critique of referencs, they went directly from the text to an interpretation of the critique focused on aliased, mutable state. There is some discussion to be had about Hoare's assumptions of a single semantic object existing in a one-to-one correspondence with a disjoint source code name, especially in the context of multi-processor and networked programming prevalent in 2024. While I think that a more general solution to Hoare's problem exists and acknowledge Rust's attempts to at least tame a portion of the problem, I don't think any language has 'fixed' this issue.

the_mitsuhiko
0 replies
1d10h

Both of the omitted sections tend very strongly toward Hoare's actual critiques being the semantic concept of references in high level languages being problematic, not merely mutable state.

Both sections however talk about mutations. Immutable references are not any different from inputs into a function.

rstuart4133
0 replies
20h19m

Wow, 1973. 50 years ago. A professional lifetime.

Hoare got so close. He found the problem, and in hindsight solving it using a type system to enforce shared xor mutability seems like a small step. Had he figured out that small step back was the solution back then he would have saved us 50 years of enormous amounts of pain.

pron
0 replies
1d6h

I really like that we are now two posts deep from a quote from a ~1973 paper by Hoare

Thing is that by the mid-nineties Hoare had already noticed that there might be something a little misguided with the very foundation of '70s-era focus on soundness: http://users.csc.calpoly.edu/~gfisher/classes/509/handouts/h...

Since then, more advanced unsound methods than simple tests have only made unsound methods pull ahead of sound methods even further (I've posted other comments on this page to that effect).

mst
3 replies
1d6h

One way to look at reference counting is as a sort of eager, optimized form of garbage collection that works in the case that you have strictly acyclic data (or can tolerate leaking cycles).

It's interesting to note that python operates with the refcounting + tracing hybrid described (and I know of production deployments where they force the tracing collector to e.g. only run once every N requests because python's never got that fast either).

perl, meanwhile, is pure refcounting but has weak references so you can make data acyclic from a refcounting POV at the cost of having to pay attention to which parts of the cyclic structure you keep references to (and for cyclic object graphs there are some tricks you can pull to move the weak link around so you don't have to pay such attention, but that would be a whole comment on its own).

koka does refcounting in theory but tries to lift as much of it to compile time as possible in practice, plus if you do e.g. 'y = x + 1' and x has only a single reference to it and is guaranteed to not be used afterwards, it re-uses the storage for y and so can do in-place mutation.

nim, meanwhile, offers something called ORC, which is automatic reference counting plus an implementation of Bacon+Rajan's Recycler algorithm which is designed to collect -only- cycles in an otherwise refcounting based system and as such is really rather fast at doing so.

Coming back to Rust: There's a stop the world Recycler implementation (rather than concurrent as intended both by the original paper and the author's future plans if the itch returns) here - https://github.com/fitzgen/bacon-rajan-cc - released as the bacon-rajan-ccc crate from this fork - https://github.com/mbartlett21/bacon-rajan-cc - and the README for those contains an Alternative section linking to other experiments in the same area.

(I've yet to play with any of the rust versions but love Recycler sufficiently as a concept that I suspect if/when I encounter a situation where it might help I'll be going back and doing so)

If you're having trouble finding the Recycler papers, I stashed a copy of each under https://trout.me.uk/gc/ (and if you're a similar sort of strange to me, you might also enjoy the ones stashed under https://trout.me.uk/lisp/ ;)

mst
0 replies
10h20m

Oh, fantastic, a rust implementation that already hives it off into a separate thread.

Seems the README mentions neither 'Recycler' nor the authors so none of my searches found it; possibly I should file an issue suggesting the author adds those because this is (at the very least) way closer to what I was looking for.

Thank you!

mccr8
0 replies
20h47m

Firefox also uses reference counting plus a trial deletion based cycle collector to manage C++ DOM objects (and cycles through JS). In fact, Graydon was responsible for the initial implementation.

worik
2 replies
22h26m

I am impatient with formal verification of programmes.

Proving that a programme correctly implements a specification is interesting theoretically but of little practical use

Correct specification is as hard as correct programming, so the difficult problems are not solved, they are moved.

There are practical use cases for formal methods, but they are rare to encounter

nyssos
1 replies
21h54m

Correct specification is as hard as correct programming

A complete specification, sure, but that's not really the goal. The things we want to prove are generally just a few key properties: this function always terminates, that one always returns a sorted array, etc. And once you can do that you can impose these things as preconditions: e.g. you must demonstrate that an array has been pre-sorted before you pass it to this function.

worik
0 replies
18h20m

Yes.

I am not against all formal methods. Pre and post conditions for loops etcetera. When they are useful they should be used (in many cases they are built in: `for i = k; i < l; i++` has a precondition, a loop condition, and an action all in one statement.)

It is the idea, that was very hot thirty or so years ago, that by using formal specifications and rules of inference we could eliminate bugs.

slushy-chivalry
0 replies
22h38m

I never get tired of reading articles about typestate pattern

shivanshu120
0 replies
1d9h

This is very good article for rust

kzrdude
0 replies
1d9h

Written on May 15th 2024 which was the 9th anniversary of Rust 1.0

eggy
0 replies
1d4h

I just posted a relevant comment on the F* article a minute ago. I would subjectively prefer F* because of syntax over Rust, but for the show control software we are developing, we are going with Ada/SPARK2014. Rust needs a formal, published standard like the other legacy PLs to draw the same crowd currently using Ada/SPARK2014 and the formal verification tools and some real-world safety-critical, high-integrity applications in its back pocket.

"I prefer F#/F* syntax, but I had to go with Ada/SPARK2014 for the safety-related control systems I am trying to verify formally and use for high-integrity applications. Rust is making some inroads with AdaCore and Ferrous Systems partnering on providing formal verification tools for Rust like they do for Ada/SPARK2014, but Rust still doesn't have a published standard like C, Common Lisp, Prolog, Fortran, COBOL, etc. Plus, the legacy is immense for Ada and SPARK2014."

bionhoward
0 replies
1d

I just wish we could write simpler compile time type guard kind of things in a more simple way, because when the trait bounds metastasize it can get hard to read these type level programs

“Where <<<T as F>::Output as G>::Output as H>::Output: HList<Head = A, Tail = B> + Z

type Output = <<<T as F>::Output as G>::Output as H>::Output;”

Put some Cons<Cons<Cons<U8, Nil>>>> in there and then realize numbers like U8 (from TypeNum) is itself a nesting of Cons<Cons<Cons< of binaries and it just winds up feeling like you’re looking at a massive nesting of <<<>>>

Consequently you can make progress in this area and get it to verify what you want, but the error messages can be massively complicated and divergent from the way the human developer writes their code (because of the “types vs values” difference producing wildly different error messages which are hard to customize) and the implementation can be a massive pain in the ass.

Developer experience here is surprising because it’s not “just functions” but rather combinations of functionalities with certain generics and associated types. You want to write a runtime style check but execute it at compile time, but to make that happen requires quite different code (in my experience)

All this to say, perhaps making type level comptime rust more simple and functional and readable is ultimately what would enable a lot of these formal analysis projects to provide a good experience for their maintainers and users

TLDR, wtb “comptime” Rust!

P.S. are we ossified on Pin<&mut Self> Future ? I wish I could tinker with different underlying implementations of async/await but no clue how to get started

Jaxan
0 replies
1d11h

I read the linked article by Boats and found it great. I was really amazed by the 50-years old quote by Hoare, it’s still relevant and very well put.