Seems to rely on at least one known compiler bug (admittedly one open since 2015) https://github.com/rust-lang/rust/issues/25860
It’s good to see that at least Miri catches the issues. Hopefully the compiler can catch up with that!
If you need a compiler sidecar anyways, why not just have a simpler language where all the borrow checking is done in a sidecar?
You don't "need" a compiler sidecar unless you're specifically writing code to exploit obscure bugs. This is a 0.0001% case
yeah or if you're using someone else's code that you're trusting doesn't have "obscure bugs" put there to smuggle in a vuln.
If you run somebody else's code and you don't trust that person, you need to read their code carefully anyway. Safe Rust code can do all kinds of shenanigans including:
* Wiping your disk
* Downloading and running code from the internet
* ...
Rust's safety guarantees don't exist to protect you from a malicious Rust programmer. They exist to protect you from mistakes a well meaning but fallible Rust programmer can make.
Rust's safety guarantees don't exist to protect you from a malicious Rust programmer.
This greatly ignores human behavior. A LOT of people who are reviewing rust code are going to only concentrate on unsafe blocks (that's if you're lucky), because rust is advertised as "rust gives you safety", and the word "safety" strongly suggests "don't worry about these things, they are safe". if the advertised safety is a false sense of safety that's a problem.
Do you have any data backing that argument? I could just as well make the argument that Rust has an extensive "correctness above all else" culture that encourages people to thoroughly review pull requests. Without data both of these claims are just that: claims.
these people you're talking about would make the same mistake with code written in any language.
Nonetheless, it's far less of an issue than in any other language. You don't need 20 lines of crazy code to slip a serious security vulnerability into C, you can add one with a single character changed half of the time.
This is exaggerated re. C, but it is true that the important thing for practical security is whether memory safety is violated in practice, and obscure compiler bugs you have to go out of your way to try to exploit don't affect that. That being said, this bug is pretty bad, and it should be fixed.
I should point out that the reason the bug is difficult seems to mainly be because of backwards compatibility concerns. If compatibility weren't an issue, Rust could just ban function contravariance and nobody would care.
It's not; changing a == to != or a < to <= is very often enough to fully compromise something.
Miri is fantastic, but also isn't a "compiler sidecar," exactly. It is an interpreter. It's closer to a sanitizer than a compile-time check. It also means a huge, huge overhead to running your code, and in a performance-sensitive language that's a tough sell.
It's also only needed for when you dip into unsafe, which is not particularly often, and so imposing this on all code would be the wrong tradeoff.
Well libstd has unsafe all over the place. And as OP points out it catches this unsoundness issue in the language even though there's no unsafe anywhere, so it goes beyond just "dip into unsafe".
Sure, but the unsafe in libstd is some of the most tested unsafe in existence. And sure, there are some soundness bugs, but in my experience they are edge cases that rarely come up. I don't think I have personally ever run into one in the wild.
Regardless, it is not feasible for you to run your code under miri all the time, so you have to pick and choose where you use it, as a practical matter.
Yup, just highlighting that it goes beyond more than just runtime checking for safeness of unsafe code only :)
Does polonius catch the issue?
It's weird to see lifetime correlations expressed as
&'a &'b
and not with where 'a: 'b
I suppose I don't understand enough of how the first is different from the second one, and how it impacts this issue.It's just a reference to a reference. It has nothing to do with lifetime correlations.
Also, it's a lot less weird if you don't stop in the middle of the type. The entire type is for example
&'a &'b u8
or &'a &'b ()
So the outer reference has lifetime 'a and the inner reference has lifetime 'b.It has nothing to do with lifetime correlations
Yes it does; there's an implied `'b: 'a` outlives relationship that's required for the type to be well-formed.
In fact, I believe that's exactly where this bug lies. You're effectively able to trigger a case in which passing `&'a &'b` without providing any correlation (`where 'a: 'b`) that one would normally be required to provide makes the compiler behave as if those correlations were passed, albeit inferred incorrectly.
Edit (since I cannot edit my comment after 2 hours): for the code to properly verify the lifetimes one needs to write
where 'b: 'a
meaning 'b is at least as useful as 'a. So 'b can live the same time, or longer, but not less.The first one creates an implied bound, while the second is an explicit bound. The important thing about implied bounds is that they allow late binding of lifetimes to function pointers (or closures): that is, you can have a single function pointer type like "for<'a> fn(&'a str)" that can be called with any lifetime 'a. Here, the lifetime 'a is called "late-bound", since it can be different with each call.
In contrast, if a lifetime is subject to an explicit where bound, it must be "early-bound": each function pointer can must choose one particular value for that lifetime, that must be upheld for every call. For a practical example, you might have tried to declare a closure with a &T parameter, only to get lifetime errors when you call it twice. This is because the lifetime in the closure type is early-bound and must be the same for every call. (Sometimes the compiler figures out that a late-bound lifetime is desired, but the rules are very subtle. This is also why it's difficult to write a function that accepts a generic async closure.)
Late binding is necessary for certain kinds of variance, which is a big part of this issue. Function pointers are contravariant in their parameters, which means if you have a type like "fn(&'short str)", you can cast it to "fn(&'static str)", since a 'static reference will always be valid for 'short. They're also covariant in their return type, so that "fn() -> &'static str" can be cast into "fn() -> &'short str". But when performing these variance transformations with late-bound lifetimes, the compiler doesn't always take into account the implied bounds in the source type properly, which allows you to perform casts that aren't actually sound.
Without compiler bugs, on stock Linux, you can achieve the same thing via /proc/self/mem. Rust makes this very easy because you can get the raw address from safe code, so no guessing is required which memory location to patch.
For the compiler bug, as someone who never strayed into those regions of Rust programming, it's not clear to me how likely it is that someone would write such code by accident and introduce a memory safety issue into their program. I browsed the Github issues, but couldn't find an indicator how people encountered this in the first place.
Without compiler bugs, on stock Linux, you can achieve the same thing via /proc/self/mem.
The documentation addresses this case specifically: https://doc.rust-lang.org/stable/std/os/unix/io/index.html#p...
"Rust’s safety guarantees only cover what the program itself can do, and not what entities outside the program can do to it. /proc/self/mem is considered to be such an external entity..."
The /proc/self/mem thing shouldn't be a problem because you won't run into it accidentally. What I'm wondering if this compiler bug is in the same category, or if a Rust programmer is somewhat likely to run into this bug by accident (or other I-unsound compiler bugs) if their approach is more or less “tweak things until they typecheck and pass the borrow checker”.
I have been programming in Rust since late 2012. I don’t think I have ever run into a soundness bug like this in the wild.
That doesn’t mean that it never happens, but it is a very rare occurrence, not something that Rust programmers deal with regularly.
What a wonderful license! https://github.com/Speykious/cve-rs/blob/main/LICENSE
0. You just DO WHATEVER THE FUCK YOU WANT TO as long as you NEVER LEAVE A FUCKING TRACE TO TRACK THE AUTHOR of the original product to blame for or held responsible.
A thing of beauty^_^ I've seen a lot of licenses that require attribution[0], so a license that explicitly demands that you not credit the author is a lovely twist:)
[0] Read: Nearly all of them; even permissive BSD-like licenses usually(?) require that much.
I could get behind that type of short, clear legalese.
The real question is: what is the chance that somebody runs into this while implementing something common.
It's not about frequency; it's about the possibility that just one of these unsoundness bugs gets through to production and screws things up. The whole point of memory safety languages is that, unless you (or a dependency author) dives into the dark arts of whatever language you're working in, you are guaranteed to not encounter use-after-free or null-pointer-dereference. Practically, maybe such a bug gets patched quickly, and no one exploits it as a vulnerability. In hindsight, then, it wasn't so bad. But you don't want to be the one who is called at 2 AM because an impossible segfault happened and wrecked the database.
What’s your shell prompt config that you screenshotted in the README?
looks like https://starship.rs/
GLWTS(Good Luck With That Shit) Public License
The author has absolutely no fucking clue what the code in this project does. It might just fucking work or not, there is no third option.
lol
Sounds like an updated version of WTFPL
Are you the safest because you're Rust? Or are you Rust because you are the safest?
Oh, cool. They implemented a `download_more_ram()` function![0]
Does it crash safely as well? I did not test it, I more than 640 KB of ram.
- [0] https://github.com/Speykious/cve-rs/blob/d51f52dd64f148a086e...
Looking at the syntax in that bug report makes me think I could never learn rust.
`&` is a reference.
`'a` is a label for a memory area, like goto labels in C but for data. You can read it as (memory) pool A. Roughly, when function entered, memory pool is created, then destroyed at exit.
`'static` is special label for static data (embedded constants).
`()` is nothing, like void in C.
`&()` is an reference to nothing (an address) like `void const *` in C.
`&&()` is an reference to reference to nothing, like `void const * const *` in C.
`& 'static & 'static ()` is like void `const * const *` to a built-in data in C.
`& 'a & 'b ()` tells compiler that second reference is stored in pool 'a, while data is stored in pool 'b. (First reference is in scope of current function.)
`_` is a special prefix for variables to instruct compiler to ignore warning about unused variable. Just `_` is a valid variable name too.
Let's say that `&'a` is from a function `a()`, while `&'b` is from a function `b()`, which called from the function `a()`.The trick here is that we relabel reference from pool `'b`, from an inner function, to pool `'a` from outer function, so when program will exit from function `b()`, compiler will destroy memory pool `'b`, but will keep reference to data inside until end of the function `a()`.
This should not be allowed.
Thank you very much for spending time explaining this. It makes more sense with that kind of description. I'm still not sure how long it would take someone like myself to be comfortable and proficient with the syntax.
I guess stated another way, I don't generally have issues reading code from a wide swath of languages. If someone plopped me in front of a rust codebase I'd be at the mercy of the manual for quite a long time.
Thank you again, sincerely.
Rust decided to spend very little budget on syntax. The issue though, is that where it took said syntax from is very wide.
& is used for references in a lot of languages.
() is a tuple, same syntax as Python
'a isn't even novel: it was taken from OCaml, which uses it for generic types. Lifetimes are generic types in Rust, so even though it's not for an identical thing, it's related enough to be similar.
_ to ignore a name for something has a long tradition in programming, sometimes purely as a style thing (like _identifier or __identifier), but sometimes supported by the language.
fn name(args) -> {} as function syntax is not SUPER unusual. The overall shape is normal, though the choice between "fn," "fun," "func," or "function" varies between languages. The -> comes from languages like Haskell.
<> for generics is hotly debated, but certainly common among a variety of langauges.
the "static name: type = value;" (with let instead of static too) syntax is becoming increasingly normalized, thanks to how it plays with type inference, but has a long history before Rust.
So this leads to a very interesting thing, where like, it's not so much that Rust's syntax is entirely alien in the context of programming language syntax, but can feel that way unless you've used a lot of things in various places. And that also doesn't necessarily mean that just because it's influenced from many places that this means it is coherent. I think it does pretty good, but also, I'd refine some things if I were making a new language.
Which parts would you change?
About a year ago I wrote this: https://steveklabnik.com/writing/too-many-words-about-rusts-...
I do think there's some good criticism of doing this, though, and so even a year later it's not clear to me it's a pure win.
I am sympathetic to the vague calls to action by Aria et al. to improve the syntax for various unsafe features. I understand why we ended up where we ended up but I think overall it was a mistake. I am not sure I agree with her specific proposals but I do agree with the general thrust of "this was the wrong place to provide syntactic salt." (my words, not hers, to be clear)
Ideally `as` wouldn't be a thing. Same deal, this is just one of those things that's the way it is due to history, but if we're ignoring all that, would be better to just not have it.
I am still undecided in the : vs = debate for structs.
I am sad that anonymous lifetimes in structs died six years ago, I think that would be a massive help.
Probably tons of other small things. Given the context and history, I think the Rust Project did a great job. All of this is very minor.
FWIW, as a high performance C++ dev who likes Rust but considers unsafe the biggest issue by far, it's encouraging to see important folk within the project believe there are issues. Too often as a relative Rust outsider it feels like the attitude is "but the unsafe situation is okay because you'd barely ever actually need it!". Hope that unsafe continues to improve!
I appreciate the kind words, but haven't been a part of the project for a while now. I hope they share this opinion, but I don't know.
Ah okay, didn't know that. Good luck wherever you are!
That blog post says that "Rust doesn’t have named parameters, and possibly never will". Could you clarify why that is the case? In my experience, named arguments in all but the simplest - i.e. unary and some binary - function calls give a massive boost to readability. And for Rust specifically, they would also provide an easy, unambiguous way to "overload" functions, similar to Swift (I'm putting "overload" in quotes here because it's not really overloading, as parameter names simply become part of the function name). So, why weren't they considered, or if they were considered, why were they rejected so emphatically that you don't anticipate them showing up even in the future?
Part of it is that it's not just named parameters: you should have a good story for default parameters too, and I think one or two others? Variadrics feels similar but separate. Oh, anonymous structs would allow you to emulate them while not adding the feature directly.
Therefore, it ends up being a really huge thing, with lots of design space. This is combined with the fact that
Not everyone sees this as a good thing.
Being slightly controversial, plus being really large, plus there being a lot of other things to do, would make me surprised if they ever land.
Here's a link from eight years ago with a link to lots of other related proposals: https://internals.rust-lang.org/t/pre-rfc-named-arguments/38...
The = on functions idea is particularly interesting by comparison with Scala, which has shifted towards requiring = over time.
This is not a representative sample of Rust. That's explicitly triggering edge cases which requires abuse of syntax you wouldn't normally see.
Check out this for something more realistic that anyone should understand https://github.com/ratatui-org/ratatui/blob/main/examples/ca...
You will never actually encounter code like this in practical use.
And even if not, this example isn’t particularly hard to decompose and understand once you have a basic grasp of the independent underlying bits (generics, lifetimes, borrows). It’s just combining all of them pathologically into an extremely terse minimal reproduction.
It’s like saying you could never understand Java because someone linked to an AbstractClientProxyFactoryFactoryFactoryBean.
I write rust professionally and really like it, but I do think that its noisy syntax is a flaw. I'm not sure how I would clean this up, but it really can be inscrutable sometimes.
To be fair, even for someone who is comfortable in Rust, the example is definitely on the rather confusing side of the language. (Not entirely surprising--it's a bug about a soundness hole in the language, which invariably tends to rely on overuse of less commonly used features in the language).
In particular, explicit use of lifetimes tends to be seen as somewhat more of a last resort, although the language requires it more frequently than I like. Furthermore, the soupiest of the syntax requires the use of multiple layers of indirection for references, which itself tends to be a bit of a code smell (just like how in C/C++, T * tends to be somewhat rare).
For an anecdota it took me roughly a year. I was trying it several times. I still use .clone() a lot but it is getting better. :)
The real question what is the use of Rust for you. Do you work on anything where Rust could be a value?
If memory safety needs this level of fuckery to get going, it's not worth it for vast majority of software. Maybe stick to a GCed language until we can come up with a sane way to do it. These examples look cryptic as hell.
It's cryptic because it is a reduced test case to show off a bug. 99.9% of code doesn't look like this.
Just because "int ((foo)(void ))[3]" is a valid C declaration doesn't mean that all of C code looks like that.
This is basically the Rust equivalent of the obfuscated C code contest and indicates little to nothing about the actual Rust people write.
C++ has the same level of fuckery without memory safety. I don't think there is an extreme level of fuckery in Rust. I wish they got more influence from the ML languages than C++ but it is not unbearable.
It's a level of fuckery required to trigger a compiler bug. You don't see code like this in real code bases.
I think that's the best description of a lifetime I've seen so far.
That is not syntax you would ever see.
I wrote pretty basic rust stuff and had to use lifetime syntax quite a bit. If you are just forcing deep copies all the time, you might not see it often, but if you're just going to deep copy everything always, you’re probably going to be better off with another language to be honest as you’re tossing away much of the benefit of using a language like rust anyway.
I rarely use lifetimes and I don't deeply copy everything. This tool is probably the fastest in its class - does this code look like having a lot of lifetimes, cryptic syntax or deep copying?
- https://github.com/pkolaczk/latte/blob/main/src/main.rs
- https://github.com/pkolaczk/latte/blob/main/src/exec.rs
There was one fundamental "aha" moment for me when it clicked: move semantics. Once I learned it, suddenly 99% of stuff became simple (including making async quite nice really, contrary to popular HN beliefs).
You can cherry pick all you want, although I would generally agree that “top level” executables will most likely bump against lifetimes far less frequently than libraries, frameworks, modules etc.
I would also generally agree that a single pass through the official book should mostly be enough to be able to read and understand the op.
Therefore I cited the core of the program as well. It also does a bunch of non trivial async processing there.
I’ve written at least two other non-toy Rust programs and I haven’t hit “lifetimes everywhere” problem there either. I guess the most lifetime related problems stem from trying to program the Java/Python/JS style in Rust - from overusing references and indirection. If you model your program as a graph of objects referencing each other then I can imagine you have a hard time in Rust.
I use a lot of deep copy in my every day Rust and the performance is still 10x of my Python that I started to use in 2009.
One of my criticisms of Rust is that it’s a very RSI causing language. Very heavy on reaching awkward keys.
I mean, a read through the guide makes this very readable, but it doesn’t change that it looks like symbol soup and have lots of awkward repetitive strain to type it out.
This is one I surprisingly very few complaints about the symbols used. My personal pet peeve are languages that use the walrus operator (:=) for variable assignment/declaration. Those two are on the same side of the keyboard, in the same finger column, so really slow and painful to type. If they were two keys for two different fingers, or better yet, for two different hands, that would be awesome (I have the same issue at work for the warsaw region abbreviated to waw). I write Rust daily and have yet to find such an inconvenience.
Odd. I'll give you that it's slow to type, but I can't agree that it's painful. Colon is on the home row under my pinky, and equal is next to the delete/backspace key that I hit all day long.
Depends on the keyboard layout you're using. Most languages are more convenient on an English layout than certain others because the syntax was most likely designed by someone using a qwerty keyboard. The symbols /[]'\;= are all accessible without modifier key.
I have a custom layout on my keyboards which makes each symbol reachable without moving my hands, which made it a lot more pleasant.
That's like saying you could never learn JS beacuse the following valid JS snippet, when in real life you would never see such thing unless someone its trying to obfuscate the code:
wat
You could also think of it as "look at all the weird syntax olympics you need to do in order to trigger a compiler edge case!"
The crazier the reproduction, the more interesting the bug. A trivial bug isn't an interesting. A bug where two edge cases interact to do something more than the sum of parts is quite interesting. You learn something about the edge where there's an interaction, and the underlying system as a whole.
(Have you ever looked at a fuzzer-generated failing piece of data and not though "oh whoa good point". That's interesting!)
You should treat this the same as you would something like the Obfuscated C competition. Just because it's supposed to compile and run properly doesn't mean it's good code.
You're right, but in this case it shouldn't compile :)
Once you conquer the headaches and eye-strain, it's not too bad.
They've just written terse code with lots of single character identifiers. Not the style on would / should use for real life development of readable code.
As noted on that GitHub page, fixing the bug will become possible once rustc adopts a new trait solver: https://blog.rust-lang.org/inside-rust/2023/07/17/trait-syst...
Can you quote the relevant part? Because IIUC lifetime checks have nothing to do with the trait solver.
There's a new borrow checker however, but that's not going to fix this either
https://github.com/rust-lang/rust/issues/25860#issuecomment-...
“Similarly, the reason why niko's approach is not yet implemented is simply that getting the type system to a point where we even can implement this is Hard. Fixing this bug is blocked on replacing the existing trait solver(s): #107374. A clean fix for this issue (and the ability to even consider using proof objects), will then also be blocked blocked on coinductive trait goals and explicit well formed bounds. We should then be able to add implications to the trait solver.
So I guess the status is that we're slowly getting there but it is very hard, especially as we have to be incredibly careful to not break backwards compatibility.”
https://blog.rust-lang.org/inside-rust/2023/07/17/trait-syst...
“The new trait solver implementation should also unblock many future changes, most notably around implied bounds and coinduction. For example, it will allow us to remove many of the current restrictions on GATs and to fix many long-standing unsound issues, like #25860. Some unsound issues will already be fixed at the point of stabilization while others will require additional work afterwards.”
Apparently there's 84 open issues which the Rust developers consider unsound issues. https://github.com/rust-lang/rust/issues?q=is%3Aissue+is%3Ao...
That's it. Rust's claims are over and done with. How can it be safe if it's unsound? By the principle of noncontradiction and the laws of thought itself, let no one speak that word anymore.
You can call Rust memory safer, but until those bugs get fixed, it's wrong to call it safe.
To be fair though, almost every language is unsound in that metric. In fact, only an implementation that is verified and translated from a formal specification with the verifier and translator also similarly verified (ad infinitum) would be sound. Even PL researchers are not that strict---many academic papers don't actually prove the whole soundness, because the proof process is more or less automatic for the vast majority of cases to who know enough PL theories. (I can't say it's "mechanical", since the reasoning is still required. But the reasoning itself is not too involved for intended audiences.)
Just like that not every security vulnerability is equally fatal, not every soundness bug is equally fatal. I reckon about three levels of severity: inherent to the design itself, not inherent to the design but reasonably user-visible, and pathological. As pcwalton pointed out, miri does show that this particular soundness bug is NOT inherent to the language design, and I believe that's true for most of 84 unsound bugs (please let me know any counter-example though, I haven't fully checked them). It remains to be seen whether there exist soundness bugs that are still user-visible enough.
But languages like C / C++ don't claim to be safe.
Even Rust doesn't claim to be "safe" without a qualifier. From the website:
Note that it explicitly mentions "memory" safety and "thread" safety, which has a specific but reasonable definition in Rust (for example, memory safety doesn't cover physical memory leak). Also it explicitly mentions which portion of Rust is responsible for such guarantees, namely "type system" and "ownership model", and it is reasonably claimed that the ideal implementation of both will indeed completely achieve such guarantees. To be clear, the current implementation is also very close to that ideal to make such claim meaningful, but there is always a difference between the ideal and the practice.
Then they should say "we're researching how to create a memory safe language" or "we've built a memory-safer language" rather than saying what they're saying.
By that standard, no general-use programming language (even eg Python or Java) could ever be called "memory-safe". That level of pedantry is occasionally necessary, but not usually useful. In practice, I'd confidently wager the vast majority of Rust programmers have never encountered a soundness bug in the core language when not specifically hunting for one (I certainly haven't).
I'd wager the vast majority of Linux users have never encountered a memory safety issue with the Kernel. Memory safety issues are usually rare enough in world class C code that, by the time it reaches end-users, you have to actively go out of your way to exploit it.
Rust built its reputation around the idea that they can crush security bugs by making them impossible. They should be holding themselves to a higher standard than that "in practice" leeway. If a malicious actor can tease Rust into behaving in a way that contradicts its safety guarantees, then it could be serious.
Maybe your corporate policy is to configure Rust to allow zero unsafe code. Some crate you're depending on gets hijacked. It uses the cve-rs to crash your system even though Rust says it's 100% safe code.
The concept of safety is pretty different in a kernel and in a programming language. The kernel can be attacked directly with malicious calls, while it's the program created by the rust compiler that can be attacked (unless you envision Rust as a sandboxing technology, what it is absolutely not).
The safety in a programming language is mostly protecting the programmer against itself. The probability for a programmer to write this kind of code by mistake is close to zero, as opposed to UB in C or C++ that are pretty common. To make a vulnerable program with this kind of issue, the programmer would have to make them on purpose, what is unlikely unless for this kind of joke repository.
Yes, because they were caught in review and tests, or were patched in a bugfix release before being widely exploited. Rust catches safety issues during compilation, before you test or commit.
Rust's safety is meant to protect against a certain class of programmer mistakes. You still need to audit your dependencies and sandbox untrusted code; the language designers have never claimed otherwise.
If you like a pedantry, yeah. But it helps to know that almost everyone else don't like that pedantry.
I agree with jart on this. Rust has gained a reputation for overzealous marketing (also called evangelism), and we who like and use Rust should do everything possible to reverse that reputation, by being honest, even to a fault, about known limitations.
Don't get me wrong, I too agree on your sentiment (even though technically it wasn't a Rust's own fault), but I also believe that jart's claim was something else and a comparison with academic paper should be sufficient to show that.
No general-purpose programming language can be "perfectly safe". At the limit, you can always write to `/proc/self/mem`, actuate a robotic arm via USB to smash the CPU, etc.
A large portion of the bugs in that list require unstable language features. Most of the remainder are codegen bugs (miscompilations, ABI mismatches, linker problems, etc). The list of core type system soundness bugs is a lot shorter, it's tracked here: https://github.com/orgs/rust-lang/projects/44
That list of core type system soundness issues is long enough that it'd probably be easier to prove Rust style checks are unsound due to incompleteness / halting problem issues rather than fixing all those issues.
The core of the Rust safety model has been proven sound: https://plv.mpi-sws.org/rustbelt/ If you peruse the list, all the issues are either a) places where the actual implementation falls short of the theoretical model, bit there's a plan to fix it, or b) edge-case interactions with peripheral language features (statics and dynamic dispatch).
Statics are hardly a "peripheral language issue", though.
The issue in question, https://github.com/rust-lang/rust/issues/49206, is such an edge case that there are 0 known examples of it causing a problem in practice, despite the issue being around for many years.
The number of GitHub issues does not correlate to the actual number or severity of issues. After all, rust-lang/rust has more than 120K issues and 9K open issues, but it doesn't mean that Rust has too many issues to solve---those GH issues are mostly means to manage tasks or user tickets to track.
Everything is always a new solver away.
a sufficiently advanced new solver. Just like how the lisp guys insisted for decades that a sufficiently advanced compiler would solve all their performance woes. Right up until it became obvious that it couldn't.
Feels a bit like early chess engines. They tried to create super sophisticated heuristics to determine move quality, but then one person (or more probably multiple people independently more or less simultaneously) realized it's both easier and better to just do more or less the simplest thing that can possibly work, but do it as much as possible in the allotted time.
In other words - don't try to logically decide on the best move with some super advanced set of rules. Just assign a fairly basic and straightforward score based on calculating every reasonble move out 15, 20, 30 moves deep.
In the case of the trait solver, there is a clear end goal to reach: "the type system should be sound." This is a yes/no question, which you can prove/disprove mathematically.
(I suppose chess engines also have an end goal of perfect play, but that is a much harder problem)
There's a very simple solver that completes that goal: one that reports that it cannot prove any code as correct. So there must be a secondary goal of "maximize the amount of accepted code", which is a significantly less trivial question.
Yes, exactly.
Saying "just do the thing correctly, duh!" is easy. Ya know the saying about the difference between theory and practice?
I guess you have to add an implied "and preserves backward compatibility for (>99.9% of) sound code that compiles today" condition
A sound type system has strong limits on expressiveness, and even stronger if you want type inference to work. It's very easy to start tripping over those limits as you add features, as users of some Haskell extensions can attest.
This is beside the point, but the only way an engine can possibly search to depth 20 is by being extremely selective, using heuristics to guess move quality (such as static exchange analysis, the killer move heuristic, history tables, countermove tables, follow-up tables, late move reductions, &c). The discovery was that it's often better to guess move quality dynamically, based on the observed utility of similar moves in similar contexts, rather than statically. But this is not especially "simple", and it still (necessarily!) involves judging up front a few moves as likely-good and the vast majority of moves as unworthy of further examination.
Both type A engines (to use Shannon's terminology) and type B engines are relatively weak. The strong engines are all hybrids.
Is it going to be even slower than the current one?
Who says the current one is slow?
Just because the compiler as a whole isn't the fastest one around doesn't mean the responsibility falls equally on every constituant part of the compiler.
The most obvious slow down people perceive of compile times is due to monomorphization. It is a combinational explosion when instantiating every generic type that is in use, and it exacerbates the issue of rustc producing verbose LLVM IR bytecode.
My biggest concern with Rust is the sloppiness around the distinction between a "compiler bug" and a "hole in the type system." This is more of a specification bug even though Rust doesn't have a formal specification: the problem is the design of the language itself, not that a Rust programmer wrote the wrong thing in the source code: https://counterexamples.org/nearly-universal.html?highlight=...
That bug is marked as I-unsound, which means that it introduces a hole in the type system.
And so are all other similar bugs, i.e., your concern seems to be unfounded, since you can actually click on the I-unsound label, and view all current bugs of this kind (and past closed ones as well!).
Perhaps I should have said "hole in the type theory" to clarify what I meant.
To be clear I wasn't trying to imply the rustc maintainers were ignorant of the difference. I meant that Rust programmers seem to treat fundamental design flaws in the language as if they are temporary bugs in the compiler. (e.g. the comment I was responding to) There's a big difference between "this buggy program should not have compiled but somehow rustc missed it" and "this buggy program will compile because the Rust language has a design flaw."
But it's not a fundamental design flaw in the language, nor is it a "hole in the type theory". It's a compiler bug. The compiler isn't checking function contravariance properly. Miri catches the problem, while rustc doesn't.
I believe it really is a flaw in the language, it's impossible for any compiler to check contravariance properly in this edge case. I don't think anything in this link is incorrect: https://counterexamples.org/nearly-universal.html?highlight=... (and it seems the rustc types team endorses this analysis)
I am not at all familiar with Miri. Does Miri consider a slightly different dialect of Rust where implicit constraints like this become explicit but inferred? Sort of like this proposal from the GH issue: https://github.com/rust-lang/rust/issues/25860#issuecomment-... but the "where" clause is inferred at compile time. If so I wouldn't call that a "fix" so much as a partial mitigation, useful for static analysis but not actually a solution to the problem in rustc. I believe that inference problem is undecidable in general and that rustc would need to do something else.
I think you're misinterpreting what "the implicit constraint 'b: 'a has been lost" in that link means. What it means is that the compiler loses track of the constraint. It doesn't mean that the language semantics as everyone understands them allows this constraint to be dropped. That sentence is describing a Rust compiler bug, not a language problem.
I think you're misinterpreting what "implicit" means here. It doesn't mean "a constraint rustc carries around implicitly but sometimes loses due to a programming bug." It means "a constraint Rust programmers use to reason about their code but which rustc itself does not actually keep track of." "The implicit constraint 'b: 'a has been lost" should more precisely read "the implicit constraint 'b: 'a no longer holds."
Look closely at what happens:
There is an implicit "where 'b : 'a" clause at the end of this declaration - this clause would be explicit if Rust was more like OCaml. The reason this clause is there implicitly is that in correct Rust code you can't get &'a &'b if 'b : 'a doesn't hold. So when a human programmer reasons about this code they implicitly assume 'b: 'a even though there's nothing telling rustc that this must be the case.This runs into a problem with the requirements of contravariance, which let us replace any instance of 'b in foo with any type 'd such that 'd : 'b, in particular 'static:
Since contravariance allows us to replace the &'a &'b with &'a &'static without changing the second &'b, the completely implicit constraint 'b : 'a is no longer actually being enforced, and there's no way for it to be enforced. This is not a compiler bug! It's a flaw in the design.In particular I don't think there's anything especially magical about 'static here except that it works for any type 'a[1]. If you had a specific type 'd such that 'd : 'a then I think you could trigger a similar bug, converting references to 'b into references to 'd.
[1] Actually maybe "static UNIT: &'static &'static () = &&();" is more critical here since I don't think &'d &'d will work. Perhaps there's a more convoluted way to trigger it. This stuff hurts my head :)
Because Rust doesn't have a specification, we can argue back and forth forever about whether something is part of the "language design". The point is that the design problem has been fixed by desugaring to bounds on "for" since 2016 [1]. The only thing that remains is to implement the solution in the compiler, which is an engineering problem. It's not going to break a significant amount of code: even the bigger sledgehammer of banning contravariance on function types was measured to not have much of an impact.
As far as I can tell, the main reason to argue "language design" vs. "compiler bug" is to imply that language design issues threaten to bring down the entire foundation of the language. It doesn't work like that. Rust's type system isn't like a mathematical proof, where either the theorem is true or it's false and one mistake in the proof could invalidate the whole thing. It's a tool to help programmers avoid memory safety issues, and it's proven to be extremely good at that in practice. If there are problems with Rust's type system, they're identified and patched.
[1]: https://github.com/rust-lang/rust/issues/25860#issuecomment-...
His point seems to be that there are reasons in principle you cannot do this (holding the design fixed).
It's a little like making a type system turing-complete then saying, "we can fix the halting problem in a patch".
If you change the design, then you can fix it. This is what I interpret "design flaw" to mean.
The design problem has been fixed, as I noted above.
Interestingly enough, in practice even mathematical proofs aren't like that either: flaws are routinely found when papers are submitted but most of the time the proof as a whole can be fixed.
Wiles first submission for his proof of Fermat's last theorem in 1993 is the best known example, but it's in fact pretty frequent.
In the language,
should be equivalent to because the type &'a &'b is only well-formed if 'b: 'a. However, in the implementation, only the first form where the constraint is left implicit is subject to the bug: the implicit constraint is incorrectly lost when 'static is substituted for 'b. This is clearly an implementation bug, not a language bug (insofar as there is a distinction at all—ideally there would be a written formal specification that we could point to, but I don’t think there’s any disagreement in principle about what it should say about this issue).See my other comment above: https://news.ycombinator.com/item?id=39448909
The point is that rustc does not even implicitly have the "where 'b: 'a { v }" clause. The programmer knows "where 'b: 'a { v }" is true because otherwise &'a &'b would be self-contradictory, and in most cases that's more than enough. But in certain edge cases this runs into problems with the contravariance requirement since we can substitute either one of the 'bs with a type that outlives 'b (i.e. 'static), and the design of contravariance lets us do this pretty arbitrarily.
We could choose to design the language as I wrote where the constraint is implicitly present and variance consequently ought to preserve it, or as you wrote where it is not present at all and variance is unsound. But we want Rust to be sound, so we should choose the former design. We would need a very compelling reason to prefer the latter—for example, some evidence that “it would break too much code” or “it’s impossible to implement”—but I don’t see anyone claiming to actually have such evidence. The only reason the implementation has not been updated to align with the sound design is that it takes work and that work is still ongoing.
You mean that Miri catches it at runtime, right? If so, that hardly demonstrates anything about the difficulty or lack thereof of fixing rustc’s type checker, since Miri is not a type checker and doesn’t know anything about “lifetimes” in the usual sense.
I agree that this isn’t a “fundamental design flaw in the language”, but Miri is irrelevant to proving that.
Yes. I'm not saying that Miri demonstrates anything other than that it's not a language issue.
I’m not sure it even demonstrates that. For comparison, C’s inability to prevent undefined behavior at compile time definitely is a fundamental weakness of the language. Yet C has its own tools that, like Miri, can detect most undefined behavior at runtime in exchange for a drastic performance cost. (tis-interpreter is probably the closest analogy, though of course you also have ASan and Valgrind.)
Or to put it another way, the reason that Rust’s implied bounds issue is not a fundamental language issue is that it almost certainly can be fixed without massive backwards compatibility breakage or language alterations, whereas making C safe would require such breakage and alterations. But Miri tells us nothing about that.
The issue isn't contravariance but that higher ranked lifetimes can't have subtyping relations between them in rustc's implementation, so the fn pointer type is treated as a more general type than it is. This user is still wrong, the problem with rustc just isn't how it handles contravariance.
Why is this distinction important? It's still something you fix by changing what programs the compiler accepts or rejects. Or were you trying to imply this is unfixable?
If it's been known since 2015 and not fixed, that's pretty suggestive.
It's been backed up behind a giant rewrite of more or less the entire trait system. This has been an enormous, many-year project that's been years late in shipping. The issue isn't so much the difficulty of the problem as the development hell that Chalk (and traits-next) has been stuck in for years.
Is there anything we users can do to help get that project out of development hell? Besides not constantly getting the developers down, and even burning them out, with uninformed complaints and criticisms, which is something I try not to do.
That means nothing. In Rust, "everyone is a volunteer" and you're not allowed to expect things to be fixed unless you do it yourself - so the fact that this hasn't been fixed is simply an artifact of the culture, not necessarily the difficulty of the problem.
The distinction matters because any existing code that breaks with the compiler fix was either relying on "undefined behavior" (in the case of a compiler bug incorrectly implementing the spec), so you can blame the user, or it was relying on "defined behavior" (in the case of a compiler bug correctly implementing a badly designed spec), so you can't blame the user.
I suppose the end result is the same, but it might impact any justification around whether the fix should be a minor security patch or a major version bump and breaking update.
Well first of all, Rust doesn't even have a spec. And I would also advocate for not blaming anyone, let's just fix this bug ;)
Rust's backwards compatibility assurances explicitly mention that breaking changes to fix unsoundness are allowed. In practice the project would be careful to avoid breaking more than strictly necessary for a fix.
In the case of user code that isn't unsound but breaks with the changes to the compiler/language, that would be breaking backwards compatibility, in which case there might be a need to relegate the change to a new edition.
Probably better to be maximally pedantic here:
- Assume our language has a specification, even if it's entirely in your head
- a "correct" program is a program that is 100% conformant to the specification
- an "incorrect" program is a program which violates the specification in some way
Let's say we have a compiler that compiles correct programs with 100% accuracy, but occasionally compiles incorrect programs instead of erroring out. If the language specification is fine but the compiler implementation has a bug, then fixing the compiler does not affect the compilation behavior of correct programs. (Unless of course you introduce a new implementation-level bug.) But if the language specification has a bug, then this does not hold: the specification has to change to fix the bug, and it is likely that at least some formerly correct programs would no longer obey this new specification.
So this is true:
But in one case you are only changing what incorrect programs the compiler accepts, and in the other you are also changing the correct programs the compiler accepts. It's much more serious.
It does not, and at the current pace, it might never have a spec.
The reason there is no spec - not even an hypothetical spec in my head - is that the exact semantics of Rust has not been settled.
With the constraints the Rust project are operating with, the only way forward I can think of is following the ideas laid out in post
https://faultlore.com/blah/tower-of-weakenings/
With the understanding that you can have multiple specs if one is entirely more permissive than the other (and as such, programmers must conform to the least permissive spec, that is, the spec that allows the smallest number of things)
But the problem is, Rust doesn't even have this least permissive spec. Or any other.
It’s a consequence of not having a formal and formally-proved type system.
Looking over the code, it relies entirely on that single bug.
On the one hand that's encouraging - you really need to want to trigger it, it doesn't look like a bug that's likely to get accidentally written. Even then, it required a bunch of anti-optimization barriers and stuff to actually be exploitable.
On the other hand, one compiler bug and safe, well, isn't
I still like Rust.
I thought it was just using the totally safe transmute, this one: https://blog.yossarian.net/2021/03/16/totally_safe_transmute...
But I guess there is another trick.