Perhaps worth mentioning is that someone attempted to port this to Rust and got about 60,000 lines of code into it before archiving the project. I feel like comparing these two efforts would be an interesting case study on the impacts / benefits / limitations or difficulties, etc involved in rewriting from C++ to Rust.
I've been using rr for a while now, and it's a game-changer for C/C++ debugging on Linux. The deterministic replay and reverse execution features are incredibly powerful, especially for tracking down those elusive bugs. It's like having a time machine for your code! Anyone else had experiences with rr making their debugging process smoother?
I've used it for years for vulnerability development (when you need to debug why a particular exploitation run went wrong etc.).
It is indeed marvelous for many situations; except for concurrency bugs ;)
Why isn't it good for concurrency bugs? rr doesn't record the entire state up to the race condition?
it serializes your problem because you can't atomically snapshot memory when there are multiple threads writing to it
This is true, but for at least some race conditions the rr "chaos mode" can help in tracking them down. Chaos mode basically randomly adjusts its thread scheduling so that it sometimes doesn't run a thread at all for a few seconds, so you run it a bunch of times until it hits the timing window for the race condition; and then when it does you debug the recording. IME it works at least sometimes but it does require that the thing you're testing doesn't take too long to fall over, since you need to do multiple recordings of it.
https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mo...
RR supports recording multithreaded programs, but only runs one thread at a time, so it cannot trigger data race bugs.
It can find higher level concurrency bugs though, and has a "chaos mode" that schedules threads in an unfair random manner.
It is sometimes useful with concurreny bugs, especially when used in "chaos" mode. Altough it probably masks some classes of concurrency bugs. I wonder if rr + TSAN works.
My previous job was in compiler engineering and I worked on upstream LLVM for a little while. rr was indispensable for me — not just for debugging, but as a tool for understanding how the compiler works in general. For example: you can set a watchpoint on an object in memory, and reverse-execute to the point where it was instantiated. This allowed you to answer questions such as ‘which optimisation pass was responsible for creating this DAG node?’ and such.
Really powerful tooling — I haven’t kept up with it for a while, but last I checked it didn’t work on aarch64/arm64 because of a lack of certain performance counters only available on x86 hardware. The lack of support for other languages (e.g. Go) is also a shame. This has unfortunately rendered it unviable for my day job!
If you’re working in a stack where it’s supported, it’s really quite special.
I believe rr should support other languages exactly as well as gdb does; I've looked at some Rust in it once or twice at least, and IIRC even some Go; though I'm happy enough with doing assembly-level breakpoints/whatnot so I can't comment on more fancy things. Even some OpenJDK JIT! (though there of course source-mapping/stacktraces are non-existent, never mind reading variables, but even then I managed to hack together some basic source-mapping at https://github.com/dzaima/grr#java-jit; though it's rather unusable with rr due to OpenJDK's machine code dumping going directly to stdout, and general slowness).
That said, it would be very neat to have a more complete standard debugging interface across ahead-of-time-compiled code, interpreters, and JITs.
And it does have some aarch64 support by now, though it'll still require a suitable CPU with necessary perf counters.
I do know that RR works for Julia. Not sure why it wouldn't work for Go.
Julia goes through LLVM, which has good DWARF support (and, additionally, until recently, Julia hosted rr's CI, so Julia directly cares about rr). Whereas AFAIU Go does its codegen fully from scratch, without much regard for gdb/DWARF quality.
rr supports Aarch64 pretty well now.
It says on https://rr-project.org/ that it supports Go programm, what's the status?
You might be interested in the rather spectacular upgrade to it, called pernosco :
It’s based on rr ( same author ) but vastly better
There are other commercial reverse debuggers, like Undo.
Pernosco is more than merely a reverse debugger.
rr is awesome. My typical debug workflow:
* rr record
* rr replay until crash or exception (set up catchpoint for that)
* set up hardware watchpoint for borked data that caused the issue
* work backwards in the data flow through watchpoints and reverse-continue
I would say this fundamentally changed how I approach debugging.Is it truly only for C/C++?
My limited understanding says a debugger needs: a list of symbols (.pdb files on windows, can't remember what they are on linux), understanding of syscalls and a few other similar things. I thought they don't care too much what generated the binaries they are debugging (obviously as long as it's native code).
Doesn't rr work with other languages like rust, zig, odin, nim, and similar ones? Obviously, I wouldn't expect it to work for python, js, c# and other languages with managed memory.
rr uses gdb as the actual debugger part, so anything that works in gdb will work in rr. (you won't get rr running on windows though, as it is very much linux-specific, having to wrap all of its syscalls. The linux symbol info thing is DWARF)
Ok yeah, of course. I'd even argue that cross platform debuggers are not a thing to be desired. Too much low level integration with the operating system is needed when implementing one.
I did use gdb recently on Windows, but this was for a cross compiled program using mingw. Not sure it works for programs made with MSVC.
I disagree. You shouldn’t have to learn two debuggers just because you occasionally have to use a different OS. GDB has the right architecture here; the actual debugging operations are implemented by a gdbserver, and gdb is only the user interface that lets the user tell the server what to do. When you’re debugging on a different platform you use use a different gdbserver and keep using the same user interface that you are familiar with.
When you replay a recording, rr first starts its custom gdbserver (which reads from the recording instead of from a live process) then starts a gdb process that connects to it.
Why? From the user's pov, what 'low level integration with an OS' is there that couldn't/shouldn't be abstracted into 'generic debugging functionalities'?
It also works with Go, I think support for it is built into Goland too
Goland uses delve which has rr support
Interestingly gdb (and in turn rr) has some limited support for debugging python. At least you can get a python backtrace, but I didn't have success in setting pyhton breakpoints.
Yeah, 'cause you're technically debugging the python interpreter. I've had some success with tracing tools designed for C/C++ for a python project. Was not easy to set up and will obviously include frames from the interpreter.
Tho, it feels wrong to expect a tool designed for native binaries to work well with python in this context. And that's ok. It feels lucky when it works as much as it does.
I use it with Zig. It's pretty handy in conjunction with Zig's allocator because it writes 0xaa bytes upon free and doesn't reuse addresses, so it very likely causes a crash, then you can put a watchpoint on the memory and rewind to the point where it got freed.
That sounds really neat, is there more information on this?
*Edit
Found this, https://zig.news/david_vanderson/using-rr-to-quickly-debug-m...
I've gotten rr to work with very specific builds of rpython before but you might be surprised at the ongoing interest:
https://github.com/python/devguide/issues/1283
https://morepypy.blogspot.com/2016/07/reverse-debugging-for-...
https://github.com/mesalock-linux/mesapy/blob/mesapy2.7/READ...
We use RR a lot with Julia. It only gives you a GDB view of the system, but it can work with any interpreted or compiled language.
Things that don't work are drivers that update mapped addresses directly. An example of this is CUDA in order to replay one would need to model the driver interactions (and that's even before you get to UVM)
Another great thing is that RR records the process tree and so you can easily look at different processes spawned by your executable.
People use it for rust: https://bitshifter.github.io/rr+rust/index.html#1
for js, there's http://replay.io
Yes, it can generally debug any language that compiles to a binary with proper debug information.
GDBs built-in reverse debugging: https://www.sourceware.org/gdb/wiki/ProcessRecord/Tutorial
I assume rr provides more features and flexibility. Anyway I want to mention that GDB itself can already reverse debug for some time now.
If you want to mention this, then you very clearly haven't actually tried it. The implementation in GDB is more convenient than rr (you can start/stop recording at will), but it is also orders of magnitude less efficient. It's only usable for very small code snippets. Otherwise it takes effectively forever and/or runs out of resources.
> runs out of resources
RAM? What kind of dev box runs out of RAM in 2024? I built a 64GB RAM dev box during COVID-19 crisis. I have never once come close to using all that RAM, even with a squillion Chrome tabs open.Still, thank you to share your first-hand experience. Did you ask the GDB Dev team for any feedback on the slow performance?
umm, embedded devices without active cooling f.e. those node-b’s sitting on cellphone towers come to mind here, there can be quite a few other similar examples i can think of.
gdb record and replay will absolutely eat up your piddly 64 gigs of ram in a few minutes if you just let it loose. it will eat your 12-terabyte hard disk too, it just takes a little longer. the gdb manual has many helpful tips for how to allocate your ram carefully to record and replay, as well as using a disk buffer instead of recording to ram, so the gdb dev team is already well aware of the problem
Uuh, any kind of dev box that requires more RAM than available? I promise you that storing that much data about the runtime can be really memory consumptive.
rr predates the one in gdb if I am not mistaken
Actually the gdb implementation predates rr, but (as an rr maintainer) I have to say that it is vastly inferior to rr. It's about 1000x slower than rr, and can't record across system calls or multiple threads or processes. It's so limited it's really a different feature.
Thanks. Can you explain why rr is so much more efficient?
rr was introduced in 2014 [1].
gdb reverse debugging was introduced in 2009 [2].
You can see a fairly comprehensive history of time travel debugging here [3].
Not to say the built-in gdb reverse debugging was any good. It had (has?) like 1,000,000% overhead which is basically unusable. At least some implementations in the history that were introduced earlier only had ~1,000% overhead or less in general. Yes, a literal 1,000x overhead difference.
[1] https://robert.ocallahan.org/2014/03/introducing-rr.html
Thanks for the info and links :)
I have successfully used GDB's build in reverse debugging once, on a platform that rr didn't yet support at the time.
It worked, it helped me track down the bug, but it was painfully slow, I had to do things to limit the size of the input to make it possible to use at all (and thankfully was luckily able to still repro the problem after doing so).
gdb's built-in replay implementation imposes a slowdown of about 10000× on your program, so if you can binary-search to the desired program state in less than 10000 restarts of the program, that will take less machine time than using reverse execution. in fact, the slowdown is large enough that even interactively navigating the debugger close to the right state repeatedly and then restarting the program is often enough
i have been able to use gdb's replay functionality usefully because i had an input file which crashed the program within a fraction of a second after startup. this meant that i could navigate backward from "this variable is wrong" to "how did this variable get set to that wrong value?" in only several minutes of waiting on the computer
rr is really cool, but almost every time I have decided to pull it out as one of the "big guns" it turns out that I have a concurrency bug and so rr is unable to reproduce it.
Despite that, it would be very, very, very cool if some languages built rr directly into their tooling. Obviously you can always "just" use rr/gdb, but imagine if rr invocations were as easy to set up and do as pdb is in Python!
Chaos mode is an option when invoking rr that can expose some concurrency issues. Basically it switches which thread is executing a bunch to try and simulate multiple cores executing. It has found some race conditions for me but it’s of course limited
Unfortunately that only works for large-scale races, and not, say, one instruction interleaving with another one on another thread without proper synchronization. -fsanitize=thread probably works for that though (and of course you could then combine said sanitizer with rr to some effect probably).
One option would be to combine chaos mode with a dynamic race detector to try to focus chaos mode on specific fine-grained races. Someone should try that as a research project. Not really the same thing as rr + TSAN.
There's still the fundamental limitation that rr won't help you with weak memory orderings.
Yeah, the reason it only works for these coarser race conditions is that RR only has one thread executing at a time. Chaos mode randomizes the durations of time allotted to each thread before it is preempted. This may be out of date. I believe I read it in the Extended Technical Report from 2017: https://arxiv.org/pdf/1705.05937
I havent tried Tsan with rr but msan and asan work quite well with it (it’s quite slow when doing this) but seeing the sanitizer trigger then following back what caused it to trigger is very useful.
Another thing that rr sadly doesn't support is GPUs. I'd love to use it but most of my stuff involves GPUs in some way or another.
I actually had a concurrency bug that I was able to capture with rr: an MPI job where I only ran rr on rank 0 and managed to figure out where a different send/recv ordering was causing issues. In fact, it was also a Python model that ties in with a lot of native code generation, so quite a complex issue.
Yeah same for me. Actually the time I really wanted it was on Mac and unfortunately it only works on Linux.
There is Undodb which works on Mac and maybe with multithreading (not sure about that), but unfortunately it costs about $50k.
I've used rr very sucessfully for reverse engineering a large code base using a break on variable change combined with reverse-continue. Took the time to extract critical logic way down.
That sounds very interesting; Do you have a write-up on this that you are willing to share?
This is the usual killer feature of something like rr. You debug, look at some variable: `p whatever`. You see that its value is wrong. You want to know where this wrong value came from, so you `watch -l whatever` and `rc`. Bam!
There are some bugs I would never have figured out without this technique. It feels like cheating.
Totally. rr is nothing short of a revolution in debuggin.
It’s not cheating, it’s technique!
May I assume the large codebase was written in a language with (for lack of a better term) dynamic types?
No, you may not.
On Windows you can use WinDbg for the same thing. It has better support for debugging multi-threaded issues.
https://www.forrestthewoods.com/blog/windbg-time-travelling-...
WinDbg uses a instruction-level emulation time travel implementation, so incurs the 10-20x slowdown associated with that technique. rr uses a replay-record time travel debugging implementation, which can incur far less overhead when done correctly. Last I saw, rr has overhead in the 2x slowdown range and, if I remember correctly, I have seen a different record-replay time travel debugger in the 10% range.
10% is 100x cheaper than WinDbg and cheap enough to leave on all the time in production. That is a game-changer.
10% is 100x cheaper than WinDbg
If you’re gonna throw around numbers like this you need to cite an actual tool not “if I remember correctly there exists a unicorn”.
More precisely, rr records just system calls & interrupts, so its overhead is largely proportional to how syscall-heavy and multithreaded your code is. If it's single-threaded and mostly just pure computation, you can easily see ~0% slowdown, with replay at the exact same speed. (a quick simple test shows that 'mmap'+'munmap' of a file is 0.5ms, vs 0.006ms outside rr; but many syscalls are buffered in rr's userspace and thus way faster (e.g. a 'stat' on the same file path is only like 300ns slower than native))
Has anyone gotten rr to work with opengl or vulkan? It seems to always crash for me after making an opengl call.
VirGL might help, it redirects OpenGL calls over a socket. Start virgl_test_server and run your app with extra environment vars __GLX_VENDOR_LIBRARY_NAME=mesa LIBGL_ALWAYS_SOFTWARE=1 GALLIUM_DRIVER=virpipe
Can someone explain how it works?
Are rr’s problems with Ryzen CPUs now firmly in the past or not?
Yes, I use rr all day every day (to record Firefox executions) on a rather recent Threaripper Pro 7950, and also with Pernosco. The rr wiki on GitHub explains how to make it work. Once the small workaround is in place it works very reliably.
Long ago, VMWare workstation supported doing this, but not just for userspace programs but also for kernels and even drivers, in a VM. The feature shipped and existed for a few versions before it was killed by internal politics.
And, I guess, before that, there was AMD SymNow, which was plug-in extendable and gave a plugin full control over CPU being emulated. I wonder if something like that is available somewhere?
I almost use rr every day, along with a gdb frontend: cgdb.
rr record /tmp/Debug/bin/llvm-mc a.s && rr replay -d cgdb
I've have success story with some bugs only reproducible with LTO. Without rr it would be a significant challenge.
It would be nice if Linux kernel could be debugged with rr. Does anyone have success with kernel under rr+qemu ? :)
what's the benefit of using cgdb while you can use gdb layout src?
I used this to help make my toy JIT compiler: https://github.com/danthedaniel/BF-JIT
Super useful, especially considering I know barely anything about x86-64.
See also https://pernos.co/ which is based on rr but adds a queryable database of the whole program execution, which allows you to do things like this:
[...] just click on the incorrect value. With full program history Pernosco can immediately explain where this value came from. The value is tracked backwards through events such as memcpys or moves in and out of registers until we reach a point where the value "originated" from, and each step in this process is displayed in the "Dataflow" panel that opens automatically. There is no need to read or understand the code and think about what might have happened, we can simply ask the debugger what did happen.
A couple of previous discussions:
https://news.ycombinator.com/item?id=31617600 (June 2022)
Is it possible to use this with C/C++ code compiled to dll/so and called by Python?
Curious if anyone tried rr for/on Android? It seems possible to crosscompile it and it could be a good tool for native side debugging
[off-topic] does anyone here who regularly uses a debugger (even just breakpoints and watchers in their IDE) use it for async execution? I've never tried, but I'm just trying to think how all that jumping around the executor and any runtime would work (if at all).
I don't understand the "rewrite X/Y/Z in Rust" trend that has been going for a few years.
I'm not familiar with Rust, but I'm almost sure it has a good C interoperability. If a certain piece of software is working well, what is the benefit of rewriting it in Rust?
There's a bit of a higher abstraction ceiling in Rust so in theory if you are successful rewriting a thing in Rust then you now have a codebase that's easier to change confidently.
This sort of property is nice to have in huge codebases where you really start losing confidence in shipping changes that don't subtly break things. But of course a huge codebase is hard to rewrite in general...
Compared to C, yes, but not compared to C++.
This isn't really true. Rust has a much better type system. When writing generic code the impact is enormous.
C++ doesn't have a real Empty Type, and it thinks Units have non-zero size. In practical terms this makes it incredibly wasteful and in terms of a clear abstraction it encourages you to come up with a hack that's unclear but efficient.
…or you can just waste a few bytes? It's not a big deal.
You can, but that makes the type system worse. Also depending on how these few bytes are used, they can add up and drag down performance.
Copying a bunch of stuff because the borrow checker won't let you share it can drag down performance as well. Yes, I do understand why one might conclude that tradeoff is worth it. But it is a tradeoff.
The original topic was the abstraction ceiling. There are a bunch of abstractions which C++ just can't express.
Funnily enough, because the borrow checker is so strict I feel more confident writing complex borrowing logic that I wouldn't dare attempting in C or C++ because even if I were to get everything right (a big if), there's no assurance that a later refactor wouldn't subtilty break the code. The borrow checker sometimes makes you copy data that you thought you didn't, but more often than not it is enforcing an actual edge case that would have been a bug, had the borrow checker not be present. If the copy is indeed so critical, you can also ease your pain with runtime checks instead using Rc/Arc, but that's another discussion.
No, my point is that it doesn't. If your zero-sized types are big your type system is not any worse: it's just less efficient.
If you're focused on just the theoretical correctness of the type system, go back to my first critique: C++ does not have Empty Types. So immediately a whole class of problems that are just a type system question in Rust are imponderable, you can't even say what you meant in C++
C++20 added `[[no_unique_address]]`, which lets a `std::is_empty` field alias another field, so long as there is only 1 field of that `is_empty` type. https://godbolt.org/z/soczz4c76 That is, example 0 shows 8 bytes, for an `int` plus an empty field. Example 1 shows two empty fields with the `int`, but only 4 bytes thanks to `[[no_unique_address]]`. Example 2 unfortunately is back up to 8 bytes because we have two empty fields of the same type...
`[[no_unique_address]]` is far from perfect, and inherited the same limitations that inheriting from an empty base class had (which was the trick you had to use prior to C++20). The "no more than 1 of the same type" limitation actually forced me to keep using CRTP instead of making use of "deducing this" after adopting c++23: a `static_assert` on object size failed, because an object grew larger once an inherited instance, plus an instance inherited by a field, no longer had different template types.
So, I agree that it is annoying and seems totally unnecessary, and has wasted my time; a heavy cost for a "feature" (empty objects having addresses) I have never wanted. But, I still make a lot of use of empty objects in C++ without increasing the size of any of my non-empty objects.
C++20 concepts are nice for writing generic code, but (from what I have seen, not experienced) Rust traits look nice, too.
It's probably mean for me to say "empty type" to C++ people because of course just as std::move doesn't move likewise std::is_empty doesn't detect empty types. It can't because C++ doesn't have any.
You may need to sit down. An empty type has no values. Not one value, like the unit type which C++ makes a poor job of as you explain, but no values. None at all.
Because it has no values we will never be called upon to store one, we can't call functions which take one as a parameter, operations whose result is an empty type must diverge (ie control flow escapes, we never get to use the value because there isn't one). Code paths which are predicated on the value of an empty type are dead and can be pruned. And so on.
Rust uses this all over the place. C++ can't express it.
void is an empty type in C++. It's less useful than it could be, but it does exist.
void isn't a type. If you try to use it as a type you'll be told "incomplete type".
People who want void to be a type in C++ (proponents of "regular void") mostly want it to be a unit type. If they're really ambitious they want it to have zero size. Generally a few committee meetings will knock that out of them.
Help me out here.
What is this empty type for? Could you provide an old man with a nice concrete example of this in action? I've used empty types in C++ to mark the end of recursive templates - which I used implement typelists before variadic templates were available.
But then you mention being unable to call functions which take an empty type as a parameter. At which point I cease to understand the purpose.
I don't know that I'll be able to convince you but I'll give a couple of examples.
What is the type of the expression "return x" ? Rust says that's ! pronounced Never, an empty type. This expression never had a value, control flow diverges.
So this means we can just use simple type arithmetic to decide that a branch which returns contributed nothing to the type of the expression - it has no possible value. This wasn't a special case, it's just type arithmetic.
Ok, lets introduce another. Rust has a suite of conversion traits. From, Into, TryFrom and TryInto. They're chained, so if I implement From<Goose> for Doodad, everybody gets the three other implied conversions. But the Try conversions are potentially fallible, hence the word Try. So they have an error type. Generic Code handling the Error type of potentially failing conversion will thus be written, even if in some cases the conversion undertaken chained back to my From<Goose> code. But wait, that conversation can't fail! Sure enough the chained TryFrom and TryInto produced will have the error type Infallible, which is an Empty Type.
So the compiler can trim all the error handling code, it depends upon this value which we know can't exist, therefore it never executes.
Thanks for the clarification.
Can you instantiate an empty type? If yes, are all instances unique? Years ago, I was surprised to learn how C++ handles the (essentially) empty type (no data): A single byte to differentiate each instance.
that's unit. the empty type is a type you cannot instantiate
Do you have a link to an example where this matters?
Sounds like a perfect situation for a strangler pattern? Wrap or transpile the original code into a language with stronger refactoring support and the rest should become incrementally easier.
The advantages of rust only come when you actually use the rust-provides abstraction, especially those around allocation and concurrency. Even if transpiling is possible, the code would still not be structured I the rust way, and you wouldn't have any of the benefits. Same goes for wrapping.
Presumablly the idea here is to support Rust replay debugging, not just rewrite a C/C++ targetting replay debugger in Rust
I'm not so sure of this. Most code in a debugger doesn't have much to do with the source language. Line mapping and figuring out values for variables happens via symbol information, eg. DWARF, and compilers for multiple languages can produce that in the same format.
Does that neutral way of debugging work as well as having first-class explicit support for the language, or it's more of a lowest common denominator (kind of like what e.g. language support over LSP can offer and how conveniently, compared to what a native Lisp or Smalltalk environment's language support can offer)?
One debugger feature I thought of while writing the comment was the ?? command in windbg. That takes an expression and evaluates it. Gdb also does this which I've used with the "print" command to print a C expression, including pointer casts and such. That would obviously require language support.
But then again you don't need to code everything in the same language either. You could write a rust parser in another language. Or a modular interface to dispatch knowledge of a programming language (does Microsoft's "language server" concept work this way?)
The latter.
The neutral way of debugging is really debugging the raw machine code of a process. This requires OS integration for your low level manipulation primitives. To add language support, you then need to figure out how to define the semantics of your manipulations in terms of the low-level primitives.
If you have a rich runtime, you can add language-level debugging facilities that can operate at a higher level. However, this requires you to implement portions of a debugger in your runtime. Now you have to maintain a language, runtime, and debugger. It also means that if new debugger techniques are invented, such as time travel debugging, you do not get them for free since you embedded a debugger of your own design. So, like many similar things, it is a trade-off of specialization versus maintenance. The perennial question of use a library, or do it yourself.
It works with pretty much any compiled language.
Does it understand the semantics, primitives, and structures of any compiled language?
Or is it more of a lowest common denominator experience, where e.g. all of them are constrained to common semantics of C/C++?
That's up to debuggers to implement. lldb, gdb, and delve all support rr traces.
Recently the federal government issued a security advisory encouraging all new development to be done in Rust. I'm not sure the extent of which agencies this was meant to cover but it struck me as very unrealistic having just done years of java development for a state agency. https://www.nextgov.com/cybersecurity/2024/02/white-house-ur...
It doesn't say that, no. Even in the title it says "memory-safe languages", which of course includes Java.
Rust is "only" mentioned in the context of a C / C++ replacement, which tends to be a different area.
Yes that is correct, I had read a different article that had a more rust first focus. I just tried to google for a reference to the news release I was referencing, but yea it doesn't really support my comment :(
Well, I guess that's it...
Maybe rust will take the same path as Ada.
One benefit is it's easier to hide malicious code thanks to Rust's complicated syntax.
It's far easier to hide malicious code in C or C++: just write some subtle undefined behavior that you can write an exploit against. Developers do that all the time even when they're not trying to be malicious. In Rust you'd have to wrap it in "unsafe" which draws attention.
>If a certain piece of software is working well, what is the benefit of rewriting it in Rust?
The "working" C program has a high risk of undiscovered bugs relating to concurrency and memory safety. Rust lets you rule out a large swathe of them by construction. Rust's type system is also far more expressive, which in many cases enables cleaner domain modelling.
You're correct that this is the stated justification most of the time.
Should be nuanced though, because the working C program has a risk, but the the risk is a function of the size of the codebase, its age, and the number of audits it has undergone.
It is definitely easier to write bugs in C due to the additional freedom you have, but it is not necessarily a "high" risk for mature C libraries.
It is definitely not as advisable to just replace all C with Rust, but it is advisable to prefer memory safety in new projects.
From the perspective of an rr maintainer, Sid's work was good and we were supportive of it. The main issues with migrating to that as the "blessed" version are that 1) rr has accumulated a decade of very hairy fixes for crazy kernel/process behavior that we feared could be lost during a port and 2) there's a closed source project (remix[0]) built on top of rr that would have needed to be ported too.
[0] https://robert.ocallahan.org/2020/12/rr-remix-efficient-repl...
It would be a good but difficult analysis; at a quick check, rr 1.0 took 3 years and signficant contributions from around 3 or 4 people (I saw at least 5 people contributing), and the rr we have today is 10 years further work on that.