The cost of argument passing is rarely well-understood, and I'm glad someone wrote this. People routinely pass 24-byte objects by value in places like Google, and the cost of that practice doesn't show up on a profiler because it's spread out on every function.
"in places like Google".
Are you speaking from experience?
As a former googler, I can say with certainty that the guidelines for passing any non primitive is pointer or ref.
string_view might be the only exception I can think of.
I saw std::function and std::string (e.g. TotW 127, https://abseil.io/tips/117) being passed by value a lot in newer google3 code. Both are larger than 16 bytes.
This is done so you can use std::move to take ownership of the allocated memory in these objects rather than do a new allocation. Passing by value rather than rvalue reference let's your function be more flexible at the call site. You can pass an rvalue/move or just make a copy at the call site which means the caller (who actually knows if a copy or a move is more appropriate) gets to control how the memory gets allocated.
An unnecessary memory allocation is much more of a performance hit than suboptimal calling convention.
In that case, the optimal interface should take std::string&& no? But it's awkward.
Wouldn't this be very annoying to work with, because now you have to explicitly move or copy the string whenever you want to construct one of these objects?
The function could accept an universal reference instead of an rvalue reference, this avoids the dance the caller has to do to pass a copy.
IMO it's hard to beat pass by value considering both performance and cognitive load.
Yeah, making it accept a universal reference would fix it...
...but that requires the argument to be a type from a template D: so you'd have to write:
...and that's not quite right either, since you'd want it to be either an rvalue reference or a const lvalue referenceExplicity having to copy or move is a desired coding style IMO.
It's kinda what Rust forces you to do, except that std::move is implied. Anything taken by value is equivalent to taking by && unless the type is explicitly marked as Copy (i.e. it can be trivially copied and the copies are implicit).
But yeah, in a c++ codebase, good modern practices are often verbose and clunky.
The vanilla interface often allows what's right depending on the context (e.g., if an unnamed return value is passed in, it can be move constructed, otherwise it can be copied). I saw some code bases provide an override for a const ref and a normal type.
Google builds non-PIE, non-PIC, static, profile-guided, link-time-optimized, and post-link-optimized binaries and probably DGAF about calling conventions.
I have seen the assembly output of Google code, and I will say that my previous comment still stands.
Passing std::function by value is almost definitely wrong these days with absl::AnyInvocable (if you need to store the type) and absl::FunctionRef (if you don't). Rough analogues in the standard are std::move_only_function (C++23) and std::function_ref (C++26).
std::string in the case you cited is only really relevant if you std::move() into it and you would otherwise incur a string copy. Yes, it's bigger than 16 bytes (24 bytes), but that pales in comparison to the alternative.
(Taking std::string&& would eliminate the possibility of misuse / accidental copies, but that pattern is generally discouraged by Google's style guide for various reasons.)
Also, just because you see a certain pattern an awful lot even at Google doesn't mean that it's best practice -- there are plenty of instances of protobufs being passed by value...
not trivially copyable types are never passed in registers regardless of size
Please prove me wrong on these points. My current belief and understanding is that:
1. `foo(T)` indicates polymorphic over both `T::T(const T&)` and `T::T(T&&)`. This gives you benefits of both pass-by-move (using `std::move` as needed), or copy.
2. Usage of `foo(T&&)` signals code-smell or an anti-pattern as `foo(T)` should be used instead unless it is perfecting forwarding / universal reference `template <typename T> foo(T&&)`.
Yes. A lot of 24-byte structs at Google are passed by value.
Indeed they are. I read about it from hackernews.
I learned about it when I was at Google working on performance.
0o...
What about std::span? std::optional? etc.
std::span should apply, but std::optional is at least as large as the type it wraps.
Almost always 8 + the size of the type. Since it's a one byte book followed by 7 padding bytes, followed by the type itself.
View types like span, string_view, FunctionRef: absolutely pass by value. They're designed to be cheap to copy, and they provide type erasure which can make your API more flexible.
You also usually want to pass smart pointers (unique_ptr, shared_ptr) by value to avoid breaking your ownership model, or pass by raw pointer or reference to the held data. The obvious case that comes to mind where you'd want to write const unique_ptr<T>& is when iterating over a vector<unique_ptr<T>>. Otherwise, pass ptr.get() for an optional param, or *ptr for a required param where you've already checked ptr != nullptr.
Wrapper types like optional, variant, StatusOr? Usually pass by reference, unless the pointed-to type is small (<= 8 bytes).
One common pattern where I see wide structs being passed by value is the use of option structs (https://abseil.io/tips/173) that are usually constructed via designated initializers. There is some marginal benefit to being able to std::move() out individual fields but it's not really a big deal either way, as said computation is usually only done once on program initialization.
Passing larger objects by value is allowed or even encouraged if it lets you avoid a copy (e.g. if you’re about to move the object)
Pass by value / pass by ref is quite a bit of mental overhead as it effectively affects your ABI/API. Zig tries to not force this so as long as you "pass by value", the compiler can actually decide to pass it by reference. It does expose this kind of footgun though https://github.com/ziglang/zig/issues/5973#issuecomment-1330...
Oof that's a very nasty bug. Is this still relevant in Zig or is there a workaround in the language? I'm not familiar with Zig, heard some good things about it, but this looks like a showstopper.
still relevant today, but as someone who writes a lot of Zig code I haven't ever really encountered it in the wild. You definitely could, though, and it'd be wildly confusing.
Luckily, the Zig core team has recognized it is an issue and plan to address it before 1.0 :)
It's rare to hit it, but if you do, having it happen silently is not ideal for sure.
I still think noalias-by-default is the way to fix this.
https://github.com/ziglang/zig/issues/1108
You get all the benefits of Zig being able to choose the function ABI, but if the optimization would have caused a bug, you'll get an immediate panic at the function entrypoint, instead of silently corrupted data.
The issue is still Open.
It’s a bit confusing when compared with other programming languages. But the Zig docs’ introduction to structs are pretty clear that they can be pass-by-value or pass-by-reference at the compiler’s discretion. If you need a copy, make a copy; if you need a reference, pass a pointer; if you don’t care, the compiler will pick one.
I think this is the same in the still experimental Carbon
There is also a another paradigm where parameters are marked as in, out or inout parameters as in D (?) and cpp2 which makes the intent clear
Apparently, it's not (anymore?):
https://github.com/ziglang/zig/issues/5973#issuecomment-1801...
Does the C++ ABI also spill 24-byte objects to the stack for each call? I guess I don't expect std::string or std::function parameters to be fast but it's still surprising.
Spilling big objects up the call stack is often not nearly as bad. Most of the time this happens, the object being spilled is constructed during the function, so the copy on return is actually elided.
That’s only true if the function takes no argument. Otherwise the copy on return is unavoidable.
That's not actually true: https://en.cppreference.com/w/cpp/language/copy_elision
I think parent is confused with NRVO typically not being triggered when returning a function argument.
In AMD64 ABI, anything larger (that doesn't fit in register) will be returned by basically output pointer argument. Caller reserves space for the return value and gives pointer to the function which then fills it with the return value. Both sides are easy to optimize and no unnecessary copies are made. Function can directly construct return value to the given address and caller can directly give a pointer to a variable if function return value is used to initialise variable.
The tradeoff here is that if you pass a pointer to that 24-byte object instead, when you actually need to use the object you need to deref that pointer. But nothing guarantees that the pointed object is nearby! You could very well cause a cache miss and wait a long time (like 100 nanoseconds) to fetch that 24-byte object from main memory.
If that same object is directly passed it's just on the stack. So it's most likely in the cache.
Yes, but this concern is orthogonal to calling convention. If the parent function does anything to the object, it will be in cache already. If it does not, but loads the fields to register, the parent will take the cache miss instead of the function. You neither gain or lose anything...
It's usually not about a single parent function and a called function. It's usually a long chain of calls where functions keep passing them around, from the point of construction. Imagine some function first creates that 24-byte object, then pass it around by pointer. Ten function calls later dereferencing that pointer will need access to main memory. Now imagine that first function passes the object by value; ten function calls later perhaps the contents of the object is duplicated on the stack ten times, but there's no cache miss.
It's a tradeoff between reducing memory usage and reducing cache misses.
+1 for pointing out that profiling often fails entirely to find widely distributed overhead, like that built into calling conventions.