Hubris is really, really nice. I've spent half an hour reading some of the kernel code and it’s exceptionally clear and well written - a far cry from ifdef macro soup, two letter variable name loving, comment starved C code I’ve seen previously. A good bit of bedtime reading!
I recommend leafing through it: https://github.com/oxidecomputer/hubris/blob/b44e677fb39cde8...
It bothers me deeply how much of C ethos can be boiled down to, "We can't be bothered to learn to type at a reasonable speed"
Disk space for source code hasn't really been a problem for forty years (binaries definitely), and yet we are still being stingy with variable names.
Identifiers aren't short in C (or any other language) because disk storage is expensive but because readability suffers with long identifiers.
The example here is very much C-style (short, lowercase, underscore). The list of argument is called "args" (short and descriptive). It is not called argumentListArray.
It is not a coincidence that math have very short and well known identifiers. The rate of change in horizontal position is called x'. It is not called posHorizontalChangeRate simply because the latter is harder to read.
There is, of course, such a thing as having too short identifiers.
You’re proving the original point - it should be named “arguments” because that is what it is.
Saving … 5 bytes? by naming it “args” instead is exactly OP’s point.
I don’t personally find “arguments” any clearer than “args”. Not sure what’s gained by calling it arguments other than 5 extra characters.
How long haye you been programming? You got used to the short name and now it is a word to you.
I don't think args is the right variable name to pick for this argument. As you say it is what that thing is called idiomatically. I think it is fine.
It is more about if your code needs to parse an internationaly formated phone number from a string into a struct. (or pick anything else specific only to your problem domain) How do you call that function? Do you call it prsphi? (prs for parse, ph for phone number and i for international) Or do you call it parse_international_phone_number? Or maybe international_phone_string_to_phonenumber_struct? (or something similar in CamelCase)
That is where the difference starts to matter. And I for one don't want to read code with many prsphi's around.
That's a false dichotomy; you could call it for instance parse_int_phonenum, abbreviating "international" and "number" to the still understandable "int" and "num" (and smashing together "phone number" into "phonenum" since it's a single concept in this code base). It's nearly half the length of your suggestion, while still being understandable at a glance.
Hah :D I was confused why would you change the signature to parse integers when clearly I said it was parsing strings. Turns out you were not calling it "parse integer" but "parse international". I guess that's a point against that name then?
If you work wit the code enough prsphi becomes as idiomatic as args. The question is which should be allowed to become idiomatic in the first place. Arguments is of course common enough across a lot of different projects that you can make a stronger argument that perhaps it should - and in fact it might be better as nobody thinks of the dictionary definition: "A discussion in which disagreement is expressed; a debate." when they see args.
I have never worked with phone numbers outside of a class assignment so prsphi is not something I would understand. However if I was in a codebase that worked with phone numbers often I'd probably get to know it and like the shortness.
That same argument applies to the word “argument”. What an “argument” is in the context of a function call is significantly more complex than the shorthand of “argument” vs “arg”.
In any case, I can remember when I started programming and no I didn’t have any problem remembering that arg is short for argument. I’ve seen acronyms and shortening of words all my life before programming after all. However I do remember being confused by the concepts argument or keyword argument in the abstract.
Short said, I really think this is making a mountain of a molehill. If someone doesn’t know what it means the answer is “arg is short for argument” and then you move on.
Who do they ask? They have just interrupted someone else's flow to ask - worse they might ask the wrong person and interrupt more than one persons flow in getting an answer. There is a high cost to even trivial questions and the more of them someone needs to ask the worse.
The point is there is a balance you need to find that balance.
If someone can’t deal with such totally standard and easily discoverable terminology, they don’t stand a chance. If they can’t ask their coworkers questions when confused, they (and the company) don’t stand a chance. You are far beyond the point of “balance” here.
I take pedagogy very seriously and work very hard for both experienced and inexperienced people to understand what I do, but this is just ridiculous. This is a non-issue.
Someone else gave these same examples but I find them compelling. Do you find atoi, strrchr or srand similarly clear? What about mbstowcs?
I wouldn’t call them similarly clear no. But that’s not really to pertinent to the question of args vs. arguments.
Identifiers in C are short because expectations set by standard library which evolved when C had hard limit of 6 characters for externally visible symbols.
If it bothers you so much I'm happy to tell that that is not the reason for it.
Could you spend some effort to explain the reason then?
Fwiw I've written some Real C++, think, audio encoders/decoders for voice recognition for 1B MAUs on the server, audio encoder on 2B devices.
Each and every time there was just insane excuses that boiled down to only a few people wanted to write C++ and they found infinite ways to justify not doing things, like all us humans do.
My favorite was the "don't write comments because as soon as you do they're out of date."
Even though this is recognizably trollish tripe at the extreme it was implemented, I still find it hard to ignore that brainworm in my day to day coding.
I think that’s a common mutation of the far more useful and therefore relevant “Don’t trust comments” which everyone learns at some point.
“Don’t trust people” is a heuristic to avoid being scammed. But it is much nicer to live in a society where you can trust most of the people. The same way it is nice to work with code base where you can trust comments by default. Which is the case for many open source projects.
Personally it’s always the programs you can generally trust that end up reinforcing the idea. On average you may be far better off trusting comments but that one time they fail you is all the worse because of that trust.
It’s the same deal with compilers and computer hardware. They work so well that they become a blind spot, and yet I’ve had both fail me.
> My favorite was the "don't write comments because as soon as you do they're out of date."
A better version of that is “if the code and comments disagree, there is a good chance that both are wrong”.
If code and comments get out of sync, at best someone has been careless, and you need to be on the lookout for more carelessness in changes made at the time. At worst changes have been made without fully understanding what is going on, so there is a huge can of edge-case worms about to burst open.
The reason is readability, not disk space or typing speed. When your screen resolution is 24 lines of 80 characters (a common screen resolution back when C became popular), having longer variable names means statements often have to be split over several lines, and less code fits on the screen at the same time. Even today, having shorter lines makes it easier to see several versions of the code side-by-side (for instance, when doing a merge, it's not unusual to see three versions of the code at the same time).
The reason was C compilers having little memory to work with thus limit of 6 characters for identifiers.
Who cares though? Just use an IDE or Copilot or something.
Ah, yes, we are all elucidated once again. Thanks for the ctx
C (the language) does not cause people to write unmaintainable or hard to understand code. Programmers who choose to write unmaintainable or hard to understand code are the problem. I have seen great C code, good C code, mediocre C code, etc. It really depends on who is writing this. This is true of all languages.
In short, technology cannot fix people problems.
Well then it’s good that OP didn’t claim that C the language causes people to write such code. They said “The C ethos”, not “The C language.” It’s not about the language’s technical requirements, it’s about what’s idiomatic in a language, how it’s taught, and what style is used by the vast majority of the existing corpus of code written in that language.
Look at the C standard library’s function names, vsnfprintf/strdupa/acosh/ftok… Compare it to something like Objective-C at the other extreme, where method and variable names tend to always have fully spelled out identifiers with no abbreviations and a full description of what’s being done (`- [NSString stringByAppendingString]`, etc.)
Is it due to some technical requirement? Is stringByAppendingString illegal in C because it’s too long? Is strdup illegal in ObjC because it’s too short? Of course not! But why do we see this everywhere so consistently? Why does C have short indecipherable function names and ObjC have such long ones, if the language doesn’t require it?
Because idioms matter. If you’re learning C, you’re learning the way it is typically taught. You’re reading other C code. You’re encouraged to program the way other C programmers program. You’re likely using the standard library a lot. Likewise ObjC.
This means two things:
- Yes, in a very obvious sense, it’s not the language’s fault, it’s the programmers’s fault.
- But also, paradoxically, it is the language’s fault, because a language is not just a set of syntax in a vacuum, it’s also a corpus of existing code, a set of idioms, a community of people, and a way of thinking of things. C absolutely causes people to write hard-to-understand code when viewed through this lens.
Your comment has been written in so many ways in so many threads discussing programming languages, it’s absolutely tiring. Yes, you can write terrible code in any language, and you can write great code in any language (well almost any… probably not brainfuck.) Nobody is arguing that. When we discuss whether one language is more readable than the other, we’re always talking about the qualities of typical code you actually see in a language, and about what style of code that language encourages.
It used to be a language limitation, first practical, later codified for portability. Originally (C89) it was 6 characters for anything externally visible, these days (C99) it's 63 ASCII characters for internal identifiers and 31 characters for external identifiers (implementation can allow longer, but does not have to - those are the minimal significant i.e. preserved identifier lengths).
Same reason why Pascal has symbols limited to 10 characters and doesn't preserve case - because original implementation mapped identifiers into 10 6-bit characters packed into 60 bit word.
Similarly some of the oldest Common Lisp functions retain aspects of encoding characters to fit into 36 bit words efficiently.
It depends whether you include the standard library into consideration.
atoi()? strrchr()? srand()?
I’d argue the obtuseness of the standard library function names at least influence the legibility of the programs written against them.
I suppose you are too young to remember that back in the day there was no IDE helping you with autocompletion, thus one character less in the name is one character less to type.
Furthermore, I'm ready to bet you are from the USA, and you forgot that most of the world (even the programming world) does not speak English natively, thus "SeedRandom" is NOT clearer than "srand": it's just as cumbersome and longer to type while reading character by character on a paper manual.
That’s a statement that comes up a lot. And while it is technically true, it really isn’t in practice. We can and do design systems to expose less of their surfaces that represent open knifes and more of the safe handle side. It is near universally true, that the easier to access way is the one that will be taken more often.
To be fair to C, it was designed long before we as a society really appreciated that. So there is a lot of old code whose authors just couldn’t know better. And some (probably few) will even have managed to write readable code! But that doesn’t mean that languages can’t encourage good behavior and discourage bad. (I’m not even sure where I come down wrt Rust here) And that will (somewhat reliably) end in better or worse code. Not necessarily in a reliable way. But it’s e.g. really hard to write python code where local control flow isn’t reasonably obvious (since scoping is enforced by white space). Not impossible of course
There are lots of example of hard to understand Python local control flow, Python is a long way from being consistent in encourge good behaviour. When comparing the same type of code let say arg parsing C is usually worse. In Python you can get lost in library hell looking at the details of that code but in C it's just knowledge of basic functions of the language how ever terse it might be. On avarage I agree.
This comes up every time someone advocates for C, and it's basically a refusal to learn anything about process and safety culture since the 1950s. "Poka-yoke" was invented in the 1960s. The programming equivalent, the use of type systems and other proof systems to automatically detect or avoid errors, is strongly resisted by C developers, who seem to want to keep writing CVEs.
While that's true, what languages have is culture. Formatting conventions, names of built-in stuff, books that teach the basics, expectations of devs that already know the language, and the newbies are expected to work with. It's often not on the programmer, but rather on the complex interactions between the programmers.
I get that we as an industry went a bit overboard when Java +ooo was in vouge novelty with entire sentences being method names, but I'm with you that all the people I see making super compressed var names with contexts longer than a for loop initializer
And don't even get me started on trying to text to speak your `mCtx` or whatever
I'm not saying TTS is there (I have no idea) but from an ideal point of view, to have it be equivalent, surely it would read 'm context', which then isn't so bad (maybe m is clear from the ..err.. context).
You could run a LLM pass to "expand" abbreviations and acronyms before running it through text to speech.
Although more expensive, it worked wonders for me with text containing units of measurement (°F, °C, m/s, etc).
> It bothers me deeply how much of C ethos can be boiled down to, "We can't be bothered to learn to type at a reasonable speed"
I can understand it from way back when. People were moving from assembly and other languages where everything was terse by requirement so variable/function/other names were short out of habit, though even then your assembly should have had comments about what was in each register & why etc. Also, comments were relatively terse because people were working on small screens and in extreme cases were concerned about the size of code files. None of this is adequate excuse for doing things wrong now, decades later, but such habits have momentum: people with the habits write books and local style guides, and also short names are baked into the standard library, so new people are “infected” by the habits, and so it rolls forward.
The standard screen size was 24 lines of 80 columns, and you couldn't expect a reader of your code to have a terminal with higher resolution than that. So the standard is to make your code fit in 80 columns; using longer identifiers means statements will end up taking multiple lines, which means less code can fit on the screen, making it less readable.
Even today, terminal emulators like Gnome Terminal still open by default with 24 lines of 80 columns, so there's still value in making code readable in that resolution (though nowadays, the main advantage is making it easier to read code side-by-side in multiple windows).
Most of the bad habits stem from how C is taught.
It should not be the first language freshmen are taught in universities. All the effort will be wasted on teaching basic programming instead of the actual language and good programming habits.
Pointers and the memory model are core concepts and should be taught early on. The worst programming books I have seen introduce them in the very last chapter, often called "advanced language features" or something similar.
It is not a compiled scripting language for Unix systems either and should not be programmed with a text editor on a remote machine. The programmer should have a proper development environment installed locally.
Valgrind and and Clang's sanitizers should be used extensively. There is no such thing as partial credit when it comes to memory correctness.
If a certain QuakeNet IRCop is reading this, thank you. Your uni C course managed to avoid all the pitfalls above.
"should not", "They should have" ... well, it had to! And they didn't have.
There is no point in trying to forget where C comes from.
"My computer is a LITERAL TELETYPE, so I will enshrine typographical parsimony in every part of this OS/Language I am creating" -K&R
Code quality has zero correlation with typing speed.
Its funny because the article even talks about the importance of reducing space for binaries.
A lot of commercial code is like this, regardless of the language; there just happen to be people trying to sell new languages. They desperately want to convince you that this is all the fault of C and it's nebulous "ethos," and if you just use their new language, it will all magically go away and possibly improve human rights somehow.
AI might kill this convention when everyone pipes their old crusty C code and all of a sudden all the variables are clean and named in the fashion the user loves best (because the AI has learned exactly the quirks of how a specific coder likes it).
AI might also make unicorns real and usher in world peace, while we're wishing for things.
Unicorns are merely an engineering problem tho (specifically, genetic engineering).
So is world peace, while we're saying things that contain the word "merely". :)
One dev uses an autoformatter (black for Python) for his codebase. When he wants to do side by side diff on a laptop, he reduces line width to 70 characters and formats his entire codebase, works with his diff tool, then reformats back to original width.