Before writing compilers, I think it's best to understand computer architectures, and what needs to be generated by the compiler to yield the most efficient machine code.
Unfortunately, in my experience, computer architecture and even systems programming are domains that schools/universities appear to systematically increasingly de-prioritize, presumably because it is seen as too technical.
That knowledge however is instrumental in landing some of the best jobs in the industry.
In my experience, no one tends to see it as “too technical.” The typical (and imo obvious, logical) reason not to emphasize low-level details is because they’re implementation-specific, immediately outdated, and not particularly generalizable.
As an aside, I also tend not to get along with the “systems programmer” types because they tend to make knowing these kinds of extremely specific factoids their entire professional personality. You end up with people that can write assembly but have no idea what a functor is.
What's the purpose of functors? It appears to have a different meaning in different programming languages.
Functors come from category theory. They're relevant in programming languages that are based on that, e.g. Ocaml, Haskell, and other sorts of overly academic and impractical esoteric stuff.
Like most ideas in programming based on theory, it's ultimately a very trivial thing, whose theoretical foundation is quite unimportant outside of helping design a minimal set of operations for your programming language.
It's not particularly important in the grand scheme of things. People even build and use monads with no understanding of the background behind it.
People even build and use monads with no understanding of the background behind it.
And then they miss the common pattern between them, so they miss the opportunity for abstraction and a common interface over the pattern. That’s the whole point of category theory: recognizing large classes (in the mathematical sense) of objects by their universal properties. It’s useful in programming language design because it gives you the ability to build useful, generic, water-tight abstractions.
A lot of abstractions built without any regard to the mathematics behind them wind up being leaky and difficult to apply appropriately because they rely on vague intuitions rather than simple mathematical properties.
I feel there is impedance mismatch between the mathematical category theory side and what is needed for programs. A case in point is the infamous complex diagram for the Lens library in Haskell. This is abstracting the notion of getters and setters to some extreme and is several hours of work to really fathom. Compare to Go where the tutorial does balanced tree comparisons as an intro example to coroutines, when the language gets out of your way I feel it is much nicer. Haskell is more of a playpen for PL ideas. Some of those ideas get promoted into the mainstream when shown to be very useful! So Haskell is very valuable in that sense but can be difficult to write code in. Especially if you want to use libraries that may use complex category theoretic libraries.
A case in point is the infamous complex diagram for the Lens library in Haskell.
That case is due to history and path-dependence. The theory and abstraction of the lens library was developed long after Haskell the language was designed. If Haskell were rebuilt today from the ground up with lens in mind you wouldn’t have that mess. Unfortunately, fixing it now would be too much of a breaking change.
It's also a common psychological pattern.. you go from fuzzy experience and semi defined patterns, later you see more rigorous and precise definitions but they make you confuse a bit and one day you get it fully. Math, haskell etc have a tendency to live in the latter stage.
Agreed. Likewise, you end up with different names for the same concepts across different facets of the industry which actually makes the profession as a whole harder learn and makes communication across different communities harder.
Programming practice shows that overly generic constructs aren't that useful and typically lead to over-engineering, difficulty of maintenance and reduced productivity.
There are enough complexities in the problem domain and the system itself, so KISS is king.
Good software is usually built by focusing on the actual problem being solved, and only generalizing solutions once sufficient amount of specific ones have been built and their commonalities identified to be lifted in a generic implementation.
The most impact language purists tend to have is when some of features end up adopted and adapted by practical languages (C++ or even C#).
Concepts without names are difficult to reason about.
I think everyone who knows what they are would agree that the definition of "functor" can be learned in ten minutes. Recognizing the same concept being applied in different situations is the value. (As my sister comment says.)
A functor is a mapping between categories, just like a function is a mapping between sets.
What kind of insights this means for programming is quite up to interpretation and how you choose to formally describe its semantics (and most popular programming languages don't have formal semantics).
The problem with category theory as it applies to programming is it's using a hydraulic press to crack eggs.
Telling people about the power of hydraulic presses when they just want to make omelets is pretty unhelpful.
In programming, a functor is a structure with “elements” which you can apply a function to.
The idea is that the type of elements is a parameter, called A, say. Then you use functorialty to change all the elements to something of a different type (or a different element of the same type).
For instance List(A) is the type of lists of elements of type A. If you have a function f taking input A and giving output of type B you can “apply the functorial action” of List to transform a List(A) to List(B). For lists “the functorial action is simply mapping the function on each element. But being able to abstract over all functors can give very general implementations of interesting algorithms and patterns which then applies to many situations.
I don't think they are. So long as you were not trained before the 70s, any knowledge there is still very relevant today.
ISAs are still very similar, and so is the relative performance of the various operations.
Registers, caches, pipelines, superscalar execution... it's all still mostly the same, and still drives what the codegen should look like.
And yet you have so many people that don't know the considerations of what makes a simple loop fast or not.
Knowing "the considerations of what makes a simple loop fast or not" is irrelevant to solving a computational problem. We as programmers should not need to worry about details of the hardware or toolchain (e.g. compiler) we're using.
Granted, the reality is that most people are doing the equivalent of digital plumbing, not necessarily solving computational problems.
Plumbers have done more to save lives and enable the luxuries we take for granted than most. I would be lucky if my work could compare to that.
I think we need software engineers coming from electrical engineering, mathematics, humanities, and all walks of life. If you’re not interested in compilers, maybe the comments for a link about compilers isn’t your place to be.
I come from an EE background. I didn't understand the importance of computation in general until entering industry, and I want to understand compilers before I leave this earthly plane, just for the sake of understanding them.
The role of the compiler is to automatically perform the task of translating a program in a given programming language to a program that runs on a computer, making the best possible use of that computer's resources.
Writing a compiler with shallow understanding of the computer you're targeting is definitely possible, but ultimately wasteful.
New challenge to self: write a functor in pure assembly.
What, such as a linked list? This is a first week assignment in a systems course.
After some reading I have come to two conclusions:
1) I have written many functors in my life without describing them as such
2) "Pure assembly" functor doesn't make much sense. Functors, both in the CS and mathematics sense, require the concept of type. Which is absent once you are dealing with stacks and registers. My best idea is something when you have mixed-width registers - eg, say you've got some 128 bit and 256 bit registers. Your functor could apply a mapping from one to the other (and would probably be nothing more than a move instruction with the appropriate sign extension).
Excellent final sentence!
I have been programming since I was 11 years old and I've been professionally programming in various languages and domains for roughly 15 years, yet I had to Google what a functor was.
Your attack on systems programmers holding specific knowledge as holy while pointing to their ignorance about another bit of knowledge you consider more relevant seems kind of ironic to me.
I guess it's poison and meat. Seems to be my idea job. I have always wanted to write C and assembly language programs in my career but don't care much about functional programming.
There are things that are generally true across almost all architectures and have been for a long time. Like cache lines, cache hierarchies, false sharing, alignment, branch prediction, pipelining, etc.
But I agree it isn't really necessary to write a compiler. Compilers tend to be pretty much the worst case from a microarchitectural point of view anyway - full of trees of small objects etc.
In my opinion one of the biggest industry advancements we have had is that compilers are now available without such level of low-level detail. There is still so much work to be done on a compiler level that really shouldn't concern themselves with the (micro)-architectural level of computers.
It seems the trend is in increased diversity of hardware, with the dominance of x86 being reconsidered with smartphones, energy-optimal cloud, GPU computing, and ML accelerators.
LLVM allows hardware manufacturers to more easily provide mainstream language support for their platform than before, but the problem here was mostly that GCC was hostile to modular design, not really theoretical advances.
In terms of frontends, I guess we're seeing more languages reach C-level performance thanks to LLVM again.
But in terms of optimizations driven by theory? There were some significant advancements in generic auto-parallelization for imperative languages, about a decade ago I think. And it it doesn't magically solve the codegen problem, and remains hampered by language semantics that are not always parallelization-friendly.
There were a bunch of improvements which were driven by making languages more hardware-aware, e.g. the concurrent C++ model in 2011 which was widely copied to other low-level programming languages.
We're also seeing more and more libraries that are specifically designed to target hardware features better.
So ultimately it looks like most of the advances are driven by better integration of how the hardware works throughout the compiler, language and community.
If they made JavaScript as fast as C, the software industry will become a cheap perversion of what it once was, and I'm picking up my toys and going home.
Could you elaborate what you think are the best jobs in the industry and why?
Compilers are amplifiers.
Compiler optimisations give you faster/cheaper execution of every program.
Type systems and linters detect errors without writing or running tests.
Programming languages are tools for thought. Matching one to the domain makes the domain easier to reason about, lets people solve bigger problems.
Whether that correlates with "best jobs" is subjective. The value proposition of making other programmers more productive is really good.
If you think that's an exaggeration, consider writing a web browser in machine code.
I agree that “best jobs” is a bit ambiguous, but I think “landing the best jobs” is unambiguously getting a job that benefits the job-getter to an unusual extent (very well paying, or at least well paying and very stable). It is an expression.
The surprising thing about this, I think, is that hardware is generally expected to be quite underpaid around here, I think, compared to programming.
GPU related tasks are still fairly in need of lots of computer architecture and compiler skills. Yes, you don’t need to study x86 or MIPs, but CUDA presents an even weirder architecture than those.