While writing a text editor, a compiler, an operating system, or a raytracer might make you a better programmer, it won't make you a better software engineer. In fact, it might make you worse at software engineering, because it embodies the disastrous "Not Invented Here" doctrine.
Hackers like to obsess about Big-O, data structures, HoTT, and other high-theory stuff, yet the following skills, essential for software engineering, are almost never discussed and even more rarely practiced:
- Deciding what to write yourself and what to take from a library
- Identifying high-quality libraries and frameworks that meet your project needs
- Deciding where optimization is worth the effort and where it is not
- Writing code that will still be readable to you (and others) a few years from now
- Thinking about the project as a large-scale, complex system with software and non-software dependencies
In that spirit, I offer the following alternative challenge: Create a web search engine. Don't bother with string matching algorithms etc., others have already done that for you. "Just" make a search engine (and crawler) that can actually work, even if it only supports a subset of the web and a single concurrent user at the beginning.
I don't understand why you think a search engine requires the use of "real engineering skills" like picking and choosing libraries and identifying which opportunities will yield fruitful optimization.
Literally all of the listed projects, text editors, compilers, operating systems, and ray tracers, can exercise the exact same activities.
I'm more inclined to think that your comment is really more revealing about what kind of knowledge is in your own personal wheelhouse. Having worked on a ray tracer that was subsequently used on many feature films, I can assure you that all of your bullet points apply to that project, too.
It's much more a matter of whether you want to do something small scale and fun, or whether you want to suck all the joy out of it by applying the same soul crushing constraints we already get paid to do in our day jobs. Bleh.
In the linked article, these projects are all explicitly described as opportunities to learn about low-level stuff like how to efficiently store editable text. The difference with a web search engine is that nobody today can build such a thing completely from scratch, therefore it forces you to give up the toxic NIH mentality, which in my experience is usually driven by elitism and ego (on full display in this comment thread), not by some lofty desire to learn something new.
I'm sorry, but can you substantiate this claim? I've seen no indication that a search engine is not buildable from scratch at all.
Sigh ... Of to the silicone mines it is then ... again /s
You mean, the beach?
It is a joke based on needing to define "from scratch". Similar to, "how do you bake a cake from scratch? First you must create the universe." Op is gathering the raw ingredients to start fabricating his chips.
Yeah, that’s sand. :)
Assuming you get the TCP/IP stack for free, you still need to build fully-featured HTTPS and a "webscale" multi-server database for document storage from scratch. The crawler is easy and so is something like PageRank, but then building the sharded keyword text search engine itself that operates at webscale is a whole other project...
The point is that it's too much work for a single person to build all the parts themselves. It's only feasible if you rely on pre-existing HTTPS libraries and database and text search technologies.
Simple methods of search like exact matching are very fast using textbook algorithms. There are well known algorithm like suffix tree which could search in millions of documents in milliseconds.
I enjoy reading your discussion and just wanted to add that some people write a big scale software from scratch nowadays - for instance Marginalia for web search and Andreas Kling and team for operating system and web browser
Marginalia, which I'm a big fan of, is not "written from scratch" (it would be stupid to do so). Check the project on GitHub, it has lots of third-party dependencies.
I definitely lean more toward NIH than what's conventionally considered wise, but most of the time it's not NIH for the sake of NIH.
I do pull a lot of libraries, but an enormous amount of what the search engine does is very much built from scratch. The libraries generally deal with parsing common formats, compression, serialization, and various service glue like dependency injection. I think the number of explicit dependencies is a bit inflated by the choice to not use a framework like springboot, which pulls many of the same (or equivalent ones) implicitly.
What makes the search engine a search engine, the indexing software (all the way down to database primitives like btrees etc.), a large chunk of the language processing, and so forth; that's all bespoke. I think it needs to be. A lot of existing code just doesn't scale, or has too many customizations that would add unnecessary glue and complexity to my own code.
I'm going to echo SerenityOS Andreas and suggest that it's a skill like any other. If you shy away from building custom solutions to hard problems, you will never be good at it; and it will become a self-fulfilling prophecy that these NIH solutions are too hard to build.
At the same time, there's a time and a place and you should indeed be judicious as to when to roll your own solutions, but maybe that time and place is exactly in a hobby project like the ones suggested in this thread (and is how my search engine started out; a place to dick around with difficult problems).
I'd also add that being able to tackle problems yourself, rather than needing a library to do all the heavy lifting at all times, is a great enabler. Sometimes there is no adequate library, but that doesn't mean the conclusion has to be "welp, I guess we can't do that yet..."
I guess the first step in writing large scale software is to become Swedish!
Long cold winters dimly lit by a flickering CRT.
A simple search engine is certainly doable from scratch in a matter of weeks, complete with most things expected from a search engine.
Similar to compilers or tiny OSes, one can go as hardcore as necessary, or just stick to basic stuff.
Of all the typical personal challenge style projects, databases are probable the hardest to build, and even that is not impossible.
making a database is easy. making a fast database is what's hard.
even implementing all of sql92 or datalog without concern for efficiency is fairly complex
implementing a toy datalog should take no more than a week.
maybe less
Back when online learn-to-code courses like Codecadamy and Udemy were a fad, I remember that one of them (and unfortunately I don't remember which, and Google, ironically, turns up nothing) taught how to build a search engine in Python from scratch as a first project for complete beginners. I thought it had a reasonable level of complexity for this task.
You can still find search-engine-from-scratch courses on Udemy, complete with all the necessary algorithms [1].
[1]: https://www.udemy.com/course/build-a-search-engine-with-pyth...
was it udacity's cs101?
Yes! Thank you.
Incidentally, the first Google search result for
"introduction online cs course that uses Python the build a search engine from scratch"
is udemy CS 101
Even through my smartphone autocorrect mistakes of 'introductory -> introduction' and 'the -> to'
You know those car guys who will rebuild their engine, just because? Or those retro computer guys that will recap an ancient board rather than buying a modern pc?
For you it is a job, for me it is a hobby. I have no interest in making something 'professional' I want to take it apart to understand how it works. Want to understand how a text editor works? Write one.
What component of that is ego?
It seems to me you're the elitist, you're not far off saying mere users shouldn't be allowed to modify their own software, shouldn't be allowed to install software that hasn't been okayed by the people who know what they're doing.
Your lack of constraints on personal exploration in software is interpreted as hubris by people who came to software seeking high paying regimented recipe-following.
Please can you define "from scratch"?
I encountered some push back on this in my "code editor from the ground up" post[1]. I think the only reasonable definition of from scratch is:
Does not have domain-specific dependencies.
So a code editor based on ACE or CodeMirror would not be from scratch, obviously, but one that involves writing all of the domain-specific logic would be. Using generic libraries doesn't stop something being from scratch. (In my case Tree-sitter is arguably domain-specific, but an early version did use a hand-coded JavaScript tokeniser in its place.)
[1] https://news.ycombinator.com/item?id=34577246
Really? I think there are a couple out there. The main issue today is scale, but you could constrain that by limiting your crawler to a fixed set of sites.
If you want to build a compiler from scratch, you must first invent the universe. Peeling back abstractions to see how things could or should work is perfectly fine, even for professionals.
Case in point, I've spent a year excising bloated frameworks from my stack at work and replacing the few corners we needed from those frameworks with, e.g. 50 lines of curl calls. The C compiles instantly and is tailored for our tiny use case, produced much quicker delivery on our one related feature we wanted, and removed chains of dependencies.
Being reliant on far-too-abstract libraries and frameworks to do simple jobs is also a curse. But nobody at work has the experience to know that curl was sitting right there on our image available for our use. And nobody has that experience because nobody took the time to build something from low level libraries. Now we know how and can make an intelligent decision without defaulting in either direction because we were afraid to try.
While nih is ultimately dependent on multiple different aspects, I would argue that, ironically, it's more likely to create nih-qualifying product with your aproach.
Because when you write things from scratch, you actually have the space to innovate. But when building product with preexisting puzzles, there is much less space to actually make any usefull changes that would make your product actually standout from existing alternatives (which nih is all about)
Well, if I’m going to build a web search engine from scratch don’t I first need to write a compiler from scratch? Which means I first need to write an editor from scratch…
Amen. And further, what better prepares a programmer to assess the relative costs of implementing a thing vs using a library providing that thing than having attempted an implementation?
Learning by doing is a valid approach, and this can even be called fun.
That's my read as well.
This is like telling a student learning an instrument never to practice scales or études because there are other skills necessary for playing in an orchestra. Those other skills are important, but you have to develop your chops at some point, and being a better a programmer will certainly help with basically all those skills you listed anyway.
No, it's like recognizing that an orchestra conductor doesn't need to play every instrument (or even any instrument!) in order to make music.
Or alternatively, that a violin player doesn't need to understand the physics of acoustic dispersion in order to be the best in the world at playing the violin.
Yes a software engineer understanding data structures and runtime complexity is akin to a violin player understanding the physics of acoustic dispersion.
Is it not? I can't tell if you're being sarcastic or not.
The analogy would be a software engineer having to understand resistors and capacitors and transistors.
Also, I don’t think you’ll find a single conductor on this planet who is not skilled in at least one instrument. Terrible analogy.
I would say, given how absurd the statement is, sarcastic.
So here’s the thing… I did an EE/CS dual degree. I’m going to focus on the EE side of it for a bit.
We learn the high level practical behaviour of devices: resistors, capacitors, BJT and MOSFET transistors, opamps, logic gates, etc in 2nd year. What we learn there is enough to be an effective practical circuit designer.
We also dive deep into how that stuff works under the hood. One course I remember fondly started the first lecture with “This course is called semiconductor physics. To understand semiconductor physics you need to understand solid state physics. To understand solid state physics, you need to understand quantum mechanics. We have a lot of material to cover so let’s get started.” A similar course was Electric and Magnetic Fields, where we spent a ton of time applying vector calculus to situations of varying complexity to get a better understanding of how the things we learned in 2nd year actually work and some of the limitations that you wouldn’t pick up on from the practical (generally first-order) models.
You don’t get to graduate with an EE degree without learning this stuff. And in my mind, that stuff is like a mixture of data structures, runtime complexity, assembler, and cache coherency. Fundamental understanding that while you won’t use directly in your day-to-day work but underpins everything you do.
The job market for EEs...still isn't as hot as for CS people. However, you could also focus on DSPs in EE and use it to get a heads up on machine learning. This assumes you are ready to specialize, however. Anyways, none of it should go to waste.
Ok, this is clearly a side-topic AND at the risk of being pedantic: Is this actually true?
Like, I can see how theoretically one could learn to sight-read music well enough to be able to direct an orchestra of individual musicians, then do enough ear-training to identify enough notes to keep tabs on everyone (especially if you're conducting a high school or (god help you :) ) middle school orchestra), etc, etc.
But does anyone actually do that? Has anyone ever done that?
"I'm going to learn how to conduct an orchestra without learning any instruments" kinda feels like "I'm going to become a software engineer using an LLM instead of learning any foundations)"
(To be clear - the LLM path might be viable in the future, possibly the near future, but at least today it's not quite there. My apologies in advance if my analogy doesn't work in the future :) )
No, it's not true, there are no prominent conductors that cannot play an instrument (or sing at a high level).
A conductor should have a deep understanding of music, theory, and rehearsal pedagogy. At least in current western schools of music I don't see how you would explore these topics outside of the study of an instrument. Maybe there is some esoteric path to this; I just can't imagine how you go to Berklee and begin to explore the nuances of a composition without ever engaging with it as an instrumentalist.
On top of that, the conductor isn't just engaged at performances, they're also responsible for leading rehearsals. If they've never deliberate practiced the learning of music from an instrumentalist perspective I think they would be very hard-pressed to structure strategies for getting the larger group to a high level.
I guess it depends what you mean by "play an instrument" - almost all proficient conductors have proficiency in at least one orchestral instrument.
There is at least one (so I'm speculating he's not the only one, but it is likely to be rare) proficient conductor (Leopold Stokowski) who had no real proficiency with any instrument but he did have some very rudimentary piano training... and then pretty much taught himself conducting. Whether that rudimentary piano ability counts as "play an instrument"
the skills of conducting do not require any instrument. Many conductors tell someone how to play their part despite not knowing how to play it themself. However it is hard to imagine anyone learning music theory not in context of learning an instrument.
There was actually a show about this: https://en.m.wikipedia.org/wiki/Maestro_(British_TV_series)
Are you an orchestra conductor or a professional violin player?
I'd rather put the violin player with understanding of acoustic dispersion in the same bracket as a software engineer who understands the effects of magnetic interference on a memory controller.
i heard this before and it seemed dubious to me
on investigating, it turned out that there were literally zero famous orchestra conductors who don't know how to play any instruments. most of them know how to play numerous instruments and do so in their spare time, though there were one or two who had stopped. from my experience with musicians i think it's likely that they could play any conventional orchestra instrument after a short exploration period, even if they don't have documented histories of playing it
i suspect that this is true of non-famous orchestra conductors as well, but i could only find public information about the famous ones
of course that doesn't prove anything about programming, which is not the same skill as playing a musical instrument or conducting an orchestra, but it does show that you're constructing your arguments without much concern for veracity
the fact that in https://news.ycombinator.com/item?id=38769008 you claimed that web search engines are not built by people who work on string matching or distributed database transaction consistency further calls your credibility into question
This is correct, a conductor does not need this. But you are not the conductor in this analogy, you are a player of one or more instruments.
Nowadays (and increasingly, going forward) it's possible to be a very productive programmer without knowing much low-level stuff. This may seem unfair to those who spent years wrestling assembly and then C pointers but that's just today's reality.
It's not possible to be a "productive" musician on a traditional instrument (i.e. excluding iPads) without knowing how to play scales, chords, etc.
This is also the reason a computer with 4GB of RAM can't run Gmail and Spotify at the same time.
That’s demands for faster delivery of software and ridiculous demands on user experience.
Does looking down on higher level devs make you feel superior or something like that?
Absolutely not. It's about bloat. Many times people "just want to code" or eschew optimizing repeating the mantra that "premature optimization is the root of all evil" that you often end up with bloated software.
Google and Spotify are supposed to get among the best / most productive developers though
Tell that to most guitarists. Vast majority are self taught and can’t read music; a number are excellent players.
They can't read music, but they can still make their hands move in repeatable patterns without thinking about each finger position, which I think is roughly the equivalent of implementing quicksort.
As someone who has played guitar on and off for years as a hobby, I'm shocked by how much harder it is than programming.
I can pick up almost any random codebase on GitHub, and as long as they use libraries and don't touch low level stuff, I can probably start working with it in 10 minutes to a day at most.
Then, I could leave that project for a year and be almost as productive as when I left. So much of the action is on the screen, not in the head, and there's no muscle memory required.
If any instrument was as easy as coding... I think a lot of coders might have music careers instead....
Hard disagree. Those “productive” programmers are a nightmare, their code is a barely decipherable jumble of randomly gluing function calls together until they kinda sorta do something that approximates working some of the time. They cause an unfathomable amount of damage, lost productivity due to bugs and data nightmares, lost data, and headaches.
They do make idiot managers happy though, because look at all those features and scrum points they “finished!” That’s why armies of them will always be there in the software ecosystem, blithely and ignorantly a net drain on whatever unlucky company is currently employing them, busily making a mess for more diligent programmers to clean up for the rest of eternity. It’s called “job security.”
What if we don't want to be "software engineers"?
I'm still not totally convinced that "software engineer" is even a thing, frankly.
Who do you think puts systems like a web search engine together?
It's certainly not people who are lost in details like string matching or how ACID is implemented in distributed databases.
Of course you need all these things in order for the search engine to work – but if you write them yourself, even if you think about them excessively, you will never finish.
Software engineering is knowing that you don't need to know everything, you only need to know where to find everything. "Programming" is an important part of it, but far from the only one.
A really good programmer who is free from the tyranny of software “engineering”.
I believe this more and more as I go through my career. There is usually one person carrying progress hard. If you can get three or four of those people and another one to coordinate them you can do truly amazing things. Usually though you just need one unencumbered by bureaucracy.
Not sure what you guys think software engineering means, but it's definitely not the same as what I think it means.
Software engineering is whatever definition that allows me to procrastinate and sleep at night knowing I don’t need to learn more.
If you don't want to learn you chose the wrong profession.
To expand, the software industry moves fast, and you have to keep learning to stay afloat, or risk stagnating and ending your career early. look at where software was at 5 or 10 or even 20 years ago. There was no react and no rust 20 years ago, nor was there even git! Whatever you know now is going to be out of date in a matter of years. good luck getting a job if that's all you ever want to learn.
Conversely, 10 years ago we also had Grunt, Gulp, etc, which were poor implementations of build systems we had ~40 years ago.
It’s not all progress.
Stagnating yes, but not ending career early. There will be python/java/c/c++ jobs for decades just keeping the lights on for non-tech companies.
Say that to people in this thread.
i've met a good fraction of the first 128 google employees, and this could not be farther from the truth:
the people who thought those things were unimportant details were the ones who tried to compete against google and failed, like lycos, inktomi, and pets.com
like, check out the wikipedia article on udi manber, who is best known for spending 15 years 'lost' in string matching:
the fact that in https://news.ycombinator.com/item?id=38769506 you claimed that orchestra conductors don't need to play any instrument further calls your credibility into question
BTW at Yahoo in the 2000s I learned that Manber had invented at least one new trade-secret algorithm while he was there. The one I ran across wasn't for string matching but it was a fairly basic kind of thing.
Literally by stitching together people you’ve listed here and making them work together:
Being able to write something from scratch is not the same as always writing from scratch. Let me turn this around. Who would you trust with writing search engine, a software engineer who worked on and wrote search engine from scratch or someone who only theoretically knows about it?
I hear this excuse more and more when it comes to threads like this. You don’t need to defend your life position, it’s okay to not know everything.
I find your comment quite ironic considering the engineers that built Google search did it by developing novel software systems, to great success. The Google codebase is basically the definition of NIH syndrome.
Maybe the *term" "software engineer" is not an established thing.
But these points raised by GP... are basically self-evident.
... But since they don't have immediate effect (along the tune of "look I made this shiny thing myself in a weekend!"), it's hard to brag about it and perhaps even hard to establish causality that your skills in the areas above contributed to the software/business' success. But you can always brag about how you implemented a Fibonacci heap or something and made your heap operations 0.2ms slower.
I do that shit every day at my job. When I’m just coding for fun at home I want to implement crazy data structures, otherwise what’s the point?
The good news is that the field of computer programming is huge - so there is room for all sorts of preferences, skills, and abilities.
And sure, use of the word "engineer" will cause angst among some - because it used to have a precise meaning which has been diluted (and in the case of software, ignored.)
Indeed the list of projects runs the gamut of where a programmer might go in life. They might do user UI, or games, or work on an OS or compiler. More likely they'll work at some random company doing databases and reports. Or they'll be Web focused doing lots of JavaScript. Or they'll be deeply involved in the field of Advertising (Advertising Engineer anyone?)
Of course wherever you end up, you can still have some fun. And if you're just starting out then it can help you understand your desires, and limitations.
So don't worry too much about the "engineer" word. It's pretty irrelevant. Even the title of the thread is "programmer" not "engineer".
Not ignored! It's precisely because software engineering discards the very parts of programming that are closest to real engineering that this term came to describe it. "Don't bother learning the fundamentals, but focus on bureaucratic ritual instead" is a message that requires excellent marketing.
I honestly don't see a difference between programmer and software engineer. The one I see a difference with is "coder"
This sounds like sensible advice. But there are a few problems with making software purely from ready-made building blocks. Here are two of them:
1) Much more often than not, the ready-made building blocks are crap. Your software will reflect that crappiness, and your life will consist not of writing software, but of maintaining and massaging crap.
2) Even if your ready-made building blocks are high-quality, they limit what kind of software you can write. Ideally, you imagine what you want your software to do, and then you write software to do it. You create the building blocks in that process. If you are only using ready-made building blocks, there is a lot of great software you just cannot write. And I think we all see that, because there isn't much great software around.
These problems might not be a concern for you. If you write software that is pretty much just more of the same, 2) does not apply to you. And while 1) is still problematic, it will be just manageable because you use these building blocks mostly on their happy path.
But I have bad news for you. This kind of software will soon be written not by you, but by an AI.
I claim that the opposite is true: The reason there are so few software projects that are really great is because high-level skills, not low-level ones, are in short supply. Every CS graduate can implement Paxos. The number of people with the experience needed to take an implementation of Paxos, and many other components, and put them together into something that works really well is much, much smaller.
AI is actually much better at writing low-level code than high-level code. If you tell it to implement a piece table or a Fibonacci heap, it will do so without problems, in any language. If you tell it to create a simple CRUD application using existing libraries, it will output something that doesn't actually work.
I guess it all depends on your definition of low-level. High-level can also be low-level, just with different building blocks, if the building blocks are based on well-specified abstractions.
I am actually currently writing a novel text editor, and I can guarantee you, my requirements mean that an AI cannot help me with that directly, because the underlying mechanisms I need don't exist yet. The AI is still helpful though when researching existing mechanisms, and to pinpoint why they don't work for me.
If your low-level code is just a copy or adaptation of something existing, then yes, AI is really good at that. Something novel though, AI cannot help you much here.
On the other hand, simple CRUD applications will all be done by AI very soon. Probably not using existing libraries though, because these are crap.
But there is certainly a role for humans currently in high-level design, I agree with you here. But this high-level design crucially relies on your ability for low-level design, with the ability to imagine and then implement (possibly with the help of AI) the building blocks you need, not just reuse the ones that are there.
You have to be pretty smart to even think of an idea that can't be done with existing building blocks.
It seems to show up more in hobby stuff, or in cutting edge projects.
I'm always excited whenever I see any kind of algorithm work, because it's such a novelty. Most everything I do is just a UI around a few libraries.
There is a world of difference between "can be done" and "can be done well". Existing building blocks typically is good for the former but often not the latter, or you have to exercise a good amount of judgement when piecing things together.
Existing libraries for writing CRUD apps are anything but crap.
If we can’t create special libraries that work well for common tasks then as a species we are too dumb.
Which ones are your favourite libraries, let's say for a TypeScript only code base?
If you feel like telling us about your novel approach to text editing, I'd like to hear it (or I'll just wait for the announcement post. That's cool too).
I have lived evidence that not nearly every CS graduate even knows about the topic that Paxos addresses.
Where in the world can you get a computer science degree without hearing about consensus algorithms?
Pretty much everywhere below country's top3, top5?
I've got masters degree and I don't think anyone even mentioned word "paxos" or "raft" during my 5 years there.
Fortunely internet is a thing and I could read something about this topic.
I know graduates from various public schools and people who heard about paxos (let alone can implement it) are tiny %.
You’ve just shifted the goalpost twice here. I never learned about Paxos, but I did learn about consensus algorithms. Second, learning about something, and implementing something are completely different. Most CS graduates have an idea of how syntax parsing works. How many can implement a parser? What about a syntax highlighter? Most graduates have an idea of how an OS works. How many can build an OS?
I guess if you’re always using libraries though, you may mistakenly be thinking that these libraries are just doing trivial work. Once you dive in and try to implement the low level stuff, then you realize how big of a disconnect there is between a fuzzy idea in your head and the lines of code that constitute that idea in reality.
Is all of this satire?
Just plain old elitism.
Press X to doubt
I think those who deal with OS kernel, compiler etc on daily basis also understand those points. They use various tools to make code review, time management etc easier.
Not all hackers are overly obsessed with the most efficient Big O and various low level (sorry pun intended) details.
A NAND gate is low level. Big O is just theory that anyone should know if he's worth something as a programmer.
Most of the time, big O is not relevant if you're not writing new algorithm. For the usual CRUD app? Not so much. Heck even for more involved work, you will not touch them because your solution will be called or will call some external tools.
The good news is that, while still valuable, you won't have to lose time creating a new broken wheel. The bad news is that this part of CompSci is now mostly fundamental research.
This is just wrong. Understanding the difference between O(1), O(n) etc is essential for literally everyone who writes code. Every single programmer is better with this understanding than without it. You should know the complexity of the code you write - and most of the time that doesn't even require actively thinking about it. If you know the basics of complexity analysis it's just intuitive.
No it isn’t.
It’s a great thing to learn and understand, and essential for designing and maintaining data-intensive systems, but your statement simply isn’t true.
Sure, you don't strictly need it to write working code. But you will very quickly run into situations where you're writing unnecessarily slow code because you don't know what you're doing.
To me, it's essential.
This attitude is precisely why we have computers thousands of times more powerful than we had in the 90s, and yet they perform the same tasks slower.
It becomes very important as soon as you start getting statement timeouts in your db queries.
And then you get needlessly inefficient CRUD apps that take minutes or even hours to generate a simple report because they're doing accidentally quadratic or even cubic work somewhere.
Understanding algorithm costs can be extremely important when building a simple CRUD app because it will help you reason how things will scale with the size of the data you’ll see in the real world.
It’s easy to make things that are accidentally quadratic but are fine until your big customer has 10 times the data and they suddenly aren’t fine anymore.
Doesn’t mean you need to optimise everything, but it doesn’t mean you shouldn’t think about this stuff.
I don’t really see how one could have any hope of performing engineering with any sort of rigor while throwing away big-O.
Then again, big-O is useless in many cases because real computers have too many arbitrary performance thresholds.
I suspect software engineering is impossible, or at least, nobody has made the model required to do it.
Going from O(n) to O(1) on an operation where n is 10 could be a performance downgrade. That doesn't mean it's useless, it just means there's more to it than the big O.
Asymptotic complexity is about how things scale up. You should care about it when you're working with things that scale. Doesn't mean it's useless if your data is static, just means you need to understand when to go for the high scaling solution and when not to.
It can be useful in general, but it is hard to provide the type of tolerances expressed in physical units you’d expect in an engineered solution when all the constants are ignored.
I'm not sure what you mean. Sounds a bit like you might be misunderstanding what were talking about.
It is a bit rude to become confused and then assume the other person was the one with the misunderstanding.
I'm just curious about what you're talking about. Tolerances and physical units are not usually mentioned in relation to time/space complexity. I'm open to the idea that you may know something I don't, feel free to enlighten me.
if your input data doesn't scale then either
① you're writing a real-time control program such as a motor controller and you care about not just the asymptotic complexity of your algorithms but their worst-case execution time, in microseconds; or
② you should just do the computation by hand with pencil and paper instead of writing and debugging a program to do it for you
at the point that the input data becomes too big for ② to be an appealing option, n is at least 100, which means you probably care a lot about whether your program is O(n) or O(n⁴), at least if you wrote it in something slow like python
maybe the time to start worrying about it is after you start the program debugged and running for the first time
So you say software engineering is not real engineering I will drop this link here:
https://youtu.be/RhdlBHHimeM?feature=shared
It is an hour long, he’s a good speaker, but what’s his point or where’s he get to it?
Notice it said programmer, not software engineer.
Speaking for myself, software engineering is something I do because I have bills to pay. Programming is something I started doing because it’s fun. The projects in this article are the types of projects I would do for fun if I had the time.
Sudden clarity.
I feel like printing it and putting it next to my workstation, so that I remember this both when doing something fun and when having to work so that my bills get paid.
Yeah, this is a very important realization to make in a career of a software engineer.
Other points to internalize:
- in business, software is made to either earn money or pay less money. All other effects are at best secondary
- code is a liability as much (or more) as it is an asset
- programming is the easy 0-20% of the job of the software engineer, the rest is making sure the right thing gets programmed
- no matter what they tell you it’s a people problem
- software is always broken but it keeps running as long as practitioners are keeping it within the operational envelope
I like watching Tsoding in my spare time and I love what he calls "recreational programming". That is what we need imho, more recreational programming.
Also, I'd add that doing HoTT or other such "high theory" stuff actually enhances my understanding of the fundamentals and gives me the machinery to handle complexities systematically that building a mere web search engine would not. I already have a job for building such commodity stuff everyday.
Same for me. While I strive to do stuff fast, in a way that is easy to maintain, modify and extend for work, see the big picture, take the right time of shortcuts, understand business and customers, there is not were the fun is.
I enjoy the time when I am alone with the PC, far from user stories and scrums.
I love to program, to convince the PC to do the trick I tell it to do. What I enjoy also is CS, algorithms, data structures, abstract thinking.
If you want learn how things scale across a team and last years, read or contribute to open source code.
It takes years for a single person to get a project to the point where it's a good learning ground for scaling and maintenance.
Gluing a few libraries together is real software engineering but unless you're really invested in the outcome it's not that engaging and it's not that educational.
That's only true if complex systems don't interest you.
Personally, I have always found the experience of "putting the pieces together" and orchestrating highly diverse systems into a coherent whole to be much more educational than learning about algorithmic details. I also generally find making things work well more interesting than making things work.
Gluing a few libraries together will certainly produce a complex system.
So if I'm understanding your argument correctly, if you enjoy it, it's what everybody else should do. The other points are just post-hoc justification.
I think your view of low-level projects from scratch is simply wrong, if you describe them as 'algorithmic details'.
Do you have any good resources for what you mentioned?
I wish. As you can plainly see in this thread, most "real programmers" consider such things beneath them, and the scarcity of relevant resources is a natural consequence.
Nobody considers them beneath them, you just found a straw man and keep doubling down on it.
Have good critical thinking skills and you’re pretty much there.
Yea, no thanks.
If I'm going to work on something in my spare time, I am going to work on something that I am interested in and find fun. I generally loathe the dirty work often involved in software engineering and debugging all of these dependencies all the time. It's exhausting. Doing things from scratch puts me in charge.
And I mean, creating a search engine is the ultimate "not invented here" project. I'd just pay for and use Google, Bing, or whatever else API to do things for me. So it doesn't even make sense in the context of your objections.
Except that none of the established search engines actually does full-text search (anymore). There is no way to prevent those engines from breaking apart and rewriting your query as they see fit. No matter how many quotation marks you put around it, and even if you enable "Verbatim" mode or whatever.
So if you just implement the option to do that (which is actually the default if you leverage a standard FTS library), you already have something that AFAIK you can't get anywhere else. Then you can work on kicking the blogspam out of your index, and before long you will have a search engine of real value for yourself (and possibly others).
Or, you can build a crappy clone of a text editor from the 70s.
How much do you think hosting would cost?
This sounds like an excellent project! I look forward to following your work on GitHub. Everybody needs a crappy search engine from the 90s!
I don't understand why you believe low-level "intrusive" programming and general software engineering are mutually exclusive. There is a big, open field for high quality, low-level software to be written well bearing good practices and practical decisions.
1. To be faif, these kinds of "hacking" projects aren't technically meant for learning software engineering in the context of a job, but they are really interesting and a treasure trove of new knowledge for anyone willing to put in the effort. I don't think it's a bad thing. A person can learn to be a good software developer and simultaneously have deep knowledge of the systems they are working with.
2. The goal of this project is not to research and implement some novel algorithm to replace existing works. The goal is to learn how things work. You are still going to use libraries, procedures and algorithms designed by other people and compile them into a cohesive system, it is merely the depth and complexity of the problem that makes it an interesting learning challenge. It's not about some absurd NIH philosophy, it's about curiosity.
3. Software development is not as straightforward as you make it seem. It's not just about best practices, management and finding an optimal way to combine existing libraries into your project. Sometimes you run into unforeseen issues, strange niches that your libraries may not account for. Sometimes you need to dive into the specifics of these libraries, understand at a lower level why they do not function for your use case. You might need to "hack" these libraries. Fork and customise them to truly fit in your project and produce the desired results. And the only thing that will help you then is deep knowledge and the ability to work with low level software.
Hmm, while I don't fully agree with his comment,
then your comment reminded me how I hate to discuss programming languages and programming ecosystems with C people.
It feels like C/low lvl people often refer to some "standard" that they treat as a bible of programming languages even when nobody mentions C and they treat GCC/CLANG as a perfect reference for compilers.
So, while high level devs dont understand low lvl stuff, then I believe low lvl programmers aren't creative at very abstract level, very abstract system modeling cuz they're too used to modeling everything with integers in C.
And they're too used to being hurt by C ecosystem that they believe it is "normal".
High level people (web developers) spend a lot of time arguing about their buzzwords like microservices, ddd, oop vs fp, etc.
Honestly, that is fair. My perspective on this is as a high-level developer, looking into how the things I am using daily actually work. I haven't actually interacted with any proper low-level developer, kind of hard to come across them where I am tbh, so I don't have any clue about how snobbish they can be, but I have seen a fair share of high level devs do it too, as you mentioned. I kind of get it? It's like a way to get validation on the technology you are stuck with for the rest of your working lives, I'm guilty of pushing Flutter onto people despite having had a painful experience with it (state management is a nightmare).
But hey ultimately I just think people should satisfy their curiosities. My work is often always in Python, JS or Java, but at some point I got tired of just doing regular software development stuff and lately I've happened to gain interest in low level development. (Like any definitely sane person I chose to start with rust and decided my first ever rust project should be an emulator, so maybe I'm just dead inside and I don't know it yet)
There is a need for both "solving complex problems alone" and "learning how to work as a team developing a complex project within a timeline". The Software Engineering part is best learnt in industry under a mentor.
The tooth extraction skill is fundamental if you are a dentist. You learn it in school. Knowing how to market and sell that skill is valuable but you can learn it while working.
CS I learned in school, engineering I learned while working.
-
This will be a wonderful feature for copilots, to allow for the human to provide a verbal narrative of the logic/process or even the environment and of others in the room commenting on why a such-and-such is being done, and the copilot elegantly documenting the narrative.
Wait until we have "Thick Code" (where an app comes with an AI generated Mini-Documentary-Series on the creation of the piece (as an optional function of your enterprise Copilot Agent (With the [Security Assistant] [Compliance Assistant] [Legal Assistant] [Media Assistant] and whatever Alignment Agents are required and inform the narrative of the project.
EDIT: @Tao3300 - I do that all the time. You just need to be a better MeatPT and infer "like, my, opinion, man"
I suffered a fatal error because I expected a few end parentheses there.
Gentlemen, you can't hack in here; this is Hacker News.
I'm shocked, SHOCKED, to find that hacking is going on in here!
one wonders whether you’ve created having completed two of the items on the list, i can confirm that it makes you better, whichever way you’d like to measure it. as a google engineer told me once: operating systems and compilers are where all the good ideas are. study them! only thing that beats that is probably writing them yourself, even if toy versions.
I'm working on a hobby OS. It's mostly just a fun project, but there's certainly a lot of deciding what to write and what to use, and identifying quality libraries when you want to use something. You can start from nothing and build your own board and bring it up with no software you didn't write or you can use a commercially available board, bios or uefi boot, use an existing boot loader, use an existing libc, etc etc, and explore the specific thing you're interested in, while still earning a lot of systems knowledge.
Deciding when to optimize isn't too hard when nobody is ever going to use it (only optimize run time if something really takes too long, otherwise don't do anything tremendously stupid, but optimize for less development time because this is taking forever)
Readable code would be nice, because sometimes it takes a long time to get back to it, but I've also designed in a real reason to come back and retouch everything (x86 -> amd64; aarch64 as a stretch) and maybe I'll make things reasonable then... Although it's not my strongest skill.
Even most simple OSes become complex. Maybe the dependency chain isn't too long though.
Seems to tick all your boxes, IMHO.
Idk why everyone is piling on this comment, they clearly make a distinction between programming and software engineer.
There are a few things on this comments that is causing noise in my mind.
I think it is a short-sighted definition of SE but in any case, there are not any mention anywhere in the article claiming that this was aimed to software engineers.
Also, although I appreciate the concerns of "Not Invented Here" doctrine, we also have the other side of the spectrum, where we don't know how to do a left-pad anymore [1], and introduce dependencies everywhere.
The third point is that I don't think that having more knowledge on how things works contributes with the "Not Invented Here" mindset. I would argue is the opposite. When engineers are hungry on learning, they find excuses to make address that appetite at work, but once they know the tradeoffs and amount of effort to make it work, they will think twice before starting anything from scratch.
[1] https://qz.com/646467/how-one-programmer-broke-the-internet-...
Is not this the main point of the editor war? programmers built tool for enhancing programming abilities, but they argued with which tool is best one...
There is a lot of disagreement towards this idea in the thread, but I really like this project suggestion for someone to make a multi-faceted project that'll help someone "grow up" and it resonates with something I've long felt missing from comp sci education: that it doesn't focus enough on studying and replicated big successful projects.
You mostly learn by doing, true, but you learn by copying what the masters did. Software engineering education feels more like taking classes and writing code for exercises, whereas it probably should have more projects of the form "study this complicated-implementation-of-X and use those techniques in your implementation of Y", where x = google search, postgresql, whatevers and y is a complicated project [for one person or a small team] like your own single concurrent user search engine
I've held this idea for a really long time, in fact, I've not actually looked at comp sci education recently, so maybe this isn't true anymore, or maybe it never has beenI figured that was the point here. It's nice to follow tutorials for a new language or framework or whatnot, but to really stress and test what you absorbed, you want some non-trivial project with minimal dependencies. the game section exemplies this the best:
This kind of game won't go on sale nor even be a portfolio piece for a non-junior engineer, but it's a good enough idea for someone like me who say, wanted to really wanted to solidify their Rust skills and show what my and the language's weaknesses are.
After I can get that done, I may feel confident enough to move on to work on a toy renderer based on what I figured out in the game. And then later contributing to a proper engine that already has a lot of groundwork figured out. I'd just be getting in the way if I jumped straight into trying to throw anything but the simplest PRs into Bevy.
You're making a strawman here, based on a very much false dichotomy. The things you list here are all very important.
But something you learn from writing a larger piece of software yourself is that you get to appreciate complexity and "bulk" (for lack of a better term) that you yourself created. You get to ask yourself questions like:
Is all this code needed? (Why is it 400 lines to do X?!) Why did this get so involved/ complicated?! How do I progress (eat the elefant)? Can I (teach myself to) make small iterative units of progress every day, for instance?
These kinds of katas are very very useful for a 5-7 year beginner (which I use as a label of respect - the longer we can remain beginners, the longer we learn with an open attitude!)
Honestly this just sounds like me when I first encountered coding puzzles like project euler / leetcode and interview challenge practice problems. I made up excuses like this because I simply didn't want to spend time on lower level problems and instead wanted to feel good sticking to my high level application comfort zone.
I eventually learned to like solving lower level problems and I'm much better off for it. And these days I can tell people who haven't paid their dues because the only solutions they ever come up with are slow quadratic ones no matter the occasion.
No, building your own text editor, compiler, OS, ray tracer is not going to make you a worse engineer nor keep you from learning how to google and evaluate libraries or think critically, nor are those things reserved for an engineer who isn't building those things.
This is the distinction between solving "local" problems and solving problems at scale.
But I think there is a transitional phase where requirements (gathering, documenting), unit testing, UI design (human or programmatical) are essential. Learning these as nuts-and-bolts in any language is essential to becoming skilled in the craft.
This is a really important skill that I find difficult and frustrating. Does anyone have good advice or resources?
Actually reimplementing something that’s already exists is a great way to get an understanding of it precisely so you can deal with the questions you mention.
My rule of thumb is that you should always understand at least one level of abstraction down.
I get your point against NIH, but I don't see how doing such a project on your own in your freetime could possibly make you a worse programmer. If anything, you would likely learn exactly what it takes to go down the NIH path so strictly and be able to make better choices
I agree with this, only working on a project with certain expectations of delivery can make you a better software engineer.
I could not disagree more.
If it is not not invented here, on the limit is not invented anywhere because nobody knows how to build these things, which we have been on that path more or less.
Perhaps you were visualizing that guy on your job who creates his own libraries and obscure systems within the system before telling anyone and that is generally a bad peer.
But the Universe is much bigger and diverse than that, people who programmed their own things but do a Google search first and use other software because they know it’s more feature complete, more known, trusted and better future supported are the best to work with.
We are all using the products of not invented here syndrome right now, don’t forget that.
To me learning the engineering part is easiest. And also not terribly valuable if you don't know CS fundamentals.
I agree that these are all valuable things for a software engineer to learn, but in my experience, it’s valuable to at least take a crack at the projects listed in the post. I wasn’t under any illusion that I was going to build anything amazing, and gave me deeper appreciation of why you don’t try and build everything in-house.