HN comments for: Notes on Distributed Systems for Young Bloods

sulam

22 replies

20h10m

2024-09-02 22:22:05 UTC

One that is not mentioned here but that I like as a general principle is that you cannot have exactly once delivery. At most once or at least once are both possible, but you have to pick your failure poison and architect for it.

Spivak

7 replies

19h55m

2024-09-02 22:36:54 UTC

Yes but in practice this is not a problem because the bits that are impossible are so narrow that turning at-least-once into exactly-once is so easy it's a service offered by cloud vendors https://cloud.google.com/pubsub/docs/exactly-once-delivery

sulam

1 replies

19h53m

2024-09-02 22:38:59 UTC

Those only work because they have retries built in at a layer that runs inside your service. You should understand this because it can have implications for the performance of your system during failures.

devoutsalsa

0 replies

13h2m

2024-09-03 05:29:34 UTC

For example, you can have an exactly once implementation that is essentially implemented as at least once with repeated idempotent calls until a confirmation is received. Idempotency handling has a cost, confirmation reply has a cost, retry on call that didn’t have confirmation has a cost, etc.

strken

1 replies

18h27m

2024-09-03 00:05:25 UTC

It's often not a problem because it's often easy to make a call idempotent. Consider the case where you attach an ID to every event and the subscriber stores data in postgres. You stick everything in a transaction, persist each event's ID, add a unique index to that column, handle failure, and bang, it's now idempotent.

withinboredom

0 replies

6h58m

2024-09-03 11:33:29 UTC

And if that service called an external system but failed before committing the transaction? I’m not sure you should be using db transactions in distributed systems as you can’t recover from partial failures.

jclulow

0 replies

19h51m

2024-09-02 22:41:19 UTC

"Redelivery versus duplicate" is doing quite a lot of work in there. This is an "at least once" delivery system providing building blocks that you can use to cope with the fact that it's physically impossible to prevent redelivery under some circumstances, which are not actually that rare because some of those circumstances are your fault, not Google's, etc.

fiddlerwoaroof

0 replies

19h32m

2024-09-02 22:59:33 UTC

My experience is building streaming systems using “exactly-once delivery” primitives is much more awkward than designing your system around at least once primitives and explicitly de-duplicating using last-write wins. For one thing, LWW gives you an obvious recovery strategy if you have outages of the primitive your system is built on: a lot of the exactly once modes for tools make failure recovery harder than necessary

camgunz

0 replies

6h35m

2024-09-03 11:56:57 UTC

at-least-once/exactly-once/at-most-once delivery are all weirdly named from the wrong perspective. From the sender's perspective there are only two options: send once and send lots. Behold:

- you send a message

- you receive nothing back

- what now

There is no algorithm that lets you implement exactly-once delivery in the face of delivery instability. Either you don't resend and you implemented at-most-once, or you resend and you implemented at-least-once.

You might say, "but hey, the receiver of course sends acks or checkpoints; I'm not a total buffoon". Sure. Let's game that out:

- you send message 8

- you get an ack for message 7

- you receive no more acks

- what now

Every system you'll use that says it implements exactly-once implements send lots and has some mechanism to coalesce (i.e. make idempotent) duplicate messages.

yamrzou

3 replies

15h0m

2024-09-03 03:32:12 UTC

Apache Flink does provide end-to-end exactly-once guarantees when coupled with data sources and data sinks that participate in its checkpointing mechanism. See:

- An Overview of End-to-End Exactly-Once Processing in Apache Flink (with Apache Kafka, too!) — https://flink.apache.org/2018/02/28/an-overview-of-end-to-en...

- Flink's Fault Tolerance Guarantees — https://nightlies.apache.org/flink/flink-docs-release-1.20/d...

shepherdjerred

2 replies

12h43m

2024-09-03 05:49:06 UTC

Exactly once processing != exactly once delivery

frant-hartm

0 replies

5h2m

2024-09-03 13:30:20 UTC

And on top of that this _end-to-end exactly-once guarantees_ makes the result look like the message was processed in the pipeline only once, while in reality a message will be processed 2x (or more times) in the pipeline in case of a temporary failure but only one version will be committed to the external system.

ahoka

0 replies

5h59m

2024-09-03 12:32:47 UTC

But it’s mostly the processing that is interesting. Like this message I write now might see several retransmissions until you read it, but it’s a single message.

j-pb

2 replies

11h43m

2024-09-03 06:48:59 UTC

*Between two distributed systems that don't share the same transactionality domain or are not logically monotone.

It's easy to see that moving data from one row to another is doable in a clustered database, and could be interpreted as a message being delivered.

The point is that you can get exactly once delivery, if your whole system is either idempotent, or you can treat the distributed system as one single unit that can be rolled back together (i.e. side effect free wrt. some other system outside the domain).

Both are cases of some form of logical monotonicity, idempotence is easier to see, but transactionality is also based on monotonicity through the used WALs and algorithms like Raft.

The article should really mention CALM (Consistency as logical monotnicity), it's much easier to understand and a more fundamental result than CAP. https://arxiv.org/pdf/1901.01930

thaumasiotes

1 replies

11h23m

2024-09-03 07:09:07 UTC

The point is that you can get exactly once delivery, if your whole system is either idempotent

If you have exactly-once delivery, there's no difference between being idempotent or not. The only effect of idempotence is to ignore extra deliveries.

If idempotence is a requirement, you don't have exactly-once delivery.

j-pb

0 replies

7h9m

2024-09-03 11:22:47 UTC

Idempotence allows you to build systems that in their internal world model allow for exactly once delivery of non-idempotent messages over an at least once medium.

Exactly-once delivery semantics doesn't equal exactly-once physical messages.

exogenousdata

2 replies

15h30m

2024-09-03 03:01:51 UTC

Can’t stress this enough. Thank you for mentioning it!! In my career I’ve encountered many engineers unfamiliar with this concept when designing a distributed system.

ignoramous

0 replies

3h47m

2024-09-03 14:45:01 UTC

I’ve encountered many engineers unfamiliar

tbf, Distributed Systems aren't exactly easy nor common, and each of us either learn from others, or learn it the hard way.

For example, and I don't mean to put anyone on the spot, Cloudflare blogged they'd hit an impossibly novel Byzantine failure, when it turned out to be something that was "common knowledge": https://blog.cloudflare.com/a-byzantine-failure-in-the-real-... / https://archive.md/LK8FI

danenania

0 replies

1h42m

2024-09-03 16:49:51 UTC

Yeah it’s very common since it becomes an issue as soon as you get sockets involved in an app. A lot of frontend engineers unknowingly end up in distributed system land like this. Coordinating clients can be just as much a distributed systems challenge as coordinating servers—often it’s harder.

dmurray

2 replies

10h59m

2024-09-03 07:32:52 UTC

It's impossible to have at least once delivery in an environment with an arbitrary level of network failure.

feoren

0 replies

1h53m

2024-09-03 16:39:10 UTC

Sure you can: keep sending one message per second until you receive a response from the recipient saying "OKAY I HEARD YOU NOW SHUT UP!"

danenania

0 replies

1h49m

2024-09-03 16:43:02 UTC

You can have at-least-once-or-fail though. Just send a message and require an ack. If the ack doesn’t come after a timeout, try n more times, then report an error, crash, or whatever.

jeffbee

0 replies

17h59m

2024-09-03 00:32:59 UTC

The important part of this lesson is "and you don't need it".

turtledragonfly

3 replies

20h52m

2024-09-02 21:39:44 UTC

Excellent list; I like the pragmatic and down-to-earth explanations. No buzzwords, no "microservices" (:

I'd say that a good amount of this advice also applies to one-box systems. There can be lots of kinda/sorta distributed sub-components to consider — could be IPC between programs, or even coordination amongst threads in one process. Even the notion of unified memory on one box is a bit of a lie, but at least the hardware can provide some better guarantees than you get in "real" distributed cases.

A lot of the advice where they compare "distributed" to "single-machine" could pretty well apply to "multi-threaded" vs "single-threaded," too.

And on another axis, once you make a program and give it to various people to run, it becomes sort of a "distributed" situation, too — now you have to worry about different versions of that program existing in the wild, compatibility between them and upgrade issues, etc. So things like feature flags, mentioned in the article, can be relevant there, as well.

It's perhaps more of a spectrum of distributedness: from single-CPU to multi-CPU, to multi-computer-tightly-connected, to multi-computer-globally-distributed, with various points in between. And multiple dimensions.

bee_rider

1 replies

2h10m

2024-09-03 16:21:57 UTC

The neighboring universe, so tantalizingly close, where AMD gave us different memory spaces for each chiplet, is something I think about often. Imagine, we could all be writing all our code as beautiful distributed memory MPI programs. No more false sharing, we all get to think hard and explicitly about our communication patterns.

immibis

0 replies

21m

2024-09-03 18:11:12 UTC

This is already here. It's called NUMA. You can access all memory from any CPU, but accessing memory that's connected to your CPU makes the access faster. NUMA-aware operating systems can limit your process to a CPU cluster and allocate memory from the same cluster, then replicate this on the other clusters, so they all run fast and only transfer data between clusters when they need to.

chipdart

0 replies

12h40m

2024-09-03 05:52:03 UTC

I'd say that a good amount of this advice also applies to one-box systems.

Nothing in "distributed systems" implies any constraint on deployment. The only trait that's critical to the definition is having different flows of control communicating over a network through message-passing. One very famous example of distributed systems is multiple processes running on the same box communicating over localhost, which happens to be where some people cut their distributed system's teeth.

ramon156

3 replies

7h32m

2024-09-03 11:00:20 UTC

Good article. I did notice it's 8 years old. Some stuff never changes I guess

baq

2 replies

7h30m

2024-09-03 11:02:15 UTC

8 is the new 11 I see. (Article dates to 2013 ;)

gautamsomani

0 replies

3h26m

2024-09-03 15:06:17 UTC

lol

bee_rider

0 replies

2h9m

2024-09-03 16:22:58 UTC

Covid years don’t count.

j-pb

3 replies

11h39m

2024-09-03 06:52:56 UTC

The article should really mention CALM (Consistency as Logical Monotonicity)[1], it's much easier to understand and a more fundamental result than CAP. It is also much more applicable and enables people with little experience to build extremely robust distributed systems.

Idempotence, CRDTs, WALs, Raft, they are all special cases of the CALM principle.

[1]: https://arxiv.org/pdf/1901.01930

baq

1 replies

7h29m

2024-09-03 11:03:01 UTC

The article predates your paper by 6 years.

j-pb

0 replies

7h21m

2024-09-03 11:10:48 UTC

Fair, we should add [2013] to the title though.

foota

0 replies

2024-09-03 18:23:00 UTC

I looked at the bloom repo and it seems somewhat stale, do you know if this is something still being worked on?

Maro

2 replies

13h41m

2024-09-03 04:50:35 UTC

I think a lot has changed since 2013 when this article was written. Back then cloud services were less mature and there were more legitimate cases when you had to care about the theoretical distributed aspects of your backend architecture.. although even then these were quickly disappearing, unless you worked at a few select bigtech companies like the FAANGs.

But today, in 2024, if you just standardize on AWS, you can pretty much use one of the AWS services for pretty much anything. And that AWS service will be already distributed in the backend, for free, in terms of you not having to worry about it. Additionaly, it will be run by AWS engineers for you, with all sorts of failovers, monitoring, alerting, etc, behind the scenes that will be much better than what you can build.

So these days, for 99% of people it doesn't really make sense to worry too much about this theoretical stuff, like Paxos, Raft, consistency, vector clocks, byzantine failures, CAP, distributed locks, distributed transactions, etc. And that's good progress, it has been abstracted away behind API calls. I think it's pretty rational to just build on top of AWS (or similar) services, and accept that it's a black box distributed system that may still go down sometimes, but it'll still be 10-100x more reliable then if you try to build your own distributed system.

Of course, even for the 99%, there are still important practical things to keep in mind, like logging, debugging, backpressure, etc.

Another thing I learned is that some concepts, such as availability, are less important, and less achievable, then they seem on paper. On paper it sounds like a worthy exercise to design systems that will fail over and come back automatically if a component fails, with only a few seconds downtime. Magically, with everything working like a well oiled machine. In practice this is pretty much never works out, because there are componenets of the system that the designed didn't think of, and it's those that will fail and bring the system down. Eg. see the recent Crowdstrike incident.

And, with respect to importance of availability, of the ~10 companies I worked at in the past ~20 years there's wasn't a single one that couldn't tolerate a few hours of downtime with zero to minimal business and PR impact (people are used to things going down a couple of times a year). I remember having outages at a SaaS company I worked for 10 years ago, no revenue was coming in for a few days, but then people would just spend more in the following day. Durability is more important, but even that is less important in practice then we'd like to think [1].

The remaining 1% of engineers, who get to worry about the beuatiful theoretical AND practical aspects of distributed computing [because they work at Google or Facebook or AWS] should count themselves lucky! I think it's one of the most interesting fields in Computer Science.

I say this as somebody who deeply cares/cared about theoretical distributed computing, I wrote distributed databases [2] and papers in the past [3]. But working in industry (also managing a Platform Eng team), I cannot recall the last time I had to worry about such things.

[1] PostgreSQL used fsync incorrectly for 20 years - https://news.ycombinator.com/item?id=19119991

[2] https://github.com/scalien/scaliendb

[3] https://arxiv.org/abs/1209.4187

truckerbill

1 replies

12h7m

2024-09-03 06:25:13 UTC

The downside is you are walking into an inviting mono/duopoly. Things like this and cloudflare are amazing until the wind changes and the entire internet has a failure mode

Maro

0 replies

11h55m

2024-09-03 06:37:04 UTC

In my experience, the biggest risk with AWS and similar is not the dependence itself, it's the cost. With AWS you pay a lot more for the compute then if you just rent the hardware for $25/server with Linux, and then go from there. It does make sense for some shops, where you have scale and can make calculated trade-offs between 100s of engineers' salaries and cloud costs (like Netflix).

But I think a lot of small startups end up overpaying for their cloud costs even though they don't take advantage of the elasticity/scaling/replication/etc.

Also, a lot of BigCos are just on public clouds because that's the thing now, but their stacks, software and teams are so broken, they don't actually take advantage of the cloud. Their engineering teams are mediocre, and so is the software, so it breaks all the time. So they end up paying for something they can't take advantage of, because they don't need to scale, they don't need high uptime and/or can't achieve it because their stuff running on top of the cloud is a horrible unreliable spaghetti anyway.

If I was to do a SaaS-y startup today, I'd just rent cheap dedicated hardware, I'd use OVH [1]. I'd start with 1 reasonably beefy server for like $25-50/mo and see how far it gets me. I would use S3 for unlimited, reliable storage, (maybe even RDS for DB), but everything else I'd keep out of AWS, running on my cheap server. Ie. I'd keep my data in AWS, because that's super cheap and worth it.

Later, if the thing is super successful, I'd go to $250-500/mo and get 5-10 reasonably beefy servers, and start to move things apart. I'd still not worry too much about replication and such, I'd just to backups, and take the hit if there's a problem and restore at the last backup point. I think this would get me pretty far, all the way to when I want to bring BigCo customers aboard who need all sorts of ISO standards and such. At that point the whole thing will stop being fun anyway and it's time to let other people worry about it..

And in terms of hiring, I'd make it clear at interview time that this is the stack, and we're not going to do microservices and Kubernetes and all that crap just because others are doing it, and I'd hire from the remaining pool.

[1] I've been on it for my personal devboxes for ~7 years, it just works

LandR

2 replies

6h5m

2024-09-03 12:27:10 UTC

Is Young Bloods a common term for beginner / newbie ?

Cthulhu_

0 replies

3h44m

2024-09-03 14:47:42 UTC

https://www.merriam-webster.com/dictionary/youngblood

    young· blood
    1: a young inexperienced person
    especially : one who is newly prominent in a field of endeavor
    2: a young African American male
    
    The first known use of youngblood was in 1602

So it's a bit archaic but not abnormal. Has been used as a surname, too.

Bnjoroge

0 replies

4h12m

2024-09-03 14:20:11 UTC

yup

vitus

1 replies

6h31m

2024-09-03 12:01:24 UTC

If you can fit your problem in memory, it’s probably trivial.

A corollary: "in-memory is much bigger than you probably think it is."

I thought I knew what a large amount of RAM was, and then all the major clouds started offering 12TB VMs for SAP HANA.

edit: this seems like it's touched on very briefly with "Computers can do more than you think they can." but even that only talks about 24GB machines (admittedly in 2012, but still, I'm sure there were plenty of machines with 10x that amount of RAM back then)

gen220

0 replies

3h21m

2024-09-03 15:10:48 UTC

Even comparatively senior engineers make this mistake relatively often. If you're a SaaS dealing with at most 100GB of analytical data per customer, (eventually, sharded) postgres is all you need.

pradn

1 replies

2h11m

2024-09-03 16:20:48 UTC

Distributed systems are different because they fail often

The key here is not just the rate of failure, but the rate of failure in a system of multiple nodes.

And - "distributed systems problems" don't only arise with several servers connected by a network. Any set of nodes with relations between them - files on disk linked logically, buffers on different IO devices - these are also going to face similar problems.

roryirvine

0 replies

1h41m

2024-09-03 16:51:23 UTC

Absolutely. In fact, it's a class of problems that can - and do - arise on any software system comprising more than a sole single-threaded process that's been locked in memory.

Some old-timers love to scoff at the inordinate amount of complexity that comes from mitigating these issues, and will complain that it would all be so much simpler if you would just run your software on a single server.

In reality, that was barely true even back in the AS/400 or VAXft days - and even then it didn't apply to the rather more chaotic multi-user, multi-process Unix world.

languagehacker

1 replies

19h58m

2024-09-02 22:33:59 UTC

I share this doc with the most promising people I get to work with.

When I worked at Lookout, Jeff Hodges shared this essay as a presentation, and ended it with a corollary: don't pretend that engineering isn't political. People that think that the code speaks for itself are missing out on important aspects of how to influence the way things are built and how to truly get results.

Ten years later, and there are few people who still so concisely understand the intersection of engineering leadership and those table-stakes capabilities we normally associate with SRE / DevOps.

decasia

0 replies

17h51m

2024-09-03 00:41:04 UTC

there are few people who still so concisely understand the intersection of engineering leadership and those table-stakes capabilities we normally associate with SRE / DevOps.

I'm curious what else is good to read about this topic, if anything comes to your mind?

tapanjk

0 replies

11h29m

2024-09-03 07:03:06 UTC

(2013)

sillywalk

0 replies

23h54m

2024-09-02 18:37:56 UTC

Previous discussions from way back:

https://news.ycombinator.com/item?id=5055371

346 points|jcdavis|12 years ago|42 comments

https://news.ycombinator.com/item?id=12245909

386 points|kiyanwang|8 years ago|133 comments

sillyfluke

0 replies

6h6m

2024-09-03 12:25:55 UTC

I guess most of it's still highly relevant since not enough people have clamored for the (2013) tag.

kuharich

0 replies

7h18m

2024-09-03 11:13:32 UTC

Past comments: https://news.ycombinator.com/item?id=23365402

jupp0r

0 replies

1h29m

2024-09-03 17:03:10 UTC

(2013)

abatilo

0 replies

19h49m

2024-09-02 22:42:53 UTC

I had the pleasure to briefly work with the author of this post within the last few years. Jeff was one of the most enlightening and positive people I've ever learned from. He was refreshingly honest about what challenges he was having, and delightfully accessible for mentorship and advice.