return to table of content

How I write HTTP services in Go after 13 years

mtlynch
122 replies
21h34m

The Valid method takes a context (which is optional but has been useful for me in the past) and returns a map. If there is a problem with a field, its name is used as the key, and a human-readable explanation of the issue is set as the value.

I used to do this, but ever since reading Lexi Lambda's "Parse, Don't Validate," [0] I've found validators to be much more error-prone than leveraging Go's built-in type checker.

For example, imagine you wanted to defend against the user picking an illegal username. Like you want to make sure the user can't ever specify a username with angle brackets in it.

With the Validator approach, you have to remember to call the validator on 100% of code paths where the username value comes from an untrusted source.

Instead of using a validator, you can do this:

    type Username struct {
      value string
    }

    func NewUsername(username string) (Username, error) {
      // Validate the username adheres to our schema.
      ...

      return Username{username}
    }
That guarantees that you can never forget to validate the username through any codepath. If you have a Username object, you know that it was validated because there was no other way to create the object.

[0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

xboxnolifes
24 replies
20h59m

Crazy that actually using your type system leads to better code. Stop passing everything around as `string`. Parse them, and type them.

stickfigure
18 replies
20h6m

There's a name for this anti-pattern: "Stringly typed"

tdeck
8 replies
16h41m

This term is typically used to refer to things like data structures and numerical values all being passed as strings. I don't think a reasonable person would consider storing a username in a string to be "stringly typed".

samatman
4 replies
16h3m

The One True Wiki[0] says "Used to describe an implementation that needlessly relies on strings when programmer & refactor friendly options are available."

Which is exactly what's going on here. A username has a string as a payload, but that payload has restrictions (not every string will do) and methods which expect a username should get a username, not any old string.

[0]: https://wiki.c2.com/?StringlyTyped

tdeck
3 replies
12h36m

I don't agree that this example is more "programmer friendly". Anything you want to do with the username other than null check and passing an argument is going to be based directly on the string representation. Insert into a database? String. Display in a UI? String. Compare? String comparison. Sort? String sort. Is it really more "programmer friendly" to create wrapper types for individual strings all over your codebase that need to have passthrough methods for all the common string methods? One could argue that it's worth the tradeoff but this C2 definition is far from helpful in setting a clear boundary.

Meanwhile the real world usages of this term I've seen in the past have all been things like enums as strings, lists as strings, numbers as strings, etc... Not arbitrary textual inputs from the user.

cwilkes
2 replies
12h11m

You inherit some code. Is that string a username or a phone number? Who knows. Someone accidentally swapped two parameter values. Now the phone number is a username and you’ve got a headache of trying to figure out what’s wrong.

By having stronger types this won’t come up as a problem. You don’t have to rely on having the best programmers in the world that never make mistakes (tm) to be on your team and instead rely on the computer making guard rails for you so you can’t screw up minor things like that.

quickthrower2
1 replies
6h42m

I agree on the one hand but empirically I don’t think I have seen a bug where the problem was the string for X ended up being used as Y. Probably because the variable/field names do enough heavy lifting. But if your language makes it easy to wrap I say why not. It might aid readability and maybe avoid a bug.

I would probably type to the level of Url, Email, Name but not PersonProfileTwitterLink.

baq
0 replies
1h30m

I’ve refactored a large js code base into ts. Found one such bug for every ~2kloc. The obvious ones are found quickly in untyped code, the problem is in rare cases where you e.g. check truthiness on something that ends up always true.

antonvs
2 replies
15h5m

It definitely is stringly typed. It's just that it's a very normalized example of it, that people don't think of as being an antipattern.

If you want to implement what Yaron Minsky described as "make illegal states unrepresentable", then you use a username type, not a string. That rules out multiple entire classes of illegal states.

If you do that, then when you compile your program, the typechecker can provide a much stronger correctness proof, for more properties. It allows you to do "static debugging" effectively, where you debug your code before it ever even runs.

wruza
1 replies
7h5m

I don’t get what you’re about. The root comment clearly presents a structure of a separate type. The fact that it happens to contain a single string field is completely irrelevant (what type an actual username should be, a float?). “Stringly typed” is about stringifying non-string values to save typing work and is not applicable here in the slightest.

quickthrower2
0 replies
6h47m

I wasn’t sure who was right. I’ll tie break with https://wiki.c2.com/?StringlyTyped= which pretty much says what you just said

zeroCalories
5 replies
15h37m

I've also seen it called primitive obsession, which is also applicable to other primitive types like using an integer in situations where an enum would be better.

fbdab103
2 replies
13h33m

Definitely use to fall for primitive obsession. It seemed so silly to wrap objects in an intermediary type.

After playing with Rust, I changed my tune. The type system just forces you into the correct path, that a lot of code became boring because you no longer had to second guess what-if scenarios.

zeroCalories
0 replies
12h15m

Yeah, modern type systems are game changers. I've soured on Rust, but if Go had the full Ocaml type system with match statements I think it would be the perfect language.

xboxnolifes
0 replies
13h17m

Definitely use to fall for primitive obsession. It seemed so silly to wrap objects in an intermediary type.

A lot of languages certainly don't make it easy. You shouldn't have to make a Username struct/class with a string field to have a typed username. You should be able to declare a type Username which is just a string under the hood, but with different associated functions.

skitter
1 replies
15h0m

Sadly enums are too advanced of a concept to be included in Go.

eru
0 replies
14h23m
timcobb
2 replies
19h23m

Bash :(

nsguy
1 replies
17h48m

JSON

tdeck
0 replies
16h41m

TCL

tuyguntn
2 replies
1h43m

things have different costs.

Types limit you from making some mistakes, but it also impacts your extensibility. Imagine an enum with 4 values and you want to add 1 because 10 level deep one of the services need new value. How does it usually go with strongly typed languages? You go and update all services until new value is properly propagated to lowest level who actually needs that value.

Now imagine doing same with strings, you can validate at the lowest level, upper levels just pass value as it is. If upper layers have conditionals based on value, they still can limit their logic to those values

ndriscoll
1 replies
11m

Why would you need to update code that isn't matching on the value? It just knows it has an X and passes it to a function that needs an X.

tuyguntn
0 replies
3m

if you don't update the code in intermediate layers, some automated validation based on enum values will fail, which also drops the request

munk-a
1 replies
18h8m

As a PHP developer I am frankly disappointed you think that we only do that with strings. I've got an array[1] full of other tools.

1. Or maybe a map? Those keys might have significance I didn't tell you about.

xboxnolifes
0 replies
18h4m

I originally typed out `int` and wanted to do more, but I try to keep my comments as targeted as possible to avoid the common reply pattern of derailing a topic by commenting on the smallest and least important part of it. If I type `string`, `int`, `arrays`, `maps`, `enums`... someone will write 3 paragraphs about enums are actually an adequate usage of the type system, and everyone will focus on that instead of the overarching message.

the_gipsy
24 replies
19h19m

It's not guaranteed at all, that's where go's zero-values come in. E.g. nested structs, un/marshaljson magic methods etc. How do you deal with that?

stouset
19 replies
18h46m

Every struct requiring its zero value to be meaningful is probably one of the worst design flaws in the language.

randomdata
17 replies
17h46m

There is no such requirement. Common wisdom suggests that you should ensure zero values are useful, but that isn't about every random struct field – only the values you actually give others. Initialize your struct fields and you won't have to consider their zero state. They will never be zero.

It's funny seeing this beside the DRY thread. Seems programmers taking things a bit too literally is a common theme.

stouset
16 replies
17h33m

Initialize your struct fields and you won't have to consider their zero state.

“Just do the right thing everywhere and you don’t have to worry!”

You can’t stop consumers of your libraries from creating zero-valued instances.

randomdata
15 replies
17h27m

Then the zero value is their problem, not yours. You have no reason to be worried about that any more than you are worried about them not getting enough sleep, or eating unhealthy food. What are you doing to stop them from doing that? Nothing, of course. Not your problem.

Coq exists if you really feel you need a complete type system. But there is probably good reason why almost nobody uses it.

stouset
14 replies
16h26m

Then the zero value is their problem, not yours.

Except for all those times you're the consumer of someone else's library and there's no way for them to indicate that creating a zero-valued struct is a bug.

Again, it's the philosophy of "Just do the right thing everywhere and you don’t have to worry!" Sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs.

randomdata
13 replies
16h0m

> Except for all those times you're the consumer of someone else's library and there's no way for them to indicate that creating a zero-valued struct is a bug.

Nonsense. Go has a built-in facility for documentation to communicate these things to other developers. Idiomatic Go strongly encourages you to use it. Consumers of the libraries expect it.

> Sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs.

Well, sure. But, like I said, almost nobody uses Coq. The vast, vast, vast majority of projects – and I expect 100% of web projects – use languages with incomplete type systems, making what you seek impossible.

And there's probably a good reason for that. While complete type systems sound nice in theory, practice isn't so kind. There are tradeoffs abound. There is no free lunch in life. Sorry.

the_gipsy
6 replies
7h46m

You don't have to go as far as Coq. Rust manages "parse, don't validate" extremely well with serde.

Go's zero-values are the problem, not any other lack of its type system.

randomdata
5 replies
7h4m

> You don't have to go as far as Coq.

No, you do. Anywhere the type system is incomplete means that the consumer can do something the library didn't intend. Rust does not have a complete type system. There was no relevance to mentioning it. But I know it is time for Rust's regularly scheduled ad break. And while you are at it, enjoy a cool, refreshing Coca-Cola.

> Go's zero-values are the problem

"Sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs." has nothing to do with zero-values. It doesn't even really have anything to do with Go specifically. My, the quality of advertising has really declined around here. Used to be the Rust ads at least tried to look like they fit in.

the_gipsy
2 replies
6h22m

Any language without zero-values (or some equally destructive quality) can do "parse, don't validate". Go cannot. Rust is just an example.

randomdata
1 replies
3h30m

Top of the hour again? Time for another Rust advertisement?

The topic at hand is about preventing library users from doing things the library author didn't intended using the type system, not "what happens if a language has zero-values". Perhaps you are not able to comprehend this because you are hungry? You're not you when you are hungry. Grab a Snickers.

the_gipsy
0 replies
1h41m

what happens if a language has zero-values, is that you can't "parse, don't validate".

Maybe it's time for you to finally try rust? Or any other language without zero-values, since rust seems to irritate you in particular.

stouset
1 replies
3h40m

This insane perspective of “nothing is totally perfect so any improvements over what go currently does are pointless” whenever you confront a gopher with some annoying quirk of the language is one of the worst design flaws in the golang community hivemind.

randomdata
0 replies
3h35m

Tell us, why you hold that perspective? It's an odd one. Nobody else in this thread holds that perspective. You even admit it is insane, yet here you are telling us about this unique perspective you hold for some reason. Are you hoping that we will declare you insane and admit you in for care? I don't quite grasp the context you are trying to work within.

DandyDev
3 replies
6h28m

You manage to present a strawman and produce a No True Scotsman fallacy all at once in this comment thread.

Nobody is suggesting that Coq should be used, so stop bringing it up (strawman). And yes, Coq might have an even stricter and more expressive type system than Rust. But nobody is asking for a perfect type system (no true Scotsman). People are asking to be able to prevent users of your library to provide illegal values. Rust (and Haskell and Scala and Typescript and ….) lets you do this just fine whereas Golang doesn’t.

And personally I would much rather have the compiler or IDE tell me I’m doing something wrong than having to read the docs in detail to understand all the footguns.

My personal opinion is that - even though I’m very productive with Golang and I enjoy using it - Golang has a piss poor type system, even with the addition of Generics.

randomdata
2 replies
4h3m

> People are asking to be able to prevent users of your library to provide illegal values. [...] and Typescript

Typescript, you say?

   const bar: Foo = {} as Foo
Hmm. Oh, right, just don't hold it wrong. But "sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs."

Your example doesn’t even satisfy the base case, let alone the general case. Get back to us when you have actually read the thread and can provide something on topic.

DandyDev
1 replies
3h57m

But that is not an accident, is it? It’s someone very deliberately casting an object. It’s not the same and you probably know it.

randomdata
0 replies
3h40m

It might be an accident. Someone uninitiated may think that is how you are expected to initialize the value. A tool like Copilot may introduce it and go unnoticed.

But let's assume the programmer knows what they are doing and there is no code coming from any other source. When would said programmer write code that isn't deliberate? What is it about Go that you think makes them, an otherwise competent programmer, flail around haphazardly without any careful deliberation?

owl57
1 replies
14h25m

> The vast, vast, vast majority of projects – and I expect 100% of web projects – use languages with incomplete type systems, making what you seek impossible.

…where, "what GP seeks" is…

> way for [library authors] to indicate that creating a zero-valued struct is a bug

I'd say that's a really low and practical bar, you really don't need Coq for that. Good old Python is enough, even without linters and type hints.

Of course it's very easy to create an equivalent of zero struct (object without __init__ called), but do you think it's possible to do it while not noticing that you are doing something unusual?

randomdata
0 replies
8h56m

> Good old Python is enough

No, Python is not enough to "...work with a type system where designers of libraries can actually prevent you from writing bugs." Not even typed Python is going to enable that. Only a complete type system can see the types prevent you from writing those bugs. And I expect exactly nobody is writing HTTP services with a language that has a complete type system – for good reason.

Of course it's very easy to create an equivalent of zero struct

Yes, you are quite right that you, the library consumer, can Foo.__new__(Foo) and get an object that hasn't had its members initialized just like you can in Go. But unless the library author has specifically called attention to you to initialize the value this way, that little tingling sensation should be telling you that you're doing something wrong. It is not conventional for libraries to have those semantics. Not in Python, not in Go.

Just because you can doesn't mean you should.

ttymck
0 replies
15h49m

This is where we arrive at my conclusion that go is not well-suited to implementing business logic!

bsdpufferfish
3 replies
16h49m

C++ constructors actually make the guarantee, but it comes with other pains

masklinn
2 replies
11h22m

Lots of languages handle it just fine and don’t need the mess of C++ ctors.

GP is pointing out that go specifically makes it an issue.

bsdpufferfish
1 replies
3h3m

What language do you have in mind?

masklinn
0 replies
1h27m

Any language which supports private state: smalltalk, haskell, ada, rust, …

cogman10
19 replies
20h45m

The issue is DRY often comes to wreck this sort of thing. Some devs will see "Hmm, Username is exactly the same as just a string so let's just use a string as Username is just added complexity".

I've tried it with constructs like `Data` and `ValidatedData` and it definitely works, but you do end up with duplicate fields between the two objects or worse an ever growing inheritance tree and fields unrelated to either object shared by both.

For example, consider data looking like

    Data {
      value string
    }
and ValidatedData looking like

    ValidatedData {
      value int
    }
There's a mighty temptation for some devs to want to apply DRY and zip these two things together. Unfortunately, that can really be messy on these sorts of type changes and the where of where validation needs to happen gets muddled.

xboxnolifes
9 replies
20h41m

Except Username is not exactly the same as string, and that's important. Username is a subset of string. If they were equivalent, we wouldn't need to parse/validate.

The often misinterpreted part of DRY is conflating "these are the same words, so they are the same", with "these are the same concept, so they are the same". A Username and a String are conceptually different.

cogman10
8 replies
20h31m

DRY is just "Do not repeat yourself". And a LOT of devs take that literally. It's not "Do not repeat concepts" (which is what it SHOULD be but DRC isn't a fun acronym).

Unfortunately "This is the same character string" is all a DRY purist needs to start messing up the code base.

I honestly believe that "DRY" is an anti-pattern because of how often I see this exact behavior trotted out or espoused. It's a cargo cult thing to some devs.

hombre_fatal
2 replies
19h58m

This seems less about DRY and more a story about a hypothetical junior dev making a dumb mistake masquerading as commentary about “DRY purism”.

cogman10
1 replies
19h49m

Man I wish it was just jr devs. I cut jrs a ton of slack, they don't know any better. However, it's the seniors with the quick quips that are the biggest issue I run into. Or perhaps senior devs with jr mentalities

ikiris
0 replies
19h41m

most srs are just jrs with inflated egos and titles

oorza
1 replies
20h6m

That's why I like to tell people to always remember to stay MOIST - the Most Optimal is Implicitly the Simplest Thing.

When you add complexity to DRY out your code, you're adding a readability regression. DRY matters in very few context beyond readability, and simplicity and low cognitive load need to be in charge. Everything else you do code-style wise should be in service of those two things.

HeavyStorm
0 replies
9h44m

DRY has nothing to do with readability. The fact that it might help with it is purely coincidental.

DRY is about maintainability - if you repeat rules (behavior) around the system and someone comes along and changes it, how can you be sure it affected all the system coherently?

I've seen this in practice: we get a demand from the PO, a more recent hire goes to make the change, the use case of interest to the PO gets accepted. A week later we have a bug on production because a different code path is still relying on the old rule.

sroussey
0 replies
19h29m

Like everything, it depends is the right answer.

piva00
0 replies
3h18m

In my experience (~20 years) with software development I developed the belief that people will go through the path of applying patterns, techniques, architectures, good practices, first as dogma, then to rejection, ending in acceptance of the knowledge that almost all of software development patterns/best practices are mostly good heuristics, which require experience to apply correctly and know when to break or bend the rules.

DRY applied as a dogma will eventually fail, because it's not a verified mathematical proof of infallible code, it's just a practice that gives good results inside its constraints, people just don't learn the constraints until it explodes in their faces a few times.

Like any wisdom, it's hard it will be received and understood without the rite of passage of experience.

Intermernet
0 replies
2h36m

DRY vs premature optimisation is the landscape most long term devs find themselves in. You can say that FP, OO and a bunch of other paradigms affect this, but eventually you need to repeat yourself. The key is to determine when this happens without spending too much time determining when this happens.

devjab
6 replies
19h36m

One of the major issues with a lot of the outdated concepts in programming is that we still teach them to young people. I work a side gig as an external examiner for CS students. Especially in the early years they are taught the same OOP content that I was taught some decades ago, stuff that I haven’t used (also) for some decades. Because while a lot of the concepts may work well in theory, they never work out in a world where programmers have to write code on a Thursday afternoon after a terrible week.

It’s almost always better to repeat code. It’s obviously not something that is completely black and white, even if I prefer to never really do any form of inheritance or mutability, it’s not like I wouldn’t want you to create a “base” class with “created by” “updated by” and so on for your data classes and if you have some functions that do universal stuff for you and never change, then by all means use them in different places. But for the most part, repeating code will keep your code much cleaner. Maybe not today or the next month, but five years down the line nobody is going to want to touch that shared code which is now so complicated you may as well close your business before you let anyone touch it. Again, not because the theoretical concepts that lead to this are necessarily flawed, but because they require too much “correctness” to be useful.

Academia hasn’t really caught on though. I still grade first semester students who have the whole “Animal” -> “duck”, “dog”, “cat” or whatever they use into their heads as the “correct way” to do things. Similar to how they are often taught other processes than agile, but are taught that agile is the “only” way, even though we’ve seen just how wrong that is.

I’m not sure what we can really do about it. I’ve always championed strongly opinionated dev setups where I work. Some of the things we’ve done, and are going to do, aren’t going to be great, but what we try to do is to build an environment where it’s as easy as possible for every developer to build code the most maintainable way. We want to help them get there, even when it’s 15:45 on a Thursday that has been full of shit meetings in a week that’s been full of screaming children and an angry spouse and a car that exploded. And things like DRY just aren’t useful.

shrimp_emoji
3 replies
19h30m

It’s almost always better to repeat code.

God no. Stop the copy pasta disease! It's horrible, mindless programming.

When reviewing code, I'm astonished anything was accomplished by copy pasting so much old code (complete with bugs and comment typos).

Incidentally, OOP encourages you to copy a lot. It's just an engine for generating code bloat. Want to serialize some objects? Here's your Object serializer and your overloaded Car serialize and your overloaded Boat serializer, with only a few different fields to justify the difference!

OOP is bad. Copy pasta is bad. DRY is good. All hail DRY, forever, at any cost.

zerbinxx
0 replies
19h15m

OOP and Dry are compatible! I’ve actually done the thing that the above commenter suggests - create a base object with created on/by so that I never have to think about it. Whether or not you actually care about that, if you implement a descended of that object you’re going to get some stuff for free, and you’re gonna like it!

jddj
0 replies
19h15m

Countless man-centuries have been lost looking for the perfect abstraction to cover two (or an imagined future with two) cases which look deceptively similar, then teasing them apart again.

hooverd
0 replies
15h15m

For what it's worth, I've always had an easier time combining WET code than untangling the knot than is too DRY code. Too little abstraction and you might have to read some extra code to understand it. Too much abstraction and no one other than the writer, and even then, may ever understand it.

kcrwfrd_
0 replies
18h0m

It’s a balancing act, but deletable code is often preferable to purely-DRY-for-the-sake-of-DRY, overly abstracted code.

HeavyStorm
0 replies
9h33m

Yeah, no. Not at all. I imagine that you are taking DRY quite literally, as if and critiquing the most stupid use cases of it, like DRYing calls to Split with spaces to SplitBySpace.

DRY's goal is to avoid defining behaviors in duplicity, resulting in having multiple points in code to change when you need to modify said behavior. Code needs to be coherent to be "good", for a number of of the different quality indicators.

I'm doing a "side project" right now where I'm using a newcomer payment gateway. They certainly don't DRY stuff. Same field gets serialized with camel case and snake case in different API, and whole structures that represent the same concept are duplicate with slightly different fields. This probably means that Thursday 15.25 the dev checked-in her code happy because the reviewer never cared about DRY, and now I'm paying the price of maintaining four types of addresses in my code base.

taberiand
0 replies
18h41m

There's a mistake many junior devs (and sometimes mid and senior devs) make where they confuse hiding complexity with simplicity - using a string instead of a well defined domain type is a good example, there is a certain complexity of the domain expressed by the type that they don't want to think about too deeply so they replace with a string which superficially looks simpler but in fact hides all of the inherent complexity and nuance.

It causes what I call the lumpy carpet syndrome - sweeping the complexity under the carpet causes bumps to randomly appear that when squashed tend to cause other bumps to pop up rather than actually solving the problem.

Cthulhu_
0 replies
9h1m

Go now has generics, so I'm confident some smart fellow will apply DRY and make it a generic ValidatedData[type, validator] type struct, with a ValidatedDataFactory that applies the correct validator callback, and a ValidatorFactory that instantiates the validators based on a new valdiation rule DSL written in JSON or XML.

...Easy!

asp_hornet
8 replies
20h51m

Now what? the username is in an unexported field and unusable? I can kind of see what its going for but it seems like a way just to add another layer of wrapping and indirection.

pants2
7 replies
20h45m

It would need a getter here. Probably good to keep it immutable, if you want guarantees that it will never be changed to something that violates the username rules.

asp_hornet
6 replies
19h38m

need a getter

Yeah, thats what I figured. Im not sure if I want the tradeoff of calling .GetValue in multiple places just to save calling validate in maybe 2 or 3 places.

Not to mention I cant easily marshal/unmarshal into it and next week valid username is a username that doesnt already exist in the database.

Maybe this approach appeals to people and Im hesitant to say “that’s not how Go is supposed to be written” but for me this feels like “clever over clear”.

klodolph
5 replies
19h13m

Yeah, thats what I figured. Im not sure if I want the tradeoff of calling .GetValue in multiple places just to save calling validate in maybe 2 or 3 places.

The tradeoff is not that you save calling validate, it’s that you avoid forgetting to call validate in the first place, because when you forget to validate, you get a type error.

IMO it’s a little more clear this way:

    type Ticket struct {
      requestor Username
      assignee  Username
    }
It lets you write code that is little more obvious.

asp_hornet
4 replies
18h49m

I’m not sure I understand. In your example you’ve grouped related data in a struct and validating that it matches your system’s invariants, that feels good to me.

The original example was more “wrap a simple type in an object so it’s always validated when set” which looks beautiful when you don’t have the needed getters in the example nor show all the Get call sites opposed to the 1 or 2 New call sites. All in the name of “we don’t want to set the username without validation” but without private constructors Username{“invalid”} can be invoked, the validation circumvented and I’m not convinced the overhead we paid was worth it.

Sakos
3 replies
18h5m

The countless bugs I've had to deal with and all the time I've lost fixing these bugs caused by people who forgot to validate data in a certain place or didn't realize they had to do so proves to me that the overhead of calling a get on a wrapper type is totally worth it.

I value the hours wasted on diagnosing a bug far more than the extra keystrokes and couple of seconds required to avoid it in the first place.

asp_hornet
2 replies
15h21m

No, you’ve achieved an illusion of that as now your spending hours wasted on discovering where a developer forgot to call NewUsername and instead called Username{“broken”}. I cant see the value in this abstraction in Go.

iampims
1 replies
14h41m

They can’t because value is not exported. They must use the NewUsername function, which forces the validation.

In my opinion, this pattern breaks when the validation must return an error and everything becomes very verbose.

asp_hornet
0 replies
14h15m

Oh, thats true about it being unexported. I hadn’t considered that.

zelphirkalt
6 replies
17h0m

I always understood "parse don't validate" a bit differently. If you are doing the validation inside of a constructor, you are still doing validation instead of parsing. It is safer to do the validation in one place you know the execution will go through, of course, but not the idea I understand "parse don't validate" to mean. I understand it to mean: "write an actual parser, whatever passes the parser can be used in the rest of the program", where a parser is a set of grammar rules for example, or PEG.

mtlynch
5 replies
16h42m

I'm not a Haskell developer, so it's possible that I misunderstood the original "Parse, Don't Validate" post.

If you are doing the validation inside of a constructor, you are still doing validation instead of parsing.

Why that would be considered validation rather than parsing?

From the original post:

Consider: what is a parser? Really, a parser is just a function that consumes less-structured input and produces more-structured output.

That's the key idea to me.

A parser enforces checks on an input and produces an output. And if you define an output type that's distinct from the input type, you allow the type system "preserve" the fact that the data passed a parser at some point in its life.

But again, I don't know Haskell, so I'm interested to know if I'm misunderstanding Lexi Lambda's post.

kevincox
4 replies
15h50m

Parse don't validate means that if you want a function that converts an IP address string to a struct IpAddress{ address: string } you don't validate that the input string is a valid IP address then return a struct with that string inside. Instead you parse that IP into raw integers, then join those back into an IP string.

The idea is that your parsed representation and serializer are likely produce a much smaller and more predictable set of values than may pass the validator.

As an example there was a network control plane outage in GCP because the Java frontend validated an IP address then stored it (as a string) in the database. The C++ network control plane then crashed because the IP address actually contained non-ASCII "digits" that Java with its Unicode support accepted.

If instead the address was parsed into 4 or 8 integers and was reserialized before being written to the DB this outage wouldn't have happened. The parsing was still probably more lax than it should have been, but at least the value written to the DB was valid.

In this case it was funny Unicode, but it could be as simple as 1.2.3.04 vs 1.2.3.4. By parsing then re-serializing you are going to produce the more canonical and expected form.

mtlynch
2 replies
15h34m

Thanks for that explanation! I hadn't appreciated that aspect of "parse, don't validate," before.

But even with that understanding and from re-reading the post, that seems to be an extra safety measure rather than the essence of the idea.

Going back to my original example of parsing a Username and verifying that it doesn't contain any illegal characters, how does a parser convert a string into a more direct representation of a username without using a string internally? Or if you're parsing an uint8 into a type that logically must be between 1 and 100, what's the internal type that you parse it into that isn't a uint8?

kevincox
0 replies
5h8m

Personally I don't think I would have used the phrase "parse don't validate" for something like a username. It isn't clear to me what it would mean exactly. I generally only thing of this principle for data that has some structure, not as much a username or number from 1-100.

IP address would be about the minimum amount of structure. Something else would be like processing API requests. You can take the incoming JSON and fully parse it as much as possible, rather than just validate it is as expected (for example drop unknown fields)

eru
0 replies
14h18m

Or if you're parsing an uint8 into a type that logically must be between 1 and 100, what's the internal type that you parse it into that isn't a uint8?

Just for the sake of example, your internal representation might start from 0, and you just add 1 whenever you output it.

Your internal type might also not be a uint8. Eg in Python you would probably just use their default type for integers, which supports arbitrarily big numbers. (Not because you need arbitrarily big numbers, but just because that's the default.)

xyzzy_plugh
0 replies
15h38m

Perhaps "normalize" or "canonicalize" is more appropriate. A parser can liberally interpret but I don't take it to imply some destructured form necessarily. There are countless scenarios where you want to be able to reproduce the exact input, and often preserving the input is the simplest solution.

But yes usually you do want to split something into it's elemental components, should it have any.

edflsafoiewq
4 replies
17h24m

You can use new types with validation too. In fact the approaches seem to be duals.

Parse, don't validate:

                    string          ParsedString
  untrusted source -------> parse --------------> rest of system
Validate, don't parse:

                    UnvalidatedString            string
  untrusted source ------------------> validate -------> rest of system

mtlynch
3 replies
16h47m

The problem is that pattern "fails open." If anyone on the team forgets to define an untrusted string as UnvalidatedString, the data skips validation.

If you default to treating primitive types as untrusted, it's hard for someone to accidentally convert an untrusted type to a trusted type without using the correct parse method.

edflsafoiewq
2 replies
16h3m

The dual problem would be any function which forgets to accept a ParsedString instead of a string can skip parsing.

Both cases appear to depend on there being a "checkpoint" all data must go through to cross over to the rest of the system, either at parsing or at UnvalidatedString construction.

mtlynch
1 replies
14h51m

The dual problem would be any function which forgets to accept a ParsedString instead of a string can skip parsing.

Both cases appear to depend on there being a "checkpoint" all data must go through to cross over to the rest of the system, either at parsing or at UnvalidatedString construction.

The difference is that if string is the trusted type, then it's easy to miss a spot and use the trusted string type for an untrusted value. The mistake will be subtle because the rest of your app uses a string type as well.

The converse is not true. If string is an untrusted type and ParsedString is a trusted type, if you miss a spot and forget to convert an untrusted string into a ParsedString, that function can't interact with any other part of your codebase that expects a ParsedString. The error would be much more visible and the damage more contained.

I think UnvalidatedString -> string also kind of misses the point of the type system in general. To parse a string into some other type, you're asserting something about the value it stores. It's not just a string with a blessing that says it's okay. It's a subset of the string type that can contain a more limited set of values than the built-in string type.

For example, parsing a string into a Username, I'm asserting things about the string (e.g., it's <10 characters long, it contains only a-z0-9). If I just use the string type, that's not an accurate representation of what's legal for a Username because the string type implies any legal string is a valid value.

eru
0 replies
13h56m

The example also assumes that everything is like a 'ParsedString' that contains a copy of the original untrusted value inside.

andrus
4 replies
20h34m

I’ve found it hard to apply this pattern in Go since, if Username is embedded in a struct, and you forget to set it, you’ll get Username’s zero value, which may violate your constraints.

mrklol
2 replies
18h35m

Why? You can easily call NewUsername inside NewAccount for example, just return the error. Or did I misunderstood?

BreakfastB0b
1 replies
16h26m

Because go doesn’t have exhaustiveness checking when initialising structs. Instead it encourages “make the zero value meaningful” which is not always possible nor desirable. I usually use a linter to catch this kind of problem https://github.com/GaijinEntertainment/go-exhaustruct

vorticalbox
0 replies
3h30m

I like this but in the examples would volume be calculated by width/length rather than being set?

Cthulhu_
0 replies
9h3m

But if you then create a constructor / factory method for that struct, not setting it would trigger an error. But this is one of the problem with Go and other languages that have nil or no "you have to set this" built into their type system: it relies on people's self-discipline, checked by the author, reviewer, and unit test, and ensuring there's not a problem like you describe takes up a lot of diligence.

costco
3 replies
19h54m

Just do

    type Username string
And replace

      return Username{username}
with

      return Username(username)

mtlynch
1 replies
19h40m

The problem there is that you lose the guarantee that the parser validated the string value.

A caller can just say:

    // This is returning an error for some reason, so let's do it directly.
    // username, err := parsers.NewUsername(raw)
    username := parsers.Username(raw)
You also get implicit conversions in ways you probably don't want:

    var u Username
    u = "<hello>" // Implicitly converts from string to Username

costco
0 replies
19h35m

That's true I did not think of that.

daveFNbuck
0 replies
19h41m

If you do that, people outside the package can also do Username(x) conversions instead of calling NewUsername. Making value package private means that you can only set it from outside the package using provided functionality.

Scarblac
3 replies
19h50m

and a human-readable explanation of the issue is set as the value.

This is annoying to translate later. At least also include some error code string that is documented somewhere and isn't prone to change randomly.

klodolph
2 replies
19h46m

I mean, you may end up just wanting something like,

    type UsernameError struct {
      name   string
      reason string
    }
    func (e *UsernameError) Error() string { 
      return fmt.Errorf("invalid username %q: %s", e.name, e.reason)
    }
And reason can be "username cannot be empty" or "username may not contain '<'" or something like that.

This is fine for lots of different cases, because it’s likely that your code wants to know how to handle “username is invalid”, but only humans care about why.

I have personally never seen a Go codebase where you parse error strings. I know that people keep complaining about it so it must be happening out there—but every codebase I’ve worked with either has error constants (an exported var set to some errors.New() value) or some kind of custom error type you can check. Or if it doesn’t have those things, I had no interest in parsing the errors.

Scarblac
1 replies
19h33m

I write mostly frontends. Sometimes the APIs I talk to give back beautiful English error messages - that I can't just show to the user, because they are using a different language most of the time. And I don't want to write logic that depends on that sentence, far too brittle.

klodolph
0 replies
19h20m

Right—I think the “error code” here is going to be the error type, i.e., UsernameError, or some qualified version of that.

It’s not perfect, but software evolves through many imperfect stages as it gets better, and this is one such imperfect stage that your software may evolve through.

Including a human-readable version of the error is useful because the developers / operators will want to read through the logs for it. Sometimes that is where you stop, because not all errors from all backends will need to be localized.

tdeck
2 replies
16h44m

But surely this is just another way of doing validation and not fundamentally "parsing"? If at the end you've just stored the input exactly as you got it, the only parsing you're potentially doing is in the validation step and then it gets thrown away.

ezrast
0 replies
16h5m

Implementation-wise, yes, but the interface you're exposing is indistinguishable from that of a parser. For all your consumers know, you could be storing the username as a sequence of a 254-valued enum (one for each byte, except the angle brackets) and reconstructing the string on each "get" call. For more complex data you would certainly be storing it piecewise; the only reasons this example gets a pass are 1) because it is so low in surface area that a human can reasonably validate the implementation as bug-free without further aid from the type checker, and 2) because Go's type system is so inexpressive that you can't encode complex requirements with it anyway.

draven
0 replies
8h4m

The validation is not completely thrown away, since the type indicates that the data has been validated. I understand "parsing" as applying more structure to a piece of data. Going from a String to an IP or a Username fits the definition.

I push my team to use this pattern in our (mostly Scala) codebase. We have too many instances of useless validations, because the fact that a piece of data has been "parsed"/validated is not reflected in its type using simple validation.

For example using String, a function might validate the String as a Username. Lower in the call stack, a function ends up taking this String as an arg. It has no way of knowing if it has been validated or not and has to re-validate it. If the first validation gets a Username as a result, other functions down the call stack can take a Username as an argument and know for sure it's been validated / "parsed".

skybrian
2 replies
20h40m

This is a good design pattern, but be wary of doing validation too early. The design pattern allows you to do it as early or late as you like, but doesn't tell you when to do it. Often it's best to do it as part of parsing/validating some larger object.

See Steven Witten's "I is for Intent" [1] for some ideas about the use of unvalidated data in a UI context.

[1] https://acko.net/blog/i-is-for-intent/

lolinder
1 replies
16h38m

I read through that piece and strongly disagree with the premise that their insight is somehow at odds with leaning into the type system for correctness.

The legitimate insight that they have is that anchoring the state as close as possible to the user input is valuable—I think that that is a great insight with a lot of good applications.

However, there's nothing that says you can't take that user-centric state and put it in a strongly typed data structure as soon as possible, with a set of clearly defined and well-typed transitions mapping the user-centric state to the derived states.

Edit: looks like there was discussion on this the other day, with a number of people making similar observations—https://news.ycombinator.com/item?id=39269886

skybrian
0 replies
13h55m

A text file and an abstract syntax tree can both be rigorously represented using types, but one is before parsing and other is after parsing. The question is which one is more suitable for editing?

Text has more possible states than the equivalent AST, many of which are useful when you haven't typed in all the code yet. Incomplete code usually doesn't parse.

This suggests that drafts should be represented as text, not an AST.

And maybe similarly for drafts of other things? Drafts will have some representation that follows some rules, but maybe they shouldn't have to follow all the rules. You may still want to save drafts and collaborate on them even though they break some rules.

In a system that's not an editor, though, maybe it makes sense to validate early. For a command-line utility, the editor is external, provided by the environment (a shell or the editor for a shell script) so you don't need to be concerned with that.

swah
1 replies
18h36m

My Go is rusty, do you mean not exporting the type "Username" (ie username) to avoid default constructor usage?

mtlynch
0 replies
16h50m

In Go, capitalized identifiers are exported, whereas lowercase identifiers are not.

In the example I gave above, clients outside of the package can instantiate Username, but they can't access its "value" member, so the only way they could get a populated Username instance is by calling NewUsername.

oorza
1 replies
21h31m

Conceptually equivalent to the ancient arts of private constructors and factory methods.

Cthulhu_
0 replies
9h6m

Which (in Java) were then abstracted away in... interesting annotations.

whimsicalism
0 replies
17h30m

the fact that this is some special “technique” really shows how far behind Go’s type system & community around typing is

patrickkristl
0 replies
19h26m

em ai have a problem from cars

otabdeveloper4
0 replies
10h37m

you have to remember to call the validator on 100% of code paths

But copy-pasting the same lines of code in literally every function is the Golang Way.

It makes code "simpler".

kvnhn
0 replies
12h40m

This is a variation on one of my favorite software design principles: Make illegal states unrepresentable. I first learned about it through Scott Wlaschin[1].

[1]: https://fsharpforfunandprofit.com/posts/designing-with-types...

dang
0 replies
12h57m

Related:

Parse, don't validate (2019) - https://news.ycombinator.com/item?id=35053118 - March 2023 (219 comments)

Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=27639890 - June 2021 (270 comments)

Parse, Don’t Validate - https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230 comments)

Parse, Don't Validate - https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4 comments)

WuxiFingerHold
0 replies
12h13m

So far I like the commonly used approach in the Typescript community best:

1. Create your Schema using https://zod.dev or https://github.com/sinclairzx81/typebox or one of the other many libs.

2. Generate your types from the schema. It's very simple to create partial or composite types, e.g. UpdateModel, InsertModels, Arrays of them, etc.

3. Most modern Frameworks have first class support for validation, like Fastify (with typebox). Just reuse your schema definition.

That is very easy, obvious and effective.

HeavyStorm
0 replies
10h0m

Encapsulation saves lives.

3pm
0 replies
20h31m

AKA 'Value Object' from DDD or a similar 'Quantity' accounting pattern. Another angle is that this fixes 'Primitive Obsession' code smell.

zemo
14 replies
21h35m

func NewServer(... config *Config ...) http.Handler

one of my biggest pet peeves is when people take a Config object, which represents the configuration of an entire system, and pass it around mutably. When you do that, you're coupling everything together through the config object. I've worked on systems where you had to configure the parts in a specific order in order for things to work, because someone decided to write back to the config object when it was passed to them. Or another case was where I've seen it such that you couldn't disable a portion of the system because it wrote data into the config object that was read by some other subsystem later. The pattern of "your configuration is one big value, which is mutable" is one of the more annoying patterns that I've seen before, both in Go and in other languages.

MrDarcy
5 replies
19h38m

My favorite way to prevent this is to make the config truly immutable, but still configurable with something like this:

  package config

  type options struct {
    name string
  }

  type Option func(o *options)

  func Name(name string) Option {
    return func(o *options) {
      o.name = name
    }
  }

  type Config struct {
    opts *options
  }

  func New(opts ...Option) *Config {
    o := &options{}
    for _, option := range opts {
      option(o)
    }
    return &Config{opts: o}
  }

  func (c *Config) Name() string {
    return c.opts.name
  }
Use it with:

  cfg := config.New(config.Name("Emanon"))
  fmt.Println(cfg.Name())

zemo
4 replies
8h53m

I used that pattern for a while but stopped using it. I first encountered it from this blog post: https://commandcenter.blogspot.com/2014/01/self-referential-...

It's a lot of boilerplate to create something that's not actually immutable. It also makes it harder to figure out which options are available, since now you can't just look at the documentation of the type, you have to look at the whole module package to figure out what the various options are. If one of the fields is a slice or map you can just mutate that slice or map in place, so it's not really immutable. The pattern as Pike describes it has the benefit that supplying an option returns an option that reverses the effect of supplying the option so that you can use the options somewhat like Python context objects that have enter and exit semantics, but in practice I've found that to be useful in a small portion of situations.

kubanczyk
3 replies
4h32m

you can't just look at the documentation of the type

Sure you can. Option func is a constructor for option type, and constructors are auto-included above methods in the docs.

PLS completion works for them as well.

zemo
2 replies
3h18m

The options for the thing being constructed are all separate types from the thing being constructed; the options aren’t a facet of the definition of the type they mutate.

kubanczyk
1 replies
2h36m

I'm saying:

- main constructor is easily available from the main type's docs,

- option type is easily available from the main constructor's docs,

- all option funcs are easily available from the option type's docs (because in fact these option funcs are constructors for the option type).

Excerpt from grpc godoc index:

    type Server

    func NewServer(opt ...ServerOption) *Server

    ...
    ...

    type ServerOption

    func ChainStreamInterceptor(interceptors ...StreamServerInterceptor) ServerOption
    func ChainUnaryInterceptor(interceptors ...UnaryServerInterceptor) ServerOption
    func ConnectionTimeout(d time.Duration) ServerOption
    func Creds(c credentials.TransportCredentials) ServerOption
    etc...
One more hop compared to a flat argument list, that's true. But if you only commonly use maybe 0-5 arguments out of 30-50 available, it does not look like a bad deal.

zemo
0 replies
1h50m

I’m not confused about what it is, I just don’t like it. It’s a lot of ceremony for very little gain.

doh
4 replies
21h34m

I think that's a valid criticism. What do you think would be a more ergonomic pattern?

Raynos
1 replies
21h29m

I wrote a static config class that reads configuration for the entire app / server from a JSON or YAML file ( https://github.com/uber/zanzibar/blob/master/runtime/static_... ).

Once you've loaded it and mutated it for testing purposes or for copying from ENV vars into the config, you can then freeze it before passing it down to all your app level code.

Having this wrapper object that can be frozen and has a `get()` method to read JSON like data make it effectively not mutable.

doh
0 replies
21h20m

I use similar pattern myself. Was curious if the OP is using some other, like for instance splitting the struct into two (im/mutable) and then passing them around, or what.

BTW kudos on zanzibar. Love the tech and the code).

zemo
0 replies
8h50m

I just use a struct literal, and then I have the type define a `func (t *Thing) ready() error { ... }` method and call the ready method to check that its valid. I prefer this over self-referential options, the builder pattern, supplying a secondary config object as a parameter to a constructor, etc.

gabesullice
0 replies
8h7m

Not the OP, but I mitigate the issue rather than use a different pattern. Like so:

type Server struct { val bool }

type Config struct { Val bool }

func NewServer(... config *Config ...) http.Handler { if config == nil { config = &Config{} } return &Server{ val: config.Val } }

It took me a long time to settle on this pattern and I admit it's tedious to copy configuration over to the server struct, but I've found that it ends up being the least verbose and maintainable long term while making sure callers can't mutate config after the fact.

I can pass nil to NewServer to say "just the usual, please", customize everything, or surgically change a single option.

It's also useful for maintaining backwards compatibility. I'm free to refactor config on my server struct and "upgrade" deprecated config arguments inside my NewServer function.

tejinderss
0 replies
21h28m

The keyword here is “mutable” config object and not config data object in general. I use immutable config dataclass liberally in one of my python projects and i pass it around in all modules. Many functions rely on multiple values and instead of passing all of them as function parameters (which requires their own function typings), the dataclass has all variables with typing definitions in one place, its pretty handy design pattern.

gloryjulio
0 replies
21h3m

I agree. We ran into sev by changing the top level config object before. You DO NOT want to modify it. The wasted man hour is not worth. You will never know where or how it get used. If you make changes it's better to derive from it instead.

Update: What's funny was, in our design the config object was kinda immutable. You have to use the WARNING_DO_NOT_USE api to make modification. We did mutate the object and we caused a sev

fnordlord
0 replies
20h33m

I've tended to create a Config struct for each package and then a configs.Config struct that's just made up of each package's Config. It might not be a Go best practice but I like that I can setup the entire system's configuration on startup as one entity but then I only pass in the minimally required dependencies for each package. It also makes testing a little easier because I don't have to fake out the entire configuration for testing one package.

mtlynch
11 replies
21h47m

I really like Mat Ryer's work, and I've applied most of the ideas in the 2018 version of this article to all of my Go projects since then.

The one weak spot for me is this aspect:

NewServer is a big constructor that takes in all dependencies as arguments... In test cases that don’t need all of the dependencies, I pass in nil as a signal that it won’t be used.

This has always felt wrong to me, but I've never been able to figure out a better solution.

It means that a huge chunk of your code has a huge amount of unnecessary shared state.

I often end up writing HTTP handlers that only need access to a tiny amount of the shared state. Like the HTTP handler needs to check if the requesting user has access to a resource, and then it needs to call one function on the datastore.

I'd love to write tests where I only mock out those two methods, but I can't write simple tests because the handler is part of this giant glob where it has access to all of the datastore and every object the parent server has access to because it's all one giant object.

Nothing against Mat Ryer, as his pattern is the best I've found, but I still feel like there's some better solution out there.

vjerancrnjak
2 replies
21h5m

It means the object created by NewServer is dealing with too much. Probably has too many data types coupled to it and too much behavior.

Simple example is adding a logger. If you add it as a dependency to the constructor, the object starts doing a bit more than initial simple implementation. It's fine to do it, but shame to not figure out how to log without editing the implementation of a simple thing.

Higher order functions (a logger decorator) get there to allow composition, but even they have their drawbacks.

It's still some form of structure that you can deal with, not a mistake.

taberiand
0 replies
18h34m

As you say, having a logger attached is one of those pragmatic and acceptable exceptions to the rule. In a perfect world we'd have the time to go to the trouble of implementing loggable types and data flows and associated higher order functions, in practice taking the compromise means getting the real business valuable work completed while still having the necessary (but usually "low priority") non-functional requirements like logging and metrics implemented.

oogali
0 replies
20h10m

I agree that too many arguments to the constructor may have the smell of too much coupling.

But if I really feel I can't avoid the need to pass a good amount of external context, I create a dedicated "options" struct and pass that into the constructor as a pointer. The purpose of the pointer (rather than pass by value) is if I want default arguments, I can pass nil.

    type ServerOptions struct {
        logger    *magic.Logger
        secretKey string
    }

    func NewServer(options *ServerOptions) (*Server, error) {
        ...
    }

qaq
1 replies
12h11m

You can use Dependency Injection to solve this issue but in my view the added complexity is not really worth it.

metaltyphoon
0 replies
3h18m

Is this a Go thing? In C# land this is trivial.

hayst4ck
1 replies
19h17m

It means that a huge chunk of your code has a huge amount of unnecessary shared state.

Can you explain that a little more?

Which chunk of code has what shared state, and why is it unnecessary?

strawhatguy
0 replies
18h57m

Basically, not all the handlers will use every dependency the server (which is the entire program in this pattern) has. Not every handler will use a database, for example.

While I may prefer a struct for this instead of separate arguments, I do agree it's useful to capture "the world" as the set of all dependencies, even if some handlers don't use them (yet).

zer00eyz
0 replies
21h17m

I tend to write most of my logic in packages... so a "users" package or a "comments" package (if we were building HN). These have NO http interface! They do however each have their own "main" and some sort of CLI interface: "//go:build ignore" in the comment of that file is your friend.

jxramos
0 replies
21h28m

I've become increasingly sensitive to these high afferent coupling points in the repos I work on, especially the deeper I embed into the world of bazel and how dependency management and physical design influence the code I author.

Where possible plugins are a great strategy to lay down these code seam points that don't force all possibilities upon some body of code, because fundamentally with plugin architectures you pick and choose what you want. Plugins are opt out by default, you must explicitly opt into a plugin for it to manifest. I've been calling software that has this quality going as being an "a la carte" style.

But in general you do what you need to do to avoid "doing everything so you can do anything".

j1elo
0 replies
19h35m
gabesullice
0 replies
7h43m

I felt this way for a long time. And maybe I'm projecting my past struggles onto what you're describing. I shared my current approach in a different comment already [0]. The gist is that I use an optional config struct, whose values get validated and copied over to my server struct inside NewServer. This makes testing much easier because I can mock fewer deps.

FWIW, I really tried to make the functional option pattern work for me, as many others have suggested, but eventually abandoned it. I felt it was a little too clever and therefore difficult to read, while requiring more boilerplate than the config struct + validate and copy pattern.

[0] https://news.ycombinator.com/item?id=39320170

ballresin
11 replies
22h36m

What is the value making main.go as small as possible?

Whose dreams come true in this scenario?

pphysch
1 replies
22h23m

For bespoke internal services, I like to keep main.go as flat as reasonable, like a "script". Handlers can have their own files but the bulk of the control flow and moving parts should be apparent from reading the main file.

Abstracting things away from main makes it less readable and is general pointless for bespoke services that will be deployed in exactly one configuration.

donio
0 replies
22h2m

That's a nice way of putting it. When exploring a new codebase for the first time it can be very helpful to have main.go give you a high level idea about the overall structure of the program.

sebastianz
0 replies
22h32m

The initialization has to be done in a separate function that you call from the setup code for your end-to-end tests.

perbu
0 replies
2h21m

I assume you mean main() and not main.go.

main() is the only place where you can't return an error. In order to keep as much of the code as idiomatic as possible you just call something like run() where you can do so.

In addition there is the testing aspect. You can't invoke main() from your tests.

mosselman
0 replies
22h27m

“Whose dreams come true in this scenario?”

I love this! I will use this as well.

There are so many situations where I have a feeling that people are solving problems that don’t exist. In code I run into at work, code and projects I see online, etc

The “whose dreams are you making come true” really applies here, because dreams are exactly what they are.

I spent quite some time writing an automatic image resizer and optimiser for my blog. Does it matter? No! Should I have spent that time writing blog posts instead? Yes! Still I was chasing some dream.

Thanks for this image

mattboardman
0 replies
22h30m

Usually your main function can't be used by any other part of your program. You should move all component implementations to modules so they can be re-used elsewhere.

jrockway
0 replies
22h17m

I've never been a fan of making main.go one line. I create the logger, parse the flags, create objects from the flags, and call Run() or something. In the tests, you aren't ever going to do those things in the same way, so there is really no point in putting them in some other file.

janosdebugs
0 replies
22h31m

If you do, you can use the application as a library and most of your code will also be easier to test.

dilyevsky
0 replies
21h43m

The idea is to keep the untestable code as small as possible but in practice you just add a layer of indirection and all of your untestable init code is in a different castle.

arccy
0 replies
22h24m

not main.go but func main. This allows your run function to return an error and you only need to deal with the abruptness of using os.Exit once

abuani
0 replies
22h17m

The author goes on to explain a few scenarios where the pattern is helpful. It's not to keep main.go as small as possible, it's so that you can test parts of your main.go file properly. In my experience, if all of my logic is stuffed into `func main() {}`, then I can't actually test it. If I have a helper method(like run in this case), I can test out specific scenarios and ensure the application handles it properly. Some of the examples Mat gave were handling context cancellations properly.

jupp0r
8 replies
20h45m

I found fx(https://github.com/uber-go/fx) to be a super simple yet versatile tool to design my application around.

All the advice in the article is still helpful, but it takes the "how do I make sure X is initialized when Y needs it" part completely out of the equation and reduces it from an N*M problem to an N problem, ie I only have to worry about how to initialize individual pieces, not about how to synchronize initialization between them.

I've used quite a few dependency injection libraries in various languages over the years (and implemented a couple myself) and the simplicity and versatility of fx makes it my favorite so far.

Philip-J-Fry
7 replies
18h5m

All the advice in the article is still helpful, but it takes the "how do I make sure X is initialized when Y needs it" part completely out of the equation and reduces it from an N*M problem to an N problem, ie I only have to worry about how to initialize individual pieces, not about how to synchronize initialization between them.

I gotta say, I hate these dependency injection frameworks.

In a well designed system this should be trivial. Making sure something is initialised when you want to use it is just a matter of it being available to pass in a constructor as a parameter.

  stockService := NewStockService()
  orderService := NewOrderService()
  orderProcessor := NewOrderProcessor(stockService, orderService)
There shouldn't be any sort of "synchronisation" of initialisation needed because your code won't compile if you do something wrong. If you add a cyclic dependency you will clearly see that because you won't be able to construct things in the right order without an obvious workaround.

jupp0r
6 replies
16h5m

If you have ever topologically sorted 100 components connected in a complex graph by hand or found the right spot to insert the 101st, you'd quickly appreciate more help than a compiler check.

williamdclt
1 replies
2h6m

I’m not against DI, but I don’t find your argument convincing: having dependencies modelled directly with the simplest language constructs (variables and arguments) and validated by the compiler makes “debugging” a ton simpler than dealing with DI errors, even in a good DI framework. Having an error just means I wrote invalid code: even a junior can easily figure it out.

DI still has other advantages, but that’s not one

jupp0r
0 replies
1h49m

I don't disagree with you, I've argued against the usage DI frameworks plenty of times on projects I was working on. Many are not well made, are overly complicated and do much more than one single thing.

Especially in Go, where you don't have destructors to help with shutdown, having common structure in place to help tear down components has always been a net benefit for me.

therealdrag0
1 replies
9h53m

I’m sure there’s a place for them.

But when micro-services are so common, it seems like people use them (Spring) because everyone else does, not because they actually provide needed value.

jupp0r
0 replies
1h56m

Spring is an overly complicated mess. I use DI when it makes things simpler, I don't see how Spring would ever do that.

Philip-J-Fry
1 replies
7h54m

Your dependency structure should just be a tree.

It should be inserted literally right next to it's first use case. Your IDE will literally point it to you with red squigglys because the places where you've added a dependency will be missing a parameter. Go to the highest one and add it on the line above.

jupp0r
0 replies
1h45m

I've never seen a tree graph, not without lots of global mutable state to cheat around DI. Your logger is just going to be needed almost everywhere.

What do you do on shutdown? In languages with destructors, that can automatically give you a call order in reverse of the construction order, but in Go you end up manually ordering things or just not having panicless shutdowns.

jopsen
6 replies
21h36m

I've recently been playing with ogen: https://github.com/ogen-go/ogen

Write openapi definition, it'll do routing, definition of structs, validation of JSON schemas, etc.

All I need to do is implement the service.

Validating an integer range for a querystring parameter is just too boring. And too easy to mistype when writing it manually.

Anyways, so far only been playing, so haven't found the bad parts yet.

dilyevsky
3 replies
19h44m

The problem with this approach is writing openapi by hand from scratch is incredibly tedious process. Writing Protobufs, capnproto or any such similar idl feels much more productive

xyzzy123
1 replies
18h31m

Its a bit icky but LLMs / copilot can speed up the creation of openapi specs a lot.

Agree it doesn't fix the "root" problem that the overall syntax is not ergonomic.

jopsen
0 replies
4h4m

My point was that writing an openapi, or other IDL is faster than writing the code to manually do these things.

And more accurate than LLMs.

Feels like whenever an LLMs could code it, you'd be better of not having the boilerplate code at all.

flowardnut
0 replies
2h5m

it lacks flexibility but i really enjoy grpc-gateway for 99% of my work

https://github.com/grpc-ecosystem/grpc-gateway

topicseed
0 replies
18h48m

Or, if you're more into publishing an Openapi spec from your Go code, I do like danielgtaylor/huma[1] and swaggest/rest[2].

[1] https://github.com/danielgtaylor/huma

[2] https://github.com/swaggest/rest

perbu
0 replies
2h33m

I've started doing the same but with oapigen; github.com/deepmap/oapi-codegen

I thougt it would be boring writing the spec, but it was not nearly as bad as I thought. Also, a spec is needed, so might as well write it up front.

lelandbatey
4 replies
21h56m

I want to see a greater acceptance of this idea:

My handlers used to be methods hanging off a server struct, but I no longer do this. If a handler function wants a dependency, it can bloody well ask for it as an argument. No more surprise dependencies when you’re just trying to test a single handler.

For HTTP services in any language, your handlers will usually end up with a lot of business logic, logic which probably has many dependencies. I see single handlers using all of the following on a regular basis: DB, cache, blob storage, some kind of special authz thing specific to your endpoints, maybe some fancy licensing checker, a queue or two, a specialized logger, and specialized metrics client. Many of those (metrics, request/response logging) can live in middlewares most of the time, but in every code base there will be times where you need to do something custom with one or the other. As time passes, the more I wonder "why aren't these all just function parameters?"

Yes, that would be a lot of function parameters (9+ for a single handler, before even getting into the request or custom params themselves), and we all have many rules of thumb and linter rules which try to keep us from having lots of function parameters. But it's not like we're not writing code which depends on all those dependencies, instead we're just sticking them on the "server" class/struct and pretending that because the method signature is shorter, we have fewer dependencies!

As time passes, I find myself wishing more and more for code that takes all its dependencies in the function/method signature, even if there's 20 of them; at least then we wouldn't be lying about how complex the code's getting...

topicseed
2 replies
18h40m

I've always have my handlers individually set as a struct each with a method to handle the route/request.

type CreateUser struct {

   store  storage.Store

   cache  caching.Cache

   logger logging.Logger

   pub    events.Publisher

   // etc
}

func (op CreateUser) ServeHTTP(ctx, req, rw) {}

// or if you have custom handlers

func (op

CreateUser) ServeHTTP(ctx, input) (output, error) {}

And in my main.go, or where I set up my dependencies, I create each operation, passing it its specific dependencies. I love that because I can keep all the helper methods for that specific operation/handler on that specific struct as private methods.

It does get tedious when you have one operation needing another, as you might start passing these around or you extract that into its own package/service.

lelandbatey
1 replies
11h48m

This is kinda missing the point; each handler needs a lot of deps to do it's job, and the most obvious place to put them is in the parameters of the function. That is what I want. I do not want more indirection for aesthetics; I want clarity, even if it's brutal clarity.

Whether all the deps are in the method receiver (the parent struct) or in a struct that's a param; it's all just more indirection to hide all the "stuff" that we need cause we think it's ugly. I dream of a world where we don't do that.

topicseed
0 replies
8h23m

You do have to instantiate that struct, and you can do it with.... a beautiful NewCreateUser(dep1, dep2, dep3, ..., dep20) *CreateUser {...}. This is essentially what he recommends with his "func newMiddleware() func(h http.Handler) http.Handler".

wereHamster
0 replies
21h49m

It doesn't have to be 9+ separate arguments, in some languages it can be a single 'context' or 'env' object that contains just what the handler needs, something like `handleHello({ db, cache, blobStore, authz }, req, res)`. That way, if two handlers use the exact same context you can reuse, but it's also easy enough to declare a per-handler context at the call site.

matt_callmann
3 replies
21h50m

is there a git repo with example code?

mtlynch
2 replies
21h43m

Not OP, but I design my Go projects with a very similar pattern that I learned from OP's 2018 post.

I think this is a pretty good example of a real-world implementation:

https://github.com/mtlynch/picoshare

Particularly these files:

https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...

https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...

karolist
1 replies
20h12m

out of curiosity, why no sort-of-established pkg and internal dirs? What do you think of https://github.com/photoprism/photoprism structure?

mtlynch
0 replies
20h1m

I'm not familiar with that package structure, unfortunately. It might be good, but I'm not sure what the reasons are for structuring the project that way.

Animats
2 replies
19h36m

I just run Go servers under fcgi. You get orchestration and crash recovery with a very simple interface. Fcgi will launch server processes as needed, feed them events, and shut it down when there's no traffic. Performance is good, and you can run on cheap hosting.

chubot
1 replies
17h38m

Which hosting do you use? I use fastcgi with python on Dreamhost and it works fine, but I’m sorta worried that they’ll turn it off because it seems kind of niche and under-documented

Animats
0 replies
15h18m

Dreamhost too. Dreamhost will let you run a continuously running process.

The amount of work you can get done on low-end shared hosting is really quite impressive.

liampulles
1 replies
20h40m

I agree with a lot of this, I'll add my own opinions:

* I would pass a waitgroup with the app context to service structs. This way the interrupt can trigger the app shutdown via the context and the main goroutine can wait on the waitgroup before actually killing the app.

* If writing a CLI program, then testing stdout, stdin, stderr, args, env, etc. is useful. But for an http server, this is less true. I would pass structured config to the run function to let those tests be more focused.

* I disagree with parsing templates using sync.Once in a handler because I don't think handlers should do template parsing at all. I would do this when the app starts: if the template cannot be parsed, the app should not become ready to receive any requests and should rather exit with a non-zero exit code.

hyeomans
0 replies
4h45m

I find your first point interesting, wouldn’t be that solved by context propagation and waiting for the server to shutdown? Thanks!

earthboundkid
1 replies
20h23m

The validator should return map[string][]string so that a request can have multiple problems with one field.

earthboundkid
0 replies
20h22m

The sync.Once should be sync.OnceValues instead.

sylware
0 replies
19h28m

I did write my own HTTP stuff in C (and more generally internet stuff), on linux (sometimes without a libc, namely direct syscalls), running on ARM64 and x86_64.

And I plan to move to rv64 assembly once I can get reasonably performant hardware (it is already here, but it extremely hard to get some where I am from and how I operate). I dunno if it will be bare metal or with a linux kernel first (coze a minimal TCP stack is already a big thingy).

sethammons
0 replies
21h42m

I like a lot of what they've done here. My testing looks a bit different however.

srv, err := newTestServer()

require.NoError(t, err)

defer srv.Close()

resp, err := http.Post(fmt.Sprintf("http://localhost:%d/signup/json", srv.Port()), "application/json", strings.NewReader(` {"email":"test@example.com", "password": "p@55Word", "password_copy": "p@55Word"} `))

In my newTestServer, I spin up a server with fakes for my dependencies. If I want to test a dependency error, I replace that property with a fake that will return an error. I can validate my error paths. I can validate my log entries. I can validate my metric emission. I can validate timeouts and graceful shutdowns.

After the server starts, I inspect to determine which port it is running on (default is :0 so I have to wait to see what it got bound to).

My "unit" tests can test at the handler level or the http level, making sure that I can fully test the code as the users of my system will see it, exercising all middleware or none. I can spin up N instances and run my tests in parallel.

sesm
0 replies
22h5m

TLDR: optimize for unit tests and do DI with explicit function arguments. Looks kind of similar to Dropwizard.

romantomjak
0 replies
22h20m

Great article with lots of interesting ideas. Can't believe I didn't know about signal.NotifyContext. Finally I'll be able to actually rememeber how to respond to signals instead of copy-pasting that between projects.

jurschreuder
0 replies
1h32m

This is 100% not how I write it.

Only thing I agree on is putting all the paths in one file.

In most other programming languages I've done a lot of research how to make it nice and clean.

Was hoping this was it for Go because I'm cleaning up a big project.

But my very basic no nonsense current setup seems better to me than this in many ways.

If anybody has another example that is a lot better (and I don't mean complexer I don't have those ego issues), I am very interested.

But this I want to hide as best as I can from my dev team this is all wrong. It's clever in a lot of ways but it's wrong.

It does not have unit testing at all, all these tests would be duplicated in the end-to-end test.

I also like end-to-end tests better but why put them here, way better to put them in postman for example then you have the most up to date documentation always auto-generated.

Passing the config, man I had so many discussions with junior developers about this, don't do that you'll make things dependent on the config and cannot reuse them in other programs. But that was already mentioned a lot here.

There are also a lot of functions with like 10 arguments passed. If you have that many arguments just pass a stuct containing a lot of the arguments it's always super confusing when people make functions with 12 arguments. I'm always counting them an after 14 times counting I rewrite their function.

It's a matter of style so keep doing it this way if you like it, but it's not my style at all it makes no sense at all to me.

If anybody knows a better example please tell me.

jbmsf
0 replies
21h51m

I don't write go, but I like these patterns. Feels fairly universal for testable code.

I never want to see another (esp. Python) Quick Start guide that treats dependencies as implicit/static/untestable.

bumpa
0 replies
19h54m

The encode example contains a bug and a lint issue. Firstly, calling w.Header().Set after w.WriteHeader is likely a bug, as the w.WriteHeader method call should occur after setting the headers.

The second issue involves passing an unused *http.Request, which will likely cause the linter to flag it.

arccy
0 replies
22h23m

i really like the patterns in this post, pretty much what i've also settled on after much experimentation with different styles.