return to table of content

Take a look at Traefik, even if you don't use containers

sph
95 replies
1d3h

Traefik is pretty cool, but suffers from the same, terrible problem of Ansible: there is a lot of documentation, and a lot of words written, yet you can never find anything you need.

I have used it since v1 and I routinely get lost in their docs, and get immensely frustrated. I have been using Caddy for smaller projects simply because its documentation is not as terrible (though not great by any stretch)

Technical writers: documentation by example is good only for newbies skimming through. People familiar with your product need a reference and exhaustive lists, not explanation for different fields spread over 10 tutorial pages. Focus on those that use the product day in and day out, not solely on the "onboarding" procedure.

This is my pet peeve and the reason why I hate using Ansible so damn much, and Traefik to a lesser extent.

jethro_tell
17 replies
1d

One of the problems that the yaml interpreter class of languages, or whatever you'd call them, suffer from is the fact that yaml itself is a language and tends to be more or less undocumented in the interpreter docs.

It's sort of assumed that you are going to do extremely simple tasks on very flat data structures. That doesn't tend to be the reality that most of us live in. And to really get the most out of these languages you have to understand an entire unspoken set of rules on how to use yaml. That's never really pointed out in the docs.

Additionally, there are docs for the unique settings for each module but as far as using the standard settings, additionally, its rarely clear how to operate on the data that might be returned or combined with anything mildly complex, you are given a dozen 1 stanza examples for each item like a stack of ingredients and then told to bake a cake.

I've had this experience with basically every one of the various yaml interpreter systems I've used.

After a few 100k lines of yaml I can get things done but the docs are useless other than a listing of settings.

cromka
12 replies
19h41m

Isn’t it why toml is seemingly increasingly used to replace yaml in projects?

jethro_tell
6 replies
14h53m

I hope not, toml is even worse at complex things and just slightly better at the stuff that isn't confusing. Add a k:v to a mildly complex dict.

At this point, I'm pushing into a place where I'm just going to switch to go because its getting to be a mess.

vundercind
5 replies
14h23m

It’s insanely better at config.

It’s about as bad at being a programming language or data structure serialization format, though.

tremon
1 replies
2h39m

But then, what is a config file if not a representation of a data structure?

vundercind
0 replies
2h1m

With the restriction that it has to be represented in human-friendly text.

JSON’s a crazy-bad serialization format, too, for that matter. It doesn’t even know what a damn integer is.

jethro_tell
1 replies
13h58m

But yaml is fine at config, it sucks at looping, conditionals and data structures, if you aren't fixing that its just another standard we have to learn, so thanks for that

crabbone
0 replies
4h26m

Nope it isn't.

There are so many things that aren't expressible in TOML that any anywhat complex system will want... it's not even a contender.

So, one problem a lot of configurations are trying to solve: modularity. I.e. how to allow different actors to change the parts of the configuration they want. Everything under /etc nowadays is of a form /etc/*.d/*.* that is all configurations are directories of multiple files with some ridiculous rules (like "prefix file names with digits so that they sort and apply in the "right" order etc.) XML had a better approach with namespaces and schema, but maybe not perfect.

Polymorphism. Any non-trivial configuration system will have plenty of repeating parts. NetworkManager connection configurations? -- They are all derived from the same "template". Systemd device services -- same thing, they are all coming from the same "template". There are plenty more examples of this. But, languages like YAML or TOML don't have a concept of polymorphism in them. This is never encoded in the configuration itself. Instead, every tool that needs to be configured and needs some degree of polymorphism rolls its own version.

Constraints. It's often impossible to describe the desired configuration through specifying the exact values. Often the goal can be described as "less than" or "given the value of X, Y should be multiple of X" and so on. Such concepts, are, again, not expressible in TOML or YAML and friends.

NB. Types are a kind of constraints.

Identity. It's often necessary in configuration to distinguish between two sub-sections that look the same and two sub-sections that designate the same exact object. Like, when configuring VMs with eg. disks: are they supposed to mount the same disk, or does each VM need a separate disk, that just has the same physical characteristics?

mardifoufs
2 replies
15h3m

In my experience toml is worse at anything complex. It's nice as an .ini replacement but makes even yaml look sane in comparison if you want to use it for very complex or deeply nested stuff. But it wasn't designed to do that anyways

vundercind
1 replies
14h22m

Am I alone in greatly preferring nesting in toml compared with yaml?

bvrmn
0 replies
8h11m

Did you try to convert any mid-complexity ansible role into toml? It was very interesting exercise for me and vastly conclusive.

tormeh
1 replies
15h34m

Toml is great for simple use-cases. For complex ones you have the same problem that yaml has: Templating a language with significant whitespace via text substitution is a horrible horrible idea. Somehow this sad state of affairs has become industry standard.

jethro_tell
0 replies
14h41m

Its not even the white space, a food liter or language server can handle that. It's not that, as much ad the fact that the most complex data structure is a list.

If you want to get crazy, you can push a dict into a list and operated on it but it gets tough at the second level. And don't get me started on if/else statements.

ornornor
2 replies
23h52m

To illustrate this point, here is how to have a multi line value in yaml: just kidding, it’s so confusing that there is a whole website to help you figure it out: https://yaml-multiline.info/

hombre_fatal
1 replies
2h8m

Those examples don’t look confusing except there being more than one way.

ornornor
0 replies
2h2m

Good luck remembering which is which :)

tootie
0 replies
3h22m

I honestly wonder why not just write your web server in node or something. It would be traceable and testable and probably performant enough. There's just so much arcana inside platforms like traefik or nginx where they do all this miraculous stuff if you just add the right flags, but also when it doesn't work it's a total black box and there's no way to discover what it thinks it's doing.

lopkeny12ko
13 replies
1d1h

This take is, at best, disingenuous, and at worst, dangerous. The Traefik maintainers and community contributors (including myself) have collectively invested hundreds of man-hours writing and improving documentation, specifically in response to feedback from users that things are hard, unintuitive, or complex.

You are discounting massive amounts of unpaid labor done specifically for people like you. At this point, if you can't find what you're looking for, it's on you. Maybe do a little bit of your own homework instead of throwing your hands up after 2 minutes and crying to the maintainers.

alex_lav
7 replies
1d1h

Investing a lot of time and trying really hard is not the same as adding a lot of value. If your users don't find value in your documentation, saying "But we spent a lot of time on it!" doesn't really change anything.

And, to be clear, I have no idea if the person you're responding to's criticism is valid. But I also know that your response does not negate their criticism at all.

lopkeny12ko
6 replies
1d1h

How about submitting a PR to improve the documentation instead of complaining about it?

joshmanders
1 replies
1d

Customer: I can't find anything I'm looking for in your store.

Store: I spent a lot of time arranging things around in the store, if you can't find what you're looking for you can stop complaining and write signs for us.

Customer: Or I can just use your competitor who actually cares about their customers. ¯\_(ツ)_/¯

nsteel
0 replies
8h3m

Except the relationship between stores, customers and competitors all revolves around money and doesn't make sense in this context.

barfbagginus
0 replies
16h56m

A PR needs maintainer approval. And the maintainer I've seen thinks the documents are good enough already. In cases like that, a PR might not be able to solve the problem.

Complaining about it reroutes people to better projects, and pushes the project to fix the problem.

alex_lav
0 replies
1d

Your adversarialism isn't a good look. Users are allowed to have opinions, this does not mean they have maintain the work themselves.

Thiez
0 replies
1d1h

This is why OSS looks like a cult at times. People are allowed to criticize your project and complain about it. They have no obligation to become a contributor. "Submit a PR" is such a conversation killer.

HellzStormer
0 replies
23h32m

I didn't use try Traefik's documentation, but the complains appear to be somewhat structural. Meaning a PR would need to possibly restructure at least part of the documentation, or add a whole section of documentation of a different type.

You can't expect someone not core to a project to just propose to restructure the whole documentation. Which may also mean changing the website.

And in any case, such overhaul coming from a "nobody" would very likely be rejected as being both too large or incomplete or not desirable.

Re-structuring needs to be pushed for by at least one person from the core team.

So yeah "Just submit a PR" in that context is not an answer, it's an excuse to avoid trying to understand the problem and actually improve the situation.

arp242
2 replies
23h39m

I never used Traefik and have no opinion on it one way or the other as such. But if this is the response to some criticism of the documentation – which you can agree or disagree with, then you've done more to turn me of from Traefik than anything anyone here can write.

pipe01
1 replies
7h36m

Note that the commenter isn't associated with traefik itself

arp242
0 replies
6h53m

Right so; the way it was written implied that at least they had spent significant amount of time on it, at least to me (I tried a quick grep in "git log" to see if I could find them, but couldn't find anyone matching the username here).

halJordan
1 replies
1d1h

Disagree that this isnt a generic problem. And i'll take the same amount of umbrage at you calling it disingenuous. There are dual needs here. Having to read a story and take in a wholly unrelated workflow just to discover only half of the switches available to the feature im looking up is a problem.

And when there isn't just straight documenting of what's been implemented then it is an unreasonable gate to usage which limits customers to only the flows imagined by the technical writer.

Which itself breeds this sort of refusal to participate. Either the end user is ungrateful and needs to express that gratitude through silence or there's a smug moderator who's read everything and knows which paragraph of which tutorial has the answer and harangues anyone asking with a link and a "why didnt you read sentence 5 of paragraph 2 of a tutorial written 2 years and 3 major versions ago?"

patmorgan23
0 replies
2h28m

Yes, those are two different kinds of documents for two different audiences.

You need both a java doc style big 'ol list of every function and what they do, and a narrative/workflow/primer based section of you documention

plantain
8 replies
1d2h

My latest gripe in this category - opentelemetry. Thousands of pages. Very little about actually achieving basic common workflows.

tnolet
1 replies
1d

Oh boy that hits home. Been deep in the OTEL world the last months and the official docs are very, very undercooked.

hooverd
0 replies
15h31m

The best OTel docs I've seen have been from observability vendors. The CNCF needs some volunteer technical writers or something. I think a lot of those docs suffer from being written by people who know the spec inside and out.

silisili
1 replies
22h50m

Same experience. Otel is one of the wordiest docs I've ever come across that says very little.

Further, I found a lot of little bugs that are hard to Google, or when Googling finding open issues that are either known and working on, or no response at all.

I ended up just throwing it in the garbage and using direct connectors. I like what Otel is trying to achieve, but it feels extremely opaque and half baked at the moment.

zaphirplane
0 replies
10h25m

Yes! Otel, is exactly what I was thinking. It’s for people working on it with access to tribal knowledge. For mortals working to add it to their service it’s not fit for purpose

jackthejacky
1 replies
16h13m

Oh man, I FEEL this comment. That was one absurdly awful set of documentation, because they not just have a lot of confusingly placed repeat content, they also follow the philosophy of only explaining top level initial conceptual primer for everything, and only explaining the actual main use case of the component 3 navigation pages deep.

So a beginner has to jump a BUNCH of pages to get a primer, and an expert has to bookmark the couple actually-useful pages and later give up and just look at github for specific operators/processors when they already know the basic config inside out.

hinkley
0 replies
16h3m

Several things I needed were in completely separate documents not linked together.

jdub
0 replies
9h27m

When I started using Honeycomb, I had such a wonderful integration experience with their Beeline SDKs.

Then they transitioned to OpenTelemetry – for very good, justifiable, "good community member" reasons – and yikes, everything got so much more complicated. We ended up writing our own moral equivalent to the Beeline SDK. (And Honeycomb have followed up since with their own wrappers.)

There's so much I love about Open Source, but piles and piles of wildly generic, unopinionated code... ooft. :-)

arkh
0 replies
4h28m

I feel like it is endemic to anything OPS / DevOPS. Lot of uselessly verbose "documentation" but no list of whatever you really need.

All in the name of selling products which abstract those parts, consulting or courses.

linsomniac
7 replies
1d2h

Do not agree WRT ansible, been using it for well over 5 years and usually a google search points me right at the correct part of the documentation to answer my question. Ansible, the tool itself, can be a bit obtuse, largely IMHO because of the YAML source language, so some concepts are hard to translate into the tool, but the documentation has never bothered me.

As far as "a lot of words written, can't find what you need", Fortinet is my poster child there (based on trying to use it a decade ago). Everything I looked up there had 10,20,30 pages of introductory material with the Fortinet stuff spread throughout it.

sph
6 replies
1d2h

Alright, please link me to an exhaustive list of Jinja filters supported by Ansible out of the box. I'll wait.

What you are given is https://docs.ansible.com/ansible/latest/playbook_guide/playb... and you need basically to read/scan each example until you find what you need [1]. Do you call that good, especially when these are basically the only way of doing anything a little complex? That's a sure way of killing my flow and productivity in its tracks. I have been through this page in anger a dozen times, and I still have no idea what Ansible filters can or cannot do.

Also, using Google to find stuff is "cheating". The goal of documentation is to be able to use it as reference; if you need an external tool to find anything in it, that defeats its purpose a bit. When people wrote documentation books, they had to make sure it's usable, legible and efficient. These days apparently that's become lost art.

1: these examples are not even exhaustive, because they don't list all the builtin Jinja filters; chances are that what you need isn't listed on that page, but you should instead refer to https://tedboy.github.io/jinja2/templ14.html

throwaway984393
2 replies
1d2h

I like that it forces users to read the docs to find the functionality. Users don't read the docs, and then they wander around the internet looking for a random blog post with a snippet for one problem, and they don't ever really learn how to use the program.

Users are a bit like high school students just skimming books for an answer to fill in on a test. They need to be forced to learn.

freedomben
1 replies
1d1h

This doesn't make a lot of sense in the context of the parents. Did you post this to the wrong parent? To accomplish what you are asking, a project needs actual good documentation. Everyone has agreed that is good. The only real disagreement here is whether Ansible docs have this, and regardless whether they do, they definitely have the example-driven docs that I think you are saying you don't think should exist, so you definitely aren't supporting the Ansible status quo.

coryrc
0 replies
15h25m

The rest makes more sense if you assume parent post meant to write "read the code".

linsomniac
0 replies
1d1h

As you say, Ansible's filter list does not include the base Jinja2 filters, which I guess is a difference of opinion. I feel that is preferred to reproducing the Jinja2 documentation, especially as the Jinja2 filter list is the first (non-TOC) link on the page.

Also going to disagree about "using Google is cheating". The purpose of documentation is to help me get stuff done. The Internet is not printed on dead trees, I don't want to read through a TOC or index looking for what I want when I'm searching, I want to use a search engine. I often don't want a reference, I want to quickly find how to do something. I rarely want to read about all the filters, instead I want to find the even/odd filter, or the the default or omit filter. Yes, sometimes I want to brush up on all available filters, but that's rare.

freedomben
0 replies
1d1h

I'm not GP, but I agree with both them and you so thought I'd chime in.

You're absolutely right that there are big omissions/holes in the Ansible docs, but I also think that using Google is not "cheating." My ideal of great documentation sounds like exactly what you would agree with: A complete and comprehensive "book" (could become a physical printed book, but needn't have to as it should be equally usable with good old-fashioned hyperlinks). It should have a logical flow, introductory sections to describe pre-requisite knowledge/concepts and things that are broadly applicable to the project as a whole. It should have a table of contents, and it should definitely have an index and comprehensive lists/tables of API details such as available field/properties, which options are valid (for enum fields), etc. Your example of Jinja filters supported by Ansible is a great one. I really miss the 90's era here where such manuals were common practice, even for things like PCs.

With that ideal described, though, I think it's important to recognize pragmatism and feasibility. Documentation takes time and money to produce. Search tools (including Google) already exist and can provide a valuable addition without spending time/effort on it, so I think they should be used. That said, I agree that it's not a good idea for doc writers to rely on that for things to be found! Table of contents, logical flow, and indexes should absolutely be thought through. If the documentation is just a bunch of random unorganized and uncatalogued pages that can only be found with a search engine, that is really bad and they should feel bad.

I think Ansible falls right in the middle there. It undoubtedly has some real glaring omissions/holes in it, but it is also not nearly the worst I've seen as well. I do dread having to go the Ansible docs though, which is an indictment against their quality, and the more I think/write about this the more I agree with you lol.

SoftTalker
0 replies
3h40m

Maybe you use ansible in a more complex environment than I do. I use ansible to provision servers, and jinja2 filters are some of the least-used things for me. I try to keep my ansible roles short and simple, and needing obscure jinja2 filters is a clue to me that I might be getting too fancy.

That said, when I have a question about jinja2, I find that the jinja2 docs are better than the ansible docs.

johanbcn
6 replies
1d1h

Technical writers: documentation by example is good only for newbies skimming through. People familiar with your product need a reference and exhaustive lists, not explanation for different fields spread over 10 tutorial pages. Focus on those that use the product day in and day out, not solely on the "onboarding" procedure.

I agree. We all would benefit by giving more exposure to documentation frameworks such as https://diataxis.fr

FelipeCortez
0 replies
4h5m

These are "the same". Diátaxis is the fork the author made after (I presume) they left Divio.

yread
0 replies
9h41m

MSDN also follows these principles

molszanski
0 replies
5h31m

Thank you for the link <3

adolph
0 replies
1d

I'm glad to have clicked through for curiosity's sake. Diátaxis is tremendously interesting.

For folks who might recognize the author's name:

Daniele Procida: Director of Engineering at Canonical. Creator of Diátaxis and BrachioGraph. Django core developer. Fellow of the Python Software Foundation.

deepersprout
5 replies
1d2h

I usually feed all the documentation to an llm and ask the question to the llm that fetches the info with rag. For example:

how do i configure Traefik to proxy traffic to two backend services?

Configuring Traefik to proxy traffic to multiple backend services is a common use case. You can achieve this by defining multiple entry points and routers in Traefik's configuration, each pointing to a different backend service. Here's a basic example of how you can configure Traefik to proxy traffic to two backend services:

  yaml
  
  # traefik.yaml
  
  # Enable API and Dashboard
  api:
    dashboard: true
  
  # Define entry points
  entryPoints:
    web:
      address: ":80"
    app1:
      address: ":8081"
    app2:
      address: ":8082"
[...more stuff...]

san-fran
2 replies
1d2h

This might be an additional way to quickly iterate on the content of the documentation itself. If I were the one writing, an easy test is passing the documentation to a lay person and asking them if they have what they need to perform X by following the documentation.

Perhaps having a focused LLM generate the steps could help catch some documentation deficiencies.

freedomben
1 replies
1d1h

If I were the one writing, an easy test is passing the documentation to a lay person and asking them if they have what they need to perform X by following the documentation.

What kind of documentation is this though? Is this how to bake a cake or tie a necktie, or is it how to setup a reverse proxy for the services in your k8s cluster?

If it's something a lay-person could do then I think this is a good strategy (though depending on the size/scope of the project/documentaiton it does seem like a pretty big effort to undertake without compensation), but if it's something highly technical like Traefik, I expect a lay-person to not even understand half the words/vocabulary in the documentation, let alone be able to perform X by reading it and following it.

all2
0 replies
1h15m

If we push this into the software development domain, my expectation would be something like "documentation should allow a software developer to do this thing without knowing the underlying tooling".

how to setup a reverse proxy for the services in your k8s cluster

Going off this specifically. I don't know how to do this. I actually have a k8s cluster on a home server waiting for me to do exactly this. Ideally there would be a doc that would start with ingress selection, and then guide a user through how to get it set up with common use-cases. Or something like that. Like others in this conversation, I've been leveraging LLMs with varying degrees of success to try and navigate this.

freedomben
0 replies
1d1h

Can you describe more on your process? Which LLM are you using? Are you doing soething specific to make it us RAG or is that automagic (might be obvious depending on which LLM you are using but)? How do you feed the documentation in? for example, when the documentation has more than one page, how do you get that content in to the LLM? Is it part of the prompt or something you've tuned it on? have to clone the docs site, turn it into plan text and feed that in to the prompt or can you pass a URL and have it crawl ahead of time or something?

This is the system I've been dreaming about but haven't had time to dig into yet. I've got ollama and openwebui set up now though, and with OpenAI getting bigger context windows it seems like it might be possible to inject the whole set of docs into the prompt, but I'm not sure how to go about that

bsenftner
0 replies
7h7m

I did this with Tailscale, which I have endless problems getting to work reliably. Their documentation is a joke. The process is pretty simple: scrape, than concatenate into a large text file and submit.

ollien
3 replies
5h44m

Pydantic falls into this box for me. The maintainer refuses to build API reference documentation, as they feel that there should only be one source of information. It's their project, of course, but every time I need to find a method on an object, I am scouring pages of prose for it. Sometimes it's just easier to read the source.

angra_mainyu
0 replies
2h33m

Haproxy does the whole documentation side of things very well.

The docs are very straightforward and thorough.

DanielHB
3 replies
4h37m

same reason why Terraform AWS Provider is better documentation than AWS documentation

https://registry.terraform.io/providers/hashicorp/aws/latest...

If I can't find the answer to what I need there I usually resort to LLMs, they are surprisingly good and fetching the info you need out of these massive documentations. The failure rate is quite high though so a lot of trial and error required, but the LLM at least gives you some hints to where to look for it.

danielvaughn
2 replies
4h34m

My primary use case for LLMs so far has indeed been to avoid terrible technical documentation.

walthamstow
1 replies
4h31m

I'm on the lookout for an LLM that is relatively small and focused on technical documentation. I don't need it to write prose or haiku, just answer my questions based on documentation

danielvaughn
0 replies
3h36m

Yep. Build it into an editor and I'd be the first customer.

rezonant
2 replies
8h23m

Technical writers: documentation by example is good only for newbies skimming through. People familiar with your product need a reference and exhaustive lists, not explanation for different fields spread over 10 tutorial pages.

Well said, this extends far beyond Traefik. Far too much documentation these days is tailored for people who have never used software of it's type. This was a workable strategy during the Great Developer Boom, but that's more or less over now.

As a developer who didn't come from this Boom, I have been constantly frustrated by this trope, and I hope the changes in the industry will tip the scales back toward solid reference documentation, so that I can feel confident in deploying more of these technologies.

Putting that more general note aside, I have been a Traefik user for years and I do recommend it. But a lot of what it does is difficult to cite using solid docs.

sharperguy
1 replies
8h11m

I've often resorted to just looking through the project on github and finding whatever source file is responsible for parsing the configuration files to figure out what each option does.

acheong08
0 replies
7h29m

This applies nearly universally in open source. If documentation isn’t sufficient, jump into the source code and most things can be figured out. Reading other people’s code is also a great way to learn/experience different approaches

samuell
1 replies
8h58m

Ansible is definitely requiring constant lookup in the documentation.

I've found a pretty good workflow with using ansible-doc though, with two-three aliases that I use constantly:

    alias adl='ansible-doc --list'
    alias adls='ansible-doc --list | less -S'
    alias ad='ansible-doc'
Then I'll:

1. Use adls to quickly search (in less with vim bindings) for relevant commands,

2. Check up the docs with `ad <command>`.

3. Almost always immediately jump to the end (`G`) to see the examples, which typically provides a direct answer to what I need.

Since authoring Ansible scripts is so dependent on the docs, I think they really should make this process work better out of the box though, providing some interface to do these lookups quicker without a lot of custom aliases.

pragma_x
1 replies
3h35m

Possibly unpopular opinions to follow. This is made even worse by:

- Having documentation split between v1 and v2, that is similar yet different enough to yield half-baked configurations before you realize what you did wrong. The website itself provides the barest of subtle changes to distinguish the two. Edit: I learned all this prior to v3.

- Supporting multiple config formats (TOML and YAML) which makes it that much harder to hunt down relevant forum posts for support. That wouldn't be a huge problem if it weren't for things that you need that aren't in the documentation (above)

- Multiple configuration _modes_. You can set options on the CLI, or in config files, and they are not 100% reflected between the two; some things must be in config files no matter what. Config files themselves are split between "dynamic" and "static" configs, and you must place the correct options in the right file.

- The one thing that Traefik does well is routing traffic to/from containers. Container labels are used to drive this. How to map those label namespaces to various internal and user provided aspects of Traefik is unclear in the docs, at best.

- Traefik exposes a dashboard web GUI by default. Yet much of the useful diagnostic and troubleshooting information is only found in the container logs.

Retiring v1 completely, picking a single configuration language/mode, and providing a rich library of example docker-compose configs, would go a very long way to securing this project's future.

renk
0 replies
3h5m

The documentation split is unfortunate and the GUI really just a status page. The other points are a strength. A pattern that works well: Put all Traefik config in your Docker container definitions, as command line flags and labels, plus the dynamic config provided as a volume. That gives you all the flexibility and only one or two places to look for the config (e.g. a Docker compose file and the dynamic config file)

mubu
1 replies
5h48m

I share the same sentiments. I dread having to go through Ansible docs because it's so densely packed. Meanwhile Caddy's docs feel too sparse, and too many spread out tutorials. The reference isn't well thought out either imo.

mholt
0 replies
3h7m

We're revamping the Caddy docs this year.

mholt
1 replies
1d2h

Funny you say that because we don’t have nearly any examples in the Caddy docs. We’re working on improving them later this year.

sph
0 replies
1d2h

Examples are good in docs. But documentation that's only made of examples and tutorials... not so much.

Thanks for Caddy btw. Neat little tool.

throwfaraway398
0 replies
1d2h

It's funny because one thing I like about ansible is how easy it is to get the reference doc for any module with `ansible-doc -t module`.

I do sometimes struggle to find the right doc when I'm searching for something about ansible core itself, but that doesn't happen too often.

scrubs
0 replies
16h15m

Oh man are you on to something!!! One huge, bad side effect of web is the atomization of an overall body of work into 62.9 million links.

One pdf please. The book concept works!

You know who's docs blow too? Mellanox. I hate their stuff.

And to give credit where due: intel does a damn good job.

molszanski
0 replies
5h35m

This is strange. I also don't like the docs but for a different reason.

I would rather have a more examples. And kinda _advanced_ and complex, rather than trivial we see in the docs.

Even though I had a working V1 configs and had a know-how about lingo / architecture like routes / services I still struggled for a day or two to properly configure a pretty simple workflows in v2 like:

* add TLS with LetsEncrypt

* configure multiple domains

* configure multiple services

* add Basic Auth for some domains

That said, more detailed and extensive docs would be much better.

I also remember finding things in github issue comments that worked as bugfix/workaround of something from the docs.

PS. For now I've moved to Caddy for simplicity and better Caddy DSL compared to yaml/label verbose config.

lamontcg
0 replies
1d

Technical writers: documentation by example is good only for newbies skimming through. People familiar with your product need a reference and exhaustive lists, not explanation for different fields spread over 10 tutorial pages. Focus on those that use the product day in and day out, not solely on the "onboarding" procedure.

You really need at least three documentation targets:

- onboarding the newbies workflows/tutorials - intermediate "focus on the important bits" workflows/tutorials - exhaustive references

There might be other useful ones as well, but I never see those three hit at the same time adequately.

jq-r
0 replies
3h21m

Those are great points. Even the page layout of the documentation is terrible. Whywe have huge monitors and millions of pixels if I have to read content from a very narrow column, which is a mile long?

Eg: https://doc.traefik.io/traefik/routing/services/

If you visit that page in your desktop browser you'll get less words per column then seeing this on the iPad (works even in dev tools window). Mind blowing.

igor_varga
0 replies
1d1h

I'm using the Traefik and have the same experience with the documentation. It can be time consuming to configure it properly if you are not a power user.

I'm happy with it though, it's a great piece of software. I wonder is there any other product out there with a similar feature set?

hinkley
0 replies
1d

Some projects need documentation, some need cookbooks. Sounds like traefik is the latter.

Hopefully as an aside (I know very little about traefik so maybe I am talking about them too and don’t know it), it seems like in the time since I abandoned Java they have weaponized that architectural strategy and I have no patience for it. I look at that sort of documentation and my eyes glaze over. Or if they don’t I feel disgust or anger and all three result in my stomping off.

Opentelemetry, particularly the stats code (vs the span code) triggered a lot of this in me. It has several sets of documentation that say different things. It took me a long time to figure out how to connect the amorphous dots, and then I didn’t entirely agree with their solution anyway.

crabbone
0 replies
4h48m

Just another one for collection: conda. Especially the parts about conda-build, meta.yaml etc. There are only examples w/o any way to tell what's available. And the source code is frustratingly twisted, undocumented and all over the place. Something that makes creating conda packages an extremely frustrating experience, to the point that it's significantly easier to create archives and write the metadata by hand than to rely on conda tooling.

cdelsolar
0 replies
22h45m

If only there were a program that had crawled bazillions of documents, including all of the traefik documentation, examples, and thousands of code files using it, and if only said program were especially designed to answer natural-language queries about said documents.

bshacklett
0 replies
23h21m

This was exactly my experience. It’s incredibly frustrating to search documentation only to be stuck with examples that are related, but don’t fit one’s exact situation, and don’t explain the underlying behavior.

arendtio
0 replies
9h28m

I agree that the documentation could be better, but it isn't that bad. I enjoyed all the gophers, and these images really helped me understand the structure.

However, I find it amusing that you wish there was a better reference. I think getting to the initial setup is quite hard. Once you have that, extending it is straightforward.

SoftTalker
0 replies
3h47m

I don't think ansible docs are that bad.

I use duckduckgo and adding !ansible to my search usually gets me what I need pretty directly.

Propelloni
0 replies
3h33m

From this point of view the Oracle RDBMS handbooks ca. 1998 were pretty good. Come to think of it, they were pretty good all around.

Fire-Dragon-DoL
0 replies
18h7m

I want both, in the same page if possible, for every possible permutation of input arguments. In theory ansible does this, but then it doesn't link to "you might use it in combination with...", essentially, it lacks integration of multiple things in the reference docs. but I didn't find ansible docs that bad? Most of the time I search module name and find the reference doc

engine_y
18 replies
1d2h

We've been using Traefik in prod for 2 years. While I used NGINX in the past, I decided to migrate to Traefik mainly because of the automatic let's encrypt integration. I am sorry for that decision. Traefik's documentation does not make sense to me or my team. It is finicky and misbehaves without proper logging. As an example - when I want to recreate the certificates - it fails sporadically leaving prod down for an indefinite amount of time.

We're moving back to NGINX.

Gormo
4 replies
3h2m

You might be happy to know that integration between Let's Encrypt and Nginx is something that's been provided by Certbot for years. The Nginx plugin for Certbot will identify active domains from your Nginx config, create and renew certificates, automating domain validation through the web server in real-time, and will automatically update your config files with both certificate paths and HTTP redirects to HTTPS (if desired).

dlbucci
2 replies
2h31m

Which is what I used for years, but recently discovered that Certbot now requires snapd to be installed. I did that and snapd bricked my server: it wouldn't start until I uninstalled it. That's when I switched to Caddy.

Gormo
1 replies
2h12m

That's very definitely not true. Perhaps they're defaulting to Snap for convenience, but Certbot is a cross-platform Python program, and can just be installed via pip: https://certbot.eff.org/instructions?ws=nginx&os=pip

Non-Ubuntu distros also often have standard packages in their repos with no reference to Snap, and EFF also distributes a Docker container with Certbot pre-configured, if Docker is your thing.

dlbucci
0 replies
59m

I wasn't aware of that. It was true for my version of Ubuntu (18), according to the website: https://certbot.eff.org/instructions?ws=nginx&os=ubuntubioni...

Perhaps I had other options the website didn't make me aware of, but it seemed like enough of a hassle that I just dropped it.

engine_y
0 replies
2h14m

Tnx. That's helpful to know.

spyspy
3 replies
1d1h

I’ve always just used go’s built in reverse proxy if I need an API gateway. You can adapt it to meet any specific need, easily find libraries to do common tasks (CORS, rate limiting, retries, etc), and the best part: no configuration language. You just write go.

jimmyl02
1 replies
1d

curious what are the performance characteristics here? I would assume something like Nginx that has been optimized over a longer period of time / a more specific use case would have non-negligible performance benefits at scale?

spyspy
0 replies
18h23m

Not everything needs to be at “scale”. I’ve deployed this pattern over 10k req/sec but it’s all about your SLOs. I’ve (thankfully) never needed to lose sleep over a millisecond or 2 in my line of work.

hellcow
0 replies
3h31m

I did the same thing. After some bad downtime from Traefik introducing breaking changes in a point release, I decided to write my own.

My reverse proxy offered a service mesh, live config reloads, managed TLS certs, and automatically rerouted traffic around down services. The whole thing was a few hundred LOC anyone could understand in its entirety. It ran in production for years unchanged and never caused an outage.

dirkt
1 replies
10h25m

I decided to migrate to Traefik mainly because of the automatic let's encrypt integration.

You probably already know and maybe it didn't work for you, but there's quite a few Docker companion containers that automate let's encrypt certs for an nginx Docker container.

ap-andersson
1 replies
11h19m

I have moved to Traefik from NGINX aswell because of the built-in support for DNS challenge and wildcard cert. I myself spent many hours trying to get it working for my domain I use at work. I used the same config I use at home (which works perfectly) but could never get it to actually do anything, even though the setup was identical. Same domain registrar with same API based on the same docker configs etc. Had all logs enabled and still I get no information what so ever about why my certificate could not be created. It simply defaulted back to its generated cert without trying it seemed. After two troubleshooting sessions and several hours of searching and troubleshooting I had to admit defeat and just use my own self-signed cert files. Very frustrating when you get no information about why it doesn't work. Just a silent failure and fallback.

Overall that has been my biggest problem with traefik. Its awesome when it works, but when it does not I always seem to have problems troubleshooting and/or finding the information I need in the docs.

At work we will start using Traefik in prod towards the end of the year. I hope Traefik and I will become better friends before that :)

Gormo
0 replies
2h18m

I have moved to Traefik from NGINX aswell because of the built-in support for DNS challenge and wildcard cert. I myself spent many hours trying to get it working for my domain I use at work.

Certbot has plugins that directly support many DNS registrars, and can automate configuration of Nginx. Using, for example, the CloudFlare plugin for DNS validation combined with the Nginx plugin for local config would solve your problem readily.

timcambrant
0 replies
5h51m

I use NGINX and Traefik in prod at work, and for my personal stuff I only use NGINX. It's all just orchestrated containers, no ingress controllers or similar magic anywhere.

I agree with your comments about Traefik being finicky, and would like to add that my very basic inhouse solution to do automatic Let's Encrypt integration (that also works with other ACME compatible CAs) is ~30 lines of bash, which is ran by cron every day. It's rock solid simply by failing hard when standard return codes fail. Monitoring for failed certificate renewals is as easy as handshaking with the endpoint and parsing the NotAfter field in the OpenSSL output. I run this as part of my regular HTTP endpoint monitoring solution at it tells me if any certificate will expire within 14 days.

The absolute worst failures I've experienced is having new domains start with a self-signed certificate until I reloaded nginx manually, and that I had 2 weeks to jump in and sort out some error because a certificate renewal failed.

So at least in my experience it turns out that LE-integration isn't a strong selling point. Logging and ease of configuration is. NGINX is not perfect in those aspects either, but it is a bit more robust and well-documented at least.

sharperguy
0 replies
7h59m

This is one area where I've found nixos to be really helpful. I can set this up with just adding some lines to the configuration.nix (which uses lego(1) and letsencrypt in the backend):

  security.acme = {
    acceptTerms = true;
    defaults.email = "admin-email@provider.net";
    certs."mydomain.example.com" = {
      domain = "*.mydomain.example.com";
      dnsProvider = "cloudflare";
      environmentFile = "/path/to/cloudflare/password";
    };
  };
  
  services.caddy.enable = true;
  
  services.caddy.virtualHosts."subdomain1.mydomain.example.com" = {
    extraConfig = ''
      reverse_proxy 127.0.0.1:1234
    '';
    useACMEHost = "mydomain.example.com";
  };

Configuring with nginx is also fairly similar I think.

1. https://github.com/go-acme/lego

boesboes
0 replies
7h21m

I've considered Traefik for that too. We had 1800+ domains, so automatic TLS would be useful. But the OSS version didn't have good options for certificate storage imo.

Ended up using nginx and adding a .well-know/certbot endpoint orso that used lua to call certbot. Some bash, rsync & nfs for config management, never had an issue with it. Not fully automated, but close enough. And very debuggable!

belthesar
0 replies
2h15m

I had much of the same issues early on in my Traefik experience. Things like using TLS-01 validation but not having DNS records set before config was applied would cause a lot of frustration. Like you, I was frustrated with the amount of logging I was getting. I eventually learned that not having DNS configured appropriately would lead validation attempts to fail after N unsuccessful attempts, and LE would refuse to do another TLS-01 validation for a while, which sounds like the kind of issue you were having.

After moving to DNS-01 validation, which comes with the added benefit of letting me cut certs for services that aren't publicly exposed with way less orchestration required than with TLS-01 style validation, my experience was suddenly much better. Assuming the DNS provider is working (and if it's not, you're hopefully getting an API error from them before LE attempts to validate the record, the failure state happens well before any check failure backoffs happen at LE. At this point, regardless of whether I'm using Traefik, Caddy, Nginx, or any other reverse proxy, I'm pretty committed to only using DNS-01 based validation from LetsEncrypt from now on, or if I have to do TLS-01 based validation, to make darn sure things are right the first time with the Staging API first.

Which, speaking of, if you cut a Staging cert with LE via Traefik, there's no good way to invalidate the staging cert. You have to munge the ACME JSON to remove the cert and restart Traefik (could maybe do a SIGHUP? didn't try) to get it to pickup the changes.

All said, lots of weird silent failures and behaviors, but the biggest pains are making dependent service errors opaque.

2NGINXSUX
0 replies
14h35m

We’ve been using Nginx is prod for 3 years. While I used Traefik in the past, I decided to migrate to Nginx mainly because of its scriptability (Traefik plugins suck). I am sorry for that decision. Nginx’s documentation is absolute trash full of non-explanations (far worse than Traefik or Caddy. It’s finicky and misbehaves constantly. Lurking around every corner is a decision from 1995 sticking around in 2024; Nginx can barely function on the modern internet without _significant_ tuning.

On top of it, the OpenResty community must be the rudest, most entitled people in the entire internet. Have a question, “YOURE DOING IT WRONG IDIOT” is the response. Of course every terrible decision they’ve made they justify with “BUT THE PERFORMANCE” as that’s the only thing worth considering.

We’re moving back to Traefik, or Caddy, both still in POC.

iansinnott
14 replies
1d5h

Traefik is more comparable to HAProxy than to nginx/caddy/apache2

Aren't caddy and traefik fairly comparable? I've only used them both lightly so I may be missing the core point of each, but I thought of them as very similar.

mkesper
6 replies
1d4h

Caddy is at the same level as nginx/apache. It is able to do everything a web server is expected to (serving web sites, files and proxying services) plus handling LetsEncrypt automatically. It does not, afaik, do dynamic service discovery like traefik nor load balancing of TCP at the protocol layer, like e.g. haproxy. https://caddyserver.com/features

lmeyerov
2 replies
1d2h

We are long-time fans of Caddy, preferring it over traefik + nginx especially for our docker-compose flows.. though it's fair to distinguish 'can' vs 'easy to do'

E.g., we can imagine writing or using a plugin to figure out some upcoming fancy sticky session routing logic based on routes/content vs just the user IP, but there are easier and more 'with the grain' solutions than with what Caddy exposes today, afaict

(Agreed tho: The reverse proxy module, for more typical cases, is awesome and we have been enjoying for years!)

francislavoie
1 replies
18h46m

Sticky sessions are supported: https://caddyserver.com/docs/caddyfile/directives/reverse_pr..., and yes it's pluggable so you could write your own LB policy. Very easy, just copy the code from Caddy's source to write your own plugin. Let us know if you need help.

Also yes, Caddy does service discovery if you use https://github.com/lucaslorentz/caddy-docker-proxy, configuration via Docker labels. Or you can use dynamic upstreams (built-in) https://caddyserver.com/docs/caddyfile/directives/reverse_pr... to use A/AAAA or SRV DNS records to load your list of upstreams.

lmeyerov
0 replies
10h13m

`query [key] ` A+++, I missed when this got added, amazing!

(this basically solves the sticky shared chatroom websocket problem for us when routing to resources with gravity, in our case, multiuser data science notebooks!)

jiehong
0 replies
8h51m

caddy-l4 still shows:

     This app is very capable and flexible, but is still in development. Please expect breaking changes.
And it does not seem to be really seeing many new commits recently, so it feels pretty much "beta" at this point.

thinkmassive
3 replies
1d5h

Caddy is primarily a web server like nginx and apache httpd. Traefik and HAproxy are primarily reverse proxies.

mholt
2 replies
1d4h

Caddy is actually used as a reverse proxy more than a static file server. It's equally excellent and proficient as both! Caddy's functionality is comparable to nginx, apache httpd, and haproxy.

indigodaddy
1 replies
1d2h

And while we’re at it, it can even forward proxy recentlyish I believe?

justusthane
1 replies
1d5h

The rest of the sentence you quoted explains that nginx, Caddy, and Apache are all webservers (which can also reverse proxy). Traefik and HAproxy are only reverse proxies and not webservers.

IggleSniggle
0 replies
1d3h

HAProxy can be a web server though, albeit it is not designed to operate this way and thus requires some goofy configuration to make happen. I only know this because it was useful for me while working on a HAProxy extension.

candiddevmike
0 replies
1d4h

Traefik can't serve static files, or interact with CGI providers like PHP.

simonw
2 replies
1d4h

If what you've got already works then no, I don't think you would see any benefit from switching.

The moment you need a feature which Traefik provides that isn't in Nginx is when I would consider the switch.

treyd
1 replies
1d

But what features does Traefik have that nginx doesn't?

simonw
0 replies
14h21m

I believe the biggest are automatic Lets Encrypt certificates and the ability to discover services and route to them based on things like Kubernetes labels.

aedocw
2 replies
1d3h

I think https://github.com/caddyserver is the best option here. Automatic handling of SSL certs, it's incredibly lightweight, and has super clear config syntax.

withinboredom
0 replies
12h15m

If only the caddy ingress were done. I’ve been waiting years for it.

hoistbypetard
0 replies
23h36m

That’s exactly the situation I like Caddy in also.

johnchristopher
1 replies
1d4h

I like traefik hot reload (among other things). Want to hide a service (the proxied app), a new route (a router in traefik terminology), a middleware (basic auth, https redirection, headers manipulation) ? Just drop the file and it gets automatically picked up, no need to reload traefik or that vhost.

Truth is: I don't like nginx syntax and traefik is/was shiny :]. I went in for the LE renewal and containers, I stayed for the configuration style.

drdaeman
0 replies
1d

It’s not that nice in practice. Traefik until 3.0 (which was released just a few days ago) wasn’t been able to reload TLS certificates under some circumstances: https://github.com/traefik/traefik/pull/9993

Built-in ACME support doesn’t work for me, so I still have some `systemctl restart traefik` hacks here and there.

treyd
0 replies
1d

Yeah I agree with this. Nginx config is easy and you can just set it and forget it. Most of the time you're copypasting from other configs you already have anyways. Automatic LE is kinda a strange selling point when Certbot is available everywhere and supports more scenarios. Traefik's and Caddy's selling points just don't make any sense to me because they don't make anything easier than the alternatives that are already widely supported.

navels
0 replies
1d

I also use NginxProxyManager (8 hosts) and I'm not seeing any replies to your post that would explain why caddyserver or traefik provide any benefit over NPM.

blinded
0 replies
20h36m

metrics with non enterprise nginx are very limited.

aaomidi
0 replies
1d4h

Traefik does certificate management for you

psYchotic
10 replies
1d5h

I'm considering moving reverse proxying to Traefik for my self-hosted stuff. Unlike the article's author, I'm running containerized workloads with Docker Compose, and currently using Caddy with the excellent caddy-docker-proxy plugin. What that gets me, currently:

- Reverse proxying, with Docker labels for configuration. New workloads are picked up automatically (but I do need to attach workloads to Caddy's network bridge).

- TLS certificates

- Automatic DNS configuration (using yet another plugin, caddy-dynamicdns), so I don't have to worry too much about losing access to my stuff if my ISP decides to hand me a different IP address (which hasn't happened yet)

There are a few things I'm currently not entirely happy about my setup:

- Any new/restarting workload makes Caddy restart entirely, resulting in loss of access to my stuff (temporarily). Caddy doesn't hand off existing connections to a new instance, unfortunately.

- Using wildcard certs isn't as simple as it could/should be. As I don't want every workload to be advertised to the world through certificate transparency logs, I use wildcard certs, and that means I currently can't use simple Caddy file syntax I otherwise would with a cert per hostname. This is something I know is being worked on in Caddy, but still.

Anyway, I've used Traefik in k8s environments before, and it's been fairly pleasant, so I think I'll give it a go for my personal stuff too!

PS: Don't let this comment discourage you trying Caddy, it's actually really good!

remram
4 replies
1d4h

Those are giant limitations. This is the first I hear of any reverse proxy that has to restart and drop connections to update configuration. That is usually the first, most fundamental part of any such server's design.

mholt
1 replies
1d4h

That is absolutely not the case. Caddy config reloads are graceful and lightweight. I have no idea why this person is stopping their server instead of reloading the config.

remram
0 replies
1d

That makes more sense. Maybe something with the Docker plugin? That or GP messed up.

IggleSniggle
1 replies
1d4h

Caddy doesn't have to restart, I think it's related to the specifics of their setup. The simple/easy path that gets a lot of people into caddy has a workflow that's more like, run caddy, job done. The next level is, give caddy super simple configuration file, reload caddy with "caddy reload --config /etc/caddy/Caddyfile". After that, you use the REST API to make changes to the server while it is running, which uses a JSON configuration definition instead of a Caddyfile, so it ends up being a jump for users.

m_sahaf
0 replies
1d3h

After that, you use the REST API to make changes to the server while it is running, which uses a JSON configuration definition instead of a Caddyfile, so it ends up being a jump for users.

You can, in fact, use any configuration format with the API as long as Caddy has its adapter compiled-in; you just have to use the correct value in the `Content-Type` header. For instance, you can use Caddyfile format using the `text/caddyfile` value in `Content-Type`. This is documented[0].

[0] https://caddyserver.com/docs/api#post-load

Cyykratahk
1 replies
1d4h

I've used caddy-docker-proxy in production and it doesn't cause Caddy to drop connections when loading a new config.

I just tested it locally to check and it works fine.

psYchotic
0 replies
1d3h

Hmm, I'll have to take a better look at my setup then, because it's a daily occurrence for me. Either I'm "holding it wrong" (which is admittedly possible, perhaps even likely given the comments here), or I have a ticket to open soon-ish.

sureglymop
0 replies
1d1h

I use (rootless) docker compose + traefik. Precisely because for wildcard certs it was really painless. Although I use my own DNS server and use RFC2136 DDNS for the LE DNS challenge. No plugins needed, really. I have basically one ansible playbook to set all this up on a vm including templating out the compose files. Then another playbook that can remove everything from the server again (besides data/mounts). For backups I use restic with a custom script that can back up files, different dbs etc to multiple locations.

In the past I deployed k3s but I realized that was too much and too complicated for my self hosted stuff. I just want to deploy things quickly and not have to handle the certs myself.

mynegation
0 replies
1d4h

I have not used Caddy, I use traefik and it discovers docker properties for configuration and TLS certificates with auto update. Not sure about dynamic DNS - I do not use it from Traefik. Adding and removing containers does not need a restart AFAIR.

eropple
0 replies
1d3h

I use Caddy for single-purpose hosts and the like, but I 100% would throw Traefik at the problems you're describing--and I do, it's my k8s cluster ingest and it runs in my dev environments to enable using `localtest.me` with hostnames.

It's worth kicking the tires on. Both are great at different things.

wg0
9 replies
1d4h

Side question - what people use to hide (and make accessible) the internal services such as grafana, prometheus, rabbit mq (the web interface) and such?

Should they be public behind such a proxy? (seems odd) Or should they be totally internal and then setup a Wireguard VPN to reach them?

withinboredom
0 replies
12h2m

They are open to the internet but each ingress is using the “external auth” feature of nginx ingress, pointing to our internal login. There’s no vpn or magic ip addresses. Once you’re logged in, you can access whatever you need.

waldrews
0 replies
1d4h

Serve them on a firewalled port, then: 1) VPN if you need to expose them to multiple trusted users, 2) firewall rules to make them accessible to your IP range, or (probably easiest), 3) access them by ssh tunnel.

pyr0hu
0 replies
1d4h

We use tailscale for this exact use case and has been working flawlessly so far. You can even set up ACL lists as a firewall.

nullify88
0 replies
1d

For the purposes of some of my self hosted stuff, I wanted to see how far I could go without VPN and instead use mutual tls authentication with my stuff exposed to the internet. Client certs are issued by cert manager in my k8s cluster and traefik does my TLS Auth.

mrj
0 replies
1d1h

Cloudflare tunnels are super convenient and provide lots of auth mechanisms. If you set up a tunnel using cloudflared and proxy the IP through cloudflare, there's nothing even exposed directly to the internet. You can even have different auth requirements for urls (like /admin) or punch holes for stuff like webhooks.

I have set up quite a few as kubernetes pods that direct to private hostnames in different namespaces and pretty happy with it for internal apps.

blinded
0 replies
20h35m

zero trust, host firewalls, mtls, ssh tunnels, bastion hosts.

John23832
0 replies
1d4h

From the internet? Drop them at the ingress level (if using kubernetes). You could also do some ip filtering. Then use an internal proxy (or internal ip of some kind) to reach them.

For proof of concepts, I use cloudflare tunnels which allows you to add ACLs to particular routes.

Hrun0
0 replies
1d4h

what people use to hide (and make accessible) the internal services such as grafana, prometheus, rabbit mq (the web interface) and such?

Proxies or VPNs like you mentioned. You usually don't expose things if you don't have to.

silverquiet
9 replies
1d4h

I use Traefik in production (with containers), and my favorite aspect of it is that the configuration is carried via the labels on containers which means I rarely if ever need to make any modifications to the Traefik config itself. I'd say the biggest con is trying to figure out how to pronounce the name - I think it's just regular traffic, but I can't help wanting to call it "trey-feek" or something like that.

Projectiboga
2 replies
1d4h

ae is closest to y, or hi. So Tryfik, is my guess, otherwise is Trayfik. If it's European fik, might be feek. *Just taking a guess here.

tazjin
0 replies
1d4h

I think its "träfik", i.e. "traffic" with a German accent.

psYchotic
0 replies
1d3h

ae is closest to y, or hi. So Tryfik, is my guess, otherwise is Trayfik. If it's European fik, might be feek. *Just taking a guess here.

I wondered how to pronounce Traefik myself, so I started googling, and came across this: https://traefik.io/blog/how-to-pronounce-traefik-d06696a3f02...

Tldr: just pronounce as you would "traffic".

fidotron
1 replies
1d3h

Heavy +1 on the labels thing. Reduces the scope of things to keep track of massively, even if writing them the first time is slightly harder because of the escaping and verbosity.

I think a combination of traefik and docker compose are in the sweet spot for small scale self hosters that haven't reached the point where k8s will pay off. i.e. if you have less servers than a k8s HA control plane would use.

silverquiet
0 replies
1d3h

Small-scale self hoster would certainly describe my situation (though we do have some of the same infrastructure issues as larger companies). We actually use Swarm which I generally like, but if it was my call we might have looked more at a simplified Kubernetes platform like K3s just because of a safety in numbers aspect.

sureglymop
0 replies
1d1h

I would say the biggest con is that, if a container is not existing/running, traefik is not aware of it or its labels. Otherwise you could more easily do cool stuff like maintenance pages, bringing up containers on the first request after inactivity etc. So for me, I have been thinking about creating a plugin that is aware of where I store my compose files and can look at them instead.

jq-r
0 replies
3h0m

We're also using it in production. And people might laugh, but naming something which looks fine/cool/quirky on paper, but is actually terrible in practice is a big con. The amount of frowns and laughs we got from colleagues is staggering and a hindrance to implementation and using a product.

We've called it "tray-feek" and it was half ok, then we actually had a call with the official support and they told us its pronounced same as regular "traffic". So any discussion about that proxy goes with: "so we're receiving traffic on our public load balancer which used traffic's native load balancing to send traffic to traffic's pods...". It sounds stupid because it is stupid.

arcanemachiner
0 replies
1d1h

I just pronounce it "traffic". I'm not playing their damn head games.

arush15june
6 replies
1d2h

I use caddy rather traefik. It's much easier to manage the Caddyfile compared to the traefik YAML config IMO, and we just keep three separate Caddyfiles for local, production and on-prem deployments. There are a plethora of great plugins, we use the coraza WAF plugin for caddy and it works well.

sureglymop
1 replies
1d1h

Looks interesting but I don't see the benefits really. Still looks like a lot of labels exactly like with traefik. Why should one switch?

BrandoElFollito
0 replies
22h9m

Having had used traefik, caddy and now caddy proxy, I like the latter because labels are simple pointers to actual caddy features (reasonably documented).

I used to have all my docker compose files in elaborate structures but moved to portainer for simplicity. Together with caddy proxy it rocks (well, there are several things missing but I have hope)

preya2k
0 replies
21h19m

Same here. I enjoyed Traefik for being able to use docker tags for my reverse proxy configuration. The mechanism is great, however I did not like Traefiks internal config structure. Caddy is much easier for me to understand and matches my (small scale) use cases much better. Using Caddy via Docker labels through caddy-docker-proxy is about as perfect as it gets (for me).

renk
0 replies
2h41m

Yes. If you don't need all of the service discovery and auto-scaling shenanigans (or are willing to script it yourself), you can gleefully skip Traefik, Docker Swarm, Kubernetes etc. and just use Caddy! It can really do most things and it does them well.

overstay8930
0 replies
2h14m

I love Caddy, I wish the docs were better on production deployments, too many unanswered questions about best practices especially RE: storage and config management. Like how local storage is supposed to be handled when you're using external storage? Allegedly it can be treated as stateless but maybe not?

You basically just have to pray the guy who made the module you need knows what he is doing, because there's no standards for documentation there either. Maintainers really need to put their foot down on random ass modules with 0 documentation providing critical functionality (i.e. S3 storage backend).

rglullis
5 replies
1d1h

For authentication, I had good luck with authentik as forward proxy.

The one thing that bothers me with traefik is that their implementation of ACME does not work if you have some sort of DNS load balancing. I had one setup with three servers responding to the same domain. It seems the first request )to start the ACME dance) would go to one server, and if the second one (with the .well-known address) is sent to a different one, it will just return a 404 and fail the whole thing. Now I either have * to delegate the certificate management to the service itself or add Caddy as a secondary proxy just to get certificate from it.

* Of course, someone smarter than me will point me to a better solution and I will be forever grateful.

jackweirdy
4 replies
1d1h

If I am not misunderstanding (sorry if I am) it sounds like you use the http challenge where your cert provider tries to GET your challenge file — if so, could the DNS challenge be better suited? There, you put the challenge in a TXT record value

rglullis
3 replies
1d

You got it, but your solution won't work because of one detail: I can not use the DNS challenge because I am running a managed service provider, and my customers are the ones who own the domain. All I can do is ask them "please add a CNAME to my gateway", and I need to figure out everything else on my side.

jspdown
0 replies
23h5m

It might not be suitable for your use case but, have you tried ACME DNS challenge delegation to a different one hosted by yourself?

arccy
0 replies
8h34m

ACME supports Delegated Domains for DNS01:

    _acme-challenge.customer.com IN CNAME _acme-challenge.your-automated-domain.org.

cagenut
4 replies
1d4h

In a mirror/reverse of the OPs premise - I always wondered why so many of these open source http reverse proxies sprung up in the container era, like what did they offer that varnish or a vmod to varnish wasn't already doing or capable of? somehow varnish almost completely missed the container era, despite seemingly being the exact type of tool a bunch of teams would go on to create.

Starlevel004
2 replies
1d4h

Devops guys are mostly incapable of using any service that isn't a) written in Go and b) configured using a YAML-based DSL.

demi56
0 replies
7h0m

and b) configured using a YAML-based DSL.

Go devops HATE YAML-based DSL we just put it there cause there’s not alternatives, json ?, don’t wanna go there fortunately there’s CUE lang but moving all these project to accept cue isn’t that easy either.

Devops guys are mostly incapable of using any service that isn't a) written in Go

Lol we basically rewrite it in Go if we’re using it frequently. Most Go projects are just things the founder really wanted for himself

TNorthover
0 replies
1d3h

Traefik's YAML does a particularly bad job at keeping syntax (such as it is) separate from user-defined labels, I feel.

Very difficult to just look at a file and see which bits are labels for the sake of it, and which bits are direct instructions to builtin features.

lmeyerov
0 replies
1d2h

For Caddy, LetsEncrypt: Free TLS in one line without talking to anyone

For Traefik, afaict, something about k8s

dizhn
3 replies
1d3h

I use caddy wherever I can. That it can already handle automatic certificates is a big plus. Plus it's very easy to congiure.

amne
1 replies
11h43m

I tried to get caddy to listen to both ports 80 and 443 in a cluster. I failed miserably. The documentation simply dismisses this as a possible scenario.

mholt
0 replies
3h3m

How do you mean? Many of our users do this with no issues.

jspdown
0 replies
23h11m

If you like Caddy for it's ACME capabilities, then you might enjoy Traefik as well. It supports HTTP, TLS ALPN and DNS challenges and can be configured in one line as well.

znpy
2 replies
1d3h

you mount the docker socket into the traefik container and gain the ability to auto-detect other containers that you might want to expose using traefik.

Totally not a security issue. Source: trust me bro.

1oooqooq
2 replies
10h17m

“Server Name Indication” (SNI)

into the trash it goes. anyone who support https everywhere and ever slightly tolerates SNI is a fool.

d-z-m
0 replies
7h25m

can you elaborate?

ajnin
0 replies
5h16m

I don't see why you're opposing HTTPS everywhere and SNI, HTTP already had the Host header so it is not a new information leak.

It's pretty much mandatory if you intend to serve multiple domains with different certificates from the same host/proxy, which seems like a very very common use case, and there is no alternative to this right now.

siva7
1 replies
1d3h

It’s nice if you’re running a bare metal server on hetzner or DO but in the age of cloud platforms like aws or azure there is hardly a need for traefik.

PennRobotics
0 replies
11h33m

Even on Hetzner, it's not amazing and not a one-click workflow.

Load their Photoprism image on a standard server with only IPv6 (as a v4 address costs extra) and certificates will not get generated; logs point to Traefik although the solution is modifying Dockerfiles; thanks Dockerphiles, for insisting your software is the answer to everything server...

ofrzeta
1 replies
1d

Is it any better than HAProxy? HAProxy has served me well for at least a decade and has also been modernized for the cloud age with the runtime API that allows dynamic configuration.

ljhtlajdfqasd
0 replies
23h30m

All of these proxies seemed to have achieved feature parity within the last couple years.

Where they seem differ is the licensing, enterprise model, source language, and data plane model (sidecar vs no sidecar).

notoall
1 replies
13h49m

For simple deployments, consider whether you need a reverse proxy at all.

I have IPv6 everywhere, with each service getting its own IPv6 address. Each service is managed in inetd-style (via systemd-socket-proxyd ), and so essentially listens directly.

For services that need to serve IPv4, I have a reverse proxy on my network edge that demuxes on TLS SNI to the corresponding IPv6 address.

The advantage here is never having to deal with complex applications, with their complex and changing configuration.

notpushkin
0 replies
13h38m

I'm using a reverse proxy just to terminate TLS. Pretty sure it is possible to do that at a service level, but don't think it's worth the trouble.

muhehe
1 replies
1d2h

In the future our company will migrate to k8s. It looks like it will be openshift, specifically. Do we need this in openshift or is there some "native" mechanism baked in?

methou
1 replies
1d3h

The only problem I'm having with it is that it doesn't support unix domain socket[0], in a "cloud native" environment you rarely need it but if you are using single node this can be sweet.

-- [0]: https://github.com/traefik/traefik/issues/4881

meonkeys
0 replies
1d2h

Could you say more about how a non-network socket would be beneficial? I'm guessing simpler code and lower resource usage, but I'm curious what you're interested in. And by "single node", do you mean one server / one user (even if the user is, say, a single API consumer or whatever), or something else?

evtothedev
1 replies
4h6m

The 37Signals/Basecamp team has been working on a small, opinionated replacement for Traefik called Thruster: https://github.com/basecamp/thruster

Would be worth checking out, if you're currently considering options.

woopwoop24
0 replies
9h28m

i had such a hard time learning traefik and transitioning to V2. I do not fall into the standard case, wanting to use traefik for containers, not running on the same host (you cannot have labels annoted as the docs suggest, if the container is on another host) Docs were sparse and also not wanted to use the env vars for the traefik config as well, so took a bit of fumbling and reading and eventually i figured it out, but was almost on the verge of going back to haproxy

vedmed
0 replies
23h17m

I needed a reverse proxy the other week. OPNSense is my firewall. I tried traefik, but it was too complicated. So I installed caddy, and it was easy as pie. My .02

teekert
0 replies
1d

I have used traefik a lot. But I mostly got frustrated with all the docker-compose labels and layers and so many lines just to have a rev proxy. Then I found Caddy. Never looked back.

I guess I was never the audience for Traefik. I just need an https enabled rev proxy. Or a basic-auth layer. In Caddy both are just 1 line, very concise, no layers (which I still don’t understand…)

riedel
0 replies
1d4h

Funnily I spend my weekend making a traefik config file to gitlab pages on a self hosted instance without pages enabled but using the artifact API. No code involved. Had to configure quite some rewriting logic and use three different plug-ins, which are mostly unmaintained. In the end probably something like nginx, Apache or caddy or a bit of code probably would have worked better, because of all the layering of different middleware. But it worked somehow. I guess it shines through still for easy SSL termination of docker and great observability. That is why at least I have been using it for the past years.

nderjung
0 replies
5h6m

If you're looking for an alternative way to run traefik, we support this out-of-the-box on https://kraft.cloud -- A platform dedicated to running ultra-lightweight VMs based on Dockerfiles, with millisecond cold start times (96ms for Traefik), scale-to-zero, autoscale.

Check it out in our docs: https://docs.kraft.cloud/guides/traefik/

It's also possible to start traefik and other services together using Compose files: https://docs.kraft.cloud/guides/features/compose/

mubu
0 replies
5h51m

A couple weeks ago I was deciding between reverse proxies and eventually settled with Caddy because of its simplicity. However, Traefik's auto discovery of containers and referencing by labels is quite nice, but Caddy has a plugin to do the same.

I read the article but I'm still not convinced Traefik has anything over Caddy for me. Maybe someone else does and can chime in.

lakomen
0 replies
1d2h

Traefik is considerably slower and more resource hungry than nginx. There is nothing more to say.

kopadudl
0 replies
1d4h

When my company looked at different proxies for k8s, we ended upon traefik cause we had experience from docker swarm and it has a dashboard.

jasoneckert
0 replies
1d3h

Another thing worthy of note is that Traefik is configured by default in K3s. This has allowed K3s to be the quickest way to spin up a K8s cluster for testing, essentially allowing you to treat your cluster like cattle too. Simply add your deployment and associated service using NodePort, and you can access your app without worrying about the ingress controller.

I use a shell script to spin up K3s clusters and test apps I specify as a positional parameter on demand (leveraging the ttl.sh ephemeral container registry). The same script tears down the cluster when finished.

jakubsuchy
0 replies
7h8m

Article spends a lot of time comparing Traefik to HAProxy. Might just as well use HAProxy then :-)

firesteelrain
0 replies
22h5m

We just started running Traefik in production since looking at self managed K8s was just too hard and complicated for what we were trying to do. We have an Ansible Docker compose service (that’s what we call it), that starts up the containers and auto registers the containers with Traefik. It works really well.

We are airgapped so can’t use Let’s Encrypt. We inject the certs into our containers via Ansible or Docker Compose.

dmeijboom
0 replies
10h51m

I don’t get the appeal of Traefik. If you want an easy to use reverse proxy that works well, pick nginx. Want something simple for self-hosting? Take a look as caddy. For Kubernetes, try out Envoy Gateway.

djhworld
0 replies
1d4h

I've been using traefik for a few years for all my self hosted things.

I abandoned the dynamic/discovery/docker labelling functionality though it was just too finicky and annoying to debug.

Instead I generate a static config file using a template engine, pretty much all my things are just a combination of host/target/port so very easy to generate the relevant sections - I don't really have any complicated middlewares other than handling TLS. It sounds like the author of the linked post has taken the same route.

The config gets generated through an ansible script and then gets copied to the machine where traefik is running - traefik watches the directory where that file is and auto-reloads on changes.

It's been working great!

cab404
0 replies
23h36m

Somehow, I find myself using Caddy everywhere I would use Træfik in the past.

btbuilder
0 replies
1d4h

When looking for a reverse proxy that is performant on Windows and Linux around 5 or 6 years ago the options were very limited. Traefik is what we ended up using.

I haven’t checked recently but at the time nginx on Windows used select() and envoy was either beta or needed a recent version of the Windows kernel that not all customers were running.

We still use it today.

brainzap
0 replies
21h4m

It would be nice if proxies are opinionated about typical URL usecases and offer an easy way to redirect www to non-www or handle path with missing slash.

barbazoo
0 replies
1d1h

I’d stay away from it. The magical way to set it up via docker compose tags is nice but doesn’t allow for zero downtime deployment at least until recently.

Getting true zero downtime deployments only worked with their file provider but that’s a bit archaic these days.

Sincere6066
0 replies
1d1h

I'll stick with caddy. It's worked for me for years.

MrOxiMoron
0 replies
1d4h

I love treafik, we use it with nomad/consul and docker to setup our whole infrastructure. The plugin system is also simple yet powerful and the dynamic configs are great for our customers custom domains, we can quickly see if a domain points to the right IP and put it in to get everything working. And of a domain no longer points to is we get a slack notification and it removes it from traefik so it no longer tries to get SSL certificates for it.