HN comments for: Why are we templating YAML? (2019)

To me YAML seems like the CoffeeScript of JSON, and unlike CoffeeScript I don’t understand why people are still using it.

I guess XML and JSON are too verbose. But YAML is so far in the opposite direction, we get the same surprise conversions we’ve had in Excel (https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-fr...). Why is “on” a boolean literal (of course so are “true”, “false”, as well as “yes”, “no”, “y”, “n”, “off”, and all capitalized and uppercase variants)? And people are actually using this in production software?

Then when you add templating it’s no longer readable and concise anyways. So, why? In JSON, you can add templating super easily by turning it into regular JavaScript: use global variables, functions and the like. I don’t understand how anyone could prefer YAML with an ugly templating DSL over that.

And if you really care about conciseness, there’s TOML. Are there any advantages of YAML over TOML?

Dunno, to me YAML is the python of markup languages.

YAML is decent at handling things like nesting and arrays, while TOML sucks at it.

I don't dislike YAML that much.

That being said, we knew since the dawn of C macros that templating languages which are not aware of syntax, are AWFUL.

Likewise, writing Helm charts (the place I encountered YAML templating) is just horrible, but would be so much nicer is templates respected the YAML syntax tree and expanded at the right subnode, instead of being a text replace botch-jobs.

The biggest issue I have with Yaml is that they forbid tabs.

Their argument is that tabs are shown differently in every editor which is actually something I like. When you're looking for something deeply nested you can reduce the tab distance a bit, when that's not needed you can increase it to improve visibility of nesting levels.

And forbidding it makes a one-keystroke action a two or four one.

I really don't understand the python/Yaml hate for tabs, and as a result I don't really use either.

And forbidding it makes a one-keystroke action a two or four one.

You can’t be serious

Not everyone wants a bloated and buggy IDE to write their code for them.

Like vim, for example? Which supports replacing tab inputs with spaces...

I code practically exclusively with vim. The replacement is buggy and has many corner cases that come up constantly. As in all editors.

Tab indentation has no bugs or corner cases.

And I've been using vim exclusively for north of fifteen years with Tab replacement, never had a problem with the editor getting confused about what happens with spaces when I hit Tab.

Some detail about the corner cases you've run into would be great, if they're happening constantly I can see how it would be a bugbear.

For example with vim (debian) defaults, if you happen to have a 2-space indented Python (the first two spaces are for HN formatting, the first if should start at zero indent):

  if True:
    # Two space indent

And continue to add another if block in that, the autoindent will give you four spaces:

  if True:
    # Two space indent
    if True:
        # Four space autoindent

And if you make a new line after the last row there and hit a backspace, it'll erase one space instead of four, giving an indentation of 3 (+2) spaces. And if you start a new line after that, you'll get an indentation of 8 spaces in total. Ending up with:

  if True:
    # Two space indent
    if True:
        # Four space autoindent
      # Hitting backspace gives this
          # Hitting a tab gives this

This is just a one case, but things like this tend to happen quite often when editing code. Even if it's been originally PEP-8 indented. Usually it's not what the Tab does, but what the Backspace or Autoindent does. I'm not exactly sure what exact Tab/Backspace/Autoindent rules underlie the behavior, but I can imagine there having to be quite a bit of hackery to support soft-tabs.

For me this kind of Tab/Autoindent/Backspace confusion is frequent enough that I'd be very surprised if others don't find themselves having to manually fix the number of spaces every now and then. And when watching over the shoulder I see others too occasionally having to micromanage space-indents (or accidentally ending up with three space indented blocks etc), also with other editors than vim.

As with most things in vim, it is definitely manageable in settings such as tw=2 (tab width) and sts=2 (soft tab stop). This is why a lot of older Python files, in particular, are littered with vim modelines with settings like these.

The nice modern twist is .editorconfig files and the plugins that support them including for vim. You can use those to set such standard language-specific config concerns in a general way for an entire "workspace" for every editor that supports or has a plugin that supports .editorconfig.

Of course you can override it, but is there any excuse for that default behavior? It sounds ridiculous.

The defaults are either 4-space or 8-space soft tab stops. 8 spaces it the oldest soft tab behavior. 4-space soft tabs have been common for C code among other languages for nearly as many decades. It is only relatively recently that Python and JS and several Lisp-family derivatives have made 2-space tab stops much more common of a style choice. Unfortunately there is no "perfect" default as these are as aesthetic preferences as anything else.

(It is one of the arguments for using hard tabs instead of soft ones in the eternal tabs versus spaces debates because editors can show hard tabs as different space equivalents as a user "style choice" without affecting the underlying text format.)

Soft tabs at 4 would be fine, though worse than autodetect. But that is not the behavior described in the above post.

The behavior described above seems to me to be exactly soft tabs at 4 in a 2-space tab document with autoindent turned on (often the default).

Vim has no autodetect by default. (I'm sure there's a plugin somewhere.)

The part where the user is on a line indented by 2, hits return, and gets a line indented by 2+4=6 doesn't sound like soft tabs at 4 to me. And I wouldn't expect hitting backspace to then only remove 1 space (if it actually removed 2 that makes more sense, but is inconsistent with what what it just added). At that point, hitting return and getting a line indented by 8 might make sense but is weird.

Another comment suggests it's using 2 and 4 for different settings and that's causing problems.

Well, yes. But that's one more small thing to config and manage. Not a big deal in isolation but such small things add up to significant yank.

With Tabs we wouldn't have this yet another papercut to tool over.

This is because ftplugin/python.vim does:

  if !exists("g:python_recommended_style") || g:python_recommended_style != 0
    " As suggested by PEP8.
    setlocal expandtab tabstop=4 softtabstop=4 shiftwidth=4
  endif

So if you use "set sw=2" then it leaves tabstop and softtabstop at 4.

You can set that g:python_recommended_style to disable it.

Also sw=0 uses the tabstop value, and softtabstop=-1 uses the shiftwidth value.

I agree Vim's behaviour there is a bit annoying and confusing, but it doesn't really have anything to do with tabs vs. spaces. I strongly prefer tabs myself as well by the way.

Even when you DO use tabs Vim will use spaces if sw/ts/sts differ by the way. Try sw=2 and using >>, or sts=2 with noexpandtab.

When looking at the code, tab-containing files are the most inconsistent ones, especially when viewed via general tools (less, diff, even web viewers).

Sure, if people would only ever use tabs for indentation and spaces for alignment, things could be good. But this almost never happens, instead:

... some lines start with spaces, some with tabs. This looks fine in someone's IDE but the moment you use "diff" or "grep" which adds a prefix, things break and lines become jagged.

... one contributor uses tabs mid-line while other use spaces. It may look fine in their editor with 6 character tabs, but all the tables are misaligned when looking in app with different tab size.

Given how many corner cases tabs has, I always try to avoid them. Spaces have no corner cases whatsoever and always look nice, no matter what you use to look at the code.

(the only exceptions are formatters which enforce size-8 tabs consistently everywhere. But I have not seen those outside of golang)

Sure, if people would only ever use tabs for indentation and spaces for alignment, things could be good. But this almost never happens, instead: ... some lines start with spaces, some with tabs.

People using tabs for alignment can happen when you've got a tab-camp-person who hasn't yet realized how they're terrible for alignment.

But "some lines start with spaces, some with tabs" happens for precisely two reasons:

* you have a codebase with contributors from both camps

* people thought in-editor tooling was the solution (now you have two problems)

Spaces have no corner cases whatsoever

This is tooling and (as you realized) stop preference dependent.

Almost every text editor has support for tabs-as-spaces.

I haven't used an IDE in years.

I don't want that though. Because then when editing I still have to mess around with spaces.

And the double nature of the spaces makes it hard to see when you have an odd number of spaces when you reach deep indenting levels, which counts as the lesser number of double spaces in Python.

IMO it would be ideal if tabs would be displayed as a block, and you could resize the width of that block on the fly <3

Is there any editor in 2024 that can't replace tabs with spaces? Is it just Notepad?

I prefer to keep my json to one line without white spaces, saves on disk space.

I prefer to write to my disk manually with a magnetized needle in a clean room.

Clean room's for tryhards. Dust adds flavor.

I recommend you use smaller fonts as well.

Ouch. The only problem with the obvious sarcastic tone of that comment is that there are plenty of people that do say exactly the same thing and mean it.

JSON formatting is less important because most apps that deal with it come with good “beautify”, “sort”, “remove all formatting white space” functions in the editor

For code I'd agree. However for configuration files, I find that I often need to edit them in places or environments where I don't have anything but the most bare-bones editor.

A quick search shows that even nano can be configured to use whatever number of spaces you want when you hit tab:

https://askubuntu.com/questions/40732/how-do-i-get-spaces-in...

I consider Nano a fully fledged editor. I'm talking Notepad or a html text box.

Oh. Windows with no ability to install anything didn't even occur to me! I'm truly sorry.

When this happens, I copy four spaces and then use Ctrl+V for Tab.

Yes, it’s not exactly the same due to alignment, and yes you have to repeat it after using the clipboard for other purposes, but it’s good enough for that occasional use.

Tabs aren't a problem

Spaces aren't a problem.

What is a problem is not picking one or the other. There's arguments for both sides but it is critical to just take a side. I'm sorry your side lost but it makes everything better to just go along with the consensus.

At one of my internships in the 90's, a developer I worked with solved the problem by never indenting. Every single line of code started at column 1.

Why even use more than one line?

Why not just make everything whitespace? Give both tabs and spaces their rightful place! https://en.wikipedia.org/wiki/Whitespace_%28programming_lang...

True, that's how real coders work!

10 IF A=1 OR Z=2 GOTO 30

20 GOTO 50

30 PRINT "HELLO WORLD"

40 GOTO 10

50 GOTO 30

Sounds like you were an indent-ured labourer.

Why did you leave an empty column at the start? :]

It sounds like this is the origin story of Python. Whoever worked with this person made it their life’s mission to enforce proper indentation.

I thought the consensus was tabs for block indentation, spaces for alignment.

No, that's what the tabs hold-outs have morphed into. Which illustrates the problem with tabs: It's very difficult to get everyone on a team to care about tabs or not care about alignment.

No, significant whitespace is the problem.

So - you're saying that mixing tabs and spaces in the same file is entirely unproblematic outside of languages with significant whitespace?

Are you sure about that?

i don't take sides, I use tabs AND spaces

"Tabs for indentation, spaces for alignment" is something I wish had caught on.

https://lists.gnu.org/archive/html/emacs-devel/2016-12/msg01...

This e-mail enters to my "favorite quotes from internet" list directly from the top.

It is a funny quip, but I wish they'd consider the reformatting. I find using an autoformatter reduces cognitive load while reading and writing.

Yeah, OP is not wrong. I also like neatly formatted code and is way easier to read.

I always reformat all my code before all commits. It's just good hygiene.

The funny part is the fussing and the answer they get.

I'd just autoformat the area of my patch and send in the patch that way, maybe plus some autoformatted blocks here and there, slowly fixing the stuff as I go.

If something is too bothersome, first try doing something, and figure out the rest of the process as you go.

Edit: blocks became blogs without my knowledge. Maybe I should write a blog post about it. Don't know.

Us old folks remember the days when reformatting was a computationally expensive action that required a special program to “pretty print” the code. And heaven forbid your code used some language feature your pretty printer didn’t understand and mangled the output making your code uncompilable.

Well, I'm not that of a young folk. I was playing with computers (programming, in fact) in the early 90s, and I remember when it was expensive.

However, Eclipse is formatting C++ code with a simple hotkey and without breaking it and understanding the language for the last 15 years as far as I can remember. It's instant, too.

Because of that I feel a bit surprised when younger people look it like it's black magic. It's neither new, nor unsolved in my conscious experience.

Reformat-on-change is also a valid strategy!

I think I've even seen this employed on C++ codebases with clang-format. Conceptually, it's like `git diff | clang-format`, but there are more flags and scripts involved: https://clang.llvm.org/docs/ClangFormat.html#script-for-patc...

There's deep wisdom there.

And forbidding it makes a one-keystroke action a two or four one.

The majority of editors can be configured to use tab to insert the appropriate number of spaces. Many will automatically detect the correct configuration.

The majority isn't all, and in my experience you always end up having to use one in some random situation that doesn't have that. tap tap tap tap

Literally 100% of editors support tabs.

The horror

The attempt on my life has left me scarred and deformed.

Which editor have you run into that doesn't? Even nano supports configuring it with both nanorc or a command line flag.

HTML textareas don’t support entering tabs.

There are rich-text editors that increase the margin on Tab rather than inserting a tab.

> forbidding it makes a one-keystroke action a two or four one.

Not if your editor can be configured to interpret a Tab keypress as the appropriate number of spaces. AFAIK all common text editors, at least in the Unix world, do this.

I don't want that though, because then I still have to mess around with spaces when editing.

I actually like tabs for indenting levels especially because I can configure how far they indent on the fly.

My editor can also change the width of blocks of spaces on the fly, as well as navigating them in arbitray chunks.

I'm pretty sure most of the "spaces" people have their editor set up to convert the 'tab' key into multiple spaces.

Now excuse me while I duck under this table.

You're safe because you're right.

Every editor I know can enter n spaces when pressing tab. That might solve your concern.

It does not. You still have to mess around with a bunch of spaces when you're editing or copy/pasting, and not having exact even numbers makes for ambiguous situations.

I agree with you about YAML's treatment of tabs. I still use YAML because there's often no other choice.

Python is actually flexible in its acceptance of both spaces and tabs for indentation.

Maybe you were thinking of Nim or Zig? Nim apparently supports an unsightly "magic" line for this (`#? replace(sub = "\t", by = " ")`), and Zig now appears to tolerate tabs as long as you don't use `zig fmt`. I haven't used either yet because of the prejudice against tabs, but Zig is starting to look more palatable.

I agree with you about YAML's treatment of tabs. I still use YAML because there's often no other choice.

True, I'm using it too when I have no other choice.

Python is actually flexible in its acceptance of both spaces and tabs for indentation.

True but it does give constant warnings then which is annoying. And I was worried about it dropping support in the future so I didn't want to waste time learning it.

Your problem, and I mean this sincerely and respectfully, is that you're not using your text editor / IDE correctly. Adding two or four spaces of indentation is done by pressing TAB! Once. Most editors will do know how to do this out of the box, but if yours doesn't you need to change it.

You still have to mess around with a bunch of spaces when you're editing or copy/pasting, and not having exact even numbers makes for ambiguous situations.

Especially if something is 5 levels deep, it's really hard to see if you have 12 or 11 spaces (so 5 levels + 1 space or 6 levels) indentation.

Like with Python, any competent text editor will take care of this for you, I've never encountered this issue before.

And forbidding it makes a one-keystroke action a two or four one.

This isn't a tabs or spaces issue. This is a "your editor is bad or configured wrong" issue.

The worst thing with Helm charts is not the YAML, or even the text replace botch-jobs, but that they seem to think that a Go stacktrace is reasonable error reporting. I don't think I've ever worked with a tool with such awfully useless error messages.

But I agree, it'd be better if the template expansion was actually structural and not just text. The huge amount of "| indent 8" etc. in Helm charts is such a stench that by about the second time people encountered that they ought to have made a better template expansion mechanism top priority.

You have an error on line one. Good luck

Ah, this had me laughing.

Unlikely it will ever get better. First to market with a prototype tool, gains market share and momentum. Eventually the enthusiasm fades off and people start hating it, for good and sometimes bad reasons. Yet users are stuck because change is expensive and risky. The team is stuck because any change risks becoming the straw that broke the camel's back, possibly cascading through the user population. Story of our young industry.

And then the second layer of hell, a DSL inside the YAML for something like an Azure DevOps pipeline. Truly awful.

My personal favorite was when my company switched to configuring Jenkins in YAML, with some of the config being in YAML proper and other config being in Groovy embedded inside of multiline strings. Since it's Jenkins, the Groovy itself embeds multiline strings for scripts that need to run, so the languages end up nested three levels deep!

The only thing that saves me is IntelliJ's inject-language-in-string feature.

TOML has the inline table syntax with curlies, like JSON, and inline array syntax with brackets, also like JSON. It could support nesting pretty well.

Sadly, it doesn't support line breaks in the inline table syntax, so using inline tables for nesting is a PITA; inline tables are pretty much unusable for anything which doesn't fit within like 80-100 characters. Inline arrays can contain newlines however, so deeply nested arrays works well.

Newlines in inline tables will be coming in TOML 1.1, which will make TOML much better for deeply nested structures. Unfortunately, there will probably be many years until 1.1 is both actually released and well supported across the ecosystem.

And of course, inline tables can't be at the top level of the document, so TOML might still not be the best way to represent a single deelpy nested structure.

Yeah, that's why I prefer ytt over helm syntax. It isn't great syntax, but at least it is aware of what it is doing.

Having said that, yaml has some pretty obvious mistakes. It should have been a lot more prescriptive about data types. Not doing that creates a lot of unneeded confusion and weird bugs.

People balk at XML, but its verbosity plus DTD allows it to pull tricks which you can't do on other things.

Well everything has its place, but XML is I think very well suited where you need to serialize complex things to a readable file, and verify it while being it's written and read back.

Indeed. I get a lot of value out of my strongly typed XML documents. I generally have code that validates them during writing and after reading. Those who don’t understand XML end up learning why it is verbose when they eventually add all of the features they need to whatever half-baked format they are using.

An XML document without a schema is strictly worse than JSON without a schema. JSON with a schema is strictly better than XML with a schema. XML structure does not map neatly into the data types you actually want to use. You do not want to use a tree of things with string attributes, all over your code. If you do have a schema, the first thing you will want to do is turn your data into native language data types. After that point, the serialization method does not matter anymore, and XML would have just be slower. Designing a schema for XML is also more tedious than for JSON.

I enjoy JSON for internal stuff and where it does not matter that JSON is not very expressive. JSON Schema is a poor substitute for a proper schema. For anything where I am interfacing with another person or team, I send them a DTD or XSD, which documents the attributes and does not have nonsense like confusing integers and floating point values.

For quick and dirty, I agree about JSON. For serious data interchange, I use XML.

JSON with a schema is strictly better than XML with a schema.

I am baffled by this assertion. XML Schema (XSD) is much more expressive than JSON Schema.

XML structure does not map neatly into the data types you actually want to use.

After that point, the serialization method does not matter anymore, and XML would have just be slower.

Considering I have mapped 3D objects to (a lot of) C++ objects containing thousands of facets under 12ms incl. parsing, sanity checking, object creation, initialization and cross linking of said objects on last decade's hardware, I disagree with that sentiment.

Regarding your first point, even without a schema, an XML shows its structure and what it expects. So JSON feels its hacked together when compared to XML in terms of structure and expressiveness.

It's fine for serializing dark data where people won't see, but if eyes need to inspect it XML is way way more expressive by nature.

Heck, you even need to hack JSON for comments. C'mon :)

The 'XML is verbose' argument is exactly analogous to the 'static typing is verbose' argument. JSON is decent, but it quickly breaks down if you want to have any sort of static sanitisation on input data, and the weird `"$schema"` attribute is quite strange. YAML makes no sense whatsoever to me.

XML is by far the most bulletproof human-readable serialisation-deserialisation language there is.

The 'XML is verbose' argument is exactly analogous to the 'static typing is verbose' argument.

It’s two things: the static typing analog is definitely there but I’d extend the comparison to something like the J2EE framework fetish & user-hostile tools, too. There were so many cases where understanding an XML document required understanding a dozen semi-documented “standards” and since few of the tools actually had competent implementations you were often forced to write long-form namespace references in things like selectors or repeat the same code.

I worked with multiple people who were pretty gung ho about static typing everything but the constant friction of that self-inflicted toil wore over time. I sometimes wonder whether something more in the Rust spirit where the tools are smart enough not to waste your time might be more successful.

I agree. Here in 2024, I hope everyone agrees that types are great.

Static types, aren't just verbose, they're clunky. They only work in a perfect world - dynamic types provide the functionality to actually thrive.

I sometimes wonder whether something more in the Rust spirit where the tools are smart enough not to waste your time might be more successful.

That could help, the problem being XML. You mention the J2EE framework and semi-documented "standards" - the world is rife with bad xml implementations, buggy xml implementations, and bad programmers reading 1 GB xml documents into memory (or programs needing to be re-worked to support a SAX parser).

There's too much baggage at the feet of XML, and the tools that maybe could have helped were always difficult to use/locked behind (absurdly expensive) proprietary paywalls.

JSON started to achieve popularity because as a format, it was relatively un-encumbered. Its biggest tie was to Javascript - if certain tools hadn't been brain-dead about rejecting JSON that wasn't strictly just JSON, it might have achieved same level of type safety as schema-validated XML, without much of the cruft. But that's not what the tools did, and so JSON became a (sort-of) human-readable data-interchange format, with no validation.

So in 2024 we have no good data-x-change formats, just random tools in little niches that make life better in your chosen poison format. We await a rust - a good format with speed, reliability, interoperability, extensibility, and easy-to-use tools/libraries built in.

Agreed. XML is clunky, no doubt, but it's partly that the tools were just clunky.

Having said that, I do like that you can flip between YAML and JSON. If we could do that with XML (attributes vs sub-elements a problem here) it would be much more useful I think.

I think PDML hits a sweet spot. The author didn't set out to recreate XML in a less verbose, more human readable syntax, but pretty much ended up doing so. I'd like to see it mature and gain more widespread adoption.

XML + DTD + XMLSchema had things we're still figuring out to do with YAML ja JSON

You could easily generate an UI based on just the DTD and Schema that could be used to fill a perfectly valid XML file.

Validating incoming XML was a breeze, just give it to the validator class along with the DTD and Schema and boom, done.

Validating incoming XML was a breeze, just give it to the validator class along with the DTD and Schema and boom, done.

See the boom? It's boomer tech. We can't have old, boomer tech in 2024.

Jokes aside, I wish people spent the time to understand the technologies before disliking them and blindly implementing a different, inferior one.

XML is more popular today than it's ever been. It's just called JSX now.

Besides being aesthetically similar to SGML, because it maps to HTML, JSX has nothing to do with XML. It is Javascript.

It's literally shorthand for "Javascript XML" and its templating syntax is the same as XML. It has a lot to do with XML.

It just looks like JavaScript version of JSP to be honest.

All of that is doable with JSON Schema, though, noy so sonething that we’re still figuring out how to so.

absolutely, 100%.

When I first encountered XSLT I seriously thought it was the most ridiculous thing I had ever seen. A frickin' programming language whose syntax was XML.

But then I learned it and I don't think I've ever seen another language that could do what XSLT could do in such a small amount of code. The trick was to treat it like a functional language (I got this advice from someone else and they were absolutely correct). Where most people got into trouble was thinking of it as an imperative language.

Pattern matching expressions is the kool kid on the block, but XSLT had that to the nth degree 20 years ago.

Indeed, XML is a decent document language because of the quality of tools available and its power/flexibility. I hate when people use it for config files and other things that are usually human edited where readability is paramount though.

I don’t understand why people are still using it

It's a good comaparator, there are indeed a lot of similarities, but I never understood why anyone ever used Coffeescript whereas I do think I have a solid understanding of why people use YAML.

It's more like Python than Coffeescript really: it's not just about simplicity & brevity, it's about terminators.

Whitespace-dependent languages are often a pain to format / parse / read in many ways - Python has survived this by the skin of its teeth by being extremely strict about indentation, both in terms of the parser & also community convention. YAML hasn't had this - it remains a mess.

However, both have that very attractive property of not requiring terminators, which can't really be understated.

if you really care about conciseness, there’s TOML. Are there any serious advantages of YAML over TOML?

TOML's got some good properties but its handling of structures with a depth > 1 is far from concise, and pretty terrible if I'm honest.

I never understood why anyone ever used Coffeescript whereas I do think I have a solid understanding of why people use YAML.

When Coffeescript was invented, it was an advancement on top of the awful Javascript standards at the time. It never went anywhere because Javascript caught up, but Coffeescript had a good reason for existing.

Today, Coffeescript is a remnant of old frontends that nobody has bothered transpiling into Javascript yet, but back in the day it was a promising new development.

it was an advancement

That was certainly the selling point. I never saw any advancements in it - the features were aesthetic syntactic sugar.

Coffeescript came with spreads and destructuring, and added string interpolation, just to name a few things. It also added classes and inheritance, the ?. operator, .

I suppose you could argue those are just synctatic sugar because they compiled down to ES5, in the same way you can argue that any programming language is synctatic sugar over raw machine code.

I may disagree (_heavily_) with the Pythonesque syntax Coffeescript chose, but it took a while for ES6 to be widely available, and Coffeescript made ES6 features work on most browsers without any additional effort. It's easy to take today's Javascript for granted, but the web was very different back in 2009.

In addition to this: ruby-like classes and "sane"/expected handling of this using fat arrow functions. I've worked with a few developers at the time that considered themselves pure backend/rails developers and didn't (bother to) grok the details around the way this worked in JS.

I distinctly remember lots of var that = this; in JS code back then, which wasn't required anymore when using CoffeScript.

Class sanity was the major reason I chose it for a project in the early 2010s. I was interacting with the classes in OpenLayers and being able to do so without all those footguns was very welcome.

javascript was never designed to be used like a classic OOP language, that's why jquery won, it was functional which meant it didn't fight you the way the other libraries did.

javascript is first and foremost functional no matter how hard MS and others have tried to hammer it into a more typical OOP language.

I'm not sure what you mean. You can put functions into objects, you have "this" when you call the functions, you even have prototypes. It seems to me like the language is designed to let you do OOP just fine, and the only thing that was awkward was organizing the code where you define all those functions and the constructor. So they added a sugar keyword for it.

right, it's awkward, so don't do that, be functional instead.

jquery vs mootools/scriptaculous/etc.

jquery won for a reason, it's just flat out a better experience in terms of code specifically because it uses a functional approach in its api rather than an OOP approach.

I would argue that fat arrow functions really are nothing more than synctactic sugar. I don't know of any place where (x,y) => {} couldn't be replaced by function(x,y){}. I prefer arrow functions myself, but it's a very minor additions.

Fixing _this_ is a good point, though.

When you didn't know how this worked, CoffeScript's fat arrow functions became a life saver when attaching callbacks from inside some object you were writing that probably had an init() method to set up the handlers:

  // Doesn't work, <this> is <window>.
  document.body.addEventListener("click", function(event) { this.handleClick(event) })

vs.

  document.body.addEventListener "click", (event) => @handleClick(event)

You only needed a .bind(this) in the plain JS version, but it felt like surprisingly few people knew this back then.

Interestingly enough, the current version of CoffeeScript compiles this code into a ES6 arrow function itself, but I think back then they used bind() in the transpiled JS.

Fat arrow functions were adopted from coffee script.

IIRC it also had some different scoping rules so you didn’t need to sprinkle `bind` all over.

by being extremely strict about indentation, both in terms of the parser & also community convention. YAML hasn't

This is why I created StrictYAML. A lot of the pain of changing YAML goes away if you strictly type it with a schema but you keep the readability.

Counterintuitively that also includes most indentation errors - it's much easier to zero in on the problem if the error was "expecting status code or content on line 334, got response", for instance.

That makes a lot of sense, though I'd guess that a lot of yaml-ops types wouldn't want to have to write schemas.

TOML's sections remind me of the directory part of a filename and keys files.

For the content that belongs in a typical configuration file this or the INI style roots are probably the most human approachable formats. For anything more complex maybe a database (such as SQLite?) is preferable past application bootstrap?

Reading yaml has the enjoyment of reading a love letter where else json has the deterimental feeling of a solicitor email. For writing, yaml is like putting out the draft, you only focus on the meaning not care for else or the form, but for json it is like finishing up your thesis with hard defined structure.

why anyone ever used Coffeescript

CoffeeScript was the front runner for 'Compile to JavaScript' technology. It was the first time we could write some sane frontend code.

Of course things like TypeScript came along and now we cannot unsee what we have already seen.

Are there any serious advantages of YAML over TOML?

Probably not but you forget YAML came out in 2001 where TOML came out in 2013. Neither are spring chickens but inertia is a hell of a thing. For example, Symfony supports YAML, XML and PHP definitions -- but not TOML. Symfony v2 simply predates TOML and they never got around to ditch YAML for TOML because it's not worth the bothering.

TOML is just an .ini file plus some syntactic and computing sugar. I can argue that TOML is actually way older than it is.

1. I am unaware of a standardized .ini format

2. The native types in TOML are useful.

Shall we bet on what would happen if we asked 10 random people of any IT stripe to write a small sample INI file?

Come on.

The problem isn't with the small configuration files, those are just argv put into a file.

Here's an experiment actually worth doing: ask ten people to write a ini file for configuring between 3 and 6 servers where some properties are the same for several servers.

It'd generate same set of problems in INI, YAML, TOML, XML, JSON, BICF (bayindirh's imaginary configuration format).

Because these are not related to how you write the file, but how your software operates in your mind.

How the software operates is of course dependent on the expressiveness of the configuration format, so it is clearly false in most practical senses to claim that the flat key-value format of INI and BICF will generate the same set of problems as formats that allows for list and nesting.

If we accept the assertion that the complexity of a configuration file for the stated scenario is constant across all configuration formats, we will next be asserting that there's no difference in complexity between solutions in x86 assembly and LISP.

We're approaching from different sides.

You stated a problem: Configure ~6 servers where they share variables.

I can implement it in plethora of ways. The most sensible one for me is to have a general or globals or defaults area where every server overrides some part of this defaults. The file format has nothing to do with the sectional organization of a configuration file. Because none of the files force you to a distinct section organization.

e.g.: Nesting is just a tool, I don't care about its availability. I don't guarantee that I'll be using if that's available.

I can write equally backwards and esoteric configuration file in any syntax. Their ultimate expressiveness doesn't change at the end of the day.

It can be

    <network iface="eno1"><ipv4_address>192.168.1.1</ipv4_address></network>

   iface_eno1_ipv4_address = 192.168.1.1

   iface.eno1.ipv4.address = 192.168.1.1

I don't care. All can do whatever I want and need. Only changes how you parse and map. It's hashmaps, parsing and string matching at the end of the day.

If you know both languages equally well, LISP becomes as complex as x86 assembly and x86 assembly becomes as easy as LISP. Depends on your perspective and priorities.

If you don't know how to use the tool you have at hand, even though it's the simplest possible, you blow your foot off.

However they want to.

One may write a single value containing a CSV, another may use a convention of namespaced keys, whatever. One may base64, one may urlencode, whatever.

The differences don't change the fact that they will all have the same things in common.

Even without a formal spec, we all know what we are free to change and not free to change, and free to assume and not free to assume. The unwritten spec specifies very little, so what? That means maybe it isn't a good choice for some particular task that wants more structure, but that was not what you said and not what I'm ridiculing.

Or was that all you meant in the first place? That without some more to it to define standardized ways to do things, it's not good for these kinds of jobs? I confess I am focusing on the literal text of the comment as though you were trying to say that the term is not meaningful because it is not defined in a recognized and ratified paper.

My point is indeed that it is not meaningful to speak of the INI culture as something directly comparable to a standardised format.

One may write a single value containing a CSV, another may use a convention of namespaced keys, whatever. One may base64, one may urlencode, whatever.

The differences don't change the fact that they will all have the same things in common.

I think this is the first time I've seen this sort of neo-romantic argument, where the representation of information is claimed to be irrelevant because, for some unspecified reason, we all known in our hearts what is being said.

Is this a mystical theory you've built on extensively, or something that came to you from the aether just now?

We should ask them instead to modify the existing INI file. I bet most would do just fine.

This is an .ini:

    [section]
    option=value it the way you want it.
    ; And these are comments. That's all.

I don't argue. I use TOML too, but it doesn't change that it's an ini++. You can treat an .ini file as a TOML file (well, maybe comments needs some changing, but eh), they're not different things.

I don't think, even though TOML has some official spec, all parsers are up to it, and may have disagreements between them. It's same for INI.

You can have "native types" in .ini as well. The difference is you'll be handling them explicitly yourself, and you should do that in defensive programming anway. A config file is a stream of input to your code, and if you don't guard it yourself, you agree what that entails.

I don't think even though TOML has some official spec

Read it on https://toml.io/ (Full spec on upper-right… with its evolutions up to final 1.00 version).

Oh sorry, I missed a comma. It should read: "I don't think, even though TOML has some official spec, ..."

Fixed the comment too.

I know TOML has an official spec.

Zomg how did you magically read my brain to produce a perfect example of what I was thinking even though there is no IEEE spec? It's unpossible!

I used Windows 3.1 and 3.11.

That's all what I'll say.

I don't think, even though TOML has some official spec, all parsers are up to it, and may have disagreements between them.

Overall it's not that bad, see e.g. https://arp242.github.io/toml-test-matrix/

If you look at the failure details then most of them are either minor issues about where things like escape characters are/aren't allowed, or about overriding existing tables (previously the spec was ambiguous on that, and I expect that will clear up over time). Note that overview is not entirely fair because it uses the latest (unreleased) version of toml-test where I added quite a few tests.

These kind of imperfections in implementations are of course true for any language, see e.g. YAML: https://matrix.yaml.info – I have no reason to believe it's worse in TOML vs. YAML, XML, JSON, or anywhere else. If anything, it's probably a bit better because it's fairly simple and has a pretty decent test suite.

The problem goes deeper. I can't remember who coined the term, but all "implerative" (imperative declarative) languages share the same issue. I don't care if it's JSON, XML, TOML, or YAML, we shouldn't be interpreting markup/data languages. GitHub actions are a good example of everything wrong with implerative languages.

Use a real programming language, you can always read in JSON/YAML/whatever as configuration. Google zx is a good example of this done right, as is Pulumi.

Kris Nóva said it best: "All config drifts towards Turing completion."

Oh man, i have a similar issue with NixLang. Though i know it's not "implarative". Many days i just want to write Nix in my preferred language. I wish Nix had made a simple JSON based IO for configuration, because then i could see what the output of something is - and generate the input state from some other language.

Really frustrating. Nix works.. but i just don't see the value, personally. And this is after living on NixOS for ~3 years now, with 4 active Nix deploys in my house.. i just don't like the language.

I'm currently building this (plus more) - the happy path of what you're talking about is almost complete. There are fundamental issues preventing what you're talking about being used as a complete replacement for NixLang: you'd need every possible language installed/available on the builder machine in order to build packages, and lazy evaluation would completely break (merely evaluating all of nixpkgs takes hours). So you do ultimately need a primary language. That being said, for devops-like stuff there is no reason to have that limitation.

Would you be able to link the repo? I'm curious on your impl

Nix can read JSON, there's a deserializer as one of the builtins you can call. So you can make a bridge where Nix reads your JSON and does something with it, and you can generate the JSON externally like you want. It's how things like poetry2nix work.

"Implerative" - thank you for this, this is the term I've been searching for to describe the weird blending of the two things.. I immediately Googled it and saw that it has previous uses as well, I would love to know who originated the concept. I see so many times, confusion and arguing about what is imperative and declarative, to the point where I question the value of the terms any longer.

FWIW, I have flirted with my own DSL implementations in a few cases. Certainly, language design is much more complex, but I also felt that once you understand enough of EBNF/parser generators (and some of the simpler alternatives), this is a very powerful option as well.

I'm also pretty against DSLs, although they do rarely have uses cases. For an example of why DSLs can be bad, look at Dockerfiles contrasted with Buildah. The former makes tons of assumptions, especially when to perform layer checkpoints. The latter is just a script in Bash or whatever your language of choice.

For the curious, this might be it: "I've cracked our marketing code, y'all! Pulumi: Implerative Appfrastructure" [1] @funcOfJoe, Joe Duffy: CEO of Pulumi

[1] https://twitter.com/funcOfJoe/status/1319667607214067712

Also an interesting post referencing the term in a previous comment on HN: https://news.ycombinator.com/item?id=31182790

I've always wondered why we seem to have implemented a whole programming language in yaml or json for so many CI/CD systems rather than just writing quick python scripts to describe the logic of a particular build step, then MAYBE using a JSON or XML file to enumerate the build steps and their order, like:

    build: src/build.py
    test: src/test.py

Sure, that's orchestration, though. The problem with GHA is the sheer amount of expressive power that it has. If you need to do dynamic stuff then that should be in a "pre-workflow" step, written however/in whatever you please, that emits the actual workflow.

Why shouldn't the python script be the discrete workflow step? It could be mounted on some file system which has checked out the git at a particular commit with a particular tag, then runs whatever tasks are required to validate or deploy the project

I agree with all of this.

If we take it one step further though and think about portability of configuration, I think that is one of the reasons we end up with operators.

For tools that allow configuration in either JSON or Javascript (like eslint), I prefer the JS version. The syntax is similar but has much more flexibility, like being able to use environment variables or add comments.

Pulumi was also a good tool when I was doing kubernetes deployments.

Why is “on” a boolean literal (of course so are “true”, “false”, as well as “yes”, “no”, “y”, “n”, “off”, and all capitalized and uppercase variants)?

Norway is also "False".

Or more precisely, its country code 'NO' is false. I don't think there are any YAML parsers that parse the literal string 'Norway' as false.

Be the change you wish to see in the world.

I would support a move for YAML to standardize on both "NO" and "Norway" evaluating to false. It seems an obvious win for consistency.

Surely it should accept either "Norway", "Norge" or "Noreg" depending on the locale setting.

Hmmmmmm. In that case the "nodding head" emoji should evaluate to false when the locale is set to Bulgarian...

It’s very obvious that’s what he means.

shrug it wasnt obvious to me. I'm glad someone explained.

It wasn’t obvious to me. I read it as the literal string “Norway” being parsed as false, which didn’t sound believable but I didn’t make the connection to NO at all.

The YAML 1.2 spec removed “no” as a synonym for false. That arguably just made that entire problem worse, and even though it’s been almost 15 years YAML 1.1 is still the commonly used variant.

Ah, that explains why I couldn’t find any online YAML->JSON converters that would demonstrate this flaw when it came up a few weeks ago.

So now we have the same language that parses the same document subtly differently depending on what version you use. Hooray?

Maybe one written by a Geordie?

And yet it doesn't recognize that the UK is false[0].

[0] https://en.wikipedia.org/wiki/Perfidious_Albion

In general I am always confused that it lets you use strings unquoted, which is what allows for all these issues with ambiguity of the interpreted data type, Norway problem and all that.

It also just looks odd to me, I don't see why it's necessary to allow this.

It’s great for end users who don’t understand what a string is or don’t have to play the game of finding the hanging single quote when they write the file by hand in a textarea.

On the opposite end of UX, there’s hand written JSON which is just too meticulous in some scenarios when people are writing config without editor support.

That’s probably a good thing for end users but if it’s running on something that affects the live service I’d rather not have people edit the config who don’t know what a string is

Dealing with inline quotes is annoying, but if you care about users writing things by hand, and especially in a textarea, you should not be using a format that depends on indentation.

It’s because YAML is designed first for readability.

YAML is an amazing config language for simple to mildly complex configs. It's easier to read and write than JSON, and it only really breaks apart when you're heavily deviating from nested lists/dictionaries with string values. People use it everywhere because by the time it becomes painful you're already so invested it's not really worth the hassle of switching.

I, on the other hand, find it much harder to read and write even in very simple configs. I never know what the indent is supposed to be, I just press my spacebar until my editor stops complaining. I find it really hard to tell if a line is a new entry or a subset of the parent entry.

I'm sure if I used it more it'd become easier, but my whole team doesn't understand it either. Luckily we only need it for GitHub configurations.

YAML is (vaguely) a superset of JSON, so you can just use JSON (without tabs) and get your life back.

I don’t need a config language with no fewer that 6 subtly different ways of decoding a string to remember, and certainly not one with a spec longer than C’s. Compare to JSON’s, which (famously) fits on a postcard.

https://yaml.org/spec/1.2.2/

https://yaml-multiline.info/

https://www.json.org/json-en.html

Until you find a snippet of config you want to copy into your `application.yml` in Spring or Quarkus (Java frameworks). If it doesn't paste in cleanly (and it rarely ever does) you'll need to go research the schema and find out where to put things. Meanwhile, if you're using a normal `application.properties` file, after you've finished pasting, you can go on with your life.

It’s aesthetically pleasing for simple configs. I’m so used to writing JSON by hand by now I don’t find it much easier. At least I never have to think about how a value is going to be interpreted from a JSON since it has a decent subset of types and I can visually tell what it is

Does anyone know which format Git uses? Is it YAML? Or TOML? Or something in between?

Git uses its own ini-style conf format which diverges from TOML.

TOML wasn't even invented/specified when git came in to being.

But it looks like YAML was.

I wonder if we would even be using YAML or TOML to the degree we are now if JSON had support for trailing commas and comments.

JSON5 has both of these

I can't find any JSON5 parser that isn't for JavaScript. I've started writing one in C that can then bind to other language, but it takes time to write!

Having on and no both be Boolean literals, but of opposite values sounds like a horrible decision, a typo doesn't result in a syntax error, but instead in a completely wrong semantic misconfiguration.

1 vs 0 is another typo of Boolean values with opposite meanings, in quite a few languages.

Worse, the ability to typo 9 in some languages for 0 and flip your boolean when it should be a type error just seems like a misdesign.

Personally I prefer INI over nearly all configuration formats.

https://github.com/madmurphy/libconfini/wiki/An-INI-critique...

I have seen this post on HN before and I wasn't received very well AFAIR.

But I can't help agreeing with its main point: so much complexity to support a few basic data types that are not sufficient for anything complex anyway.

If you haven't checked it out, NestedText is a great format that offers no handling of types beyond string/list/dict, leaving all that to the application reading in the values.

No character needs escaping.

stronly agree, I came to that conclusion before k8's even existed because I myself thought to use it as a configuration file format and the second I started realizing some of the ambiguity in it's syntax I walked away from it.

The only thing I disagree with is that Coffeescript is still useful. I had the same reaction to Coffeescript that I had with yaml, Coffeescript _never_ had any real point outside of a segment of people preferring to write javascript in Ruby syntax. The biggest issue Coffeescript had is that debugging meant reading through the javascript anyway so you never really got away from javascript.

I'm a fan of either using a full-blown programming language or ini files, and yes I realize that seems insane to many people but at the end of the day ini files are stupidly easy to edit and if you can get away with not needing a full-blown turing complete language then convention based ini files are vastly easier on the human than yaml or json.

I'm either a greybeard that never got with the times or I'm a rebel, probably depends on who you talk to.

I'm a fan of either using a full-blown programming language or ini files

How do you persist complex multi-object state? Think nested lists of objects with references to one another.

If your answer is still "ini files", I'm sure it can be done, but only with a lot of custom-rolled code...xml/json(even yaml) for all their issues provided a code-free way of persisting this all - either through use of marshalling (xml) or json/yaml.load().

you cut off the part of my statement that answers your question

if you can get away with not needing a full-blown turing complete language then convention based ini files are vastly easier on the human than yaml or json.

My claim isn't that ini files solve for every use case, it's that if your needs are simple enough ini files are superior to json/yaml, but that full-blown turing complete languages are superior to everything else.

Also, if you're saving complex object state you don't have a configuration format but a serialization format and definitely ini isn't good for that.

HJson https://hjson.github.io seems a nice 'in-between' between YAML and JSON without the indentation-based syntax, so closer to the JSON side but with comments and less quotes.

What I don't really get is why the cloud providers / tooling implementors have never drafted up a "YAML-light" that just throws out the rarely-used headache-inducing syntax elements.

Hjson is pretty nice.

Two YAML-light style projects are StrictYAML (a Python library), and NestedText (an alternative spec with only string, list, and dict).

StrictYAML solves most if not all of these problems https://hitchdev.com/strictyaml/features-removed/

StrictYAML is great (and the author is in these comments!), but ultimately it's one specific library, not a format spec, so to depend on it for a project you need every person/tool doing the writing/parsing to commit to use that library (and the programming language it was written for).

Again, it's a great project, but I wanted something similar that is a language-agnostic format specification, so moved on to using NestedText wherever I can.

Why is “on” a boolean literal (of course so are “true”, “false”, as well as “yes”, “no”, “y”, “n”, “off”, and all capitalized and uppercase variants)?

”on”, ”off”, ”yes”, ”no”, “y”, and ”n”, and case variants thereof, are not boolean literals in YAML since YAML 1.2 (2009).

As far as I know, not even libyaml supports 1.2. What YAML parsing libraries support 1.2?

I guess the real mystery is why so many tech types speak like a infant having a tantrum, about some esoteric trivia, and then have hordes of their kind come and vigorously head-knod it, and all involved think virtue is being done.

People started using things like YAML, obviously, because it reads closer to natural language. It's like a nested bullet list, which everyone can easily read. Readability is important to people. It's why we don't all still write C and Perl.

So it's one thing to say "I think people should be careful about prioritizing readability over precision especially for production systems". It's another to do this narcissistic dramatic faux-incomprehension implying the markup language gained the popularity it did because everyone's stupider than you.

I guess the real mystery is why so many tech types speak like a infant having a tantrum, about some esoteric trivia, and then have hordes of their kind come and vigorously head-knod it, and all involved think virtue is being done.

Ha, great line. And you caught me mid-tantrum and mid-head nod. :)

We're still using the CoffeeScript of JSON because YAML's UX improvments haven't been brought into the upstream JSON spec like CoffeeScript's UX improvements were brought into JavaScript.

Right, I also don't understand why it's considered a feature of many of these languages to introduce so many ways of doing the same thing. Like the boolean example, but also having three different ways to express a list or dictionary? It's the classic Robustness principle which makes it less robust, making reading and parsing more complicated. How about just allowing one syntax and error if it's not according to spec.

YAML is fine if you don't do weird stuff with it. (And some stupidity like the Norway problem) A good example is OpenAPI schemas, which are quite legible in the YAML.

TOML has some nasty edge cases like top level arrays, arrays of objects under a key, etc.

Obligatory: https://github.com/edn-format/edn

I've had few to no issues when using YAML for docker-compose.yml files. This isn't to say that use of YAML can't be problematic, but I don't believe it's necessarily bad at all for configuration.

So, why? In JSON, you can add templating super easily by turning it into regular JavaScript: use global variables, functions and the like. I don’t understand how anyone could prefer YAML with an ugly templating DSL over that.

That's a valid use case when the target user is the software developer themself, but access to the language runtime is not something that should be accessible to a technical but non-maintainer user. Granted, it's plausible that a "template" JSON can be defined, which would be spread over a JSON-formatted configuration, but what YAML allows the user to do is define "templates" within the configuration itself and control over where those template structures are extended.

When the user is a developer maintaining a software project, they should probably just use JavaScript for configuration, and not JSON files, except when there's a possibility that the configuration can be intercepted.

This is all because people refuse to use JSON parsers that allow comments

The answer is simple: JSON doesn’t have comments, XML isn’t human writable, and TOML isn’t well-supported by common tooling.

Oh toml is atrocious and it’s a nightmare trying to understand nesting with all those repeated keys and double brackets.

CoffeeScript is the worst thing that ever happened to the software industry.

CoffeeScript fooled developers into thinking that transpilation was free and had absolutely no downsides whatsoever. The advantages of CoffeeScript over JavaScript were so incredibly marginal. I've never heard a single good argument about why it was worth adding a transpilation step and all the complexity that came with it.

I think even TypeScript isn't worth transpilation step and bundling complexity these days, especially not when modern browsers allow you to efficiently preload scripts as modules and bypass bundling entirely.

About YAML. It's also not worth it though it's not quite as infuriating as CoffeeScript. The advantage of JSON is that it's equally as human-friendly as it is software-friendly. YAML leans more towards human-friendliness and sacrifices software friendliness. For instance, you can't cleanly express YAML on a single line to pass to a bash command as you can with JSON. It's just one additional format to learn and think about which doesn't add much value. Its utility does not justify its existence.

why people are still using it.

If support for JSON with comments was more widely available / in use, we'd use that. But it's not, so we don't.

I'm still using CoffeeScript whenever I can. It has one of the nicest syntaxes out there, a lot of code fits to one screenful, the logic of the code is easier to see without the clutter of unnecessary syntax and it's a joy to write too.

YAML is probably used for similar reasons.

I don't understand why people want redundant verbose syntax that makes reading and writing code harder. And sadly don't anymore expect anyone to really explain it based on anything tangible.

YAML, or TOML or Json, I think the more problematic issue is using templating at all instead of generating them.

I guess XML and JSON are too verbose. But YAML is so far in the opposite direction, (...)

YAML is a far better format in terms of being human readable and editable, and supports features such as node labels and repeated nodes that turn into killer features when onboarding YAML parsers into applications.

But YAML is so far in the opposite direction, we get the same surprise conversions we’ve had in Excel

This is optional. Besides using a better parser that uses the spec that's long fixed a lot of these listed in the article, another way to avoid the issue is adding more verbosity (that would still not match XML nor JSON).

You don't have this option in XML/JSON, you can't remove all that useless markup (and leave it only when it's useful)

Why is “on” a boolean literal

Because that's what humans use to denote booleans

I wish jdon5.org were adopted more widely, it's JSON,safe because it cannot be executed, but with comments and trailing comma!

Note: YAML is a superset of JSON, which means that any YAML reader can read JSON.

XML also has some other issues (no typing, to many ways to have maps but non seems to be the correct way etc.)

JSON just isn't mean to be written by humans (no comments).

But YAML is just horrible, like the whole accidental mistyping issues (NO => false) are just horrible and not acceptable IMHO. That it's a pretty complex thing doesn't help either.

I honestly don't understand why we (e.g. github actions) still use YAML for new thinks even knowing all the issues especially if we, there are many other well suited decent but less wide spread alternatives.

Are there any advantages of YAML over TOML

YAML is older and more well supported. I'll explain to you why I ended up choosing YAML for the config files for a CLI utility written in Python that I maintain.

I initially chose TOML for many of the reasons mentioned here but before my first release I ended up switching to YAML. Python added support for reading TOML to the standard library in version 3.11, however it still requires you use an external library for writing. Do I use the built in library for reading and an external library for writing? A chunk of my users are on versions of Python older than 3.11 (generally Windows users who installed Python manually at some point), do I import a separate library for THEM to read the files but use the standard library if ver >= 3.11?

Now that I look at the state of things today I probably would add the tomlkit library to my setup file, but that wasn't very mature at the time, so I just used pyyaml. Changing it now would break compatibility with my older versions that use yaml config files, unless I maintained both paths... which I could do but it's just another source of complexity to worry about. These are relatively simple config files the user has to interact with manually so yaml works fine and I don't see any reason to change at this point.

unlike CoffeeScript I don’t understand why people are still using it

Gasp! Does this mean you know why people are still using CoffeeScript?

Well, you can replace YAML with JSON and JS templating without changing the parser. So I guess that’s an advantage over TOML?

I think YAML is for code what Markdown is for Text: It is easy to read and _can_ produce the same or equal output that more strict and extensive languages. Easy readability makes this tradeoff acceptable for most.

For everyone who hates YAML, we extend YAML to another use.

It’s like without rules none of you show any common sense. Who cares what the spec says? Obviously you shouldn’t use “oN” as boolean true.

Dhall is not a Turing-complete programming language, which is why Dhall’s type system can provide safety guarantees on par with non-programmable configuration file formats. Specifically, Dhall is a “total” functional programming language, which means that: You can always type-check an expression in a finite amount of time If an expression type-checks then evaluating that expression always succeeds in a finite amount of time

You shouldn't need the full complexity and power of a Turing complete programming language to do config. The point of config is to describe a state, it's just data. You don't need an application within an application to describe state.

Inevitably, the path of just using a programming language for config leads to your config becoming more and more complex until it inevitably needs its own config, etc. You wind up with a sprawling, Byzantine mess.

your config becoming more and more complex until it inevitably needs its own config, etc. You wind up with a sprawling, Byzantine mess.

We're already there with Helm.

People write YAML because it's "just data". Then they want to package it up so they put it in a helm chart. Then they add variable substitution so that the name of resources can be configured by the chart user. Then they want to do some control flow or repetitiveness, so they use ifs and loops in templates. Then it needs configuring, so they add a values.yaml configuration file to configure the YAML templating engine's behaviour. Then it gets complicated so they define helper functions in the templating language, which are saved in another template file.

So we have a YAML program being configured by a YAML configuration file, with functions written in a limited templating language.

But that's sometimes not enough, so sometimes variables are also defined in the values.yaml and referenced elsewhere in the values.yaml with templating. This then gets passed to the templating system, which then evaluates that template-within-a-template, to produce YAML.

At the end of the day, Helm's issues stem from two competing interests:

(1) I want to write something where I can visualize exactly what will be sent to Kubernetes, and visually compare it to the wealth of YAML-based documentation and tutorials out there

(2) I have a set of resources/runners/cronjobs that each require similar, but not identical, setups and environments, so I need looping control flow and/or best-in-class template inclusion utilities

People who have been working in k8s for years can dispense with (1), and thus can use various abstractions for generating YAML/JSON that don't require the user to think about {toYaml | indent 8}.

But for a team that's still skilling up on k8s, Helm is a very reasonable choice of technology in that it lets you preserve (1) even if (2) is very far from a best-in-class level.

We need turing completeness in the strangest of places. We can often limit these places to a smaller part of the code. But it's really hard to know beforehand where those places will occur. Whenever we think we have found a clear separation we invent a config language.

And then we realize that we need scripting so we invent a templating language. Then everybody looses their minds and invents 5 more config languages that surely will make us not need the templating language.

Let's just call it code and use clever types to separate turing and non-turing completeness?

That's not my experience after using AWS CDK since 2020 in the same company.

Most of our code is plain boring declarative stuff.

However, tooling is lightyears ahead of YAML (we have types, methods, etc...), we can encapsulate best practices and distribute as libs and, finally, escape hatches are possible when declarative code won't cut.

I agree, I think a language like dhall (https://dhall-lang.org/) strikes a good balance.

I have a recent example of rolling out IPv6 in AWS:

1. Create a new VPC, get an auto-assigned /56 prefix from AWS.

2. Create subnets within the VPC. Each subnet needs an explicitly-specified /64 prefix. (Maybe it can be auto-assigned by AWS, but you may still want to follow a specific pattern for your subnets).

3. Add those subnet prefixis to security / Firewall rules.

You can do this with a sufficiently-advanced config language - perhaps it has a built-in function to generate subnets from a given prefix. But in my experience, using a general-purpose programming language makes it really easy to do this kind of automation. For reference, I did this using Pulumi with TypeScript, which works really well for this.

A really good solution here is to use a full programming language but run the config generator on every CI run and show the diff in review. This way you have a real language to make conditions as necessary but also can see the concrete results easily.

Unfortunately few review tools handle this well. Checked-in snapshot tests are the closest approximation that I have seen.

> You don't need an application within an application to describe state.

As shown in the article, you apparently do.

That kind of ignores the entire pipeline involved in computing the correct config. Nobody wants to be manually writing config for dozens of services in multiple environments.

The number of configurations you need to create is multiplicative, take the number of applications, multiply by number of environments, multiply by number of complete deploys (i.e. multiple customers running multiple envs) and very quickly end up with an unmanageable number of unique configurations.

At that point you need a something at least approaching Turing completeness to correctly compute all the unique configs. Whether you decide to achieve that by embedding that computation into your application, or into a separate system that produces pure static config, is kind of academic. The complexity exists either way, and tools are needed to make it manageable.

Yeah, YAML is good at declarative things. It’s when you start using it imperatively eg CI/CD is when it really starts to get ugly.

It happens because config is dual purpose: its state, but it's also the text-UI for your program. It spirals out of control because people want the best of it being "just text" and being a nice clean UI.

The complexity is already there. If you only need static state like you say, then YAML/JSON/whatever is fine. But that's not what happens as software grows.

You need data that is different depending on environments, clouds, teams, etc. This complexity will still exist if you use YAML, it'll just be a ridiculous mess where you can break your scripts because you have an extra space in the YAML or added an incorrect `True` somewhere.

Complexity growth is inevitable. What is definitely avoidable is shoving concepts that in fact describe a "business" rule (maybe operational rule is a better name?) in unreadable templates.

Rules like: a deployment needs add these things when in production, or change those when in staging, etc exist whether they are hidden behind shitty Go templates or they are structured inside of a class/struct, a method with a descriptive name, etc.

The only downside is that you need to understand some basics of programming. But for me that's not a downside at all, since it's a much more useful skill than only knowing how to stitch Go templates together.

Pulumi is enticing because it allows you to write in your preferred language and abandon HCL, but it is strictly worse in my opinion. IaC should be declarative in my opinion. That allows for greater predictability, reproducibility and maintainability. In general, I think wanting to use Python or Ruby or whatever language you're going to use with Pulumi is not a good basis for choosing the tool.

There are many graveyards filled with places that tried to start writing logic into their IaC back in the Chef/Puppet era and made a huge mess that was impossible to upgrade or maintain (recall that Chef is more imperative/procedural, whereas in Puppet you describe the desired end state). The Chef/Pulumi approach can work, but it requires one person who is draconian about style and maintenance. Otherwise, it turns into a pile of garbage very quick.

Terraform/Puppet's model is a lot more maintainable for longer terms with bigger teams. It's just a better default for discouraging patterns that necessitate an outsized investment to maintain. Yes HCL can be annoying and it feels freeing to use Python/TS/whatever, but pure declarative code prevents a lot of spaghetti.

recall that Chef is more imperative/procedural, whereas in Puppet you describe the desired end state

Chef's resources and resource collection and notifications scheme is entirely declarative. And after watching users beat their heads against Chef for a decade the thing that users really like is using declarative resources that other people wrote. The thing that they hate doing is trying to think declaratively themselves and write their own declarative resources or use the resource collection properly. People really want the glue code that they need to write to be imperative and simple.

The biggest issue that Chef had was the "two-pass parsing" design (build the entire resource collection, then execute the entire resource collection) along with the way that the resource collection and attributes were two enormous global variables which were mutable across the entire collection of recipe code which was being run, and then the design encouraged you to do that. And recipes were kind of a shit design since they weren't really like procedures or methods in a real programming language, but more like this gigantic concatenated 'main context' script. Local variables didn't bleed through so you got some isolation but attributes and the resource collection flowing through all of them as god-object global variables was horrible. Along with some people getting a bit too clever with Ruby and Chef internals.

I had dreams of freezing the entire node attribute tree after attribute file processing before executing resources to force the whole model into something more like a functional programming style of "here's all your immutable description of your data fed into your functional code of how to configure your system" but that would have been so much worse than Python 2.7-vs-3.0 and blown up the world.

Just looking at imperative-vs-declarative is way too simplistic of an analysis of what went wrong with Chef.

The existence of the YAML language for Pulumi and the CDK for TF both confound this explanation, it’s just not grounded in reality.

The limitations of HCL are actually a good thing!

I have never seen Pulumi or CDKTF stuff work well. At some point are you simply writing a script and abandoning the advantages of a declarative approach

Pulumi is declarative. The procedural code (Python, Go, etc) generates the declaration of the desired state, which Pulumi then effects on the providers.

HCL is also not pure declarative code either. It can invoke non-declarative functions and can do loops based on environment variables, so in that sense there is really no difference between Pulumi and Terraform. The only real difference is that HCL is a terrible language compared to say Python.

I'm actually fairly sure HCL is Turing complete, it has loops and variables. But even if it is not all the way turing complete it's pretty close.

The fact that HCL has poor/nonexistent multi-language parsing support makes building tooling around terraform really annoying. I shouldn't have to install Python or a Go library to read my HCL.

I once took a job that involved managing Ansible playbooks for an absolutely massive number of servers that would run them semi-regularly for things like bootstrapping and patching. I had used Chef before for a similar task, and I loved it because it's just ruby and I could easily define any logic I wanted while using loops and proper variables.

I understand that Ansible was designed for non-programmers, but there is no worse hell for someone who is actually familiar with basic programming than being confined to the hyper-verbose nonsense that is Jinja templating of Ansible playbooks when you need to have a lot of conditional tasks and loops.

Ansible has a great module/plugin system. It's trivial to handle complex tasks or computations in a custom module or action.

So why is there this massive ecosystem around not writing modules then? RedHat invented automation controller just so they didn't have to implement proper error handling with Ansible.

I agree. And to make matters worse, the DSL on YAML has grown so large in features, it may as well be a programming language now.

Chef vs Ansible was the first example that popped into my mind. I had a very love/hate relationship with Chef when I used it, but writing cookbooks was definitely one of the good parts.

Agreed, and I almost feel silly for pointing this out, but for writing JSON (JavaScript Object Notation), I'd recommend using JavaScript...

JS is actually not that great for this IMO. You probably need an NPM package to even deal with YAML because JS has a shitty standard library.

Sticking to a scripting language with a strong standard library is way better.

Any unix system can get Ruby/Python and read/write YAML/JSON immediately without caring too much about versions.

Of course in today's upside down world most developers seem to only know JS, so it would at least be "familiar". Still a bad choice in my view.

The way this industry is going, give it a few years and we'll have React-Kubernetes for generating templates. And I wish I was joking.

Parent is talking specifically about writing JSON, not YAML.

For JSON I'd stick with Typescript to be honest. You end up executing Javascript and producing Javascript-native objects, but the typing in Typescript to ensure the objects you produce are actually valid will save a lot of debugging.

I agree that YAML templating is kind of insane, but I will never understand why we don't stop using fake languages and simply use a real language.

The problem is language nerds write languages for other language nerds.

They all want it to be whatever the current sexiness is in language design and want it to be self-hosting and be able to write fast multithreaded webservers in it and then it becomes conceptually complicated.

What we need is like a "Logo" for systems engineers / devops which is a simple toy language that can be described entirely in a book the size of the original K&R C book. It probably needs to be dynamically typed, have control structures that you can learn in a weekend, not have any threading or concurrency, not be object oriented or have inheritance and be functional/modular in design. And have a very easy to use FFI model so it can call out to / be called from other languages and frameworks.

The problem is that language nerds can't control themselves and would add stuff that would grow the language to be more complex, and then they'd use that in core libraries and style guides so that newbies would have to learn it all. I myself would tend towards adding "each/map" kinds of functions on arrays/hashmaps instead of just using for loops and having first class functions and closures, which might be mistakes. There's that immutable FP language for configuration which already exists (i can't google this morning yet) which is exactly the kind of language which will never gain any traction because >95% of the people using templated YAML don't want to learn to program that way.

I mean... Nix satisfies every single one of what you mentioned and people say its too complicated. It's literally just the JSON data structure with lambdas, which really is basic knowledge for any computer scientist, and yet people complain about it.

It's fairly straightforward to 'embed' and as a bonus it generates json anyway (you can use the Nix command line to generate JSON). Me personally, I use it as my templating system (independent of nixpkgs) and it works great. It's a real language, but also restrictive enough that you don't do anything stupid (no IO really, and the IO it does have is declarative, functional and pure -- via hashing)

Completely agree, my wish is that anything that risks getting complex uses a Ruby-based DSL.

For example, I like using Capistrano, which is wrapper around rake, which is a Ruby based DSL. That means that if things get tricky I can just drop down to using a programming language. Split stuff into logical parts that I load where needed and, for example, I can do something like YAML.load(..file..).dig('attribute name') or JSON.load from somewhere else.

Yes, you risk someone building spaghetti that way, but the flip side is that a good devops can build something much easier to maintain than dozens of YAML and JSON files, and you get all the power from your IDE and linters that are already available for the programming language, so silly syntax errors are caught without needing to run anything.

I heard you liked configuration languages, so I made this configuration language for your configuration language generation scripts. It supports templates, of course.

I agree, and I just want to highlight what you said about generating a config file. It's extremely useful to constrain the config itself to something that can go in a json file or whatever. It makes the config simpler, easier to consume, and easier to document. But when it comes to _writing_ the config file, we should all use a programming language, and preferably a statically typed language that can check for errors and give nice auto complete and inline documentation.

I think aws cdk is a good example of this. Writing plain cloudformation is a pain. CDK solves this not by extending cloudformation with programming capabilities, but by generating the cloudformation for you. And the cloudformation is still a fairly simple, stable input for aws to consume.

This. It's why things like Cloud Development Kit and Pulumi are quite interesting to me.

I argued that point in my article some time ago https://beepb00p.xyz/configs-suck.html also HN discussion at the time news.ycombinator.com/item?id=22787332

Because the security surface of "any language" is tricky and most (all?) popular languages do not have nice data literal syntax better than JSON and YAML.

I think language embedding is kind of a lost architecture in modern stacks. It used to be if you had a sufficiently complex application you'd code the guts in C/C++/Java/Whatever and then if you needed to script it, you'd embed something like a LISP/Lua/whatever on top.

But today, you have plenty of off-the-shelf JSON/TOML/YAML parsers you can just import into your app and a function called readConfig in place of where an embedded interpreter might be more appropriate.

It's just easier for developers to add complexity to a config format rather than provide a full language embedding and provide bindings into the application. So people have forgotten how to do it (or even that they can do it - I don't think it occurs to people anymore)

Relevant article (2012): http://mikehadlow.blogspot.com/2012/05/configuration-complex...

This is how config actually works in Scala.

I'm very happy using Typescript to templatize JSON. You can define a template as a class, compose them if needed, and when you are done, just write an object to a file.

Helm would probably benefit from something like JSX for YAML/JSON. Just being able to script a chart instead of this templating hell.

In some places working with Kubernetes, people unironically use the term "YAML engineer".

I've seen memes where SREs complain they have just become YAML engineers. :(

I mean...building a data centre / PaaS with YAML is pretty cool

We used to have to shove servers in to racks ! Kids these days :D

I *loved* shoving servers in racks!

I dream of a day there's a physical component of my job, not just the staring at a screen bit.

I've been there. Not YAML specifically, but basically just configuration (XML, JSON, properties, ...) for some proprietary systems without any good documentation or support available. "It's easy, just do/insert X", half a year and dozens of meetings and experts later, it was indeed not just X. Meanwhile I could've build everything myself from scratch or with common open-source solutions.

yamlops is a real thing :)

This criticism doesn't pass the sniff test though: your average Haskeller loves to extoll the virtues of using Haskell to implement a DSL for some system which is ultimately just doing the same thing in practice (because they're still not going to write documentation for it, but hey, how hard can it be to figure out it's just...)

YAML becomes a programming language because vendors need a DSL for their system, and they need to present it in a form which every other language can mostly handle the AST for, which means it's easiest if it just lives atop a data transfer format.

maybe yaml should standardise hygienic macros. and a repl.

The lengths people go to avoid using s-expressions never ceases to amaze me.

We're talking countless centuries and great many minds pushed to brink of madness, just to keep the configs looking like Python or JavaScript.

I'd say it's even worse: it's a collective hallucination that complex configs are not code.

Hey now. Your average Haskeller would simply recommend you replace YAML with Dhall.

https://dhall-lang.org/

Why not "just" use an embedded DSL?

I don't know what this has to do with Haskell. I understand that they need a DSL for their system. I just don't agree that it is a good idea to use some general purpose serialization format. In the end they always evolve to a nearly full programming language with conditions and loops. Using a full programming language makes much more sense IMHO, for example like Zig build files or how we use Python to build neural networks. That way I can actually use existing tools to do what I need.

YAML is the Bradford Pear of serialization formats. It looks good at first, but as your project ages, and the YAML grows it collapses under the weight of it's own branches.

I had to look up that tree. Invasive, offensive odour, cynaide-rich fruit. That's a a good insult!

You should see what they look like after a 25kph breeze. Which isn't too far off from what templated YAML generates after someone commits a bad template.

YAML is also just as bad as the Linden tree.

https://www.youtube.com/watch?v=aoqlYGuZGVM

Even worse, every generation repeats this mistake. I‘m not sure S-Expressions are the answer, but Terraform HCL should never have been invented.

I was just telling a colleague today that HCL is great until you need to do a loop. A lot of parallels to this YAML discussion

My favorite pattern in HCL is the if-loop. Since there is no »only do this resource if P« in Terraform, the solution is »run this loop not at all or once«.

I'll take HCL over YAML templating any day. At least it is working with real data structures not bashing strings together.

That being said, yes, it is also an awful language.

It's pretty much repeating the mistake of early 2010s Java, where the entire application frequently was glued together by enormous ball of XML that configured all the dependency injection.

It had the familiar properties of (despite DTDs and XML validation) often blowing up late, and providing error messages that were difficult to interpret.

At the time a lot of the frustration was aimed at XML, but the mid 2020s YAML hell shows us that the problem was never the markup language.

You have a loosely coupled bundle of modules that you need to glue together with some configuration language. So you decide to use X. Now you have two problems.

Spot on. We use ytt[0], "a slightly modified version of the Starlark programming language which is a dialect of Python". Burying logic somewhere in a yaml template is one thing I dislike with passion.

[0] https://tanzu.vmware.com/developer/guides/ytt-gs/

TBH, ytt is the only yaml templating approach that I actually like.

The downside is that it is easy to do dumb things and put a lot of loops in your yaml.

The positive is that it is pretty easy to use it like an actual templating language with business logic in starlark files that look almost just like Python. In practice this works pretty well.

The syntax is still fairly clumsy, but I like it more than helm.

In such places one frequently has to remind oneself and others to not start programming in that configuration language, if avoidable, to not create tons of headache and pain.

Yeah … for CI files (like Github workflows & such), one of the best things I think I've done is just to immediately exec out to a script or program. That is, most of our CI steps look like this:

  run: 'exec ci/some-program'

… and that's it. It really aids being able to run the (failing) CI step offline, too, since it's a single script.

Stuff like Ansible is another matter altogether. That really is programming in YAML, and it hurts.

I love the idea of keeping it simple and I do try to use kustomize or even plain yaml as installation method as much as possible.

But in practice when managing large systems you inevitably end up benefiting from templating

I've begun thinking that if you start thinking about templating you might be better off building an operator. Operators aren't as well understood and documented. But in my mind an operator is just a pod or deployment that creates on demand resources using the k8s api.

oh yeah; operators are great and sometimes they are necessary.

On the other hand, most operators I've seen are just k8s manifest templates implemented in Go.

I often end up preferring using Jsonnet to deal with that instead of doing the same stuff in Go.

Jsonnet is much more close to the underlying datamodel (the k8s manifest Json/Yaml document) and comes with some useful functionality out of the box, such "overlays".

It has downsides too! It's untyped, debugging tools are lacking, people are unfamiliar with it and don't care to learn it. So I totally get why one would entertain the possibility of writing your "templates" using a better language.

However, an operator is often too much freedom. It's not just using Go or Rust or Typescript to "generate" some Json manifests, but it also contains the code to interact with the API server, setup watches, and reactions etc.

I often wish there was a better way to separate those two concerns

I'm a fan of metacontroller [1], which is a tool that allows you to write operators without actually writing a lot of imperative code that interacts with the k8s API, but instead just provide a general JSON->JSON transformer, which you could write in any langue (Go, Python, Rust, Javascript, .... and also Jsonnet if you want).

I recently implemented something similar but much tailored to just "installing" stuff, called Kubit. An OCI artifact contains some abitrary tarball (generally containing some template sources) and a reference to a docker image containing an "engine" and runs the engine with your provided tarball + some parameters passed in a CRD. The OCI artifact could contain a helm chart and the template engine could contain the helm binary, or the template engine could be kubecfg and the OCI artifact could contain a bunch of jsonnet files. Or you could write your own stuff in python or typescript. The kubit operator then just runs your code, gathers the output and applies with with kubectl apply-set.

1. https://metacontroller.github.io/metacontroller/intro.html

2. https://github.com/kubecfg/kubit

I'm a fan of metacontroller [1], which is a tool that allows you to write operators without actually writing a lot of imperative code that interacts with the k8s API, but instead just provide a general JSON->JSON transformer,

That seems... surprising, to me. It's not clear to me how a JSON->JSON transformer (which is essentially a pure function on UTF-8 strings to UTF-8 strings, i.e. an operation without side effect) can actually modify the state of the world to bring your requested resources to life. If the only thing the Operator is being used for is pure computation, then I agree it's overkill.

An example use case for an Operator would be a Pod running on the cluster that is able to receive YAML documents/resource objects describing what kind of x509 certificate is desired, fulfill an ACME certificate order, and populate a Secret resource on the cluster containing the x509 certificate requested. It's not strictly JSON to JSON, from "certificate" custom resource to Secret resource - there's a bunch of side-effecting that needs to take place to, for instance, respond to DNS01 or HTTP01 challenges by actually creating a publicly accessible artifact somewhere. That's what Operators are for.

Metacontroller is actually quite easy to learn. It comes with good examples too. Including a re-implementation of the Stateful Set controller, all done with iterations of an otherwise pure computation. The trick is obviously that the state lives in the k8s api server, from which the inputs of the subsequent invocation of your pure function come.

Helm is a low budget operator.

No... no, no, no. No kidding; Operators are indeed poorly understood. They are not just glorified XSLT for YAML/JSON.

https://kubernetes.io/docs/concepts/extend-kubernetes/operat...

The purpose of an Operator is to realize the resources desired/requested in a (custom) resource manifest, often as YAML or JSON.

You give the apiserver a document describing what resources you need. The Operator actually does the work of provisioning those resources in the "real world" and (should) update the status field on the API object to indicate if those resources are ready.

How would one use the json api without ending up writing a bunch of custom code?

I think custom code is to be expected, and making it maintainable is what's important.

everything should be made as simple as possible, but no simpler.

Helm et al made it simpler than it was, IMO.

Helm is another can of hot garbage. Impossible to vendor without hitting name collisions, can configure only what’s templated.

Jsonnet is the way to go with generated helm manifests transformed later. Kustomize with its post-renderer hooks is another can of even hotter garbage.

Impossible to vendor without hitting name collisions

What problem exactly are you facing? I can change the name of the chart itself in chart.yaml and if the name of the resources collide I change them with nameOverride/fullnameOverride in the values. All charts have these because they are autogenerated by `helm create`.

I vendor all charts and never had this problem.

You just made a copy of a chart. You modified your chart. What I’m missing is helm having some notion of an org in the chart name, like docker does: repo/name:tag, helm only has name and version. Hence you modify your chart.yaml and it should be preferable without having to modify anything.

This is really problematic when a chart pulls dependencies in.

Everyone hand rolling code does not seem like an improvement over tools like helm even if it’s yaml

No, obviously not, and that's not what I've suggested.

probably doesn't meet the 2nd requirement, most definitely doesn't meet the third, but:

https://cdk8s.io/docs/latest/

The second requirement is actually probably the most important - if someone that just set up ArgoCD, Flux, or has their own GitOps pipeline, how much of a headache does using a new compile step present?

Lots of things are simple in isolation: want to use Cue? Just get your definitions and install the compiler and call it and boom, there are your k8s defs! Ok, but how do I integrate all of that into my existing toolchain? How do I pass config? Etc, etc.

The best, fastest tool won't win. The tool that has the most frictionless user story will.

I was able to get CDK8s working easily by simply committing the built template along with my TypeScript. Then, I just pointed ArgoCD to my repo.

We do the same thing but commit to a second git repo that we treat like the "k8s yaml release database".

The Kubernetes API is fairly straightforward, and has a well-defined (JSON) schema, people should be spending a bulk of their time learning k8s understanding how to use the API, but instead they spend it working out how to use a Helm chart.

This is a general pattern in software. Instead of learning the primitives and fundamentals that your system is built on, which would be too hard, instead learn a bunch of abstractions over top of it. Sure, now you are insulated from the lower-level details of the system, but now you have to deal with a massive stack of abstractions that makes diagnosis and debugging difficult once something goes wrong. Now it's much harder to ascertain what exactly is happening in your system, since the details of what is actually going on have been abstracted away from you by design. Further, you are now dependent on that abstraction layer and must support and accommodate whatever updates may be released by the vendor, in addition to whatever else is lurking in your dependency graph.

k8s make me miss xml

For Helm the value is it is not a configuration managemet solution but a package manager. The rest are just methods of writing json/yaml.

I understand the "hate" against yaml, But I don't think it's deserving it that much.

Perhaps timoni will take over with it's usage of cue. At least it's a package management solution.

We're using jsonnet for our systems and they have absolutely nothing to do with k8s. I'm not sure it's true to say it has ever gained much traction. It's just a niche case for complex configuration, and isn't the most publicised tool.

It does precisely what we need with zero fuss, cross platform and cross _language_ (we've embedded it in C++, .NET, and JVM executables).

We can use the resulting json config with a vast array of tools that simply don't exist for the alternatives such toml/yaml/hocon/ini whatever. In fact we tried to get HOCON working for non-JVM languages but there was always some edge case.

You just pinpointed my biggest peeve with YAML. It looks like it's "human friendly" because there are no scary curly braces. But you still need to get the syntax exactly right, so that benefit is very small. And now you have to keep your finger on the screen while scrolling in order to figure out what a bullet belongs to.

Then what alternative do you recommend for content creators? Do you use the alternative in Markdown front matter?

I don't think they have a need for configuration files while filming their tiktoks.

What is the best term to use for the people who are writing content on the web team? The ones who write blog entries, documentation, and marketing pages. The ones who mainly touch Markdown files.

You should make what you do / don't do less of your identity. You're limiting yourself because you identify as "not the kind of person who does that".

Note that I am not a content creator myself. I build solutions for web teams and on those teams, some people focus solely on content and Markdown. I want to offer them an easy editing experience. So far YAML has been the easiest format for them.

Learn JavaScript. Get the fuck out of your "content creator" pigeonhole. JavaScript is content.

I don't think anyone writes their blog entries with JavaScript here.

TOML is pretty easy to grok and forgiving at the same time

But you still need to get the syntax exactly right

I think it's more that it's declarative that makes it simple. Also you just have to remember simpler rules compared to JSON.

E.g.

  - Apple
  - Orange
  - Strawberry
  - Mango

Is simpler than

  [
    "Apple",
    "Orange",
    "Strawberry",
    "Mango"
  ]

Don't forget to skip that last comma! But not all of the others!

I think it's more that it's declarative that makes it simple

..it's no more or less declarative than other configuration languages?

And yes, I get that it looks simpler. I just think that it applies as long as your file can fit in about half a page. As it grows and becomes deeply nested, IMO, that simplicity disappears.

This is least of my worries - just use VScode with plugin which gives red lines on and formats yaml, use yamllint in your CI.

Basically you're saying YAML is unreadable without an IDE or a text editor with advanced highlighting functionality.

YAML is all but human-friendly. It has far too many special features and edge cases for most people. Something simple like Java properties files would solve something like markdown front matter perfectly fine.

Java properties files are a mess. They still require Windows encoding (ISO-8859-1), which is incompatible with UTF-8.

Only if you're still using java 8.

I'd personally go with TOML over YAML for that

Why something so complex for front matter? Isn't it typically just a few key/value pairs?

GitHub actions would suck whatever you "configured" them in, because you are trying to describe a program in a data structure.

Ansible makes the same mistake, as do countless other tools.

"because you are trying to describe a program in a data structure"

(cries in lisp)

I often wonder if the only reason we haven't used lisp more as a society, and certainly in the devops world, is because our brains find it easier to parse nested indentation than nested parentheses.

But in doing so, we've thrown out the other important part of lisp, which is that you can use the same syntax for data that you do for control flow. And so we're stuck in this world where a "modern-looking" program is seen as a thing that must be evaluated to make sense, not a data structure in and of itself.

https://www.reddit.com/r/lisp/comments/1pyg07/why_not_use_in... is a fascinating 10 year old discussion. And of course, there's Smalltalk, which guided others to a treasure it could not possess. But most younger programmers have never even had these conversations.

Lisp code is written with nested indentation! So that can't be it.

Non-lisp languages have parentheses, brackets and braces, using indentation to clarify the structure. Nobody can reasonably work with minified Javascript, without reformatting it first to span multiple lines, with indentation.

Lisp has great support for indentation; reformatting Lisp nicely, though not entirely trivial, is easier than other languages.

Oh, have you seen parinfer? It's an editing mode that infers indentation from nesting, and nesting from indentation (both directions) in real-time. It also infers closing parentheses. You can just delete lines and it reshuffles the closers.

The github.io site has animations:

https://shaunlebron.github.io/parinfer/

The best interpretation of weebull's comment is not that describing a program in a data structure is "bad" per se, but that doing that in a configuration language (or requiring configuration constructs to be programming constructs) might not be a hot idea.

Even Lisp software that uses Lisp for configuration does not necessarily allow programming in that configuration notation.

Yeah, I think describing a program in a data structure is fine. I honestly prefer it to any syntax that a "real" programming language has brought me. It's so consistent and you can really focus on what you care about. What is unhappy about Github Actions and similar is that your programming language has like 2 keywords; "download a container" and "run a shell script". I would have preferred starting with "func", "handle this error", and "retry this operation if the error is type Foo" ;)

Since this article is about helm, I'll point out that Go templates are very lispy. I often have things in them that look like {{ and (foo bar) (bar baz) }} and it only gets crazier as you add more parentheses ;)

you are trying to describe a program in a data structure

This describes 100% of software development, though! Programming is just designing data structures that represent some computation. Each language lends itself better to some computations than to others (and some, like YAML, are terrible for describing any kind of computation at all), but they're all just data structures describing programs.

The problem isn't that GitHub Actions tries to describe a program in a data structure, the problem is that the language that they chose to represent those programs (YAML and the meta language on top) is ill-suited to the task.

My favorite example of this is chown/chmod taking 4-5 lines, in yaml. Sure you can do it a bunch of different ways, sure it allows for repeatable commands. But, it just sucks.

At this point, I even prefer plain JSON to YAML. What pushed me over the edge is that "deno fmt" comes with a JSON formatter, but not a YAML formatter. It's a single binary that runs in milliseconds. For YAML auto-formatting you basically have to use Prettier, and Prettier depends on half of NPM and takes a good 2 seconds to startup and run. So, I literally moved every YAML file in our repository at work that could be JSON to JSON and I think everyone has been much happier. Or, at least I have been, and nobody has complained to me about it.

Various editors also support a $schema tag in the JSON. I added this feature to our product (which has a flow that invokes your editor on a JSON file), and it works great. You can just press tab and make a config file without reading the docs. Truly wonderful.

YAML has this too with the YAML language server, but you need your tab key to indent stuff, so the ergonomics are pretty un-fun. JSON isn't perfect, but at least the text "no" is true.

At work we're currently expanding to another country. Which means that many services now need a country label etc., which is fun when you're adding "no" to all our existing services. Luckily it's quick to catch, but man... why?

Yeah, I'm pretty sure there are exactly two substantive problems with JSON for (static) configuration file use cases, which are comments and multiline strings (especially with sane handling of indentation). YAML fixes these, but it adds so much complexity in the process including such a predictable footgun of unquoted strings (the no/false problem is particularly glaring/absurd, but it's also easy to forget to quote other boolean values or numbers in a long list of other strings).

If configs had well-adopted schema support, it wouldn't be so bad.

These same feelings extend to other proprietary config languages like HCL for Terraform, ASL for AWS Step Functions, etc. It's fine that you want a declarative API, but let me generate my declaration programatically.

Yeah, I've had the same sort of opinion since the bad old AWS CloudFormation days. I wrote an experimental CloudFormation generator 4 years ago where all of the resources and Python type hints were generated from a JSON file that AWS published and it worked really well (https://github.com/weberc2/nimbus/blob/master/examples/src/n...).

Config declared in and generated by code has been a superior experience. It's one of the things that AWS CDK got absolutely right.

Is that how CDK works? I've only dabbled with it, but it was pretty far from the "generate cloudformation" experience that I had built; I guess I never "saw the light" for CDK. It felt like trading YAML/templating problems for inheritance/magic problems. I'd really like to hear from more people who have used AWS CDK, Terraform's CDK, and/or Pulumi.

An often-heard benefit for using YAML is that JSON does not have comment. What I don't understand is why we would switch to a whole new language. Just add a filter before loading the configuration, which can't be harder than switching to YAML, right?

Another reason for YAML is that it is easier to read. That I don't understand either. The endless pain of dealing with configuration does seem come from saving a few seconds of parsing off braces and brackets, but from not being about easily figure out what goes wrong, especially when what's wrong is a missing space or tab embedded in hundreds of lines of configurations.

In the case of GitHub Actions, it's made more painful by the lack of support for YAML anchors, which would provide a bare minimum of composability.

https://github.com/actions/runner/issues/1182

That's sort of what https://cdks.io does, except the final output is YAML for better readability.

Fyi https://cdk8s.io/

I would recommend implementing a similar API to Grafana Tanka: https://tanka.dev

When you "synthesise", the returned value should be an array or an object.

1. If it's an object, check if it has an `apiVersion` and `kind` key. If it does, yield that as a kubernetes object and do not recurse. 2. If it's an array or any other object, repeat this algorithm for all array elements and object values.

This gives a lot of flexibility to users and other engineers because they can use any data structures they want inside their own libraries. TypeScript's type system improves the ergonomics, too.

copy and paste from the web

Hot take, this is a terrible idea, and is why so much cloud infra is monstrously expensive (and bad).

People need to stop making infra easy. It’s not supposed to be easy, because when you make a bad decision, you don’t get to revert a commit and carry on with life. You don’t understand IOPS and now your gp2 disk is causing CPU starvation from IOWAIT? Guess you’re gonna learn some things about operating within constraints while waiting for a faster disk to arrive at the DC! Buckle up, it’ll be good for you.

I’m fully aware that I sound like a grouchy gatekeeper here, and I’m fine with it. People making stupid infra decisions en masse cause me no end of headaches in my day job, and I’m tired of it.

Others have mentioned CDK, but I want to say that this is almost the exact approach I took on a project recently and it worked out fine. Node script that validates a few arguments and generates k8s manifests as JSON to be fed into `kubectl apply`.

IME, here's no need to involve anything more complicated if your deployment can be described solely as k8s manifests.

Take a look at cdk8s from Amazon.

https://github.com/cdk8s-team/cdk8s-examples/tree/main/types...

You can convert YAML to JSON programmatically, and JSON is valid jsonnet, so you can pretty much copy paste examples from the web into your jsonnet if you find yourself wanting to do that

Except when you need anything more complex than a string or an array of strings, when they become entirely useless.

There is not a single even slightly complex piece of software that uses exclusively env vars for configuration. Even bash or vim have config files, this is not some new idea.

Hang on there, array of strings? Environment variables can't handle that either without quirks.

Oops, you're right, even that is too advanced...

you can't really check them into source code though, right?

I swear this is how we got docker containers... some ruby dev who abused env vars and a SA who was sick of his shit breaking on every roll out and hearing "but it works for me"...

And now installable software is a fucking unicorn!

( This week I keep running into go apps that can be installed from source or as straight down load, with docker as well. Been a breath of fresh air)

The article is mostly talking about things like Helm charts for kubernetes, which aren't possible to define as env vars.

How do you make sure all the right variants of all the environment variables are in the right place(s)?

How does it compare to dhall?

Dhall's lack of any form of type inference makes it very verbose and difficult to refactor in my opinion. (I'm the author of dhall-kubernetes and never ended up using it in production; funnily enough). Dhall is also extremely slow. We had kubernetes manifests that took _minutes_ to type-check. Cue is basically instant. This matters a lot to me.

I find cue very ergonomic. Also it treating both types and values as values is very neat. You write your types and your values in the same syntax and everything unifies neatly. but I sometimes miss its lack of functions.

Cue also being to ingest protobuf definitions and openapi schemas makes it very quick and easy to integrate with your project. Have a new Kubernetes CRD you want to have type-checked in cue? No problem just run `cue get go k8s.io/api/myapi/v1alpha1` and off you go you have all your type definitions imported from Go to Cue!

Especially for k8s this makes for very fast development and iteration cycle.

I've wanted to take a look at https://nickel-lang.org/ which is a "what if cue had functions" language. but to be honest Cue kind of serves my needs.

Speaking of Nickel, they've got a great document detailing the reasons for their design (for example why they chose not embed in a general-purpose language like Pulumi) and how Nickel compares to other config languages like Dhall and CUE: https://github.com/tweag/nickel/blob/master/RATIONALE.md

Cue was designed very much with k8s in mind and developed tutorials and integrations for it early on. Dhall was designed pre-k8s. Dhall had to introduce a defaults feature: before that it was completely unusable for k8s. Dhall has functions, which are natural to programmers- particularly from an FP background, Dhall would be trivial to start using. Whereas it takes some getting used to cue's unifications- but there is enough documentation and integration for getting going with k8s to make up for it. Dhall has unique features for stably importing configurations from remote locations.

we have a pipeline that ingest very concise cuelang files.

then it generates json files for each application for a tool that will create xml definitions which then are applied to a xls which the architects own, to spit out a yaml that we use to apply our helm charts. the charts deploy a k8s client which then interact with the main cluster via json using the api.

took a while, but we are using the best tool for each job.

just throw in a kafka cluster so you can pipe each step through an event bus and you'll have an enterprise-grade deployment setup

You used JSON twice, how casual.

Your API should clearly be using protobuf.

If I’m going to use a whole language to generate my config already, why would I use anything but the language my application is written in? Everything can export JSON after all.

Different requirements, different guarantees. Principle of least power. Have a look at https://docs.dhall-lang.org/discussions/Safety-guarantees.ht....

This makes no sense to me.

You have complex enough logic to warrant a language, you should use a real language. You'll have more support, less obscure issues, a solid standard library and whatever else you want, because it's a REAL language.

If the argument is "someone in my team uses recursion to write the YAML files, so I'll disallow it", then the issue is not with the language, it's with the team.

What I have found on my career is that many Ops people sell themselves short and hesitate to dive into learning and fully using an actual language. I've yet to understand why, but I've seen it multiple times.

They then end up using pseudo-languages in configuration files to avoid this small step towards using an actual language, and then complain about how awful those pseudo-languages are.

You have complex enough logic to warrant a language, you should use a real language.

Not sure what you mean. Dhall is a real language:

Because your language might not have a nice type system. For example Python -> JSON is going to produce worse guarantees than DHALL.

Pulumi over Terraform.

CDK over Cloudformation.

Don't hand craft configuration files, these aren't new lessons. I remember being first introduced to Troposphere, which was pretty awesome.

Haml, Pug, and JSX are not template languages even though they can output HTML.

That's nonsense, unless we go by your idiosyncratic definition of what a template language is ("fancy string interpolation").

Haml (HTML Abstraction Markup Language) is a templating system that is designed to avoid writing inline code in a web document and make the HTML cleaner.

Pug – robust, elegant, feature rich template engine for Node.js

JSX is an XML-like syntax extension to ECMAScript without any defined semantics.

OK, I'd agree that JSX is not strictly a template language.

But in the end, all of these compile down to HTML. Not by string interpolation, but as a language that is parsed into a syntax tree, then rendered into HTML properly with an internal understanding of valid structure.

YAML with templating is fancy string interpolation, it's not a template language (or at least a poorly implemented one).

I am aware that Haml and Pug call themselves template languages, but they are not. In a template language, the source is a "template" that has some special syntax to fill in some bits. I don't think that's a very idiosyncratic definition. Pretty much any programming language can output a bunch of text, but most of them are not template languages. Java has XMLBuilder, but that doesn't make it a template language for outputting XML. But PHP is a template language, even though it's not recommended to use it that way anymore.

Well, it's true that Haml calls itself a "templating system", and Pug uses the term "template engine". That's 3 out of 3, you win. ;)

PHP is a scripting language that is also a template processor, but I wouldn't call it a template language. So we disagree on several points, but no big deal. A big disadvantage of PHP, in relation to your original point about "fancy string interpolation", is that it does not natively understand the target output HTML syntactically and structurally.

This is the essence of the problem! Yaml and templates are just distractions. It just boils down to the fact that "string" is a very general type and we use it lazily.

My personal rule: Every time a value is inserted into a string it must be properly encoded.

I wrote a full blog post around this a while back https://kevincox.ca/2022/02/08/escape-everything/. But the TL;DR is that every string has a format which needs to be respected wether that be HTML, SQL or human-readable terminal output. Every time you put some value into a string you should be properly encoding it into that format. But we rarely do.

This is how Django templates have done it for over a decade. You have to go out of your way to tell it not to escape the values if for some reason you need that.

The fact that it's a purely functional programming language with lazy evaluation is really powerful but steepens the learning curve for devs who haven't worked with functional languages.

The stdlib is also pretty sparse, missing some commonly required functions.

does it really though? what part do they struggle with?

IME engineers struggle with folds most.

I was reading the description of Jsonnet and wondering why we don't just use JavaScript. Read a file, evaluate it, take the value of the last expression as the output, and blat it out as JSON.

The environment could be enriched with some handy functions for working with structures. They could just be normal JavaScript functions. For example, a version of Object.assign which understands that "key+" syntax in objects. Or a function which removes entries from arrays and objects if they have undefined values, making it easy to make entries conditional.

Those things are simple enough to write on demand that this might not even have to be a packaged tool. Just a thing you do with npm.

Yeah similarly I'm using Nix to template K8s templates and I've never looked back. Helm is great for deploying 3rd party applications easily but I've never seen the appeal for using it for in house services, templating YAML is gross indeed.

Preferably written in assembler, to avoid the extra complexity of a compiler, right?

Configuration files have been a common feature of software since OSs exist, basically. They serve a clear and useful purpose, even though they create some problems of their own.

For complex environments like those discussed in the article, there’s unavoidably complicated logic.

Code is a good place for logic to live.

Compared to yaml, code is more testable, readable and expressible.

I should’ve restricted my original comment to the kind of situation in the article where different configs are created for various regions and test environments with optional values. Totally agree configs are useful for defining more static values.

Restricting config to static values removes quite a bit of the value of config, in my opinion.

Yes, logic should live in code, but very often that logic needs to behave differently depending on some piece of (inherently variable, not static) configuration.

Random examples (written from the perspective of personified code): - How many threads should I use? - On which port should I serve metrics? - Which retry strategy should I use?

By "more static", I meant items with only a handful of variations.

If you're using one port for dev & another for prod I reckon it's best to have it in config.

But if you're port is varying by image, region, dev/test/prod status and has exceptions for customers using your app on prem then keeping all that logic in code may be easier.

Yes.

Except you then have to sensor that programming language severely. Maybe you can accept some endless loop, but you probably don't want the CI orchestrator to start mining Monero, instead of bootstrapping and configging servers and services.

A solution to that sensorship might be a very limited WASM runtime: one that offers a very few API's, has severely limited resources and timeouts and such. So people can write their orchestration in Python, Javascript or Rust or even Brainfuck if they want, but what that orchestration can do, and for how long it can do that, and how much memory, space and so on it gets, all is very limiting.

While that may work, it's far harder to think of than "lets make another {{templating|language}}" inside this YAML that we already have and everyone else uses.

I don't see any practical difference w.r.t. cybersecurity between "I blindly applied this pile of YAML to my production kubernetes clusters without looking at it" and "I blindly downloaded and ran this computer program on my CI runner without looking at it".

A supply chain attack on the former means that your environment is compromised. So does the latter.

I do see a difference.

GitHub actions isn't going to run your Python code on its orchestration infra. Nor is DigitalOcean or Fly.io or CircleCI. They all convened around "YAML" because it's a very limited set of instructions.

I'm quite sure you cannot write a bitcoin miner (or something that opens a backdoor) in Liquid inside YAML in the DSL that Github Actions has. I am 100% sure you can write a bitcoin miner in Python, Javascript, Lua, or any programming language that Github would use to replace their YAML config.

you can still have a json output of the python code and compare it with in similar way to how atlantis work.

That ugly child doesn't cut it, these are the comments people want:

key1=value1 # this is a proper key1 comment, if you move a line, it stays with key1

key2=value2 # it also doesn't break the table

# and you don't need to write a config parser

# nor modify a syntax highlighter

# nor make sure other people use your comment style

If people absolutely want those, they can use TOML, which supports single line and inline comments.

Given a choice, I'd even opt XML over YAML.

That's fine, you can pick whatever XML ugliness you like, I was just pointing out that you can't solve the basic fail of JSON with comments by making them data

The ironic thing is that, IIRC, k8s manifests were supposed to be machine-generated from the k8s's inception, you weren't supposed to write them by hand... of course, people wrote them by hand anyway, until it became unbearable ― at which point they've started templating them because that's how the things always seem to progress: manually-written text is almost never replaced by machine-generated config-serialized-to-text, it's replaced by templated-but-originally-still-manually-written text.

k8s manifests were supposed to be machine-generated from the k8s's inception,

failed spectacularly at not being inconvenient enough for their intended purpose.

one of those cases where unreadable by design would be a most welcome feature.

"YAML is a superset of JSON" only means that any JSON document is a valid YAML document. It does not mean YAML is equal to JSON.

To add to the sibling comments, after going from a jsonnet-based setup to a Typescript-based one (via pulumi), the biggest thing I missed from jsonnet was the native object merge operations which are very useful for this kind of work as it lets you say "I want one of these, but with these changes" even when the objects are highly nested, and you can specify whether to merge or override for each individual key.

But ultimately this was a minor issue and I think it's far more important that you use something like this (whether a DSL or a mainstream PL) and that you're not trying to do string templating of YAML.

They're various points along the Turing complete config generator vs declarative config spectrum. Declarative config is ideal in lots of ways for mission critical things, but hard to create lots of because of boiler plate.

A turing-complete general purpose language is entirely unconstrained in its ability to generate config, so it's difficult to understand all the possible configs it can generate. And it's difficult to write policy that forbids certain kinds of config to be generated by something like Python. And when you need to do an emergency-rollback, it can be hard to debug a Python script that generates your config.

Starlark is a little better because it's deliberately constrained not to be as powerful as Python.

Jsonnet is, IIUC, basically an open source version of the borgcfg tool they've had at Google forever. My recollection is that Borgcfg had the reputation of being an unreadable nightmare that nobody understood. In practice, of course, people did understand it but I don't think anyone loved working with it.

Brian Grant, creator of Kubernetes, wrote up his thoughts on various config approaches in this Google doc: https://docs.google.com/document/d/1cLPGweVEYrVqQvBLJg6sxV-T....

I definitely wouldn't use Python because it isn't sandboxed, and users will end up doing crazy things like network calls in your config.

Starlark is a good option though.

People will talk about Jsonnet not being Turing complete, but IMO that is completely irrelvant. Turing completeness has zero practical significance for configs.

I'm a big fan of Python as configuration for my own projects, but:

- It requires discipline for devs to keep the conf declarative. Discipline is not automatically enforceable, so it's prone to failure.

- No guarantee of reproducibility.

- You need a Python VM (or a starlark interpreter if that's what you like). It's a big constrain.

- If you are a Saas provider, accepting Python as input is really hard to secure.

a Dhall configuration file will never:

- throw an exception

- crash or segfault

- accept malformed input

- produce malformed output

- hang or time out

in https://docs.dhall-lang.org/discussions/Safety-guarantees.ht...

Still a fan of Python for configuration?

Yes, because engineering is about context.

I totally agree with you on LLM usage. I have recently switched from JSON to YAML for requests and replies from LLMs (GPT-4 specifically) and I find it much better: fewer tokens used, more readable if you are looking at the http requests and responses and you can parse it on the fly in streaming responses. The last point lets you do visual updates for the user, which is pretty important if you need to wait 1+ minutes for the full response

I'd be very curious to know what kind of previews/streaming YAML applications you are building with LLMs. I have building a v0.dev kind of thing with streaming update on my TODO list.

clarification: YAML and TOML can represent JSON, but the opposite isn't true.

I tried this, it significantly complicated documentation and support after release. Lots more logic handling conflicting cases in two otherwise identical files, etc.

Yes. The answer is a "config.d" directory, this has been known to linux package managers for a long time. It is the only way for multiple packages to contribute to configuration without fighting over ownership of the one true config file.

I can second cuelang. We started using it at work and it's so nice. Some of the error messages are a little hard to decipher, but that's acceptable because it catches so many errors up front. The few times I have to write yaml directly, it now feels so tedious in comparison.

Jsonnet looks like a case of XKCD-927[0]. I fully agree with you that real programing languages are the way to go for generating anything more complex.

[0] https://xkcd.com/927/