Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author seems to use misfeatures of a particular implementation to tar all implementations with. The round-tripping issue is not a statement about YAML as a markup language, much in the way a rendering bug in Firefox is not a statement about the web.

Stepping back a bit, YAML is good enough, and this problem has been incrementally bikeshedded since at least the 1970s, it is time to move on. Human-convenient interfaces (like YAML, bash, perl) are fundamentally messy because we are messy. They're prone to opinion and style, as if replacing some part or other will make the high level problem (that's us) go away. Fretting over perfection in UI is an utterly pointless waste of time.

I don't know what NestedText is and find it very difficulty to care, there are far more important problems in life to be concerned with than yet another incremental retake on serialization. I find it hard to consider contributions like this to be helpful or represent progress in any way.



I actually disagree it's bike shedding.

If you can write a bad YAML document because of those mis-features/edge cases, I'd say you've already lost.

Humans are messy, but at the end of the day the data has to go to a program, so a concise and super simple interface has a lot of power to it for humans.

Working at a typical software company with average skill level engineers (including myself), no one likes writing YAML. But everyone is fine with JSON.

I think it's a case of conceptual purity vs what an average engineer would actually want to use. And JSON wins that. If YAML was really better than JSON, we'd all be using that right now.

So does it really matter if YAML is superior if >80% of engineers pick JSON instead?


I would argue that you can write something poor and/or confusing in any markup language that is sufficiently powerful.

Conversely, if a markup language is strict enough to prevent every inconsistency, then it's not powerful enough or too cumbersome to use to be generally useful.


I'd say that YAML is anything but conceptually pure, with all the arbitrariness, multitude of formattin options, and parsig magic happening without warning.

If you want conceptual purity (and far fewer footguns), take Dhall.


> Stepping back a bit, YAML is good enough, and this problem has been incrementally bikeshedded since at least the 1970s, it is time to move on

Nah, in the 1970s we had Lisp S-expressions that completely solved the problem, and everything since then has been regressions on S-expressions due to parenthesis phobia.

After hearing that thing about the country code for Norway, I became convinced that YAML has to just die. Become an ex-markup language. Pine for the fjords. Be a syntax that wouldn't VOOM if you put 4 million volts through it. Join the choir invisible, etc.

This is good: https://noyaml.com/

Erik Naggum had a notoriously NSFW rant about XML (over the top even for him) that I better not link to here, but lots of it applies to YAML as well.


S-expressions don't solve the problem at all, you just get to fractally bikeshed all over again about what semantics they have and what transformations are or aren't equivalent. Does whitespace roundtrip through S-expressions? Who knows. Are numbers in S-expressions rounded to double precision on read/write? Umm, maybe. How do I escape a ) in one of my values? Hoo boy, pick any escape character you like and there's an implementation that does it.


EDN solves all these problems: https://github.com/edn-format/edn


I have to second that. Including the variant "canonical s-expressions" which is in fact a binary format.


Why of course, embedding a full-blown Lisp development environment for parsing a config file is totally sane and normal.

(Sarcasm, just in case.)


S-expressions don’t completely solve the problem: they don’t have a syntax for maps, and in practice there are at least two common incompatible conventions: alist or plist?


Obviously the application has to interpret the Lisp object resulting from reading the S-expression, just like it has to interpret any JSON, YAML, or anything else that it reads. So for maps you can, as you mention, use alists or plists. Regarding other stuff mentioned: none of the encodings are supposed to be bijective (the writer emits the exact input that the reader ingested). Otherwise, for example, they couldn't have comments, unless those ended up in the data somehow. There is ASN.1 DER if you want that, but ASN.1 is generally disastrous.

Stuff like escape chars were well specified in Lisps of the 1970s (at least the late 1970s), including in Scheme (1975). Floating point conversion is a different matter (it was even messier in the pre-IEEE 754 era than now) but I think the alternatives don't handle it well either. You probably have to use hexadecimal representation for binary floats. Maybe decimal floats will become more widely supported on future hardware.

A type-checked approach can be seen in XMonad, whose config files use Haskell's Read typeclass for the equivalent of typed S-expressions.


EDN has maps, sets, vectors and lists and is extendable.


Solutions for this problem that I've used in my own S-expression config files:

1. Use only alists for maps because they prevent off-by-one errors.

2. Allow plists because they're less verbose than alists and use reader macros to distinguish them, and allow the reader macro definitions to be in the same file.

Most of the time I use option 1 because it's simpler.


I would argue that, in a data markup language, there shouldn't be a syntax for maps. Whether a given sequence should be treated as key-value pairs, and whether keys in that sequence are ordered or unordered, is something that is best defined by the schema, just like all other value types.


It'd be bikeshedding if the status quo was good. But it isn't.


> Author seems to use misfeatures of a particular implementation to tar all implementations with.

There's no canonical YAML implementation, and YAML spec is enormous (doubly so if you need to work with stuff like non-quoted strings etc. )


> There's no canonical YAML implementation

The formal grammar counts as canonical and several implementations are derived from it: https://github.com/yaml/yaml-reference-parser


If you use YAML in situations where it may need hand editing, it means you actively hate your users.

YAML is patently unsuitable for any use case where the resulting output may require hand editing.


> YAML as a markup language

YAML ain't markup language.


>Human-convenient interfaces (like YAML, bash, perl) are fundamentally messy because we are messy

I don't know what to make of this statment, it has so much handwaving built-in. The most charitable interpretation I can find is that by 'Human-convenient' you simply meant the quick-and-dirty ideology expressed in Worse Is Better: Does job, makes users contemplate suicide only once per month, isn't too boat-rocking for current infrastructure and tooling.

Taken at face value (without special charitable parsing), this statement is trivially false. Python is often used as a paragon of 'Human-convenience', I sometimes find this trope tiring but whatever Python's merits and vices its _definitely_ NOT messy in design.

Perl is the C++ of scripting languages, it's a very [badly|un] designed language widely mocked by both language designers and users. Lua and tcl instead are languages literally created for the sole exact purpose of (non-) programmers expressing configuration inside of a fixed kernel of code created by other programmers, and look at their design : the whole of tcl's syntax and semantics is a single human-readable sentence, while lua thought it would be funny if 70% of the language involved dictionaries for some reason. These are extremely elegant and minimal designs, and they are brutally efficient and successful at their niches : tcl is EDA's and Network Administration's darling, and lua is used by game artists utterly uninterested in programming to express level design.

'Humans are messy' isn't a satisfactory way to put it. 'Humans love simple rules that get the job done' is more like it. But because the world is very complex and exception-laden, though, simple rules don't hug its contours well. There are two responses to this:

- you can declare it a free-for-all and just have people make up simple rules on the fly as situations come up, that's the Worse Is Better approach. It doesn't work for long because very soon the sheer mountain of simple rules interact and create lovecraftian horrors more complex than anything the world would have thrown at you. Remember that the world itself is animated by extremely simple rules (Maxwell's equations, Evolution by Natural Selection, etc...), it's the multitude and interaction of those simple rules that give it its gargantuan complexity and variety.

- you stop and think about The One Simple Rule To Rule All Rules, a kernel of order that can be extended and added to gradually, consistently and beautifully.

The first approach can be called the 'raster ideology', it's a way of approximating reality by dividing it into a huge part of small, simple 'pixels' and describing each one seperately by simple rules. I'm not sure it's 'easy' or 'convenient', maybe seductive. It promises you can always come up with more rules to describe new patterns and situations, and never ever throw away the old rules. This doesn't work if your problem is the sheer multitude and inconsistency of rules. The second approach is the 'vector ideology', it promises you that there is a small basis of simple rules that will describe your pattern in entirety, and can always be tweaked or added to (consistently!) when new patterns arise, the only catch is that you have to think hard about it first.


>and lua is used by game artists utterly uninterested in programming to express level design

Rather short sighted and dismissive to a successful programming language that's evolved over 20+ years. Lua is a great general purpose programming language that specializes not in "game making for non-programmers" but in ease of embedding, extension/extensability, and data description (like a config language). There's a whole section in Programming in Lua[1] to that effect. The fact that it's frequently used in games is credit to it's speed, size and great C API for embedding, not because of any particular catering to game designers.

[1]: https://www.lua.org/pil/10.1.html


You misunderstood me. I love lua and I wasn't being dismissive of it, I was using the first example that came to my mind to counter the claim that a convenient language has to be messy. Just because that was the example used doens't mean there is an implicit "and that's the only thing it's good for" clause I'm implying there: if someone said "Python is used by scientists utterly uninterested in programming to express numerical algorithms" would you understand that to be a dismissive remark against Python ?

Being used by non-programmers utterly uninterested in programming to solve problems is the highest honor any programming language can ever attain, because it means that the language is well-suited to the domain enough (or flexible enough to be made so) that describing problems in it is no different than writing thoughts or design documents in natural language. This is the single most flattering thing you can ever say about a language, not a dismissive remark.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: