Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's really sad to see the pervasiveness of JSON. For one thing its usage as a config file is disturbing. Config files need to have comments. Second, even as a data transfer format the lack of schema is even more disturbing. I really wish JSON didn't happen and now these malpractices are so widespread that it's hurting everyone.


JSONC. JSON with comments. And even if your favorite parser does not support it natively it’s not so hard to add with a very simple pre-lexer step.

JSON schemas exist and they’re ok for relatively simple things. For more complex cases I find myself wishing I could just turn Typescript into some kind of schema validation for JSON.


> For more complex cases I find myself wishing I could just turn Typescript into some kind of schema validation for JSON.

Not sure if this is what you're looking for, and whether it's powerful and expressive enough for your use case, but you can use typescript-json-schema¹ for this, and validate with eg ajv.

¹https://github.com/YousefED/typescript-json-schema


I like JSON5 for similar reasons. I specifically like the addition of comments, trailing commas, and keys without quotes.


I've struggled with this in Java recently and at first I used Jankson which supports the complete JSON5 spec, but later we figured out we could configure the standard Jackson JSON package to accept the things we actually need and actually use.


Also needed is string concatenation. One line strings are very limiting.


There's libraries that let you define a schema programmatically, and then infer the types.

https://github.com/sinclairzx81/typebox


Seems to me that YAML just needs type/schema support to be less of a hurdle.

As an alternative, the encoding/decoding roundtrip using protobuf seems reasonable to me, catches the footgun of using floating-point version numbers (it becomes a parse error), whitespace/multiline concatenation being more obvious, and allowing comments (compared to JSON):

  ( cat << EOF
  # yes, comments are allowed
  name: "Python package"
  on: "push"
  build {
    python_version: ["3.6", "3.7", "3.8", "3.9", "3.10"]
    steps: [
      {
        name: "Install dependencies"
          run:
            "python -m pip install --upgrade pip\n"
            "pip install pytest\n"
            "if [ -f 'requirements.txt' ]; then pip install -r requirements.txt; fi\n"
      },
      {
        name: "Test with pytest"
        run: "pytest\n"
      }
    ]
  }
  EOF
  ) | protoc --encode=Config config.proto  | protoc --decode=Config config.proto
  
  name: "Python package"
  on: "push"
  build {
    python_version: "3.6"
    python_version: "3.7"
    python_version: "3.8"
    python_version: "3.9"
    python_version: "3.10"
    steps {
      name: "Install dependencies"
      run: "python -m pip install --upgrade pip\npip install pytest\nif [ -f \'requirements.txt\' ]; then pip   install -r requirements.txt; fi\n"
    }
    steps {
      name: "Test with pytest"
      run: "pytest\n"
    }
  }


> Seems to me that YAML just needs type/schema support to be less of a hurdle.

JSON schemas exist and can be applied to yaml and this is supported by many editors. For example this vscode extension: https://marketplace.visualstudio.com/items?itemName=redhat.v...

It's strange to see so many complains about "missing tooling" that actually exists and is well supported.


> Seems to me that YAML just needs type/schema support to be less of a hurdle.

Unfortunately YAML already got type support, which made it easier to roundtrip, but also insecure. Creating a type calls constructors with possible insecure side effects. Which was eg used to hack Movable Type.


JSON Schema is an official thing that exists and has implementations in all major languages. Personally I’m very glad that it’s an opt-in addition rather than a requirement.

(I agree with you about comments though)


For comments just use JSONC.


I agree, but I would recommend JSON5 as the solution. Not YAML or this abomination.

JSON5 has many advantages:

* Superset of JSON without being wildly different. I know YAML is a superset of JSON but it's completely different too. Insane.

* Unambiguous grammar. YAML has way too many big structure decisions that are made by unclear and minor formatting differences. My work's YAML data is full of single-element lists that shouldn't be lists for example.

* Comments, trailing commas

* It's a subset of Javascript so basically nothing new to learn.

* It has an unambiguous extension (.json5). I think JSONC would be a reasonable option but everyone uses the same extension as JSON (.json) so you can never be sure which you are using. E.g. `tsconfig.json` is JSONC but `package.json` is just JSON (to everyone's annoyance).

* Doesn't add too much of Javascript. I wouldn't recommend JSON6 because it's just making the format too complicated for little benefit.


I would rather recommend jsonc:

- it has good editor support (VsCode) - has comments support - Support jsonschema

Only thing missing is trailing commas, but i would rather live without trailing commas than tooling support


JSONC supports trailing commas.

> - it has good editor support (VsCode)

Unfortunately it doesn't really because of the extension issue I mentioned. Certain file names (like `tsconfig.json`) are whitelisted to have JSONC support, but any random file `foo.json` will be treated as JSON and give you annoying lints if you put comments and trailing commas in.

That's a fairly recent change I think.


Tools that use JSON as configuration format could simply allow certain unused keys (e.g. all keys starting with #) and promise never to use them. Then author can write their comments with something like:

    {
      "name": "my-tool",
      "#comment-1": "Don’t change the version!",
      "version": "42.1337.0"
    }


There's a lot of JSON tooling, and it's liable to interact badly with this. For example, a formatter might re-order the fields of a dict, moving "#comment-1" away from "version". Or the software that this JSON is for might error upon receiving unexpected keys (which is actually useful behavior, as that would catch a typo in an optional field).

Also, this doesn't let you put comments at the top of the file, or before a list item, or at the end of a line.

If you're going to change your JSON tooling to handle comments of some kind, you might as well go all the way to JSONC.


I've heard and read this multiple times. Why are you trying so hard to fit into a format that doesn't just support comments out of the box? What advantages is JSON offering you that you've compelled to bend over backwards to do this? It's exactly these kinds of workarounds that is making it super difficult stop such malpractices. It's just plain ugly. Please stop doing this.


In many cases, you're using a library or service that you don't maintain, so you don't have much of a choice.


You can't comment out a large section of config easily. For me, this is a relatively common use case for config files, so I take the position that JSON should be used for serialization only.


And I am just writing a JSON de/serializer to move my config from the current system to JSON. I worked on it today and yesterday and several days some time ago.

This situation makes me feel rather silly


So you prefer the "good old" XML days? I'll take comment-less JSON over XML any day

(and it doesn't have to be comment-less... JSON with comments is a thing and VSCode has syntax highlighting for it - just strip out the comments before parsing).


> So you prefer the "good old" XML days? I'll take comment-less JSON over XML any day

Aren't we past basic false dichotomies?


Nope: basic false dichotomies and JSON are both pervasive.


There is a corporate- and government-approved standard for false dichotomies, but it works as a de-facto standard, not published.


   ifnot:
    - foo
   then_clearly:
    -
       'some bar'


XML is perfect. + With all the fancy editors now its very easy to write. Easy schema to check, comments. Perfect.


Disclaimer: this is not a defense for YAML, I'm just trying to remove the rose tinted glasses some people view XML configs through.

As someone who has used XML configs they have a few problems:

- technical: missing comments are mentioned multiple times here so I will mention that while XML has comments they cannot be nested.

- socially: for some reason (maybe because XML is structured enough that this doesn't immediately collapse?) XML tends to just grow and grow. People start programming in XML too, and not only using XSLT or other standard approaches but also in completely proprietary ways.

At one project someone even wrote an authorization framework in Apache Tiles which allowed one to create roles using somewhere between 600 and 5000 lines of XML pr role. The benefit was of course that you could update the roles without touching the Java code.

(In case it isn't immediately obvious: it would have been extremely much simpler to edit it in Java, and people who know enough Java to fix it are available at the right price, the XML system had to be learned at work.)

Personally I just want it to be kept simple:

- a settings.local.ini and default settings in settings.ini or something to that effect

- if necessary, just use a code file: config.ts works just as well, or config.js if it needs to be adjustable at runtime without transpilation.


not easy to read, it's the java of config, pages of code that express very little, by the time you find what you need, you forget the context and what level of nesting you're on already. It's also more wasteful as a transport.


> It's also more wasteful as a transport.

This is most certainly true, however with GZip thrown into the mix, it's not quite as bad as one might imagine: https://www.codeproject.com/articles/604720/json-vs-xml-some...

It compresses pretty decently and doesn't have too much of an overhead, in the example it being around 10% larger than JSON when compressed.

I'd argue that if one were to swap out JSON for XML within all the requests that an average webpage needs for some unholy reason, the overall increase in page size would be much less than that, because huge amounts of modern sites are images, as well as bits of JS that won't be executed but also won't be removed because our tree shaking isn't perfect.

Edit: as someone who writes a good deal of Java in their dayjob, i feel like commenting about the verbosity of XML might be unwelcome. I'll only say that in some cases it can be useful to have elements that have been structured and described in verbose ways, especially when you don't have the slightest idea about what API or data you're looking at when seeing it for the first time (the same way how WSDL files for SOAP could provide discoverability).

However, it all goes downhill due to everything looking like a nail once you have a hammer - most of the negative connotations with XML in my mind actually come from Java EE et al and how it tried doing dynamic code loading through XML configuration (e.g. web.xml, context.xml, server.xml and bean configuration), which was unpleasant.

On an unrelated note, XSD is the one truly redeeming factor of XML, the equivalent of which for JSON took a while to get there (JSON Schema). Similarly, WSDL was a good attempt, whereas for JSON there first was WADL which didn't gain popularity, though at least now OpenAPI seems to have a pretty stable place, even if the tooling will still take a while to get there (e.g. automatically generating method stubs for a web API with a language's HTTP client).


You mean something like https://pyotr.readthedocs.io


Thanks for the link, but not necessarily.

How WSDL and the code generation around it worked, was that you'd have a specification of the web API (much like OpenAPI attempts to do), which you could feed into any number of code generators, to get output code which has no coupling to the actual generator at runtime, whereas Pyotr is geared more towards validation and goes into the opposite direction: https://pyotr.readthedocs.io/en/latest/client/

The best analogy that i can think of is how you can also do schema first application development - you do your SQL migrations (ideally in an automated way as well) and then just run a command locally to generate all of the data access classes and/or models for your database tables within your application. That way, you save your time for 80% of the boring and repetitive stuff while minimizing the risks of human error and inconsistencies, with nothing preventing you from altering the generated code if you have specific needs (outside of needing to make it non overrideable, for example, a child class of a generated class). Of course, there's no reason why this can't be applied to server code either - write the spec first and generate stubs for endpoints that you'll just fill out.

Similarly there shouldn't be a need for a special client to generate stubs for OpenAPI, the closest that Python in particular has for now is this https://github.com/openapi-generators/openapi-python-client

However, for some reason, model driven development never really took off, outside of niche frameworks, like JHipster: https://www.jhipster.tech/

Furthermore, for whatever reason formal specs for REST APIs also never really got popular and aren't regarded as the standard, which to me seems silly: every bit of client code that you write will need a specific version to work against, which should be formalized.


> model driven development never really took off

same as to why REST is now not a hot thing anymore, the idea that your API is just a dumb wrapper around data model is poor api design.

API-driven development didn't really took off either, that is write your spec in grpc/OpenAPI and have the plumbing code generated in both ends. It's technically already there with various tools, but because of dogma like "code generation is bad", quality of code generators, or whatever reason, we're still writting "API code"


Well, in Python, code generation is an anti-pattern.


> Well, in Python, code generation is an anti-pattern.

Hmm, i don't think that i've ever heard of this. Would you care to provide any sources, since that sounds like an interesting stance to take?

So far, it seems like frameworks like Django don't have an issue with CLI tools to generate bits of code, i.e. https://docs.djangoproject.com/en/3.2/intro/tutorial01/

  If this is your first time using Django, you’ll have to take care of some initial setup. Namely, you’ll need to auto-generate some code that establishes a Django project – a collection of settings for an instance of Django, including database configuration, Django-specific options and application-specific settings.
  
  $ django-admin startproject mysite
Similarly, PyCharm doesn't seem to have an issue with offering to generate methods for classes (ALT + INSERT), such as override methods (__class__, __init__, __new__, __setattr__, __eq__, __ne__, __str__, __repr__, __hash__, __format__, __getattribute__, __delattr__, __sizeof__, __reduce__, __reduce_ex__, __dir__, __init__), implementing methods, generating tests and copyright information.

I don't see why CLI tools would be treated any differently or why code generation should be considered an anti-pattern since it's additive in nature and is entirely optional, hence asking to learn more.


First of all, just because a tool or project uses a pattern, it doesn't mean that it's a good idea. Second, code generation as part of IDE or one-time setup is something else.

I need to clarify: when I say that "code generation" is an anti-pattern, I'm talking about the traditional, two-step process where you generate some code in one process, and then execute it in another. But Python works really well with a different type of "code generation".

Someone once said that the only thing missing from Python is a macro language; but that is not true - Python has its own macro language, and it's called Python.

Python is dynamically evaluated and executed, so there is no reason why we need two separate steps when generating code dynamically; in Python, the right way is not to dynamically construct the textual representation of code, but rather to dynamically construct runtime entities (classes, functions etc), and then use them straight away, in the same process.

Unless you're dynamically building hundreds of such constructs (and if you do you have a bigger problem), any performance impact is negligible.


> Someone once said that the only thing missing from Python is a macro language

Ahh, then it feels like we're talking about different things here! The type of code generation that i was talking about was more along the lines of tools that allow you to automatically write some of the repetitive boilerplate code that's needed for one reason or another, such as objects that map to your DB structure and so on. Essentially things that a person would have to do manually otherwise, as opposed to introducing preprocessors and macros.

For a really nice example of this, have a look at the Ruby on Rails generators here: https://medium.com/@simone.catley/ruby-on-rails-generators-a...


>you forget the context

Wait, it the opposite. XML is designed to indicate context, and JSON is designed to hide context, you have a bunch of braces in place of context there, no matter where you are it's braces all the way down, like lisp.


not really, what enables you to have the context is shorter code. It's useless to have context reminders at the top and bottom of the thing, but not the middle and it's too damn long


For me XML and YAML are about the same. I think I'd also prefer comment-less JSON over both. However, XML wasn't that bad. With a decent editor and schema validation I would say there's a good chance I was more productive with XML than I am with YAML.


It's simple. For config files, choose the format that has the best tooling in your company and that supports comments. For data transfer, choose that supports schemas, backwards compatibility and good tooling (protobufs is just one e.g. that I'm most familiar with).


Actually, yes, I do. XML syntax was far from stellar, and much of the ecosystem (e.g. XML Schema) was drastically overengineered... but even so, we had gems like RELAX NG to compensate. On the whole, it was better than the current mess.


So you prefer the "good old" XML days? I'll take comment-less JSON over XML any day

Sure, why not? XML rocks. I'll take it over JSON for many purposes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: