Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone that's written a whole lot of code parsing both complex XML and JSON, I'd go with a more restrictive JSON format over a more idiomatically correct and elegant (from the data perspective) XML format any day. Complex XML sucks for storing structured data unless it's as restrictive as a JSON document, and then...

The simple use case for XML is always easy, but then it always ends up looking like this:

  <step>prepare fruit
    <step>prepare <ing variety"bartlet anjou comice">pear slices</ing> from a <ing state="unprepped">pear</ing> 
    <step>wash</step>
    <step>trim
       <step>remove stem</step>
       <step>peel</step>
    </step>
    <step>
      <step thickness=".25mm">slice</step>
     </step>
  </step>
  <step>...

Plain text is great and all as a display format but it sucks even more than XML to parse as a data format.

You can make JSON that's just as stupid as XML but especially if you have people hand-writing XML, it invites a lot of complexity for a little more expressiveness. If you need to, you can always have flatter XML markup in JSON fields to avoid the large scale recursive structural insanity when parsing.



My gut feeling is that there's something going on here with dueling priorities between (A) the best editing experience with a plain text editor vs. (B) the clearest storage format. This leads to things like "too much inlining" or "too much duplication".

In contrast, imagine relaxing the everything-in-notepad requirement, imagine a renderer that can easily display cross-referenced materials in a readable way. Or a step beyond that, an editor which also gives you "jump to definition" etc.

That change permits a much more internally-consistent XML file, such as one where "materials" and "steps" are separate sections, and any step can references a material that is being used as input or output, with something like <mat_ref id="sliced_uncooked_apples"/> .


I see the value in using xml for simple markup and standardized entities/references but the flexibility makes navigating whole documents more cumbersome. I think that using it inline is a good idea, but above the paragraph level I don’t see the benefit of using it at all. Even in the supremely consistent world of open doc xml, parsing is a bear of a task. For something like this requiring a fraction the complexity, it should either force more internal structure— XML markup in json fields representing ingredients lists, etc— or just decide it’s for presentation only and go with HTML or rtf.

I probably also have a different perspective on both of these topics than most. I’ve dove a lot of automated document work, and also was a chef so I’ve got a more structured, less prosaic approach to recipes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: