Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generated lexers are still worth it but handwritten RD won on the parsing front. No other historical debate gets this Weekend at Bernie's treatment the way parser generators do. Is it from dated undergrad + research material? There must be a reason all the references and approaches in this article are decades old and parser combinators get a single dismissive footnote. Pretty sure that Guy Steele quote even predates PEGs.

Bison and YACC are a mess. Sure, go and complicate your build by introducing code gen steps and ancient spaghetti DSLs with terrible tooling and an extra learning curve. Anything to avoid writing a readable parser in the same language as the rest of your compiler. The amount of ink spilled over parsing and parser generators is phenomenal when you consider how trivial parsing is compared to the other compilation phases. Look at source code for large compilers and you'll find the handwritten parser is some of the easiest code to understand in the project. Lack of left recursion is a trivial limitation in practice.



I think the parsing debate is just one example of a spectrum that I've noticed a lot of developers (or indeed people in general) lie along; at one end are those who love complexity and abstraction and generality as well as indulge heavily in theory, while at the other end are those who want simple practical solutions even if they might be considered "hacks" by those at the opposite end. The former are obviously those advocating for parser generators and all of the theory behind it, while the latter are going straight for handwritten RD.

Is it from dated undergrad + research material?

If one thinks that creating research material is the goal, then that's what happens in abundance.

Lack of left recursion is a trivial limitation in practice.

The same goes for context-sensitivity.


The front end of compilers gets a lot of attention because the problem looks complex, but is easy in practice, which is why there are a large number of materials trying to teach the topic.

On that note, what resources would you suggest to become familiar with the backends of a compiler? The CS classes I had mostly had a frontend focus with the backend only being covered at a high level and without delving too deep into the implementation details.


I have a similar struggle and find you are largely limited to learning piecemeal. I have found the best learning resources are the technical docs, specifications, proposals and source code of major language projects and VMs. Research languages and associated research papers. Conference talks by compiler authors (Rust has some good ones).


> The amount of ink spilled over parsing and parser generators is phenomenal when you consider how trivial parsing is compared to the other compilation phases.

Trying to find good resources for implementing type checking was a pain when I was doing my MSc project. Maybe it's partly due to how few projects get to serious stages like type checking or optimization passes.


> Generated lexers are still worth it but handwritten RD won on the parsing front.

Packrat parsers make a very compelling case. I'm not sure the debate is over.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: