Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But that doesn't conform to the "Descent Principle" described in the article.

I haven't really been following Zig, but I still felt slightly disappointed when I learnt that they were just replacing a source-based bootstrapping compiler with a binary blob that someone generated and added to the source tree.

The thing that makes me uncomfortable with that approach is that if a certain kind of bug (or virus! [0]) is found in the compiler, it's possible that you have to fix the bug in multiple versions to rebootstrap, in case the bug (or virus!) manages to persist itself into the compilation output. The Dozer article talks about the eventual goal of removing all generated files from the rustc source tree, ie undoing what Zig recently decided to do.

If everything is reliably built from source, you can just fix any bugs by editing the current source files.

[0] https://wiki.c2.com/?TheKenThompsonHack



I think there is too much mysticism here in believing that the bootstrapping phases will offer any particular guarantees. Without essentially a formal proof that the output of the compiler is what you expect, you will have to manually inspect every aspect of every output phase of any bootstrapping process.

OK, so you decide to use Compcert C. You now have a proof that your object code is what your C code asked for. Do you have a formal proof of your C code? Have you proved that you have not allowed any surprises? If not, what is your Rust compiler? Junk piled on top of junk, from this standpoint.

On the other hand, you could have a verified WASM (or other VM) runner. That verified runner could run the output of a formally verified compiler (which Rustc is not). The trusted base is actually quite small if you had a fully specified language with a verified compiler. But you have to start with that trusted base, and something like a compiler written in C is not really enough to get you there.

Oh, and why do we trust QBE?


> Without essentially a formal proof that the output of the compiler is what you expect, you will have to manually inspect every aspect of every output phase of any bootstrapping process.

And why would it be easier to manually inspect (prove correct) the output of every phase than to manually inspect (prove correct) the source code? The compiled code will often lose important information about code structure, how abstractions are used, include optimisations, etc.

I usually trust my ability to understand source code better than my ability to understand the compiled code.


But you cannot trust the compiler, you said.


That's not what I said. I've implied that it's hard to trust the output of some unknown compiler (eg, the "zig1.wasm" blob) and that it's easier to trust source code.

The Dozer article explains, under "The Descent Principle", how rustc will eventually be buildable using only source code [0] (other than a "512-byte binary seed" which implements a trivial hex interpreter). You still need to trust a computer to run everything on, though in theory it should be possible to gain trust by running it on multiple computers and checking that the result is the same (this is why any useful system bootstrapping project should also be reproducible [1]).

[0] https://github.com/fosslinux/live-bootstrap

[1] https://bootstrappable.org/best-practices.html


The even more immediate objection is that a binary blob is the opposite of portable?!


In this case it is portable, because the Zig compiler source tree includes an interpreter for the blob (WASM) in portable C.

It's not objectionable to have non-portable source code anyway. I think it's fine having architecture-specific assembly code, just as long as it's hand-written.

The problems arise when you're storing generated content in the source repository, because it becomes unclear how you're meant to understand and fix the generated content. In this case it seems like the way to fix it is by rerunning the compiler, but if running the compiler involves running this incorrect blob, it's not clear that running the compiler again will produce a correct blob.

I wonder if anyone is monitoring these commits in Zig to ensure that the blobs are actually generated genuinely, since if not it seems like an easy way for someone to inject a KTH (Ken Thompson Hack): https://github.com/ziglang/zig/commits/master/stage1/zig1.wa...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: