But that doesn't conform to the "Descent Principle" described in the article. I ...

projektfu · on Aug 26, 2024

I think there is too much mysticism here in believing that the bootstrapping phases will offer any particular guarantees. Without essentially a formal proof that the output of the compiler is what you expect, you will have to manually inspect every aspect of every output phase of any bootstrapping process.

OK, so you decide to use Compcert C. You now have a proof that your object code is what your C code asked for. Do you have a formal proof of your C code? Have you proved that you have not allowed any surprises? If not, what is your Rust compiler? Junk piled on top of junk, from this standpoint.

On the other hand, you could have a verified WASM (or other VM) runner. That verified runner could run the output of a formally verified compiler (which Rustc is not). The trusted base is actually quite small if you had a fully specified language with a verified compiler. But you have to start with that trusted base, and something like a compiler written in C is not really enough to get you there.

Oh, and why do we trust QBE?

maxdamantus · on Aug 27, 2024

> Without essentially a formal proof that the output of the compiler is what you expect, you will have to manually inspect every aspect of every output phase of any bootstrapping process.

And why would it be easier to manually inspect (prove correct) the output of every phase than to manually inspect (prove correct) the source code? The compiled code will often lose important information about code structure, how abstractions are used, include optimisations, etc.

I usually trust my ability to understand source code better than my ability to understand the compiled code.

projektfu · on Aug 27, 2024

But you cannot trust the compiler, you said.

maxdamantus · on Aug 27, 2024

That's not what I said. I've implied that it's hard to trust the output of some unknown compiler (eg, the "zig1.wasm" blob) and that it's easier to trust source code.

The Dozer article explains, under "The Descent Principle", how rustc will eventually be buildable using only source code [0] (other than a "512-byte binary seed" which implements a trivial hex interpreter). You still need to trust a computer to run everything on, though in theory it should be possible to gain trust by running it on multiple computers and checking that the result is the same (this is why any useful system bootstrapping project should also be reproducible [1]).

[0] https://github.com/fosslinux/live-bootstrap

[1] https://bootstrappable.org/best-practices.html

qu1j0t3 · on Aug 27, 2024

The even more immediate objection is that a binary blob is the opposite of portable?!

maxdamantus · on Aug 27, 2024

In this case it is portable, because the Zig compiler source tree includes an interpreter for the blob (WASM) in portable C.

It's not objectionable to have non-portable source code anyway. I think it's fine having architecture-specific assembly code, just as long as it's hand-written.

The problems arise when you're storing generated content in the source repository, because it becomes unclear how you're meant to understand and fix the generated content. In this case it seems like the way to fix it is by rerunning the compiler, but if running the compiler involves running this incorrect blob, it's not clear that running the compiler again will produce a correct blob.

I wonder if anyone is monitoring these commits in Zig to ensure that the blobs are actually generated genuinely, since if not it seems like an easy way for someone to inject a KTH (Ken Thompson Hack): https://github.com/ziglang/zig/commits/master/stage1/zig1.wa...