Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's a solved problem. Paul Karger, who invented the attack and concept in the 1970's, immediately worked with others to solve it with rigorous methods called high-assurance security. Far as this problem, it's mainly a problem of people you trust reviewing it, it getting distributed to you, and you verifying you got what they reviewed. With most distro's, it boils down to that since you have to trust millions of lines of code (maybe privileged) in the first place. SCM security of a trusted repo becomes the solution. Wheeler covers SCM security here:

https://www.dwheeler.com/essays/scm-security.html

Now, let's say you want to know the compiler isn't a threat. That requires you to know that (a) it does its job correctly, (b) optimizations don't screw up programs esp removing safety checks, and (c) it doesn't add any backdoors. You essentially need a compiler whose implementation can be reviewed against stated requirements to ensure it does what it says, nothing more, nothing less. That's called a verified compiler. Here's what it takes assuming multiple, small passes for easier verification:

1. A precise specification of what each pass does. This might involve its inputs, intermediate states, and its outputs. This needs to be good enough to both spot errors in the code and drive testing.

2. An implementation of each pass done in as readable a way possible in the safest, tooling-assisted language one can find.

3. Optionally, an intermediate representation of each pass side-by-side with the high-level one that talks in terms of expressions, basic control flow (i.e. while construct), stacks, heaps, and so on. The high-level decomposed into low-level operations that still aren't quite assembly.

4. The high-level or intermediate forms side by side with assembly language for them. This will be simplified, well-structured assembly designed for readability instead of performance.

5. An assembler, linker, loader, and/or anything else I'm forgetting that the compiler depends on to produce the final executable. Each of these will be done as above with focus on simplicity. May not be feature complete so much as just enough features to build the compiler. Initial ones are done by hand optionally with helper programs that are easy to do by hand.

6. Combine the ASM of compiler manually or by any trusted applications you have so far. The output must run through assembler, linker, etc. to get the initial executable. Test that and use it to compile the high-level compiler. Now, you're set. Rest of development can be done in high-level language w/ compiler extensions or more optimizations.

7. Formal specification and verification of the above for best results. Already been done with CompCert for C and CakeML for SML. Far as trust, CakeML runs on Isabelle/HOL whose proof checker is smaller than most programs. HOL/Light will make it smaller. This route puts trust mostly in the formal specs with one, small, trusted executable instead of a pile of specs and code. Vast increase in trustworthiness.

@rain1 has a site collecting as many worked examples as possible of small, verified, or otherwise bootstrapping-related work on compilers or interpreters. I contributed a bunch on there, too. I warn it looks rough since it's a work in progress that's focused more on content than presentation. Already has many, many weekends worth of reading for people interested in Trusting Trust solutions. Here it is for your enjoyment or any contributions you might have:

https://bootstrapping.miraheze.org/wiki/Main_Page



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: