Write a program to write programs.
“I’d rather write programs to write programs than write programs.”
Dick Sites [DEC], quoted by Jon Bentley in More Programming Pearls.
Context
Have a high-level specification of a program; want to be able to run it directly.
Forces
- High-level specifications can’t be executed directly.
- An interpreter might be too slow.
- A compiler is too much work.
Resolution
Rather than compiling all the way to machine code, write a program to generate code in a moderately high-level language. (C is a very common choice.) Then use an existing compiler to convert the result to machine code.
Discussion
Generating C is easier than machine code. Debugging the C code is easier than raw machine code (though generated code is not very pretty.) We have some flexibility in generated code – we can analyze the high-level language and generate different code depending on the circumstances. For example, a parser generator could special-case certain grammar constructs.
A potential problem is in dealing with compilers. For example, you could generate code on a single line, but compilers (or language definitions!) often restrict the permitted size of a line. Or there may be restrictions on the number of variables in a routine, or nesting levels, etc. Or perhaps the compiler performs poorly with similar identifiers.
There are three common approaches to code generation: table-driven, template-based, and compiled. A table-driven implementation has a basic interpreting routine, and encodes the language into a table. The template-based method replaces high-level constructs using a template that expresses the construct in a lower-level language. The compiler approach generates plain code.
You may provide facilities to help in debugging – perhaps generating different code when debugging is desired.
Some generators allow mixed expressions – both high- and low-level code. The low-level code is inserted and executed as part of the generated code. (The yacc system takes this approach.)
There is some relationship between compilers and interpreters that’s relevant here: the actions done by an interpreter can be inserted into the generated code.
Examples
- The lex and yacc tools developed on the Unix system convert regular expressions and LALR grammars (respectively) into C code.
- The Shlaer-Mellor approach to object-oriented design is sometimes described as transformational, for the way it moves from high-level to low-level specifications.
- Cleaveland [ref?] describes application generators.
[Written 9-15-98]