We will be using the ARM Cortex-A8 as the target for our compiler. You will need access to an ARM system (or emulator). Your compiler need not necessarily run on the ARM system, but you will need to compile and execute the ARM assembly output from your compiler on an ARM system.
The steps to compilation are:
foo.h
)
in our language
foo.s
)
gcc
to produce an
executable (e.g., $ gcc -o foo foo.s
to produce the
foo
program)
$ ./foo
)
Our language is rather restricted, and so only a small portion of the ARM assembly language instructions will be relevant for our purposes. Relevant instructions include:
mov
ldr
str
add
sub
mul
cmp
b[l] <label>
cmp
instruction
We also need to be able to divide integer values but the ARM chip we're
using does not support integer division. So we'll resort to one
of two mechanisms for division. We'll either 1) implement it in code
(i.e., add a block of ARM assembly to our program) or 2), we'll invoke a
system-call to the glibc
divide function. See
this tutorial chapter for more...
Later we will be implementing subroutines using the stack (note that
we will not be using the ARM convention of passing arguments via
registers); so in addition to
register r13
(aka sp
), r11
(aka fp
) will not be used as a general-purpose register
since it will be used to anchor the call-frame of the current function
(initially set to the value of the stack pointer).
Access to parameters and local variables within a function will be done
relative to the address in fp
. As local variables are
allocated we will adjust the stack pointer to make room (and since the
stack grows "down" on the ARM this will involve decrementing the stack
pointer). So local variables will be accessed relative to the frame
pointer (e.g., local variable 1 will be accessed as [fp, #-4]
)
Each assembly file we produce will need a template with a few pre-existing values defined. The following is a general template that you can use:
.text .global main main: ldr r0, =link /* Store the lr value for graceful return at the end */ str lr, [r0] mov fp, sp /* set the fp to the base of the stack */ /*** * Generated code goes here **/ ldr r0, =link /* Reset the lr */ ldr lr, [r0] mov r0, #0 /* Return 0 for success */ bx lr /* return */ .data .balign 4 link: .word 0 .balign 4 string_format: .asciz "%s" .balign 4 int_format: .asciz "%d" /*** * Program-defined string constants here **/Note that we'll generate the
.data
section at the end so
that we can simply scan the entire "symbol table" for all constant
strings and allocate them here. I recommend that you use a simple
counter for the number of constant strings and create simple labels for
each one as they're entered into the symbol table, e.g.:
.balign 4 s1: .asciz "Hello World!" .balign 4 s2: .asciz "I love compilers!" ...
First, we will be calling the printf
function for output.
Since printf
requires 2 registers (the format and the data) we
need to either
r0
and r1
for printing, leaving us
with registers r2 - r10
for general use, or
r0
and r1
before we setup and invoke printf
;
as long as we save the register states for r0-r3
before we begin to print and restore them
after we print, we should be able to use r0
and
r1
as general purpose registers.
printf
is free to use
registers r0-r3
so if our code
is using registers r2, r3
they'll need to be saved on the stack before the call and restored
afterwards. So it makes sense to simply push r0-r3
before we begin to print, and then restore them,
leaving us free to use r0-r10
Register allocation is a matter of finding a free register to satisfy
"load" operations and arithmetic operations that have a destination. The
process of finding a register is a function often vaguely referred to as
getreg()
). At the very least our compiler will need
to keep track of what symbols are in what registers; we may also want to keep track of the
inverse - given a symbol, what register might it be in. Allocation will generally follow
the simple rules that
mov
operations. Those
optimizations will be discussed separately.
Last modified: , by David M. Hansen