2 years ago

#51178

test-img

Rudolf Ziegaus

code generation for a compiler based on ANTLR

I am working on a compiler for a language and I have got some problems with operator precedence. The language is pretty simple, nothing complicated. It has functions, statements, expressions, etc. It's based on a youtube series of videos "Let#s build a compiler".

There is already code generated for the Java Platform (JVM) and everything is fine there.

However I wanted to be able to also generate code for the INTEL platform and so I decided to add a code generator for the 80x86. I'm still at the very beginning, I have a functional toolchain. I can generate assembly code and then have the assembler (masm32) translate the code to object files and then the linker creates an exe file out of it. So far so good - however I have problems in generating code for performing divisions and multiplications. In the grammar I first perform divisions and then perform multiplications.

The parser uses a visitor, to be exactly there is one for the JVM platform and one for the 80x86 platform (Windows).

The main problem appears to me that the parser creates a stack by calling the visit-methods and that is very well suited for the JVM platform, since it is also stack based. However the 80x86 platform works with registers, so I have to load the operands into registers.

For example an expression like "8 * 4 / 2" should evaluate to 16 and the relevant assembly code should look like

mov eax, 8
mov ebx, 4
imul eax,ebx
mov ebx, 2
idiv ebx

However the code generated is

 mov eax, 8
 mov eax, 4
 mov ebx, 2
 idiv ebx
 imul eax, ebx

Due to the stackbased approach there are two consecutive mov-statements for eax, which is of course completely wrong, since the first value for eax gets overwritten with the second one.

So my question is how to transform the stackbased approach in a more "linear" approach?

I hope that the wording of my question is comprehensible and not too complicated.

Thanks a lot for reading and answering!

Here is the relevant part of the grammar:

expression: 
            left=expression DIV   right=expression   #Div
          | left=expression MUL   right=expression   #Mul
          | left=expression MINUS right=expression   #Minus
          | left=expression PLUS  right=expression   #Plus
          | '(' expression ')'                       #Parens 
          | left=expression operator=(LT | LE | GT | GE | EQUAL | NOTEQUAL)    right=expression   #Relational
          | left=expression AND right=expression     #And
          | left=expression OR right=expression      #Or
          | number=NUMBER                            #Number   
          | text=STRING                              #String
          | varName=IDENTIFIER                       #Variable
          | functionCall                             #FuncCallExpression
      ;     

antlr

code-generation

0 Answers

Your Answer

Accepted video resources