Writer: Gabriel Fernandez
(Check out COMPILER.BAS which is a part of this tutorial -ed)
Hi, welcome to the first Part of the my compiler tutorial. I made this tutorial to help the 'QB Cult Magazine', which, I think is the best qbasic magazine.
Well, I think you know Gabasic, it is a basic compiler I made in qbasic, check http://gab_soft.tripod.com to download it, I hope this tut is going to leave you ready to create a compiler like(or maybe better) than Gabasic in QB.
I'm going to teach how to make a compiler that works in RealMode, because this is the mode that everyone knows how it works.
You must have assembler knowledge to read this tutorial.
Well, let's start the tutorial.
|First, make:||- A string using DIM SHARED called Text$|
- A variable using DIM SHARED called linen&
Here are the most important subs and functions (if you don't understend something, don't worry because all the subs will be explained later):
- A sub to show program errors|
'SUB ShowError (ErrorNumber)'
- A lot of subs for Keywords|
'SUB Keyword.NAME (Parameters)'
Example: SUB Keyword.CLS ()
- Functions to return words, and full sentences|
'FUNCTION GetWord$ ()'
'FUNCTION GetFullWord$ ()'
- The math Parser(more info later)|
'SUB Parser.Math (Mode)'
- The string Parser|
'SUB Parser.String ()' *
- A function that will check if a text must be parse using the
math parser or the string parser. This is needed for IF, LOOP|
'FUNCTION Gettexttype ()' *
The main program should look like this:
DIM SHARED text$ DIM SHARED linen& OPEN Inputfile$ FOR INPUT AS #1 OPEN Outasm$ FOR OUTPUT AS #2 DO WHILE NOT EOF(1) Line input #1, text$ linen& = linen& + 1 ; Here we will check keywords, subs, etc. LOOP CLOSE
The main program is very simple, it just read each line of the InputFile$ and it adds one to linen&(which has the number of the current line read), until we reach the end of the file. Of course we are going to add a lot of stuff to this program.
Now, I will explain what a 'Parser' is.
Parser: A parser analize and generates assembler (not necesary assembler) code for math or string operations. A Parser is the 50% of the compiler.
A math parser will read and generate code of a text that only has math operations on it, like "1 + 2 / 3 + 4".
The math Parser that we will create will parse text and leave the result in AX, so, if i use the math parser with the following text:
"5 + 4 * 2 / 6"
it will leave the result('3') in AX.
Our parser will support two modes of work, the 16 bit mode(integer mode) and the 32 bit mode(long mode), when our parser works in integer mode, the asm code will use 16 bit registers(AX, BX,...), when it works in long mode, it will use 32 bit registers(EAX, EBX,...)
The Parser is very usefull to parse keywords parameters, and leave the result of the math operations on the CPU registers(AX, BX, CX, ...), to call an assembly function later. So if my text is "LOCATE 12 + 1, 2 - 1" , I will call the parser with the first parameter("12 + 1"), and the parser will leave the result in AX, then I PUSH this value(on the stack), and then I call my parser with the second parameter("2 - 1"), it will leave the result again in AX, now I POP the pushed value into BX, and now i call the "LOCATE" asm function. Here's the code that will be generated with "LOCATE 12 + 1, 2 - 1":
Mov ax, 12 Add ax, 1 PUSH AX Mov ax, 2 Sub ax, 1 POP BX CALL Locate
Here's the Keyword.LOCATE sub:
SUB Keyword.LOCATE () a$ = getfullword$ b$ = getfullword$ IF a$ = "eof" OR b$ = "eof" THEN ShowError 1 Math.Parser a$ END SUB
I'm going to explain the STRING Parser later, when we finish the Math parser(this will take a long time).
The show error sub will be called when an error is found on the program, this will make the compiler programming a lot easier.
' ShowError sub SUB ShowError (Errornumber) Print "An error was found in the line: ", linen& SELECT CASE Errornumber CASE 1: PRINT "Argument-count mismatch" CASE 2: PRINT "Unkown command" END SELECT END END SUB
Getword$ will be a function that will return each word of Text$
- Example of Getword$:
Text$ = "PRINT A$ + B$ + 'Hello'"
Each call to Getword$ will return the next word. Here is what Getword$ will return on each call:
|1st call: Getword$ will return "PRINT"|
|2nd call: Getword$ will return "A$"|
|3rd call: Getword$ will return "+"|
|4th call: Getword$ will return "B$"|
|5th call: Getword$ will return "+"|
|6th call: Getword$ will return "'Hello'"|
|7th call: Getword$ will return 'eof'|
GetFullWord$ will return a full sentence, a sentence will be all the text before a comma (",").
So, if text$ = "LOCATE 13 + 2, 12 - col%", it will return on each call:
|1st call: "LOCATE 13 + 2"|
|2nd call: "12 - col%"|
|3rd call: the text 'eof'|
All this functions must return the text "eof" when the end of the line is reached. Also, when it finds a '(comment char), it must return 'eof' too, becuase there is where a comment starts.
Well, now we are going to add the following commands to our compiler:
|- CLS (Clears the screen)|
|- WAITKEY (Waits until a key is pressed)|
|- END (Finish the program)|
You must have the Getword$ and Getfullword$ functions finished, check the 'compiler.bas' for this functions.
To add keywords, we will add the code like this one to our main program:
keyword$ = getword$ SELECT CASE UCASE$(Keyword$) CASE AnyKeyword: Keyword.ANYKEYWORD END SELECT
Check now the main program with the above code added:
DO WHILE NOT EOF(1) Line input #1, text$ linen& = linen& + 1 keyword$ = getword$ SELECT CASE UCASE$(Keyword$) ' Keywords list CASE "CLS": Keyword.CLS CASE "END": Keyword.END CASE "WAITKEY": Keyword.WAITKEY END SELECT LOOP
Let's build the subs for CLS, END, WAITKEY:
SUB Keyword.CLS () PRINT #2, "CALL CLS" END SUB SUB Keyword.END () Print #2, "MOV AX, 4C00h" Print #2, "INT 21h" END SUB SUB Keyword.WAITKEY () Print #2, "XOR AX, AX" Print #2, "INT 16h" END SUB
Great! Now our compiler can do CLS, END and WAITKEY.
Of course you have to add the CLS asm routine, a CLS routine will look like this:
CLS: ; - CLS routine PUSH ES mov ax, 0B800h mov es, ax mov di, 0 mov cx, 2000 mov ax, 0 REP STOSW POP ES RET
Let's create a SUB called Addasmroutines, this sub will add the asm routines at the end of our Outasm$ file.
SUB Addasmroutines () ' Cls routine PRINT #2, "CLS:" PRINT #2, " PUSH ES" PRINT #2, " mov ax, 0B800h" PRINT #2, " mov es, ax" PRINT #2, " mov di, 0" PRINT #2, " mov cx, 2000" PRINT #2, " mov ax, 0" PRINT #2, " REP STOSW" PRINT #2, " POP ES" PRINT #2, "RET" END SUB
We will call the sub Addasmroutines when our main program ends, add 'Addasmroutines' before the Closing all the file in our main program.
It is very easy to add keywords that doesn't use parameters.
We are going to make a very simple math parser right now, it will support +, *, /, -, and numbers, we aren't going to add variable support in this part of the tutorial.
Our parser will work in the following way:
* Load the first number in the text to AX, and set CurrentOp(eration) to 1 [-- Loop _ Use a$ = Getword$ _ If a$ returns 'eof', exit the sub * Compare CurrentOp with 1, if CurrentOp = 1 then - Set CurrentOp to 2 - Get the operation type(+,-,/,*) * If not one - Get number from a$ - Do the math operation - Set CurrentOp to 1 * END the IF block Loop --]
Now, this is the SUB Parser.MATH, this parser is going to support only integer mode (for now).
DEFINT A-Z SUB Parser.Math () a$ = getword$ Print #2, "MOV AX," + a$ CurrentOp = 1 DO a$ = getword$ IF a$ = "eof" THEN Exit Sub IF CurrentOp = 1 THEN CurrentOp = 2 SELECT CASE a$ Case "+": MathOp = Add: Goto label1 Case "-": MathOp = Subs: Goto label1 Case "/": MathOp = Div: Goto label1 Case "*": MathOp = Mul: Goto label1 END Select ShowError 3 ELSE SELECT CASE MathOp Case Add: PRINT #2, "ADD AX, " + a$ Case Subs: PRINT #2, "SUB AX, " + a$ Case Mul: PRINT #2, "MOV DX, 0" PRINT #2, "MOV BX, " + a$ PRINT #2, "MUL BX" Case Div: PRINT #2, "MOV DX, 0" PRINT #2, "MOV BX, " + a$ PRINT #2, "DIV BX" END SELECT CurrentOp = 1 END IF label1: LOOP END SUB
That was a very simple math parser, but it does the work. Now, you can create keywords that use parameters.
Now, let's create the LOCATE keyword.
SUB Keyword.LOCATE () parameter1$ = getfullword$ parameter2$ = getfullword$ Text$ = parameter1$ Getcharpos = 0 Parser.Math PRINT #2, "MOV [Textx], ax" Text$ = parameter2$ Getcharpos = 0 Parser.Math PRINT #2, "MOV [Texty], ax" END SUB
Don't worry about the getcharpos variable, look the compiler.bas to understand that.
Here ends the first part of my tutorial. I hope you like it. In next part, we are going to create a great math parser, and maybe we are going to add variables support.