Assembly language programming tutorial part 2: The basics
By Petter Holmberg of Enhanced Creations
Edited version (original version posted in QB:tm)

Hello again!
The first part of this tutorial was written in a hurry, but this one wasn't,
so I hope you will find this one better. Last time I discussed the history
of assembler and told you where to use it and not. I also gave you some
information about the binary and hexadecimal system, and briefly explained
how the base memory is addressed. This background information was needed to
give you a good start in the learning process.
This time I will teach you the basics of the assembly language and show you
how to use it in QuickBASIC. Let's get to it!

How the heck can I execute assembly code in QuickBASIC?
This might be the first question you're asking yourselves. How can you make
QuickBASIC understand assembly code? Well, you can't. QuickBASIC can only
understand regular BASIC expressions. However, it is possible to make
QuickBASIC execute snippets of machine language code in a program. Machine
language is the only language the processor really understands, and the
BASIC code you usually see is translated into machine language instructions
when you run the program. But as assembly code basically is machine language
represented in a more humane way, all you need is a program that translates
your assembly code into machine code, and the knowledge on how to get
QuickBASIC to run that machine code.

When converting a program written in a high-level language such as QuickBASIC
to machine code, you say that you compile a program. When you do the same
with an assembly program, you say that you assemble the program. A program
that does this is called an assembler. Do not mix up the expressions here!
Now you may be thinking that you don't have an assembler on your hard drive,
but that's where you're wrong. All Microsoft operating systems, from MS-DOS
to Win98 have a program called DEBUG somewhere. This program was included in
Microsoft OS:s as a tool for advanced users, and it has the possibility to
convert raw assembly code to machine code and the reverse. This program can
be very useful, but it's also very hard to use. Luckily, you won't need to
worry about that. I will explain this later.

There are two ways to run machine code in QuickBASIC. I will start by only
explaining the first one. QBASIC and QuickBASIC both have a built-in function
called CALL ABSOLUTE. With CALL ABSOLUTE you can execute a machine language
routine and then return to QuickBASIC. If you use the standard QBASIC, you
can use CALL ABSOLUTE directly, but in QuickBASIC it's included in the
external library QB.QLB/QB.LIB. So if you use QuickBASIC, you must start it
with the syntax QB /L, which includes this library.

To begin with, we will work with DEBUG and CALL ABSOLUTE as our tools to
learn assembler. But as I told you DEBUG is hard to use, so I have created
a program that makes everything a whole lot easier. It is called Absolute
Assembly, and it can be downloaded at the Enhanced Creations website at
http://ec.quickbasic.com. The last version was written many months ago, but
it works fine. This program releaves you from the pain of using DEBUG
manually to create a program. It takes a raw text file with assembly
code as input, and through DEBUG generates a snippet of QuickBASIC code in
a file of your choice. I'll give you the details as we continue.

The basics of assembler:
Now we are ready to begin discussing some serious stuff: The first steps into
the asm world!
First of all: When working with assembler, you mainly process a lot of numbers.
These numbers needs to be stored somewhere. You can of course use the memory
to store your data, but there's another way to do it: Through registers.
Registers are a bit like variables, but they're not stored in the memory as
variables are. They are stored in the microprocessor, where they can be
accessed instantly and effectively. However, there are only a few of them, so
you'll have to use them carefully and keep track of what you're doing with
them. Many registers also have special uses, so you can, and must, use them
only in certain places. This may seem a little confusing to you right now, but
you will soon understand how it works.

There are four basic registers that can be used for almost anything. They are
called AX, BX, CX and DX. You can think of these registers as INTEGER variables
in QuickBASIC. They are small memory cells that can store a 16-bit number.
If you want to use only one of the two bytes in these registers, you can do
so by calling them AH/AL, BH/BL, CH/CL, and DH/DL. The H and L stands for
"high" and "low". So you can use only the upper 8 bits of the AX register by
calling it AH, and the lower 8 bits by calling it AL. On 386 computers and
later, you can also call these registers EAX, EBX, ECX and EDX, and use them
to store 32-bit large numbers. I'll better draw this to make you understand
it:

        <--- 32 bits --->
 --------------------------------
|              EAX               |
 --------------------------------|
                |       AX       |
                 ----------------|
                |   AH   |  AL   |
                 ----------------
                 < -- 16 bits -- >

Writing data into AL doesn't affect AH, but it affects AX and EAX. The same
rules goes for BX, CX and DX. Remember that this is all just different names
to access different parts in the same register. As DEBUG cannot handle any
386 or above processor instuctions, we won't be using 32 bits registers as
long as we're working with it.
Although these four general purpose registers can be used for almost
everthing, they also have special purposes. The A, B, C and D in the registers
actually stand for Accumulator, Base, Counter and Data. I will tell you when
they should be used when we come to such situations.
There are many other registers with more special uses that you will need to
learn, but I will return to them later when we need them. Now we're going to
learn our first assembly instruction!

MOV, your key to data transfer:
The most common and important assembly instruction is called MOV. Assemblers
on most platforms have this instruction. It's purpose is to move or copy
values between memory and registers. As you probably guessed, MOV is short
for move. Most assembly instructions are three letters long. The name is a
little misleading, because moving a value would mean taking it away from the
source, but MOV actually copies the value. The general syntax for MOV is:

MOV destination, source

Beginners always tend to mix up the positions of the source and destination
with MOV. It may seem more natural to put the source first, but if you think
about it you'll see that you do the same in BASIC. (destination = source)
The source can be a direct number, a register or a value in the memory. The
destination can be a register or a memory position. Here are some exaples:
If you want to put the value 8 in the AX register, you type:

MOV AX, 8

If you would like to copy the contents of the CH register into BL, you type:

MOV CH, BL

A thing that you cannot do is to move a 16-bit value into an 8-bit register.
An instruction like MOV AL, BX is therefore not possible.

But what about values in the memory? Well, then you must learn to use three
new registers: DS, SI and DI.

Accessing the memory:
In order to be able to read and write in the memory, you need the special
memory addressing registers. If you want to read data from the memory, you
must put the memory address into the two registers DS and SI. Their full names
are the Data Segment register and the Source Index register. In DS you should
put the segment address of the memory position you want to read from, and in
SI you should put the offset address. Segments and offsets was explained in
part 1 of this tutorial series.
Now, suppose you want to copy byte number 18 in the memory into AL. Then you
need the following assembly code:

MOV BX, 1
MOV DS, BX
MOV SI, 2
MOV AL, [SI]

This requires some explaining. First of all: It's impossible to move values
into the DS register directly. Don't ask me why, but it has to do with the
Intel PC processor architecture. So what you need to do is to put a value in
one of the general purpose registers, here I use BX, often used in this
situation, and then I copy the contents of that register into DS. That explains
the first two lines. Next we put a value into the SI register. Luckily this
can be done directly. We wanted memory position 18, and now the DS register
is 1 and the SI register is 2. Remember that the actual memory position is
the segment times 16 plus the offset. In our case this gives us 1*16+2=18.
The final line copies the byte located at this memory position into AL. The
brackets [] around SI means that the computer should fetch the byte at the
memory position pointed out by SI, (DS is automatically assumed to hold the
correct segment address) instead of the value in SI itself. Without the
brackets, the value of SI, (1), would get into AL. Well, actually that wouldn't
work because AL is only 8 bits and SI is 16 bits. If it had been the whole AX
register it would work though. But with the brackets, the value from memory
position DS * 16 + SI is read.

If you want to do the opposite; write to memory, you use DI instead of SI. DI
is short for Destination Index. Writing data is done in almost the same way
as reading data. If you would like to write the value 5 into the same memory
position as we were reading from in the previous example, you type this:

MOV BX, 1
MOV DS, BX
MOV DI, 2
MOV AL, 5
MOV [DI], AL

Easy, huh? Here the DS register is also used for the segment address. In this
example, AL, an 8 bit register part, is used, and so 8 bits will be written to
the memory. If you had changed AL to AX, 16 bits would have been written.

Now you should know how to read and write values from/to registers and how
to access specific memory positions. But you can't do much with it yet. Now
it's time to learn how to call asm routines from QB!

The interface:
Now you need to have both DEBUG and Absolute Assembly, the two programs I
talked about earlier. You won't have to know how to use DEBUG, because
Absolute Assembly will do all the dirty work for you. You can use your
favorite text editor when creating your assembly routines. Just make sure
the code is saved in a file of the standard ASCII .TXT format. When you want
the code to be fed into QB, you just start Absolute Assembly. You will be
asked to type in the name of the text file containing the asm source, the
QB program file to put the code in, and the name of a code string. Make sure
you have saved the BAS file in standard ASCII format (this only applies to
QuickBASIC users). The code string is a string variable name that is going to
be used in the QB program to access the code. If you're making an asm routine
to draw pixels on the screen, you should call it drawpixel$ or something like
that. You don't have to type in the $ sign in Absolute Aseembly. Next, you
will be asked to answer yes or no to two questions. The first one asks you if
you want to append the code to the basic source file. If you answer no, your
BAS file will be cleared before the code is written to it. If you answer yes,
the ASM code will end up in the bottom of the file without erasing its old
contents. The other question is if you want to add CALL ABSOLUTE lines to the
program. This is a little hard to explain right now. I'll get back to it soon.

Our first assembly routine:
It's time to try writing an asm routine for QB. The first test routine won't
do anything, it's just a test.
First, start your favorite text editor and type in the following lines:

PUSH BP
MOV BP, SP
POP BP
RETF

Now you wonder what this means, but don't worry! I will explain it to you.
The first line is a new assembly instruction for you: PUSH. This introduces a
new part of assembly programming. PUSH is an instruction used to put values
on the stack. What is the stack then? Well, it's a part of the memory that
can be used to store temporary values. You will often have the need to keep
track of more numbers than there are registers, and the most convenient way
to go then is to use the stack. The stack is a place where you can shuffle
away values until you want to use them. PUSH is the instruction you use to
copy a value to the stack. When you want to get the value back, you use the
opposite of PUSH, an instruction called POP. You can see that POP is used on
the third line of the program. There's a special register that keeps track
of where in the memory the stack is located. It is called the Stack Pointer,
or SP. When you use PUSH, the value goes to the memory address in SP. Then
SP is changed, so that it points at a new memory position in the stack. This
works a little strange. You can think of the stack as a stack of plates. When
you use PUSH, it's like you were putting a new plate on the stack. When you
use POP, you remove it, revealing the plate underneath. So you must keep order
of the values you put on the stack. Consider this example:

PUSH AX
PUSH BX
POP AX
POP BX

First, AX is copied to the stack, and then BX. When the first POP instruction
is called, the value that was last put on the stack, the BX value, is returned
to AX. When the second POP is called, the first value of AX gets into BX. So
this example will actually swap the values in AX and BX, using the stack. If
you want the values to get back in the right order, you must POP them in the
opposite order:

PUSH AX
PUSH BX
POP BX
POP AX

This would correctly return the values to their original registers. This is a
technique called LIFO, and it stands for Last In, First Out. Get it?
Of course, there's no reason to PUSH and POP values like in the example above,
but if you needed AX and BX for other things between the PUSH and POP calls,
you would find it very useful.
There's a strange thing about the stack that you need to know also. The image
of a stack of plates is not entirely correct. The value in the SP register is
not increased after each PUSH, it's actually decreased! So it would be more
like a stack of plates turned upside down, even though earthly physics
obviously woldn't accept stacks of plates hanging upside down on the roof. :-)
This is not something you need to care about now though.

Now let's return to the test program. As you can see, the PUSH instruction
uses a register called BP. This is another new thing for you. BP is short for
Base Pointer, and it is a value that points to the base of the stack. So in
the plate example, it would point at the plate in the bottom of the stack,
(if you ignore the upside down-thing for a while). It's essential that this
value stays the same before and after the call to the asm routine, because
QB uses it too. So therefore we always PUSH it in the beginning of the
routine, and then POP:s it back in the end.
Now we have come to the second line. By now you should understand what it
does: It puts the value of the SP register in BP. Now the computer will think
that the bottom of the stack is at what's actually the top of the stack. In
the next part of this tutorial series I will explain why we do this.
After the first two lines, we would be free to write anything, but as we
don't want the routine to do anything yet, we'll just POP the old value of BP
back and return to QB. The last line with an instruction called RETF, which
stands for Return Far, will make sure we get back to the BASIC program.

Let's try this out:
- First, type the four lines into a text file and save it. Let's call it
  ASMTEST.TXT!
- Then: Run Absolute Assembly.
- First you will be asked to type in the name of the asm sourcefile. Type
  ASMTEST.TXT. Usually, assembly source files have the extension .ASM, but
  it doesn't matter what name you use.
- Then you type in the name of the BASIC sourcefile. Since we don't have one,
  just type ASMTEST.BAS, and this file will be created.
- Then you must type in the name of the code string. call it test.
- The program now asks if you want to append the asm code to the BASIC file.
  Since the BAS file is empty, press the N key for no.
- Finally the program asks if you want to add CALL ABSOLUTE lines. Press Y.
- Now DEBUG should be executed by the program, and it will prompt you what is
  happening. If everything was done correctly, the program will ask you if
  you want to convert another file. Press N and exit the program.
- Now open the file ASMTEST.BAS in QB. You will se this:

' ------ Created with Absolute Assembly 2.1 by Petter Holmberg, -97. ------- '

test$ = ""
test$ = test$ + CHR$(&H55)              ' PUSH BP
test$ = test$ + CHR$(&H89) + CHR$(&HE5) ' MOV BP,SP
test$ = test$ + CHR$(&H5D)              ' POP BP
test$ = test$ + CHR$(&HCB)              ' RETF

offset% = SADD(test$)
DEF SEG = VARSEG(test$)
CALL ABSOLUTE(offset%)
DEF SEG

' ------ Created with Absolute Assembly 2.1 by Petter Holmberg, -97. ------- '

As you see, there's now a string variable called test$. For each line, numbers
(converted to ASCII codes) are added to the string. On the right, you can see
the assembly instructions you typed in earlier as comments. For each line, one
assembly instruction, or more correctly, one machine language equvivalent to 
an assembly instruction, is added to the string.
Because we answered yes to the question if we wanted to add CALL ABSOLUTE
lines to the BAS file, there are also four other lines under the test$
declaration. the offset% variable gets the offset address of the string test$,
and the DEF SEG instruction makes sure the default segment is the segment of
the test$ string. (DEF SEG in QB is almost the same as typing MOV DS, BX in
assembler)
And now comes the CALL ABSOLUTE call. This line will execute the code located
at the start of the test$ string. As this test program doesn't really do
anything, you won't see anything happening. Finally, the second DEF SEG
resets the default QB segment. Run the program and make sure it actually
works! It won't do anything, but just the fact that it didn't crash is
enough to make an assembly programmer happy!

This is the end of the second part of my assembly tutorial. I've tried to go
slowly in the beginning so that you would understand everything, but now the
vital basics of assembler should be crystal clear for you. The next time we
can start the fun! I will teach you more assembler instructions, and we will
start writing programs that can do something useful, such as manipulating
BASIC variables and returning the answer. As the last time, make sure you
understand everything I explained in this part, and I'll see you in January!

Have a Merry Christmas and a Happy new Year!

Petter Holmberg