


By Petter Holmberg
Assembly language programming tutorial part 1: Getting started
By Petter Holmberg of Enhanced Creations
Hello everyone!
This article is written for all of you who wants to learn how to program in
assembler in order to enhance your QuickBASIC programs. I know this is a
dream for many QB programmers, but they feel it's too complicated to learn,
and they haven't found any good sources of information to get started with.
If you are one of these programmers, this article is written for you. You
will find that it's not easy to learn assembly language programming, but you
will probably also find that it's much easier than you first thought. This
article will not delve too deeply into assembly language programming, but it
will give you a solid start to work on.
So what is assembler then?
The early ancestors of today's computers, developed in the period of about
1940 to 1960, was a real pain to program. The circuits in these computers
could perform simple arithmetic operations, they could take data as input,
write data as output, and do other operations needed to solve problems for
the people that had built them. In order to make the computers understand
what they should do, they needed to be fed with instructions. These
instructions was given to the computers as series of codes. Let's say the
number 1 was the code for adding, the number 2 was the code for subtracting,
and the number 3 was the code for outputting the result. The programmers
would figure out a program, input it into the computer by turning switches
or making holes in paper cards and feed them to the computer. If the program
didn't work, the programmers had to go through each instruction again and
see were the error was, and then reprogram the computer again. Not very
convenient, especially as the programs were all written as a series of ones
and zeroes. In order to make programming easier, they started writing the
programs in hexadecimal numbers instead of binary numbers. That changed 4
binary digits into one hexadecimal, making the programs shorter and easier
to read. But the programs was still just a sequence of numbers, hard to
remember and understand for any programmer. So someone had the great idea
that they would instead write the instructions as short words, that could
be translated directly into numbers and fed to the computer. So instead of
saying 1 for an addition, the programmers could say "add", and instead of
2 for subtraction, they could say "sub". Now you could see more clearly what
the program did, and finding errors was not as hard anymore. The assembly
language was invented.
Later on, computer engineers found out that you could actually make
programming
a lot easier if you rewrote long sequences of assembly instructions into
codes much more like human language. They were called high-level programming
languages, and BASIC was one of the first ones.
Today's microprocessors still perform their dutys as a series of simple
instructions, such as "add" and "sub", but programming languages like BASIC
makes sure that we usually shouldn't have to worry about it.
Why do I need to learn assembler?
There are many reasons to use a high-level language like BASIC instead of
assembler: A simple instruction such as PRINT could in assembler be more than
100 lines of code. It is therefore pretty obvious that BASIC programs are
easier to write and debug, and you don't have to worry about what the
processor
actually does when it writes a letter on the screen. It just works. Another
reason to use high-level languages is that you could easilly convert your
BASIC
program on yout PC to work on an Amiga computer, using an Amiga BASIC
compiler.
If you had wrote your program in assembler you would find that the Amiga
wouldn't understand it, because it's CPU doesn't work like a PC processor.
There are still reasons to use assembler instead of a high-level language:
QuickBASIC cannot do everything. There are sometimes things you want to do
with the computer that no BASIC instruction can do, and you often find that
your BASIC program needs to do so many calculations that the program gets
slow. The problem is that such an instruction as PRINT takes many
possibilities
into account. It makes sure you have a valid string to print, it checks what
screen mode you use and what color you want to print the text in and so on.
Usually you know all these details when you want to print the text, and you
don't need the processor to perform all these checks. The only way to remove
them is to use assembler code instead of PRINT. There's no point in writing
a full program in assembler. Only use it when you need to do something really
fast or something really low-level.
What do I need to know?
When you write a BASIC program, you don't really need to know much about how
the computer works. In assembler you work with the computer on it's own level,
and therefore you need to know what you're actually doing. You don't need to
know very much to get started though, and you will learn the rest as you're
learning assembler.
The first thing that you will find useful to know is how to count in the
binary and hexadecimal system instead of the decimal. This is pretty easy to
learn.
Usually we count in the decimal system. We then have 10 numbers, ranging from
0 to 9. The lowest number we could use is 0, and as we count upwards we use
the numbers 1, 2, 3, 4, 5, 6, 7, 8 and 9. That's all the numbers we have, so
in order to continue we need to use two numbers. We reset the 9 to 0, and
add a 1 to the right of it. The first number is now worth 10 times the second
one. We can now use all combinations of numbers up to 99, and then we need to
reset them and add a third number. This suggests that the number 1234 can be
expressed as 1*10^3 + 2*10^2 + 3*10^1 + 4*10^0. See the pattern?
What if you didn't have 10 numbers to play with? Well, it works just as fine
anyway. The binary system, on which computer technology is based, has only 2
possible numbers, 0 and 1. You start counting from 0, and when you reach 1 you
have used all of your numbers and need to add a second one, and you get the
number 10. Each new number is worth 2 times the number to the right. The
binary
number 10110 can thus be expressed as 1*2^4 + 0*2^3 + 1*2^2 + 1*2^1 + 0*2^0,
or 
22. The hexadecimal system works with 16 different numbers. Since we only have
invented 10 symbols for numbers, we use letters to represent the higher
numbers.
The hexadecimal system therefore uses the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, A, B, C, D, E and F. The hexadecimal number F3 can therefore be expressed
as 15*16^1 + 3*16^0, or 243. It's easier to understand if we put the three
systems in a table for comparisation:
Decimal         Hexadecimal     Binary
0               0               0
1               1               1
2               2               10
3               3               11
4               4               100
5               5               101
6               6               110
7               7               111
8               8               1000
9               9               1001
10              A               1010
11              B               1011
12              C               1100
13              D               1101
14              E               1110
15              F               1111
16              10              10000
As you can see, the number F in hexadecimal is the same as the number 1111 in
binary, and this shows why the hexadecimal system is often used in assembly
language programming instead of the decimal. If you want the binary number
1111000011110000, you can write it in hecadecimal as F0F0. As you can see,
it's easy to convert binary numbers to hexadecimal and hexadecimal numbers
to binary. 
The number of different digits you can use is called the base of the counting
system. You can use any number as a base. If your number in any counting
system is, say, 3 digits long, it can be expressed as: 
a*base^2 + b*base^1 + c*base^0, where a, b, and c are your three digits. The
most important thing when using different systems simultaneously is to keep
track of what system you use for a certain number. For example, is the number
10 the usual decimal for 10, or the binary version of the decimal number 2?
If you still haven't understood this, read it again until you do or ask
someone
who understands it to explain it to you. It's very useful to know about this
when you program in assembler.
The second thing that is necessary to know when programming in assembler is
the PC memory architecture. I'm not going to explain this in detail, because
it's a complicated issue.
A PC has 640 kilobytes of basic memory, and additional megabytes in special
memory circuits that you can insert into the computer yourself. The terms
EMS and XMS refers to this extra memory. That is not the memory I'm going to
talk about here. The interesting thing is the basic 640 kilobytes that
every PC has. You need to know how to find a certain position in the memory
if you want to use it, and you need to know how to do this if you are going
to be an assembly programmer.
Each position in the memory have an address, a number telling the computer
where to read or write data. It would have been easy if this addres would
just have been a number from 0 to 640k, but that's not the system used. A
memory position is described by two numbers, called the segment address and
the offset address. The actual memory position is a combination of the segment
and the offset address.
The segment address describes the memory as groups of 16 bytes. The first byte
in the memory, byte 0 if you like, has the segment address 0. The segment
address 1 is the 16th byte in memory, and the segment address 2 is the 32th
byte in memory. The offset address is a number telling you how far from the
segment position in memory the byte you want is. So if you want to access byte
3 in memory, you use the segment address 0, and the offset address 3. Together
they form a number pointing at an exact memory position. Written as a formula
this can be expressed as: actual memory address = segment*16 + offset. if you
want to access byte 20 in the memory, you use the segment 1, giving you the
position 16, and the offset 4, adding 4 bytes to the position, for the final
number 20. But you can also use a segment address of 0, and the offset 20,
giving you the same memory position! The segment and the offset address
numbers can both range from 0 to 65535, giving you several possible
combinations
when you want to use a certain memory position. This system makes it a little
complicated to understand memory addressing to beginners. You can see what
segment and offset a certain BASIC variable is located at by using the
functions VARSEG and VARPTR. Try it!
Now you might be wondering how it is possible for both the segment and offset
variables to be 65535. That gives you the biggest possible memory position of:
65535 * 16 + 65535 = 1114095, which is bigger than 640k. Well, this memory
certainly exists, but it is not accessible as the first 640k of memory, and
I'm not going to delve deeper into this here and now. Later on, I will discuss
memory access in more detail.
Again, if you didn't understand this, read it again, and if that didn't help,
ask someone to explain it to you.
This was all for the first part of this article: A very brief introduction to
what's about to come. The next time I will start describing the basics of
assembler and how you use it in QuickBASIC. Make sure you understand the
different numbering systems and the memory addressing scheme until then.
Bye for now!
(Editor's Note: Petter's asm series will continue in Issue 5. Check it out!)
Back to Top
This tutorial  originally appeared in QBasic: The Magazine Issue 4.