QuickBasic Graphics Programming in Assembly language
By CGI Joe of Shimmer
[V1.1 15th June 2000]
Revised 11th Dec 2000
Ever used DirectQB, Future.Lib or Blast? They provide ways of using cool, fast
graphics in Qbasic. I bet you've been limited by them in some way, though.
Like you wanted a cool routine to draw multiple parallax backrounds, and the
library didn't have such a routine? Want to have the power to do anything
you want, being limited only by your imagination (and the lack of a damn fine
tutorial ;).
Other people's libraries are great, but they don't teach you anything.
Let's learn Assembly Language!
DISCLAIMER:
"I do not accept responsibility for any effects, adverse or otherwise,
that this code may have on you, your computer, your sanity, your dog,
and anything else that you can think of. Use it at your own risk."
Blah etc. Shame we live in a world where that sort of thing
is required.
(Taken from the DJGPP Allegro docs)
You've really got to *want* to learn this stuff cos, er...it involves
a lot of theory and there's a lot learn. I've tried to keep everything
down to a minumum. But remember, you don't have to memorise *everything*
here. Just make sure you understand a few concepts and use it as a reference
as and when you need it.
I'll be covering the following topics in this order:
1. Introduction
- How do I use assembly language in QBasic?
Well, there are two, quite different, ways. The first method was used by the
Blast! library by Andrew L. Ayers. It involved using CALL ABSOLUTE, a
statement which allowed programmers to use Machine Code in QBasic (Machine
Code/Language is just a series of 'Opcodes' which is just assembled assembly
language). It worked like this; you stored all of your Machine code into
memory, and then passed the address of this code to CALL ABSOLUTE. The
problem with this was you had to have an assembler capable of generating
pure machine code instead of a standard format like .OBJ and the code was
difficult to maintain in QBasic (you had to set up memory and have big ugly
Hex strings which held the code). Ok, so now you're ready to chuck the first
method. Excellent.
The second method is far more structured:
- You have all of your assembly code in a seperate .ASM file
- You assemble the file using TASM or MASM or whatever. This
produces an .OBJ file.
- You build a Library (.LIB) file using the given OBJ file
- You link your QBasic program in with the library.
This is much cleaner and easy to maintain. The disadvantages are
you must have QuickBasic Version 4.5 because this version is capable
of using libraries. There are other version of QB like 4.0 and 7.1 but 4.5
is the most widely available. You also need an assembler capable of producing
.OBJ files. All the steps here are described in detail elsewhere in this
document.
You can find all the tools you need, including QuickBasic 4.5, on our site.
1.1 Architecture of the PC
Let me explain how all the parts interact -
RAM and High Memory
RAM, under DOS, is split into parts, I've shown two in the diagram. The
first, larger, part is called Conventional or Base memory. This is where
your EXE program is stored when you run it in DOS. It contains all of
your Code and Data. The position of all this stuff is determined by it's
size. Also, your Code doesn't have to be in the same 'place' in memory as
your Data. There could be gaps of unused space in between, this is all
decided and handled by the Operating System, DOS and should really concern
you at all. Items that are found in Base memory include:
- Your Code and Data
- Parts of the Operating System (DOS)
- Interrupt tables and Interrupt routines
The smaller part is called the High Memory Area (HMA). This contains
some hardware specific stuff like the SCREEN 13 graphics area is located
here. You can't store any Code or Data you like here. Other items that
are found in the HMA include:
- System drivers (network, mouse, sound, TSR's)
- All VGA card data for Text & Graphics modes
- Parts of DOS can be found here
Hardware I/O (Input/Output)
You can use the CPU to control Hardware ports. What are these ports,
exactly? Well, it's how you change the colour palette in Graphics
mode. You can also manipulate other things like the keyboard. It's
basically a way of reading from and writing to the hardware. You may
have used the OUT and INP statements in QB, this is what they do.
CPU
You will not *believe* the amount of work the CPU is continuously doing!
It has to handle things like:
- Redrawing and displaying the screen
- Handling thousands of interrupts a second
- Fetching your code from RAM, decoding and executing it
- Taking inputs such as keypresses and converting the data
it returns.
- Loads of other stuff I don't even *know* about!
Everything you access on your PC goes through the CPU.
If you are confused by interrupts, don't worry. I explain them all
later.
1.2 Alternative number systems
Our number system is denary or BASE 10. It's name is simply because there are
ten digits, 0 -> 9.
There are two other important number systems you need to know about:
BASE 16 - Hexadecimal
BASE 2 - Binary
The sixteen Hexadecimal digits are:
0 1 2 3 4 5 6 7 8 9 A B C D E F
where A=10, B=11, C=12, D=13, E=14 and F=15
Example: 50 in Hex is actually 80 and 2E is actually 46.
So how do you tell the difference between a Hex number and a normal number?
Well, there's usually an 'H' floating around somewhere. Like, in QBasic, a
Hex number begins with '&H', in assembly language it ends in an 'h'.
It's useful for representing adresses because memory goes up in units
of 16 so it's easier to work with.
Binary of course only has 2 digits. 0 and 1. Here's a binary number:
0 0 0 0 1 1 0 1
It represents the number 13. How do you work it out? Like this:
128 64 32 16 8 4 2 1
-------------------------------
0 0 0 0 1 1 0 1
Each binary digit stands for a number which is a power of two. So you add
up the values of all the digits which are '1'. In this case 8, 4 and 1.
8 + 4 + 1 = 13
That was an 8-bit number, or a byte, and the largest number you can hold with
8-bits is 255. Which brings us onto our next topic...
1.3 Data types
Pretty smooth, huh?
Each data type, apart from the Bit can be Signed or Unsigned. Having an
Unsigned number doubles it's maximum value. Observe:
BIT - Smallest form of data on a PC.
BYTE - 8 Bits
- Unsigned: Largest=255, Smallest=0
- Signed : Largest=127, Smallest= -128
WORD - 16 Bits (2 Bytes)
- Unsigned: Largest=65535, Smallest=0
- Signed : Largest=32767, Smallest= -32768
DOUBLE WORD - 32 Bits (2 Words or 4 Bytes)
- Unsigned: Largest=2^32(about 4 billion), Smallest=0
- Signed : Largest=2,147,483,647, Smallest= -2,147,483,648
2. Addressing memory
Base or Conventional memory is split up into parts called Segments.
Segments are of variable size; the smallest being 16 bytes and the
largest being 64000 bytes.
When you allocate memory, you are given a unique address in Base memory
so you can access it. This address is made up of two parts:
A SEGMENT address and an OFFSET in that segment.
Say we were given an address of a byte of data and the Segment of the data
was A000h and the Offset was 0005h. This means that the byte of data we want
is 5 bytes from A000h or, 5 bytes from the beginning of the segment.
Addresses are commonly written like this:
Segment:Offset
Example: A000:0005
That's just about all you need to know about segments for now. Find out
how to access them next...
3. The CPU
One of Assembly language's selling points is the fact that you can control
just about everything on your PC without some interfering High Level Language
getting in your way (are you listening Microsoft?). But (if you were paying
attention earlier) you soon realise that you must base your code entirely on
the CPU. So where do we begin? I guess the guts of the CPU would be a good
place.
The CPU contains several small banks of memory called 'Registers'. When I
say 'small' I mean they each hold 16 Bits of Data. Ok, that was a lie. Your
PC no doubt has 32-Bit Registers (if it's a 386+) but for now, we'll pretend
they're 16-Bit to make things easier.
These Registers are very fast access. They are used to store numbers which
can be added, subracted, multiplied and divided by other registers or just
immediate numbers. You can treat them like variables in Qbasic.
Some Registers exist for a special purpose. Others can can be used
for anything and are called General Purpose Registers. They are:
AX - Accumulator. This is commonly used for returning values from FUNCTION's
BX - Base. No special reason but can be used to index memory
CX - Counter. This is used for um, counting
DX - Displacement. Just a register
I think Intel just liked the whole ABCD thing...
Here's another 6 registers. These all have a primary reason for existing.
They are used to address and access memory. They are Segment Registers:
CS - Code segment. This points to your CODE
DS - Data segment. This points to your DATA
ES - Extra segment. This can point to anything you want
SS - Stack segment. This points to the STACK (more later)
FS - These two are only really required for use in Protected mode
GS - but they can be used for anything, like ES. They're available on
a 386 processor or above, only.
What's all this pointing business about? "Pointing" means that the register
holds the actual physical address of something in RAM. For example if your
Data was stored in RAM at position 106, DS would equal 106. It's that easy.
You can't add values to Segment registers and you can't move immediate
values into them. You must use a General register first then set the
Segment register = to the General register.
Some more registers
IP - Instruction Pointer. Is used with CS
SP - Stack Pointer. Yes, used with SS
BP - Base Pointer. Used to REFERENCE the stack (SS).
SI - Used with DS
DI - Used with ES
Flags - Wow! What's this, eh? It's another 16-bit register in which each
bit stands for something important, instead of a number. Er, that's
a bit vague, I know. But it's not really vital stuff at the moment.
Do you remember the Segment:Offset stuff from earlier?
Well, IP, SP, SI, DI are all used as OFFSET registers and are paired
with Segment registers, like this:
CS:IP - I wouldn't advise playing around with these!
SS:SP - Or these!
ES:DI
DS:SI
These aren't strict rules. You can use ES:SI if you want. But unless
you explicitly tell the CPU you want to use SI, it will default to DI.
There's something useful about the 4 general registers AX, BX, CX and DX.
They can each be split into two parts. Each part is 8-Bit (a byte) and these
can be treated as seperate registers. A High byte and a Low byte.
AX - AH, AL
BX - BH, BL
CX - CH, CL
DX - DH, DL
This is really useful because you can easily run out of registers when
writing ASM procedures.
So when you change something in AH you're also changing the overall value
of AX:
Here, the value of AL = 1+32+128 = 161
AH = 1+2+4 = 7
and...
AX = 1+32+128+256+512+1024 = 1953
Get it? Oh, c'mon it's easy...
The other parts of the CPU are:
ALU - Arithmatic Logic Unit does all the adding and subtracting and stuff.
We don't care about this, and we can't touch it.
FPU - Floating Point Unit. At least 80% of modern computers have these. It
handles all of our Real numbers. You can access it directly, but you need
to know a few special opcodes. You'll probably never need to use it in
assembly language programming.
Cache - This is a small temporary store like RAM but a zillion times faster.
It's used for storing program code, like a loop. We can't access this
thing either. Good optimization would involve making sure an inner loop
would fit neatly into the cache.
Misc Memory - The CPU has it's own little piece of internal memory which it
uses for various things that we need not concern ourselves with. We
can't access it.
I put those four extra bits in for the sake of completeness. Just in case
somebody tries to picks holes in my tute. Grrr.
4. Some Assembly instructions/mnemonics
A mnemonic, (pronounced new-monic) is something that helps you to remember
something else. An assembly mnemonic is a short, usually 3-letter, version of
the operation it performs. Here's a few:
DEC - decrements stuff (subtracts 1) In assembly language, typically 2 'Operands' follow the instruction/mnemonic.
They are just the two items you want to perform some operation on. Some
instructions require only 1 operand, some none, some want three. All operands
usually have to be the same size. The first operand (the one on the left) is
the item that is going to be 'operated' on. Let's take a look:
MOV ah, al ; moves the contents of AL into AH
SUB cx, bx ; subtracts BX from CX
ADD ah, dl ; adds DL to AH
INC ah ; adds 1 to ah
A ';' is a comment in assembly language, by the way.
A closer look at MOV...
- MOV <Destination>, <Source>
This moves a value, be it a constant, a value of
a register or a memory location into the Destination.
Which can be a register or a memory location. It
can also be a position in the stack.
Example 1: MOV AX, 5
Moves the value 5 into AX. Anything that was
previously in AX will be overwritten. Although 5
is small enough to fit into AL, because AX was the
subject, AH gets overwritten aswell.
There's a list of common instructions available elswhere in this document.
5. Plotting your first Pixel
Forget SCREEN 7, 12, 11, 10, 3, 8. They're all total crap.
The only one you should be interested in is SCREEN 13. Why? Because it's
a good resolution for high-speed graphics (320x200) and you have 256 colours
available.
So how do we plot a pixel? SCREEN 13 Memory is located at the address:
A000:0000
That's Hex, of course. And you should have worked out that memory is stored
in consecutive bytes not a 2D table so...
Say the following is the screen:
<---320--->
11111111111 |
22222222222
33333333333 200
44444444444
55555555555 |
In memory, each line is stored one after the other like this:
1111111111122222222222333333333334444444444455555555555
(We'll just pretend there's 320 in each line, there).
So to calculate the OFFSET in the A000 segment, we just multiply
the pixel's Y position by 320 and add the X:
offset = Y * 320 + X
There's one thing I have to mention at this point; Multiplying numbers is
S L O W! But we want a fast graphics lib, how do we get around this?
Answer: Use shifts.
5.1 Binary Shifts
Binary shifts allow us to Divide and Multiply numbers by powers of 2.
This isn't as limiting as it sounds, as the whole PC theory is built on
powers of 2.
Take a binary number: 0 0 0 0 1 1 0 1 = 13
Shift it Left once : 0 0 0 1 1 0 1 0 = 26
Shift it Left again : 0 0 1 1 0 1 0 0 = 52
And this is super quick. So, Shifting left 8 places is the same
as multiplying by 256. ok?
Shifting Right is the same concept, but if you have an odd number (odd numbers
always have bit number 1 set) then you lose some accuracy. 53 shifted
right once = 26. But, who cares?
The shift mnemonics are:
SHL <subject>, <numberofshifts>
SHR <subject>, <numberofshifts>
Example: SHL AX, 2 ; AX=AX*4
Example: SHR DX, 1 ; DX=DX/2
5.2 Yup, still plotting this damn pixel...
That shifting stuff *really* comes in handy, it can be used for all sorts
of stuff.
So to get Y * 320:
We split the 320 into powers of 2: 64 and 256 (256+64=320)
So (Y*64) + (Y*256) = 320
Some source!
MOV ax, 0A000h ; Screen 13 Segment
MOV es, ax ; Can't set ES directly
MOV dx, Y ; DX = the Y pos
MOV bx, dx ; Copy it to BX
SHL dx, 8 ; DX = DX * 256
SHL bx, 6 ; BX = BX * 64
ADD dx, bx ; DX = DX + BX (DX = Y * 320)
ADD dx, X ; DX = DX + X
MOV di, dx ; DI = DX
MOV al, Colour ; Get the colour
STOSB ; STORE String Byte (AL) at ES:DI
The STOSB is an example of an intruction which has no operands.
We could have used:
MOV es:[di], al
The [] brackets tells the CPU to use the address pointed to by DI.
STOSB stores the byte in AL at position ES:[DI] and INCREMENTS DI
You also have STOSW (STOre String Word) which gets the value from AX and
stores a 16-Bit number.
6. The Stack
Ok, we're just about to start getting down to business, there's just
one more thing you need to know about; the stack. It is through this, your
QB program and your Assembly routine communicate.
The stack is our friend. Remember that. It is basically a temporary storage
area we can use in our programs. At the top of your QBasic program you
will have noticed something like:
DECLARE SUB DrawBox (x1%, y1%, x1%, y2%, col%)
This DECLARE statement is telling QBASIC to put all of the values passed on
to it (inside the brackets) on the STACK so your routine can access them.
Of course you don't have to do any of this yourself; QBasic handles it all
for you.
Anyway, what I have explained above is probably the most complicated use
of the STACK so if you understood that, consider yourself well and truly
done, sir!
Another way we can use the STACK is inside our ASM code. We can PUSH values
onto it and them POP them back off again. The rule is: LAST ON, FIRST OFF.
We use PUSH to place a Byte, Word or Dword onto the stack and POP to remove
a byte, word or dword. Study the following example:
MOV AX, 5 ; Move 5 into AX
PUSH AX ; Place the value of AX onto the stack
MOV AX, 0 ; Set AX to zero
POP AX ; Remove the last value on the stack
; and put it into AX. AX now equals 5!
Understand? Good, cos it gets a little bit more complicated.
The stack has a pointer, SP. It is increased when you PUSH something on
and decreased when you POP something off. I think of the stack as something
you PUSH a value UP TOWARDS as the diagram shows:
Values on stack: Position: (Assuming every value is a WORD)
ValueA 6
ValueB 4
ValueC 2
ValueD 0
Now when we PUSH another value on....
Values on stack: Position: (Assuming every value is a WORD)
ValueA 8
ValueB 6
ValueC 4
ValueD 2
New value --> ValueE 0
Each value's position is increased by 2 (assuming that they're all WORDs
which are, as you should know by now, 2 bytes) because they have all been
'shoved' upwards.
And when we POP back off again:
Values on stack: Position: (Assuming every value is a WORD)
ValueA 6
ValueB 4
ValueC 2
ValueD 0
Everything is returned back to normal. Smashing. If you understand all of
this then give yourself a pat on the back and a hot mug of cocoa, well done!
Say we have a pixel plot routine in assembly and we want to call it
from QBasic. We'll do something like,
PixelPlot 160, 100, 15
Plots a colour 15 pixel at 100,160. So how does our ASM routine get at these
values? Answer: They've been pushed onto the Stack like this:
Values on stack: Position:
--> 160 8
--> 100 6
--> 15 4
QBasic Return SEG 2
QBasic Return OFF 0
The Basic Return SEGment and OFFset are pushed on by QBasic last. It's just
a pointer to the next instruction in your QBasic code. You cannot touch
these and they must always be in this position when your ASM routine is
finished, so remember and clear up when you're done using the stack!
Right, we access the stack using the BP register not the SP because SP
changes in the routine and we want a static value! So the first thing we
do is: save the contents of BP and then MOV BP, SP
You can PUSH BP onto the stack if you like, but remember all your variables
will be 2 positions higher! Remember to restore BP before you exit
your routine!!
Some code to get the X position from the Stack:
MOV cx, bp ; Save bp in cx
MOV bp, sp ; Get the stack pointer
MOV dx, [bp+8] ; Store the X value
Notice we can add an index (8) to the BP register?
Also, notice the square brackets, they mean get the value at the ADDRESS
of bp+8.
We are ready.
7. Creating/Editing Libraries
So you have the knowledge and a plethora of ideas...let's do something
with them.
To create and use a library you NEED the following:
- QuickBasic 4.5 Compiler
- An assembler such as TASM or MASM
- A text editor
- A copy of Microsoft Link (it comes with QB 4.5)
- A copy of Microsoft Library manager (also with QB 4.5)
1. Paste, or type the following into a file:
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
.model medium, basic
.stack 200h
.386
.code
public AddFour
; Our stack:
; Number 6
; QB Seg 4
; QB Off 2
; BP 0
AddFour proc
push bp
mov bp, sp
mov ax, [bp+6]
add ax, 4
pop bp
ret 2
AddFour endp
end
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
2. Save the file as 'ADDFOUR.ASM'
3. Run TASM by typing:
TASM addfour
If there are errors, check you copied the code exactly.
5. Check there is a file called ADDFOUR.OBJ in your directory.
6. Now run Microsoft Library Manager by typing:
LIB
You should see the following:
Microsoft (R) Library Manager Version 3.14
Copyright (C) Microsoft Corp 1983-1988. All rights reserved.
Library name:
Enter 'addfour' as the library name, then you'll see...
Library does not exist. Create? (y/n)
'y' of course, then...
Operations:
This means "What OBJ files?". Type 'addfour.obj'. If you want to use more
than one OBJ then seperate each file with a '+'. Next...
List file:
Just type 'null'. A list file is a listing of all the symbols you use
in you library (symbols: variables, procedure names).
You got yourself a library! But wait, you want to use it in QuickBasic.
Then you have to build *another* library (yes, it's stupid, I know)
which is a .QLB file. This is required only if you want to use your routines
while inside QuickBasic.
You need a copy of Microsoft LINK.EXE
Type the following at the DOS prompt:
LINK/QU addfour.lib, addfour.QLB, NULL, BQLB45.LIB
Make sure BQLB45.LIB is available (it's a QB 4.5 support library that needs
to go into your QLB).
Don't get me *started* on how annoying that *stupid* command is!
My advice is to write a small .BAT batch file to do all that stuff for you.
Hey-ho. Library built!
We should have the following files:
ADDFOUR.ASM
ADDFOUR.OBJ
ADDFOUR.LIB
ADDFOUR.QLB
I know, it gets messier, ;)
Let's start QuickBasic 4.5 with the 'l' switch. It tells QB to load our
library:
qb/l addfour
At the top of our empty .BAS file, type the following:
DECLARE FUNCTION AddFour% (BYVAL x%)
It lets QB know how to call our function AddFour. And because it's
a FUNCTION, it has to know what value to return. In this case, it's an
integer (%). The BYVAL part tells QB to to pass the value of x% onto the
stack. If you leave it out, the default is passing the Segment and Offset
of the variable onto the stack. Just use BYVAL for passing everything.
Now type the following code:
PRINT AddFour%(10)
Run or compile the program and you should see 14!
OK, not stunning. But it works.
8. Contructing an ASM file
So what was that stuff at the top of the asm file?
.model medium, basic
.stack 200h
.386
.code
These are assembly 'Directives'. Directives are used to tell the
assembler how to construct the format of the output OBJ file.
.model medium, basic
We're calling from BASIC and QB uses a Medium memory model. This is sort of
the format of the EXE file QB produces. It's not important, just make sure
you select MEDIUM all the time. Other models include; TINY, SMALL, LARGE
and HUGE. Look them up sometime.
.stack 200h
Sets up a stack of 512 bytes. You can select any size you want but make
sure you leave enough space for all the return addresses and your variables.
200h is plenty of space.
.386
We're using 386 (32-bit) intructions. You can select 286, 486 and 586 also.
But use 386 for compatibility reasons.
.code
Everything after this point is our file's code.
There is also another: .data This is a Data Segment where you can put
internal variables and tables local to your ASM file, very useful.
public
We use this to make our symbols available to external processes, namely
Qbasic. If you don't put this in, QB will moan. Guaranteed.
addfour proc
.
.
.
addfor endp
No prizes for this one. Same as SUB xxx . . . END SUB
TASM also requires you to put an 'end' at the end of your file.
Let's go over the code in the file...
8.1 Review of the code
Just gonna clear up some stuff here.
AddFour proc
push bp ; Save BP
mov bp, sp ; Get the stack pointer
mov ax, [bp+6] ; Get the number from the stack
add ax, 4 ; Add 4 to it
pop bp ; Restore BP
ret 2 ; Return to QB
AddFour endp
The 'ret 2' is used to cleanup the stack and return to QB. The '2' is
used to remove 2 bytes from the top of the stack, which is the number we
pushed on. Use 'ret' on it's own if you don't pass anything on.
9. So, what are we gonna do?
So we have the knowledge and the technology to do pretty much anything we
like. Sprite-scaling, rotation, translucency, texture-mapping...fantastic.
First we've got to do some basic, foundation coding. If you want to write
some seriously cool games we have to use double-buffering.
9.1 Using double buffering
Double-buffering is a technique used to stop those horrible flickering
graphics you might have witnessed in SCREEN 13.
Y'know how you can PCOPY in SCREEN 7 but you can't do it in SCREEN 13?
Doesn't that make you mad?
Let's remedy the situation. All we have to do is have a chunk of memory
which we can use to draw our graphics onto and then copy it onto SCREEN 13
VGA memory at address A000h at the end of our rendering loop.
Where does this magical 'chunk of memory' come frome? Answer: Qbasic.
DIM buffer%(31999)
This allocates all the memory we need for a double buffer. It's exactly
64000 bytes. Which is 320*200 and because each pixel is 1 byte that's 64k
To get the address of this array, we use a QB function called VARSEG.
VARSEG(buffer%(0)) returns the Segment of buffer%
Paste, or type the following into a file
and save it as 'OURLIB.ASM':
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
.model medium, basic
.stack 200h
.386
.code
public fillbuffer, buffercopy, plotpixel
; Stack
; FromSeg 10
; ToSeg 8
; QB Return SEG 6
; QB Return OFF 4
; DS 2
; BP 0
;
buffercopy proc
push ds
push bp
mov bp, sp
mov ds, [bp+10]
mov es, [bp+8]
xor si, si
xor di, di
mov cx, 32000
rep movsw
pop bp
pop ds
ret 4
buffercopy endp
; Stack
; ToSeg 8
; Colour 6
; QB Return SEG 4
; QB Return OFF 2
; BP 0
;
fillbuffer proc
push bp
mov bp, sp
mov es, [bp+8]
xor di, di
mov al, [bp+6]
mov ah, al
mov cx, 32000
rep stosw
pop bp
ret 4
fillbuffer endp
; Stack
; ToSeg 12
; X 10
; Y 8
; Colour 6
; QB Return SEG 4
; QB Return OFF 2
; BP 0
;
plotpixel proc
push bp
mov bp, sp
mov es, [bp+12]
mov dx, [bp+8]
mov bx, dx
shl dx, 8
shl bx, 6
add dx, bx
add dx, [bp+10]
mov di, dx
mov al, [bp+6]
mov es:[di], al
pop bp
ret 8
plotpixel endp
end
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
Assemble it, build a LIB file and then build a QLB file, naming them
all 'OURLIB'.
start QB: qb/l ourlib
Now copy the following into a BAS file:
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
DECLARE SUB FillBuffer (BYVAL toseg%, BYVAL colour%)
DECLARE SUB BufferCopy (BYVAL fromseg%, BYVAL toseg%)
DECLARE SUB PlotPixel (BYVAL toseg%, BYVAL x%, BYVAL y%, BYVAL colour%)
DIM buffer%(32001)
SCREEN 13
FillBuffer VARSEG(buffer%(0)), 1
FOR t% = 1 TO 10000
PlotPixel VARSEG(buffer%(0)), RND * 319, RND * 199, RND * 255
NEXT t%
PRINT "Press a key...": SLEEP
BufferCopy VARSEG(buffer%(0)), &HA000
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
Run it! It plots 10000 pixels on our double buffer, randomly.
The buffer is also filled with blue.
Notice this time we pass on VARSEG(buffer%(0)). Having the option to
specify our buffer's location gives us flexibility. Try passing on the
constant value '&HA000'. It'll change the screen directly. This will be
the standard throughout your library. If you want.
9.2 Review of the code
buffercopy proc
push ds ; Save DS (must always save DS)
push bp ; Save BP
mov bp, sp ; Get stack pointer
mov ds, [bp+10] ; Get source segment
mov es, [bp+8] ; Get to segment
xor si, si ; Zero out si
xor di, di ; Zero out di
mov cx, 32000 ; Let's move 32000 words (64000 bytes)
rep movsw ; Do it
pop bp ; Restore BP
pop ds ; Restore DS
ret 4 ; Return to QB, removing the 2 words
buffercopy endp
Note that you must *always* save DS, but ES isn't important.
XOR is a logic gate. Check it out on the QB help. If you xor a number
with itself, it has the effect of setting it to zero. Believe it or
not, this is faster than doing MOV SI, 0
We set SI and DI to zero. We do this because our Segments are 64k big and
the pointer is always Zero. Don't rely on this for everything else, though.
MOVSW this copies one word from DS:SI to ES:DI. The REP prefix repeats the
MOVSW, CX times. We have 32000 in CX so we copy 32000 words, or 64000 bytes.
fillbuffer proc
push bp ; Save DS
mov bp, sp ; Get stack pointer
mov es, [bp+8] ; Get destination Segment
xor di, di ; Set it's pointer to Zero
mov al, [bp+6] ; Get the fill colour
mov ah, al ; Copy it to AH
mov cx, 32000 ; 32000 words to store
rep stosw ; Store them
pop bp ; Restore BP
ret 4 ; Return to QB, cleanup stack
fillbuffer endp
We copy AL to AH so we can write the two bytes at the same time.
The REP STOSW works in the same way as REP MOVSW but instead it simply
stores data from AX to ES:DI
10. Conditions and looping
What if you wanted to write a routine to copy one segment to another
transparently, to give a parallax effect? You have to go through each
pixel one at a time and check to see if it's colour zero (which is commonly
used for transparency). For this you'll need some new instructions.
- CMP <subject>, <value>
Compares the subject with another value.
- JMP linelabel
Unconditional JUMP to another part in the code
- JNZ linelabel
Jumps if the CMP was Not Zero
- JZ linelabel
Jumps if the CMP was Zero
- JNE linelabel
Jumps if the CMP operands were Not Equal
- JE linelabel
Jumps if the CMP operands were Equal
- JG linelabel
Jumps if the first CMP operand was Greater than the second
- JL linelabel
Jumps if the first CMP operand was Less than the second
There's about 30 jump instructions so look them up somewhere. But the
above should suffice for a while. All of the Jump intructions must be
used immediately after the compare.
Example1:
MOV bx, 3
MOV ax, bx
SUB ax, 3
CMP ax, 0
JZ ax_was_zero
.
. more instructions...
.
ax_was_zero:
.
. more instructions...
.
.
etc
Example2:
MOV ax, 15
MOV dx, 16
CMP dx, ax
JG dx_was_greater
.
. do something else
.
.
dx_was_greater:
.
. more instructions
.
Let's write that routine I was talking about
; Stack
; Fromseg 10
; ToSeg 8
; QB Return SEG 6
; QB Return OFF 4
; DS 2
; BP 0
;
transcopy proc
push ds
push bp
mov bp, sp
mov es, [bp+8]
mov ds, [bp+10]
xor si, si
xor di, di
next_pixel:
mov al, ds:[si]
cmp al, 0
jz skip_plot
mov es:[di], al
skip_plot:
inc si
inc di
cmp di, 64000 ; 63999 is the end of the buffer
jne next_pixel
pop bp
pop ds
ret 4
transcopy endp
Here's a job for you: Copy this into an asm file, assemble it, build
a library and write a demo program to test it.
The QB declaration is:
DECLARE SUB TransCopy (BYVAL fromseg%, BYVAL toseg%)
Remember to make it 'Public'!
11. Interrupts
I actually finished this document then realised that I hadn't actually said
anything about interrupts. Ooops.
- So what are they?
They're mini programs stored in memory. We can treat them like SUB's or
FUNCTION's. The name 'interrupt' comes from the fact you stop the CPU from
doing whatever it's doing and make it run the requested interrupt program.
Each interrupt has it's own unique number in the range 0-255.
- Some examples of interrupts
The DOS interrupt is a good one. Using this interrupt, you can allocate
memory, open/close and read/write files, get information about the current
system setup, free disk space that sort of thing, plus tons of other stuff.
- Ok, so how do you get all those diffeerent things from one interrupt??
The DOS interrupt number is 21h, but we choose what we want to do by passing
another number in the AX register. So opening a file is a subfunction of the
DOS interrupt and we use it by setting AX to 3D00h and then calling interrupt
number 21h.
Interrupts - Hardware and Software
These are the two types of interrupt. A good example of a hardware
interrupt is pressing a key on your keyboard. The keyboard tells the CPU
something happened and the CPU passes control to an Interrupt Service Routine
(ISR) which handles the keypresss. Hardware interrupts occur automatically.
A software interrupt is usually something that is caused by a program. Say,
we generate an interrupt to allocate 32k of memory for us.
- How do we call them?
That's easy:
MOV ax, 0013h ; Mode 13h
INT 10h ; Set the mode
The above example sets the video mode to 13h or SCREEN 13.
INT 10h is the Video interrupt.
Hooking interrupts
We can write small programs to handle the event of an interrupt ourselves.
A popular interrupt replacement is the keyboard interrupt. The keyboard
interrupt is number 09h.
There is a table held by DOS in the lower part of memory called
the Interrupt Vectors Table. When any interrupt occurs, it's number is
looked up in this table and, at it's entry, there is a SEGMENT:OFFSET
address which points to a piece of code in memory that will handle the
interrupt.
We can simply change this address to point to our ASM code and we can
replace the crappy standard keyboard handler with our own!
Funnily enough, the DOS interrupt 21h, has a routine to do this for us, we
simply pass on the SEG:OFF.
A complete keyboard handler is given to you in section 13 of this document.
There is a list of interrupts in a utility program on our site.
12. ASM Data and Arrays
In section 8, Constructing an ASM file, I mentioned something called
the Data Segment. This is a part in our ASM file where we can store
variables, arrays and structures.
There are three common types of data element we can declare:
Byte, Word and Dword (double word).
In ASM they are DB, DW and DD respectively. The 'D' stands for 'Define'.
Example:
--------------------------------------------------------------------
.data
Oldpointer DD 0 ; Define Double Word
Number_of_sprites DW ? ; Define Word
GameSpeed DB 10 ; Define Byte
--------------------------------------------------------------------
The '?' means that we don't want any particular initial value of the
variable whereas the other two are set.
We can also declare a string constant, like this:
ourstring DB 'Hello I am a string of characters'
We can define multiple data sizes like this:
OurWordData DW 1,3,5,7,9,11,13,15,17 ; seperated by commas ','
OurByteData DB 0,1,2,4,8,16,32,64,128 ;
OurDwordData DD -14871, 56004, -24576 ;
The way we access all this data is so easy. Example:
--------------------------------------------------------------------
.data
numbaddudes DW 500
.code
plotenemies proc
;
MOV ax, numbaddudes ; easy
;
plotenemies endp
--------------------------------------------------------------------
*** Important !! ***
The only thing you must remember is that the DS register must
be pointing to the Data segment. When you enter your procedure, DS
is pointing to the right place, but if you change it, remember to
restore it! You can do so at any time by doing this:
MOV cx, @DATA
MOV ds, cx
@DATA is the value of the current data segment.
Another thing to note is that your variable values will not be reset if
you exit the routine. Eg, if calling your ASM procedure you set the variable
'NumWeapons' to 20 then you exit to QBasic, then you enter the ASM again,
it's value will still be 20.
12.1 Arrays
We can use arrays in our ASM as well. It's super easy:
our_array DB 512 dup (0)
Creates an array of 512 bytes and sets each element to zero.
mr_array2 DD 100 dup (?)
Creates an array of 100 Dwords. Each element is uninitialised.
To access them we treat them like normal arrays and use square brackets:
MOV al, our_array[15] ; accesses 16th element
Or we could use a register like this:
MOV al, our_array[BX] ; accesses the element stored
BX is the called the Index Register. You can only use certain registers
as an index. They are: BX, BP, SI and DI.
13. Writing your routines
Before you jump in and start writing your super-optimized sprite routine
in ASM, try writing it in QBasic, in an assembly style. Then use it as a
reference when translating the algorithm to ASM.
Also, if you haven't done so already, check out the source code from some
other libraries around. Our library contains source code and I know that
DirectQB and Dash come with source code so go through it and see if you can
understand it. Don't copy it directly! You wont learn anything that way...
13.1 A good example: A keyboard handler
Here is a good example that covers a lot of the ASM programming
techniques discussed in this document.
It is a fully operational keyboard handler for taking multiple keypresses.
The QBasic declarations are:
DECLARE SUB KeyboardOn()
DECLARE SUB KeyboardOff()
DECLARE FUNCTION GetKey%(BYVAL Scancodenumber%)
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
.model medium, BASIC
.stack 200h
.data
keybflags DB ? ; original flags
keymatrix DB 128 dup(0) ; holds the key map
keybon DB 0 ; tells us if the keyb is on
dos_int_seg DW ? ; old key handler seg
dos_int_off DW ? ; old key handler off
.code
.386
public KeyboardOn
public KeyboardOff
public GetKey
KeyboardOn proc ; Switch our keyboard on
cmp keybon,1
jne keybon_cont
ret
keybon_cont:
mov ax, 40h ;
mov es, ax ; Stores the current keyboard
mov di, 17h ; flags state
mov al, es:[di] ;
and al, 70h
mov keybflags, al
push ds
mov ax,3509h ; Function 35h Get Int Vector
int 21h
pop ds
mov dos_int_off, bx ; Store the old addres
mov dos_int_seg, es ;
mov ax, SEG new_key_int ; Get the SEG & OFF of our
mov dx, OFFSET new_key_int ; procedure
push ds
mov ds, ax
mov ax, 2509h ; Function 25h Set Int Vector
int 21h
pop ds
mov keybon,1
ret
KeyboardOn endp
new_key_int proc ; This is the new interrupt ISR
push ax ; The CPU will call this everytime
push bx ; you press or release a key
push si
push ds
in al, 60h ; Read the pressed key from port 60h
xor ah, ah ;
mov si, ax ; The rest of this stuff
in al, 61h ; just resets the keyboard
or al, 82h ; flip-flop and tells the
out 61h, al ; keyboard that we received
and al, 127 ; the key
out 61h, al ;
mov al, 20h ;
out 20h, al ;
mov bl, 1 ; Assume it's a Make code
test si, 128 ; is it >= 128 ?
jz store_key ;
and si, 127 ;
xor bl, bl ;
store_key: ;
mov ax, @DATA ;
mov ds, ax ;
mov keymatrix[si], bl ; Store the new state in our array
pop ds ;
pop si
pop bx
pop ax
iret ; IRET = interrupt return
new_key_int endp
Keyboardoff proc
cmp keybon,0
jne keyboff_cont
ret
keyboff_cont:
push ds
mov dx, dos_int_off ; Get the old SEG:OFF
mov ax, dos_int_seg ;
mov ds, ax ;
mov ax, 2509h ; 25h = Set Int Vector
int 21h
pop ds
mov ax, 40h ; Restore the Key flags
mov es, ax ;
mov di, 17h ;
mov al, keybflags ;
mov es:[di], al ;
mov ax, SEG keymatrix ;
mov di, OFFSET keymatrix ; Set our array to zeros
mov es, ax ;
xor ax, ax ;
mov cx, 128 ;
rep stosb ;
mov keybon, al
ret
Keyboardoff endp
; stack
;
; code 4
; ret seg 2
; ret off 0
;
Getkey proc ; All this does is return
mov cx, bp ; the key state of a
mov bp,sp ; given key scancode
mov si,[bp+4] ;
mov al, keymatrix[si] ; A list of scancodes is available
xor ah, ah ; in the QB help
mov bp, cx
ret 2
Getkey endp
-Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - -
The whole keyboard thing works like this: When you press a key, that key
number is sent from the keyboard, when you release the key the key number
plus 128 is sent. These are called the 'Make or Break' codes respectively.
You are free to use the above code in any way you want. It's a standard
procedure so do what you want!
13.2 A Super Fast Pixel Plot
Thought someone might like this. It's the fastest pixel plotting routine
I have ever seen. I mean, the method it uses to get the offset is pretty
cool - it only involves one shift right. And who wrote it? Me! Oh, I *love*
blowing my own horn...;)
RSPset proc
mov ax, bp ; (1)
mov bp, sp ; (1) get stack pointer
mov es, [bp+10] ; (1) get seg
mov bh, [bp+6] ; (1) bl = y
shr bx, 2 ; (3) bx = y * 64
and bl, 11000000b ; (3)
add bh, [bp+6] ; (2) y is always less than 255 so it's = to + y*256
add bx, [bp+8] ; (2) add x
mov cl, [bp+4] ; (1) get colour
mov es:[bx], cl ; (1)
mov bp, ax ; (1)
ret 8 ; (5)
; = 22 clock ticks!
RSPset endp
Someone might have to correct me on the clock ticks though. It's probably
something like, 30-35...who knows. Let's see if someone can make it
faster!!!
13.3 Algorithms
To help you on your way, I've included a few standard-ish algorithms
you can use to write your ASM routines.
Lines
----------
You can forget about me giving you the algorithm for drawing a line between
any two points. For that you need the Bresenham algorithm, which is a little
complex. Search for it on the internet, you'll find what you need.
For now, here's a Vertical line routine:
Vertical:
Get the destination segment in ES
Get Y1
Get Y2
Is Y1 > Y2? If yes, swap them
Calculate DI by Y1 * 320 + X
Subtract (Y1 from Y2) +1 and store this value (it's the counter)
Y_LOOP: Get the colour
Plot it to ES:DI
Add 320 to DI
Decrement the counter
Are we at zero? If no jump to Y_LOOP
The Horizontal one's even easier. You can use a REP STOSB
I think this sprite algorithm is the one used in DirectQB's sprite plotting
routine. It's definitely not the fastest way of doing it because it performs
a lot of checking in the X and Y loops. A much faster version would
calculate what does and doesn't need plotting beforehand. (Hey, those guys
at Shimmer/Ribbonsoft did one of those! ;)
See if you can improve it!
Sprite Plot
-------------
Set DS to the sprite segment
Set ES to the buffer segment
Set SI to the sprite offset
Point DI to x,y-1
Get the Sprite width & height by reading the first two
words from DS:SI
Get the Y value and subtract 1
Add the sprite width to DI
Y_LOOP:
Add the sprite width to SI
Add the screen width to DI
Move to the next line
Decrement the Height counter
Is it zero? If yes, jump to END_PLOT
Make sure the Y position is within screen boundaries.
Subtract the sprite width from SI and DI
Store the sprite's X value
Store the sprite's width
X_LOOP:
Get a pixel from the sprite buffer
Is it zero? If yes jump to SKIP_PLOT
Make sure the X position is within screen boundaries
If yes, draw it on-screen
SKIP_PLOT:
Increment the sprite's X value
Decrement the sprite's width
Jump if not zero to X_LOOP
Jump to Y_LOOP
END_PLOT:
Cleanup stack, exit routine
Rotation
-------------
Doing sprite rotation is really a general algorithms excercise. I recommend
that you, again, write the rotation in QBasic first, limiting yourself
by writing the code in an assembly style.
If you know anything about rotation, you'll know you need a load
of SIN and COS values, 360 of each to be exact, for each angle.
So how do you access these tables in ASM? Well, you could create and
array in QBasic and store the values there and then pass the address
of this array onto the routine...or...you could precompute all the values
and write them into the ASM file as a series of DW words. They would be
much easier to access and handle. And it'll give you more space in QBasic.
Check out the sources of DirectQB and our RibbonSoft VGA library to see
how to implement this.
If you don't know how to rotate points, there is a file called QBRotate.zip
somewhere on our site. It gives all the info you need to know.
Scaling
-------------
For scaling, try writing a version using X-Step, Y-Step technology. These
two values contain a decimal increment for the X,Y plotting position. This
is very easy to write in a High Level Language, but it's also very slow.
This is because of the smelly floating point. What we need is fixed
point mathematics which is covered in Section 14: Optimization.
Translucency
--------------
Here's a lovely effect you can produce with minimum effort. For this
you'll need a gradient palette, arranged in 32 shades (that's 8 in all)
from dark to light.
In the inner loop of your sprite plotting routine, or whatever, single out
the part where you get the pixel colour from your sprite buffer
GET pixel from sprite buffer
MOD it with 32 (or AND it with 31)
Calculate the base of the colour gradient **
GET pixel from destination screen buffer (where you're gonna put it)
MOD it with 32 (or AND it with 31)
Add the two pixel numbers together
Divide by 2 to get an average (SHL 1)
Add this average to the base
Plot the pixel
** The colour gradient base is the colour which is the first in the grade.
Ok, We have a grade from Black to white from colours 0 -> 31
and we have a Dark red to Light red grade from colours 32 -> 63
The base colour for a colour between 0 and 31 is 0
The base colour for a colour between 32 and 63 is 32
You see what I mean?
With a gradient palette, you do all kinds of cool effects:
- Darken a patch of the screen for an overlayed menu effect
- Plot a sprite in greyscale/greenscale/pinkscale/anything
- Even realtime lighting effects are possible!
14. Optimization
This sections gives some ways on how to make your procedures even faster
14.1 32-Bit programming
How do we make our code faster? We can use 32-Bit for a start.
You can replace:
MOV CX, 32000
REP MOVSW
in the buffer copy routine with:
MOV CX, 16000 ; move 16000 dwords (64000 bytes)
REP MOVSD
All instructions that end with a 'B' or 'W' can now end with a 'D'. Ace.
Remember this only works on 386+ machines.
The other prizes we are given for using 32-bit is double sized registers.
These register are an extension of the existing 16-bit ones and have the
prefix 'e'.
EAX -> AX
EBX -> BX
ECX -> CX
EDX -> DX
EDI -> DI
ESI -> SI
ESP -> SP
EBP -> BP
Each 'E' register is 32-Bit and affects it's 16-bit version in the same way
AL affects AX.
There are no EES, EDS, ESS and ECS, though. Why? Well you'll find out once
you start doing protected mode programming...and, yes...you *will* code in
protected mode one day. Soon you will grow tired of QB's primitive ways and
discover the joy of DJGPP...soon..*ahem*..
14.2 Fixed Point Math
It's a little strange to me calling it 'Math' as we call it 'Maths' in
the UK, but since 80% of the readers of this document will be from the US
I might as well conform.
We use Fixed Point to represent numbers with a decimal point that
doesn't move. This is much faster than using Floating point numbers
which are encoded in a very complex way. With fixed point, we can add
subtract, multiply and divide in the same way we would standard integers.
This is how it works. We take a 16-bit number (a standard sized register)
and split it into two parts. The higher (left) part will be our whole number
The lower part will be our fractional part. We can put an imaginary
decimal point between them. Observe:
Whole part Decimal part
| | | |
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
-----------------------------------------------------------------------
128 64 32 16 8 4 2 1 1/2 1/4 1/8 . . . . . 1/128 1/256
Sorry, I couldn't fit the rest of the values in the decimal part.
So we use the upper 8 bits as the whole number and the lower 8 bits as
the decimal part.
This is called 8.8 fixed point. This disadvantage of this is, we can only
represent numbers in the range 0-255, but we do have have 8-bits of accuracy.
Another method is 9.7 fixed point. This way you have the range 0-511
but slightly less accuracy. Let's look at an example using 9.7:
Imagine we have the number 100 and we want to put it in a loop and increment
it with 0.5 every cycle. Here is some assembly language to do it..
MOV dx, 100 ; Start with a standard 100
SHL dx, 7 ; Multiply it by 128
next:
ADD dx, 64 ; Add 64. Same as adding 0.5
.
.
JNZ next
Adding 128 would give the effect of adding 1 each time. Adding 1 would
have the effect of adding 1/128 each time. So now you've got this fixed
point number, what do you do with it? You Shift RIGHT 7 places and then you
have the whole number part.
Assigning a fixed point number
------------------------------
We set the variable or register to the value we want to scale and then
multiply it by our scale factor. If we're using 9.7 we multiply by 128 or
in assembly language Shift Left 7 places.
Let's look at how we use them. Remember that these fixed-point numbers
are imaginary and the CPU will add and subtract like they were normal
16-bit numbers. Example:
MOV ax, 54
SHL ax, 7 ; Creates 54.0
MOV bx, 17
SHL bx, 7 ; Creates 17.0
ADD ax, bx ; Result: 71.0
SUB ax, bx ; Result: 54.0
Adding and subtracting is pretty straightforward, but Mulitply and dividing
are a little tougher. When we multiply to fixed point numbers, we must
remember that they have BOTH been scaled by a given factor, in our case 128.
So when we multiply them we must shift them back down 7 places to correct
them. Example:
MOV ax, 20 ;
SHL ax, 7 ; Creates 20.0
MOV bx, 5 ;
SHL bx, 7 ; Creates 5.0
IMUL bx ; Multiplies AX by BX and stores it in AX
SHR ax, 7 ; Divides AX by 128
Dividing is a bit trickier, still. We when we divide, we have an unwanted
factor of 128 in the division. We must remove this by shifting the
dividend to the LEFT 7 places, first. Then we can divide.
The only problem with this is if we shift left another 7 places, we'll
lose the significant part of our number! This problem is resolved
by using 32-bit fixed point numbers. We can EAX instead of AX and have
great accuracy and range by using 16.16 fixed point.
Applications of Fixed Point
---------------------------
Here's a good one: rotation. We have all our SIN and COS values scaled
up by 128 or 256 or whatever, then in our assembler we do our rotation
calculations and scale the result back down to get our new X,Y position.
Check out our RibbonSoft VGA library to see this in action!
14.3 Inner loops
The inner loop is the bottleneck to high performance software. At least
that's what it says in the VESA SVGA specification...
The inner loop or more commonly the 'X' loop must be made as fast as
possible as this is the part that'll be executed the most.
We can do this by making sure we use registers for all our values in the
loop, make sure there are no PUSHes or POPs and there are no memory
references, these are much slower access than a register. Also all
required calculations must be done outside the loop, if possible.
So that means no MULs, IMULS, DIVs or IDIVs (unless you really HAVE to).
Another technique is to Unroll our loops. Consider this example which
calculates the sum of 100 16-bit word integers
; assume ES:DI points to a list of Word integers
MOV cx, 100
XOR ax, ax
nextnumber:
ADD ax, es:[di]
ADD di, 2 ; move 2 bytes (onto the next word)
DEC cx
JNZ nextnumber
This seems innocent enough but it can be made faster:
; assume ES:DI points to a list of Word integers
MOV cx, 100
XOR ax, ax
nextnumber:
ADD ax, es:[di]
ADD ax, es:[di+2]
ADD ax, es:[di+4]
ADD ax, es:[di+6]
ADD ax, es:[di+8]
ADD di, 10 ; move 10 bytes
SUB cx, 5 ; we've added 5 words...
OR cx, cx ; We can use this instead of CMP
JNZ nextnumber
This has the same effect as the previous example, but it is much faster
as the loop only runs 20 times instead of 100.
14.4 Other tips
This is just a few little useful pieces I didn't know where to put.
- Make sure you always use XOR ax, ax iinstead of MOV ax, 0
- You can use the stack for temporary sstorage by pushing a load of
blank registers and referencing the 'slots' with BP
- If you have a lot of items on the staack, which you need to remove,
instead of doing:
pop bx
pop bx
pop bx
pop bx
You can simply do: ADD sp, 8
- If you have a number which needs a Shhift-Left 8 places, instead
of
SHL dx, 8
use XCHG dh, dl
XOR dl, dl
This moves DL into DH which has the same effect as a shift but is
much faster!
- The format of a sprite stored by QB'ss GET is as follows:
Element Size Content
Array(0) 2 Bytes Contains the Width of the sprite * 8
Array(1) 2 Bytes Contains the Height
Array(x) ?? All pixel data in a linear format
from left to right, top to bottom
I'm not sure why QB multplies the width by 8. I think I knew once but
I forget...pih. Oh, to get the correct size, just SHR 3 times.
- As of the 486 onwards ADD and SUB aree the same speed as INC and DEC
It used to be that they were slightly slower, not anymore.
Also these instructions... MOV ax, [bp]
MOV [di], cx
MOV ax, bx
MOV ax, 1
...are all the same speed.
Some people have been reading very out of date, old, smelly ASM guides.
These guides have been for old 8086 and 286 processors. They are no good!
An example would be the Peter Norton Guide to Assembly language. It's
waaaaay out of date. The most recent ASM guide I could find is
HelpPC which is a nice TSR that has lots of other cool stuff in it.
It's available on our site.
Anyway, some people thought that all those memory MOV's were really
slow. They're only 1 clock tick. OK??!!!! Sorry.
15. Closing words
Um. I wrote this entire document over the weekend and sacrificed my social
life.