Shimmer QuickBasic ASM Tutorial

V1.2 11th Dec 2000

QuickBasic Graphics Programming in Assembly language

By CGI Joe of Shimmer

[V1.1 15th June 2000]
Revised 11th Dec 2000

Ever used DirectQB, Future.Lib or Blast? They provide ways of using cool, fast graphics in Qbasic. I bet you've been limited by them in some way, though. Like you wanted a cool routine to draw multiple parallax backrounds, and the library didn't have such a routine? Want to have the power to do anything you want, being limited only by your imagination (and the lack of a damn fine tutorial ;). Other people's libraries are great, but they don't teach you anything. Let's learn Assembly Language!

DISCLAIMER: "I do not accept responsibility for any effects, adverse or otherwise, that this code may have on you, your computer, your sanity, your dog, and anything else that you can think of. Use it at your own risk." Blah etc. Shame we live in a world where that sort of thing is required. (Taken from the DJGPP Allegro docs) You've really got to *want* to learn this stuff cos, er...it involves a lot of theory and there's a lot learn. I've tried to keep everything down to a minumum. But remember, you don't have to memorise *everything* here. Just make sure you understand a few concepts and use it as a reference as and when you need it. I'll be covering the following topics in this order:

1 Introduction
2. Addressing memory
3. The CPU
4. Some Assembly instructions/mnemonics
5. Plotting your first pixel
- 5.1 Binary Shifts
- 5.2 Yup, still plotting this damn pixel...
6. The Stack
7. Creating/Editing Libraries - your first library
8. Constructing an ASM file
- 8.1 Review of the code
9. So, what are we gonna do?
- 9.1 Using double buffering
- 9.2 Review of the code
10. Conditions and looping
11. Interrupts
12. ASM Data and Arrays
13. Writing routines
14. Optimization
15. Closing words

1. Introduction

- How do I use assembly language in QBasic? Well, there are two, quite different, ways. The first method was used by the Blast! library by Andrew L. Ayers. It involved using CALL ABSOLUTE, a statement which allowed programmers to use Machine Code in QBasic (Machine Code/Language is just a series of 'Opcodes' which is just assembled assembly language). It worked like this; you stored all of your Machine code into memory, and then passed the address of this code to CALL ABSOLUTE. The problem with this was you had to have an assembler capable of generating pure machine code instead of a standard format like .OBJ and the code was difficult to maintain in QBasic (you had to set up memory and have big ugly Hex strings which held the code). Ok, so now you're ready to chuck the first method. Excellent. The second method is far more structured: - You have all of your assembly code in a seperate .ASM file - You assemble the file using TASM or MASM or whatever. This produces an .OBJ file. - You build a Library (.LIB) file using the given OBJ file - You link your QBasic program in with the library. This is much cleaner and easy to maintain. The disadvantages are you must have QuickBasic Version 4.5 because this version is capable of using libraries. There are other version of QB like 4.0 and 7.1 but 4.5 is the most widely available. You also need an assembler capable of producing .OBJ files. All the steps here are described in detail elsewhere in this document. You can find all the tools you need, including QuickBasic 4.5, on our site.

1.1 Architecture of the PC

Figure 1: Simple plan of the PC workings Let me explain how all the parts interact - RAM and High Memory RAM, under DOS, is split into parts, I've shown two in the diagram. The first, larger, part is called Conventional or Base memory. This is where your EXE program is stored when you run it in DOS. It contains all of your Code and Data. The position of all this stuff is determined by it's size. Also, your Code doesn't have to be in the same 'place' in memory as your Data. There could be gaps of unused space in between, this is all decided and handled by the Operating System, DOS and should really concern you at all. Items that are found in Base memory include: - Your Code and Data - Parts of the Operating System (DOS) - Interrupt tables and Interrupt routines The smaller part is called the High Memory Area (HMA). This contains some hardware specific stuff like the SCREEN 13 graphics area is located here. You can't store any Code or Data you like here. Other items that are found in the HMA include: - System drivers (network, mouse, sound, TSR's) - All VGA card data for Text & Graphics modes - Parts of DOS can be found here Hardware I/O (Input/Output) You can use the CPU to control Hardware ports. What are these ports, exactly? Well, it's how you change the colour palette in Graphics mode. You can also manipulate other things like the keyboard. It's basically a way of reading from and writing to the hardware. You may have used the OUT and INP statements in QB, this is what they do. CPU You will not *believe* the amount of work the CPU is continuously doing! It has to handle things like: - Redrawing and displaying the screen - Handling thousands of interrupts a second - Fetching your code from RAM, decoding and executing it - Taking inputs such as keypresses and converting the data it returns. - Loads of other stuff I don't even *know* about! Everything you access on your PC goes through the CPU. If you are confused by interrupts, don't worry. I explain them all later.

1.2 Alternative number systems

Our number system is denary or BASE 10. It's name is simply because there are ten digits, 0 -> 9. There are two other important number systems you need to know about: BASE 16 - Hexadecimal BASE 2 - Binary The sixteen Hexadecimal digits are: 0 1 2 3 4 5 6 7 8 9 A B C D E F where A=10, B=11, C=12, D=13, E=14 and F=15 Example: 50 in Hex is actually 80 and 2E is actually 46. So how do you tell the difference between a Hex number and a normal number? Well, there's usually an 'H' floating around somewhere. Like, in QBasic, a Hex number begins with '&H', in assembly language it ends in an 'h'. It's useful for representing adresses because memory goes up in units of 16 so it's easier to work with. Binary of course only has 2 digits. 0 and 1. Here's a binary number: 0 0 0 0 1 1 0 1 It represents the number 13. How do you work it out? Like this: 128 64 32 16 8 4 2 1 ------------------------------- 0 0 0 0 1 1 0 1 Each binary digit stands for a number which is a power of two. So you add up the values of all the digits which are '1'. In this case 8, 4 and 1. 8 + 4 + 1 = 13 That was an 8-bit number, or a byte, and the largest number you can hold with 8-bits is 255. Which brings us onto our next topic...

1.3 Data types

Pretty smooth, huh? Each data type, apart from the Bit can be Signed or Unsigned. Having an Unsigned number doubles it's maximum value. Observe: BIT - Smallest form of data on a PC. BYTE - 8 Bits - Unsigned: Largest=255, Smallest=0 - Signed : Largest=127, Smallest= -128 WORD - 16 Bits (2 Bytes) - Unsigned: Largest=65535, Smallest=0 - Signed : Largest=32767, Smallest= -32768 DOUBLE WORD - 32 Bits (2 Words or 4 Bytes) - Unsigned: Largest=2^32(about 4 billion), Smallest=0 - Signed : Largest=2,147,483,647, Smallest= -2,147,483,648

2. Addressing memory

Base or Conventional memory is split up into parts called Segments. Segments are of variable size; the smallest being 16 bytes and the largest being 64000 bytes. When you allocate memory, you are given a unique address in Base memory so you can access it. This address is made up of two parts: A SEGMENT address and an OFFSET in that segment. Say we were given an address of a byte of data and the Segment of the data was A000h and the Offset was 0005h. This means that the byte of data we want is 5 bytes from A000h or, 5 bytes from the beginning of the segment. Addresses are commonly written like this: Segment:Offset Example: A000:0005 That's just about all you need to know about segments for now. Find out how to access them next...

3. The CPU

One of Assembly language's selling points is the fact that you can control just about everything on your PC without some interfering High Level Language getting in your way (are you listening Microsoft?). But (if you were paying attention earlier) you soon realise that you must base your code entirely on the CPU. So where do we begin? I guess the guts of the CPU would be a good place. The CPU contains several small banks of memory called 'Registers'. When I say 'small' I mean they each hold 16 Bits of Data. Ok, that was a lie. Your PC no doubt has 32-Bit Registers (if it's a 386+) but for now, we'll pretend they're 16-Bit to make things easier. These Registers are very fast access. They are used to store numbers which can be added, subracted, multiplied and divided by other registers or just immediate numbers. You can treat them like variables in Qbasic. Some Registers exist for a special purpose. Others can can be used for anything and are called General Purpose Registers. They are: AX - Accumulator. This is commonly used for returning values from FUNCTION's BX - Base. No special reason but can be used to index memory CX - Counter. This is used for um, counting DX - Displacement. Just a register I think Intel just liked the whole ABCD thing... Here's another 6 registers. These all have a primary reason for existing. They are used to address and access memory. They are Segment Registers: CS - Code segment. This points to your CODE DS - Data segment. This points to your DATA ES - Extra segment. This can point to anything you want SS - Stack segment. This points to the STACK (more later) FS - These two are only really required for use in Protected mode GS - but they can be used for anything, like ES. They're available on a 386 processor or above, only. What's all this pointing business about? "Pointing" means that the register holds the actual physical address of something in RAM. For example if your Data was stored in RAM at position 106, DS would equal 106. It's that easy. You can't add values to Segment registers and you can't move immediate values into them. You must use a General register first then set the Segment register = to the General register. Some more registers IP - Instruction Pointer. Is used with CS SP - Stack Pointer. Yes, used with SS BP - Base Pointer. Used to REFERENCE the stack (SS). SI - Used with DS DI - Used with ES Flags - Wow! What's this, eh? It's another 16-bit register in which each bit stands for something important, instead of a number. Er, that's a bit vague, I know. But it's not really vital stuff at the moment. Do you remember the Segment:Offset stuff from earlier? Well, IP, SP, SI, DI are all used as OFFSET registers and are paired with Segment registers, like this: CS:IP - I wouldn't advise playing around with these! SS:SP - Or these! ES:DI DS:SI These aren't strict rules. You can use ES:SI if you want. But unless you explicitly tell the CPU you want to use SI, it will default to DI.

Figure 2: CPU (Actual size...hehe ;) There's something useful about the 4 general registers AX, BX, CX and DX. They can each be split into two parts. Each part is 8-Bit (a byte) and these can be treated as seperate registers. A High byte and a Low byte. AX - AH, AL BX - BH, BL CX - CH, CL DX - DH, DL This is really useful because you can easily run out of registers when writing ASM procedures. So when you change something in AH you're also changing the overall value of AX:

Here, the value of AL = 1+32+128 = 161 AH = 1+2+4 = 7 and... AX = 1+32+128+256+512+1024 = 1953 Get it? Oh, c'mon it's easy... The other parts of the CPU are: ALU - Arithmatic Logic Unit does all the adding and subtracting and stuff. We don't care about this, and we can't touch it. FPU - Floating Point Unit. At least 80% of modern computers have these. It handles all of our Real numbers. You can access it directly, but you need to know a few special opcodes. You'll probably never need to use it in assembly language programming. Cache - This is a small temporary store like RAM but a zillion times faster. It's used for storing program code, like a loop. We can't access this thing either. Good optimization would involve making sure an inner loop would fit neatly into the cache. Misc Memory - The CPU has it's own little piece of internal memory which it uses for various things that we need not concern ourselves with. We can't access it. I put those four extra bits in for the sake of completeness. Just in case somebody tries to picks holes in my tute. Grrr.

4. Some Assembly instructions/mnemonics

A mnemonic, (pronounced new-monic) is something that helps you to remember something else. An assembly mnemonic is a short, usually 3-letter, version of the operation it performs. Here's a few:

MOV - moves/copies data

ADD - adds stuff

SUB - subtracts stuff

MUL - multiplies stuff

INC - increments stuff (adds 1)

DEC - decrements stuff (subtracts 1) In assembly language, typically 2 'Operands' follow the instruction/mnemonic. They are just the two items you want to perform some operation on. Some instructions require only 1 operand, some none, some want three. All operands usually have to be the same size. The first operand (the one on the left) is the item that is going to be 'operated' on. Let's take a look: MOV ah, al ; moves the contents of AL into AH SUB cx, bx ; subtracts BX from CX ADD ah, dl ; adds DL to AH INC ah ; adds 1 to ah A ';' is a comment in assembly language, by the way. A closer look at MOV... - MOV <Destination>, <Source> This moves a value, be it a constant, a value of a register or a memory location into the Destination. Which can be a register or a memory location. It can also be a position in the stack. Example 1: MOV AX, 5 Moves the value 5 into AX. Anything that was previously in AX will be overwritten. Although 5 is small enough to fit into AL, because AX was the subject, AH gets overwritten aswell. There's a list of common instructions available elswhere in this document.

5. Plotting your first Pixel

Forget SCREEN 7, 12, 11, 10, 3, 8. They're all total crap. The only one you should be interested in is SCREEN 13. Why? Because it's a good resolution for high-speed graphics (320x200) and you have 256 colours available. So how do we plot a pixel? SCREEN 13 Memory is located at the address: A000:0000 That's Hex, of course. And you should have worked out that memory is stored in consecutive bytes not a 2D table so... Say the following is the screen: <---320---> 11111111111 | 22222222222 33333333333 200 44444444444 55555555555 | In memory, each line is stored one after the other like this: 1111111111122222222222333333333334444444444455555555555 (We'll just pretend there's 320 in each line, there). So to calculate the OFFSET in the A000 segment, we just multiply the pixel's Y position by 320 and add the X: offset = Y * 320 + X There's one thing I have to mention at this point; Multiplying numbers is S L O W! But we want a fast graphics lib, how do we get around this? Answer: Use shifts.

5.1 Binary Shifts

Binary shifts allow us to Divide and Multiply numbers by powers of 2. This isn't as limiting as it sounds, as the whole PC theory is built on powers of 2. Take a binary number: 0 0 0 0 1 1 0 1 = 13 Shift it Left once : 0 0 0 1 1 0 1 0 = 26 Shift it Left again : 0 0 1 1 0 1 0 0 = 52 And this is super quick. So, Shifting left 8 places is the same as multiplying by 256. ok? Shifting Right is the same concept, but if you have an odd number (odd numbers always have bit number 1 set) then you lose some accuracy. 53 shifted right once = 26. But, who cares? The shift mnemonics are: SHL <subject>, <numberofshifts> SHR <subject>, <numberofshifts> Example: SHL AX, 2 ; AX=AX*4 Example: SHR DX, 1 ; DX=DX/2

5.2 Yup, still plotting this damn pixel...

That shifting stuff *really* comes in handy, it can be used for all sorts of stuff. So to get Y * 320: We split the 320 into powers of 2: 64 and 256 (256+64=320) So (Y*64) + (Y*256) = 320 Some source! MOV ax, 0A000h ; Screen 13 Segment MOV es, ax ; Can't set ES directly MOV dx, Y ; DX = the Y pos MOV bx, dx ; Copy it to BX SHL dx, 8 ; DX = DX * 256 SHL bx, 6 ; BX = BX * 64 ADD dx, bx ; DX = DX + BX (DX = Y * 320) ADD dx, X ; DX = DX + X MOV di, dx ; DI = DX MOV al, Colour ; Get the colour STOSB ; STORE String Byte (AL) at ES:DI The STOSB is an example of an intruction which has no operands. We could have used: MOV es:[di], al The [] brackets tells the CPU to use the address pointed to by DI. STOSB stores the byte in AL at position ES:[DI] and INCREMENTS DI You also have STOSW (STOre String Word) which gets the value from AX and stores a 16-Bit number.

6. The Stack

Ok, we're just about to start getting down to business, there's just one more thing you need to know about; the stack. It is through this, your QB program and your Assembly routine communicate. The stack is our friend. Remember that. It is basically a temporary storage area we can use in our programs. At the top of your QBasic program you will have noticed something like: DECLARE SUB DrawBox (x1%, y1%, x1%, y2%, col%) This DECLARE statement is telling QBASIC to put all of the values passed on to it (inside the brackets) on the STACK so your routine can access them. Of course you don't have to do any of this yourself; QBasic handles it all for you. Anyway, what I have explained above is probably the most complicated use of the STACK so if you understood that, consider yourself well and truly done, sir! Another way we can use the STACK is inside our ASM code. We can PUSH values onto it and them POP them back off again. The rule is: LAST ON, FIRST OFF. We use PUSH to place a Byte, Word or Dword onto the stack and POP to remove a byte, word or dword. Study the following example: MOV AX, 5 ; Move 5 into AX PUSH AX ; Place the value of AX onto the stack MOV AX, 0 ; Set AX to zero POP AX ; Remove the last value on the stack ; and put it into AX. AX now equals 5! Understand? Good, cos it gets a little bit more complicated. The stack has a pointer, SP. It is increased when you PUSH something on and decreased when you POP something off. I think of the stack as something you PUSH a value UP TOWARDS as the diagram shows: Values on stack: Position: (Assuming every value is a WORD) ValueA 6 ValueB 4 ValueC 2 ValueD 0 Now when we PUSH another value on.... Values on stack: Position: (Assuming every value is a WORD) ValueA 8 ValueB 6 ValueC 4 ValueD 2 New value --> ValueE 0 Each value's position is increased by 2 (assuming that they're all WORDs which are, as you should know by now, 2 bytes) because they have all been 'shoved' upwards. And when we POP back off again: Values on stack: Position: (Assuming every value is a WORD) ValueA 6 ValueB 4 ValueC 2 ValueD 0 Everything is returned back to normal. Smashing. If you understand all of this then give yourself a pat on the back and a hot mug of cocoa, well done! Say we have a pixel plot routine in assembly and we want to call it from QBasic. We'll do something like, PixelPlot 160, 100, 15 Plots a colour 15 pixel at 100,160. So how does our ASM routine get at these values? Answer: They've been pushed onto the Stack like this: Values on stack: Position: --> 160 8 --> 100 6 --> 15 4 QBasic Return SEG 2 QBasic Return OFF 0 The Basic Return SEGment and OFFset are pushed on by QBasic last. It's just a pointer to the next instruction in your QBasic code. You cannot touch these and they must always be in this position when your ASM routine is finished, so remember and clear up when you're done using the stack! Right, we access the stack using the BP register not the SP because SP changes in the routine and we want a static value! So the first thing we do is: save the contents of BP and then MOV BP, SP You can PUSH BP onto the stack if you like, but remember all your variables will be 2 positions higher! Remember to restore BP before you exit your routine!! Some code to get the X position from the Stack: MOV cx, bp ; Save bp in cx MOV bp, sp ; Get the stack pointer MOV dx, [bp+8] ; Store the X value Notice we can add an index (8) to the BP register? Also, notice the square brackets, they mean get the value at the ADDRESS of bp+8. We are ready.

7. Creating/Editing Libraries

So you have the knowledge and a plethora of ideas...let's do something with them. To create and use a library you NEED the following: - QuickBasic 4.5 Compiler - An assembler such as TASM or MASM - A text editor - A copy of Microsoft Link (it comes with QB 4.5) - A copy of Microsoft Library manager (also with QB 4.5) 1. Paste, or type the following into a file: -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - .model medium, basic .stack 200h .386 .code public AddFour ; Our stack: ; Number 6 ; QB Seg 4 ; QB Off 2 ; BP 0 AddFour proc push bp mov bp, sp mov ax, [bp+6] add ax, 4 pop bp ret 2 AddFour endp end -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - 2. Save the file as 'ADDFOUR.ASM' 3. Run TASM by typing: TASM addfour If there are errors, check you copied the code exactly. 5. Check there is a file called ADDFOUR.OBJ in your directory. 6. Now run Microsoft Library Manager by typing: LIB You should see the following: Microsoft (R) Library Manager Version 3.14 Copyright (C) Microsoft Corp 1983-1988. All rights reserved. Library name: Enter 'addfour' as the library name, then you'll see... Library does not exist. Create? (y/n) 'y' of course, then... Operations: This means "What OBJ files?". Type 'addfour.obj'. If you want to use more than one OBJ then seperate each file with a '+'. Next... List file: Just type 'null'. A list file is a listing of all the symbols you use in you library (symbols: variables, procedure names). You got yourself a library! But wait, you want to use it in QuickBasic. Then you have to build *another* library (yes, it's stupid, I know) which is a .QLB file. This is required only if you want to use your routines while inside QuickBasic. You need a copy of Microsoft LINK.EXE Type the following at the DOS prompt: LINK/QU addfour.lib, addfour.QLB, NULL, BQLB45.LIB Make sure BQLB45.LIB is available (it's a QB 4.5 support library that needs to go into your QLB). Don't get me *started* on how annoying that *stupid* command is! My advice is to write a small .BAT batch file to do all that stuff for you. Hey-ho. Library built! We should have the following files: ADDFOUR.ASM ADDFOUR.OBJ ADDFOUR.LIB ADDFOUR.QLB I know, it gets messier, ;) Let's start QuickBasic 4.5 with the 'l' switch. It tells QB to load our library: qb/l addfour At the top of our empty .BAS file, type the following: DECLARE FUNCTION AddFour% (BYVAL x%) It lets QB know how to call our function AddFour. And because it's a FUNCTION, it has to know what value to return. In this case, it's an integer (%). The BYVAL part tells QB to to pass the value of x% onto the stack. If you leave it out, the default is passing the Segment and Offset of the variable onto the stack. Just use BYVAL for passing everything. Now type the following code: PRINT AddFour%(10) Run or compile the program and you should see 14! OK, not stunning. But it works.

8. Contructing an ASM file

So what was that stuff at the top of the asm file? .model medium, basic .stack 200h .386 .code These are assembly 'Directives'. Directives are used to tell the assembler how to construct the format of the output OBJ file. .model medium, basic We're calling from BASIC and QB uses a Medium memory model. This is sort of the format of the EXE file QB produces. It's not important, just make sure you select MEDIUM all the time. Other models include; TINY, SMALL, LARGE and HUGE. Look them up sometime. .stack 200h Sets up a stack of 512 bytes. You can select any size you want but make sure you leave enough space for all the return addresses and your variables. 200h is plenty of space. .386 We're using 386 (32-bit) intructions. You can select 286, 486 and 586 also. But use 386 for compatibility reasons. .code Everything after this point is our file's code. There is also another: .data This is a Data Segment where you can put internal variables and tables local to your ASM file, very useful. public We use this to make our symbols available to external processes, namely Qbasic. If you don't put this in, QB will moan. Guaranteed. addfour proc . . . addfor endp No prizes for this one. Same as SUB xxx . . . END SUB TASM also requires you to put an 'end' at the end of your file. Let's go over the code in the file...

8.1 Review of the code

Just gonna clear up some stuff here. AddFour proc push bp ; Save BP mov bp, sp ; Get the stack pointer mov ax, [bp+6] ; Get the number from the stack add ax, 4 ; Add 4 to it pop bp ; Restore BP ret 2 ; Return to QB AddFour endp The 'ret 2' is used to cleanup the stack and return to QB. The '2' is used to remove 2 bytes from the top of the stack, which is the number we pushed on. Use 'ret' on it's own if you don't pass anything on.

9. So, what are we gonna do?

So we have the knowledge and the technology to do pretty much anything we like. Sprite-scaling, rotation, translucency, texture-mapping...fantastic. First we've got to do some basic, foundation coding. If you want to write some seriously cool games we have to use double-buffering.

9.1 Using double buffering

Double-buffering is a technique used to stop those horrible flickering graphics you might have witnessed in SCREEN 13. Y'know how you can PCOPY in SCREEN 7 but you can't do it in SCREEN 13? Doesn't that make you mad? Let's remedy the situation. All we have to do is have a chunk of memory which we can use to draw our graphics onto and then copy it onto SCREEN 13 VGA memory at address A000h at the end of our rendering loop. Where does this magical 'chunk of memory' come frome? Answer: Qbasic. DIM buffer%(31999) This allocates all the memory we need for a double buffer. It's exactly 64000 bytes. Which is 320*200 and because each pixel is 1 byte that's 64k To get the address of this array, we use a QB function called VARSEG. VARSEG(buffer%(0)) returns the Segment of buffer% Paste, or type the following into a file and save it as 'OURLIB.ASM': -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - .model medium, basic .stack 200h .386 .code public fillbuffer, buffercopy, plotpixel ; Stack ; FromSeg 10 ; ToSeg 8 ; QB Return SEG 6 ; QB Return OFF 4 ; DS 2 ; BP 0 ; buffercopy proc push ds push bp mov bp, sp mov ds, [bp+10] mov es, [bp+8] xor si, si xor di, di mov cx, 32000 rep movsw pop bp pop ds ret 4 buffercopy endp ; Stack ; ToSeg 8 ; Colour 6 ; QB Return SEG 4 ; QB Return OFF 2 ; BP 0 ; fillbuffer proc push bp mov bp, sp mov es, [bp+8] xor di, di mov al, [bp+6] mov ah, al mov cx, 32000 rep stosw pop bp ret 4 fillbuffer endp ; Stack ; ToSeg 12 ; X 10 ; Y 8 ; Colour 6 ; QB Return SEG 4 ; QB Return OFF 2 ; BP 0 ; plotpixel proc push bp mov bp, sp mov es, [bp+12] mov dx, [bp+8] mov bx, dx shl dx, 8 shl bx, 6 add dx, bx add dx, [bp+10] mov di, dx mov al, [bp+6] mov es:[di], al pop bp ret 8 plotpixel endp end -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - Assemble it, build a LIB file and then build a QLB file, naming them all 'OURLIB'. start QB: qb/l ourlib Now copy the following into a BAS file: -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - DECLARE SUB FillBuffer (BYVAL toseg%, BYVAL colour%) DECLARE SUB BufferCopy (BYVAL fromseg%, BYVAL toseg%) DECLARE SUB PlotPixel (BYVAL toseg%, BYVAL x%, BYVAL y%, BYVAL colour%) DIM buffer%(32001) SCREEN 13 FillBuffer VARSEG(buffer%(0)), 1 FOR t% = 1 TO 10000 PlotPixel VARSEG(buffer%(0)), RND * 319, RND * 199, RND * 255 NEXT t% PRINT "Press a key...": SLEEP BufferCopy VARSEG(buffer%(0)), &HA000 -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - Run it! It plots 10000 pixels on our double buffer, randomly. The buffer is also filled with blue. Notice this time we pass on VARSEG(buffer%(0)). Having the option to specify our buffer's location gives us flexibility. Try passing on the constant value '&HA000'. It'll change the screen directly. This will be the standard throughout your library. If you want.

9.2 Review of the code

buffercopy proc push ds ; Save DS (must always save DS) push bp ; Save BP mov bp, sp ; Get stack pointer mov ds, [bp+10] ; Get source segment mov es, [bp+8] ; Get to segment xor si, si ; Zero out si xor di, di ; Zero out di mov cx, 32000 ; Let's move 32000 words (64000 bytes) rep movsw ; Do it pop bp ; Restore BP pop ds ; Restore DS ret 4 ; Return to QB, removing the 2 words buffercopy endp Note that you must *always* save DS, but ES isn't important. XOR is a logic gate. Check it out on the QB help. If you xor a number with itself, it has the effect of setting it to zero. Believe it or not, this is faster than doing MOV SI, 0 We set SI and DI to zero. We do this because our Segments are 64k big and the pointer is always Zero. Don't rely on this for everything else, though. MOVSW this copies one word from DS:SI to ES:DI. The REP prefix repeats the MOVSW, CX times. We have 32000 in CX so we copy 32000 words, or 64000 bytes. fillbuffer proc push bp ; Save DS mov bp, sp ; Get stack pointer mov es, [bp+8] ; Get destination Segment xor di, di ; Set it's pointer to Zero mov al, [bp+6] ; Get the fill colour mov ah, al ; Copy it to AH mov cx, 32000 ; 32000 words to store rep stosw ; Store them pop bp ; Restore BP ret 4 ; Return to QB, cleanup stack fillbuffer endp We copy AL to AH so we can write the two bytes at the same time. The REP STOSW works in the same way as REP MOVSW but instead it simply stores data from AX to ES:DI

10. Conditions and looping

What if you wanted to write a routine to copy one segment to another transparently, to give a parallax effect? You have to go through each pixel one at a time and check to see if it's colour zero (which is commonly used for transparency). For this you'll need some new instructions. - CMP <subject>, <value> Compares the subject with another value. - JMP linelabel Unconditional JUMP to another part in the code - JNZ linelabel Jumps if the CMP was Not Zero - JZ linelabel Jumps if the CMP was Zero - JNE linelabel Jumps if the CMP operands were Not Equal - JE linelabel Jumps if the CMP operands were Equal - JG linelabel Jumps if the first CMP operand was Greater than the second - JL linelabel Jumps if the first CMP operand was Less than the second There's about 30 jump instructions so look them up somewhere. But the above should suffice for a while. All of the Jump intructions must be used immediately after the compare. Example1: MOV bx, 3 MOV ax, bx SUB ax, 3 CMP ax, 0 JZ ax_was_zero . . more instructions... . ax_was_zero: . . more instructions... . . etc Example2: MOV ax, 15 MOV dx, 16 CMP dx, ax JG dx_was_greater . . do something else . . dx_was_greater: . . more instructions . Let's write that routine I was talking about ; Stack ; Fromseg 10 ; ToSeg 8 ; QB Return SEG 6 ; QB Return OFF 4 ; DS 2 ; BP 0 ; transcopy proc push ds push bp mov bp, sp mov es, [bp+8] mov ds, [bp+10] xor si, si xor di, di next_pixel: mov al, ds:[si] cmp al, 0 jz skip_plot mov es:[di], al skip_plot: inc si inc di cmp di, 64000 ; 63999 is the end of the buffer jne next_pixel pop bp pop ds ret 4 transcopy endp Here's a job for you: Copy this into an asm file, assemble it, build a library and write a demo program to test it. The QB declaration is: DECLARE SUB TransCopy (BYVAL fromseg%, BYVAL toseg%) Remember to make it 'Public'!

11. Interrupts

I actually finished this document then realised that I hadn't actually said anything about interrupts. Ooops. - So what are they? They're mini programs stored in memory. We can treat them like SUB's or FUNCTION's. The name 'interrupt' comes from the fact you stop the CPU from doing whatever it's doing and make it run the requested interrupt program. Each interrupt has it's own unique number in the range 0-255. - Some examples of interrupts The DOS interrupt is a good one. Using this interrupt, you can allocate memory, open/close and read/write files, get information about the current system setup, free disk space that sort of thing, plus tons of other stuff. - Ok, so how do you get all those diffeerent things from one interrupt?? The DOS interrupt number is 21h, but we choose what we want to do by passing another number in the AX register. So opening a file is a subfunction of the DOS interrupt and we use it by setting AX to 3D00h and then calling interrupt number 21h. Interrupts - Hardware and Software These are the two types of interrupt. A good example of a hardware interrupt is pressing a key on your keyboard. The keyboard tells the CPU something happened and the CPU passes control to an Interrupt Service Routine (ISR) which handles the keypresss. Hardware interrupts occur automatically. A software interrupt is usually something that is caused by a program. Say, we generate an interrupt to allocate 32k of memory for us. - How do we call them? That's easy: MOV ax, 0013h ; Mode 13h INT 10h ; Set the mode The above example sets the video mode to 13h or SCREEN 13. INT 10h is the Video interrupt. Hooking interrupts We can write small programs to handle the event of an interrupt ourselves. A popular interrupt replacement is the keyboard interrupt. The keyboard interrupt is number 09h. There is a table held by DOS in the lower part of memory called the Interrupt Vectors Table. When any interrupt occurs, it's number is looked up in this table and, at it's entry, there is a SEGMENT:OFFSET address which points to a piece of code in memory that will handle the interrupt. We can simply change this address to point to our ASM code and we can replace the crappy standard keyboard handler with our own! Funnily enough, the DOS interrupt 21h, has a routine to do this for us, we simply pass on the SEG:OFF. A complete keyboard handler is given to you in section 13 of this document. There is a list of interrupts in a utility program on our site.

12. ASM Data and Arrays

In section 8, Constructing an ASM file, I mentioned something called the Data Segment. This is a part in our ASM file where we can store variables, arrays and structures. There are three common types of data element we can declare: Byte, Word and Dword (double word). In ASM they are DB, DW and DD respectively. The 'D' stands for 'Define'. Example: -------------------------------------------------------------------- .data Oldpointer DD 0 ; Define Double Word Number_of_sprites DW ? ; Define Word GameSpeed DB 10 ; Define Byte -------------------------------------------------------------------- The '?' means that we don't want any particular initial value of the variable whereas the other two are set. We can also declare a string constant, like this: ourstring DB 'Hello I am a string of characters' We can define multiple data sizes like this: OurWordData DW 1,3,5,7,9,11,13,15,17 ; seperated by commas ',' OurByteData DB 0,1,2,4,8,16,32,64,128 ; OurDwordData DD -14871, 56004, -24576 ; The way we access all this data is so easy. Example: -------------------------------------------------------------------- .data numbaddudes DW 500 .code plotenemies proc ; MOV ax, numbaddudes ; easy ; plotenemies endp -------------------------------------------------------------------- *** Important !! *** The only thing you must remember is that the DS register must be pointing to the Data segment. When you enter your procedure, DS is pointing to the right place, but if you change it, remember to restore it! You can do so at any time by doing this: MOV cx, @DATA MOV ds, cx @DATA is the value of the current data segment. Another thing to note is that your variable values will not be reset if you exit the routine. Eg, if calling your ASM procedure you set the variable 'NumWeapons' to 20 then you exit to QBasic, then you enter the ASM again, it's value will still be 20.

12.1 Arrays

We can use arrays in our ASM as well. It's super easy: our_array DB 512 dup (0) Creates an array of 512 bytes and sets each element to zero. mr_array2 DD 100 dup (?) Creates an array of 100 Dwords. Each element is uninitialised. To access them we treat them like normal arrays and use square brackets: MOV al, our_array[15] ; accesses 16th element Or we could use a register like this: MOV al, our_array[BX] ; accesses the element stored BX is the called the Index Register. You can only use certain registers as an index. They are: BX, BP, SI and DI.

13. Writing your routines

Before you jump in and start writing your super-optimized sprite routine in ASM, try writing it in QBasic, in an assembly style. Then use it as a reference when translating the algorithm to ASM. Also, if you haven't done so already, check out the source code from some other libraries around. Our library contains source code and I know that DirectQB and Dash come with source code so go through it and see if you can understand it. Don't copy it directly! You wont learn anything that way...

13.1 A good example: A keyboard handler

Here is a good example that covers a lot of the ASM programming techniques discussed in this document. It is a fully operational keyboard handler for taking multiple keypresses. The QBasic declarations are: DECLARE SUB KeyboardOn() DECLARE SUB KeyboardOff() DECLARE FUNCTION GetKey%(BYVAL Scancodenumber%) -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - .model medium, BASIC .stack 200h .data keybflags DB ? ; original flags keymatrix DB 128 dup(0) ; holds the key map keybon DB 0 ; tells us if the keyb is on dos_int_seg DW ? ; old key handler seg dos_int_off DW ? ; old key handler off .code .386 public KeyboardOn public KeyboardOff public GetKey KeyboardOn proc ; Switch our keyboard on cmp keybon,1 jne keybon_cont ret keybon_cont: mov ax, 40h ; mov es, ax ; Stores the current keyboard mov di, 17h ; flags state mov al, es:[di] ; and al, 70h mov keybflags, al push ds mov ax,3509h ; Function 35h Get Int Vector int 21h pop ds mov dos_int_off, bx ; Store the old addres mov dos_int_seg, es ; mov ax, SEG new_key_int ; Get the SEG & OFF of our mov dx, OFFSET new_key_int ; procedure push ds mov ds, ax mov ax, 2509h ; Function 25h Set Int Vector int 21h pop ds mov keybon,1 ret KeyboardOn endp new_key_int proc ; This is the new interrupt ISR push ax ; The CPU will call this everytime push bx ; you press or release a key push si push ds in al, 60h ; Read the pressed key from port 60h xor ah, ah ; mov si, ax ; The rest of this stuff in al, 61h ; just resets the keyboard or al, 82h ; flip-flop and tells the out 61h, al ; keyboard that we received and al, 127 ; the key out 61h, al ; mov al, 20h ; out 20h, al ; mov bl, 1 ; Assume it's a Make code test si, 128 ; is it >= 128 ? jz store_key ; and si, 127 ; xor bl, bl ; store_key: ; mov ax, @DATA ; mov ds, ax ; mov keymatrix[si], bl ; Store the new state in our array pop ds ; pop si pop bx pop ax iret ; IRET = interrupt return new_key_int endp Keyboardoff proc cmp keybon,0 jne keyboff_cont ret keyboff_cont: push ds mov dx, dos_int_off ; Get the old SEG:OFF mov ax, dos_int_seg ; mov ds, ax ; mov ax, 2509h ; 25h = Set Int Vector int 21h pop ds mov ax, 40h ; Restore the Key flags mov es, ax ; mov di, 17h ; mov al, keybflags ; mov es:[di], al ; mov ax, SEG keymatrix ; mov di, OFFSET keymatrix ; Set our array to zeros mov es, ax ; xor ax, ax ; mov cx, 128 ; rep stosb ; mov keybon, al ret Keyboardoff endp ; stack ; ; code 4 ; ret seg 2 ; ret off 0 ; Getkey proc ; All this does is return mov cx, bp ; the key state of a mov bp,sp ; given key scancode mov si,[bp+4] ; mov al, keymatrix[si] ; A list of scancodes is available xor ah, ah ; in the QB help mov bp, cx ret 2 Getkey endp -Cut here - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - The whole keyboard thing works like this: When you press a key, that key number is sent from the keyboard, when you release the key the key number plus 128 is sent. These are called the 'Make or Break' codes respectively. You are free to use the above code in any way you want. It's a standard procedure so do what you want!

13.2 A Super Fast Pixel Plot

Thought someone might like this. It's the fastest pixel plotting routine I have ever seen. I mean, the method it uses to get the offset is pretty cool - it only involves one shift right. And who wrote it? Me! Oh, I *love* blowing my own horn...;) RSPset proc mov ax, bp ; (1) mov bp, sp ; (1) get stack pointer mov es, [bp+10] ; (1) get seg mov bh, [bp+6] ; (1) bl = y shr bx, 2 ; (3) bx = y * 64 and bl, 11000000b ; (3) add bh, [bp+6] ; (2) y is always less than 255 so it's = to + y*256 add bx, [bp+8] ; (2) add x mov cl, [bp+4] ; (1) get colour mov es:[bx], cl ; (1) mov bp, ax ; (1) ret 8 ; (5) ; = 22 clock ticks! RSPset endp Someone might have to correct me on the clock ticks though. It's probably something like, 30-35...who knows. Let's see if someone can make it faster!!!

13.3 Algorithms

To help you on your way, I've included a few standard-ish algorithms you can use to write your ASM routines. Lines ---------- You can forget about me giving you the algorithm for drawing a line between any two points. For that you need the Bresenham algorithm, which is a little complex. Search for it on the internet, you'll find what you need. For now, here's a Vertical line routine: Vertical: Get the destination segment in ES Get Y1 Get Y2 Is Y1 > Y2? If yes, swap them Calculate DI by Y1 * 320 + X Subtract (Y1 from Y2) +1 and store this value (it's the counter) Y_LOOP: Get the colour Plot it to ES:DI Add 320 to DI Decrement the counter Are we at zero? If no jump to Y_LOOP The Horizontal one's even easier. You can use a REP STOSB I think this sprite algorithm is the one used in DirectQB's sprite plotting routine. It's definitely not the fastest way of doing it because it performs a lot of checking in the X and Y loops. A much faster version would calculate what does and doesn't need plotting beforehand. (Hey, those guys at Shimmer/Ribbonsoft did one of those! ;) See if you can improve it! Sprite Plot ------------- Set DS to the sprite segment Set ES to the buffer segment Set SI to the sprite offset Point DI to x,y-1 Get the Sprite width & height by reading the first two words from DS:SI Get the Y value and subtract 1 Add the sprite width to DI Y_LOOP: Add the sprite width to SI Add the screen width to DI Move to the next line Decrement the Height counter Is it zero? If yes, jump to END_PLOT Make sure the Y position is within screen boundaries. Subtract the sprite width from SI and DI Store the sprite's X value Store the sprite's width X_LOOP: Get a pixel from the sprite buffer Is it zero? If yes jump to SKIP_PLOT Make sure the X position is within screen boundaries If yes, draw it on-screen SKIP_PLOT: Increment the sprite's X value Decrement the sprite's width Jump if not zero to X_LOOP Jump to Y_LOOP END_PLOT: Cleanup stack, exit routine Rotation ------------- Doing sprite rotation is really a general algorithms excercise. I recommend that you, again, write the rotation in QBasic first, limiting yourself by writing the code in an assembly style. If you know anything about rotation, you'll know you need a load of SIN and COS values, 360 of each to be exact, for each angle. So how do you access these tables in ASM? Well, you could create and array in QBasic and store the values there and then pass the address of this array onto the routine...or...you could precompute all the values and write them into the ASM file as a series of DW words. They would be much easier to access and handle. And it'll give you more space in QBasic. Check out the sources of DirectQB and our RibbonSoft VGA library to see how to implement this. If you don't know how to rotate points, there is a file called QBRotate.zip somewhere on our site. It gives all the info you need to know. Scaling ------------- For scaling, try writing a version using X-Step, Y-Step technology. These two values contain a decimal increment for the X,Y plotting position. This is very easy to write in a High Level Language, but it's also very slow. This is because of the smelly floating point. What we need is fixed point mathematics which is covered in Section 14: Optimization. Translucency -------------- Here's a lovely effect you can produce with minimum effort. For this you'll need a gradient palette, arranged in 32 shades (that's 8 in all) from dark to light. In the inner loop of your sprite plotting routine, or whatever, single out the part where you get the pixel colour from your sprite buffer GET pixel from sprite buffer MOD it with 32 (or AND it with 31) Calculate the base of the colour gradient ** GET pixel from destination screen buffer (where you're gonna put it) MOD it with 32 (or AND it with 31) Add the two pixel numbers together Divide by 2 to get an average (SHL 1) Add this average to the base Plot the pixel ** The colour gradient base is the colour which is the first in the grade. Ok, We have a grade from Black to white from colours 0 -> 31 and we have a Dark red to Light red grade from colours 32 -> 63 The base colour for a colour between 0 and 31 is 0 The base colour for a colour between 32 and 63 is 32 You see what I mean? With a gradient palette, you do all kinds of cool effects: - Darken a patch of the screen for an overlayed menu effect - Plot a sprite in greyscale/greenscale/pinkscale/anything - Even realtime lighting effects are possible!

14. Optimization

This sections gives some ways on how to make your procedures even faster

14.1 32-Bit programming

How do we make our code faster? We can use 32-Bit for a start. You can replace: MOV CX, 32000 REP MOVSW in the buffer copy routine with: MOV CX, 16000 ; move 16000 dwords (64000 bytes) REP MOVSD All instructions that end with a 'B' or 'W' can now end with a 'D'. Ace. Remember this only works on 386+ machines. The other prizes we are given for using 32-bit is double sized registers. These register are an extension of the existing 16-bit ones and have the prefix 'e'. EAX -> AX EBX -> BX ECX -> CX EDX -> DX EDI -> DI ESI -> SI ESP -> SP EBP -> BP Each 'E' register is 32-Bit and affects it's 16-bit version in the same way AL affects AX. There are no EES, EDS, ESS and ECS, though. Why? Well you'll find out once you start doing protected mode programming...and, yes...you *will* code in protected mode one day. Soon you will grow tired of QB's primitive ways and discover the joy of DJGPP...soon..*ahem*..

14.2 Fixed Point Math

It's a little strange to me calling it 'Math' as we call it 'Maths' in the UK, but since 80% of the readers of this document will be from the US I might as well conform. We use Fixed Point to represent numbers with a decimal point that doesn't move. This is much faster than using Floating point numbers which are encoded in a very complex way. With fixed point, we can add subtract, multiply and divide in the same way we would standard integers. This is how it works. We take a 16-bit number (a standard sized register) and split it into two parts. The higher (left) part will be our whole number The lower part will be our fractional part. We can put an imaginary decimal point between them. Observe: Whole part Decimal part | | | | 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 ----------------------------------------------------------------------- 128 64 32 16 8 4 2 1 1/2 1/4 1/8 . . . . . 1/128 1/256 Sorry, I couldn't fit the rest of the values in the decimal part. So we use the upper 8 bits as the whole number and the lower 8 bits as the decimal part. This is called 8.8 fixed point. This disadvantage of this is, we can only represent numbers in the range 0-255, but we do have have 8-bits of accuracy. Another method is 9.7 fixed point. This way you have the range 0-511 but slightly less accuracy. Let's look at an example using 9.7: Imagine we have the number 100 and we want to put it in a loop and increment it with 0.5 every cycle. Here is some assembly language to do it.. MOV dx, 100 ; Start with a standard 100 SHL dx, 7 ; Multiply it by 128 next: ADD dx, 64 ; Add 64. Same as adding 0.5 . . JNZ next Adding 128 would give the effect of adding 1 each time. Adding 1 would have the effect of adding 1/128 each time. So now you've got this fixed point number, what do you do with it? You Shift RIGHT 7 places and then you have the whole number part. Assigning a fixed point number ------------------------------ We set the variable or register to the value we want to scale and then multiply it by our scale factor. If we're using 9.7 we multiply by 128 or in assembly language Shift Left 7 places. Let's look at how we use them. Remember that these fixed-point numbers are imaginary and the CPU will add and subtract like they were normal 16-bit numbers. Example: MOV ax, 54 SHL ax, 7 ; Creates 54.0 MOV bx, 17 SHL bx, 7 ; Creates 17.0 ADD ax, bx ; Result: 71.0 SUB ax, bx ; Result: 54.0 Adding and subtracting is pretty straightforward, but Mulitply and dividing are a little tougher. When we multiply to fixed point numbers, we must remember that they have BOTH been scaled by a given factor, in our case 128. So when we multiply them we must shift them back down 7 places to correct them. Example: MOV ax, 20 ; SHL ax, 7 ; Creates 20.0 MOV bx, 5 ; SHL bx, 7 ; Creates 5.0 IMUL bx ; Multiplies AX by BX and stores it in AX SHR ax, 7 ; Divides AX by 128 Dividing is a bit trickier, still. We when we divide, we have an unwanted factor of 128 in the division. We must remove this by shifting the dividend to the LEFT 7 places, first. Then we can divide. The only problem with this is if we shift left another 7 places, we'll lose the significant part of our number! This problem is resolved by using 32-bit fixed point numbers. We can EAX instead of AX and have great accuracy and range by using 16.16 fixed point. Applications of Fixed Point --------------------------- Here's a good one: rotation. We have all our SIN and COS values scaled up by 128 or 256 or whatever, then in our assembler we do our rotation calculations and scale the result back down to get our new X,Y position. Check out our RibbonSoft VGA library to see this in action!

14.3 Inner loops

The inner loop is the bottleneck to high performance software. At least that's what it says in the VESA SVGA specification... The inner loop or more commonly the 'X' loop must be made as fast as possible as this is the part that'll be executed the most. We can do this by making sure we use registers for all our values in the loop, make sure there are no PUSHes or POPs and there are no memory references, these are much slower access than a register. Also all required calculations must be done outside the loop, if possible. So that means no MULs, IMULS, DIVs or IDIVs (unless you really HAVE to). Another technique is to Unroll our loops. Consider this example which calculates the sum of 100 16-bit word integers ; assume ES:DI points to a list of Word integers MOV cx, 100 XOR ax, ax nextnumber: ADD ax, es:[di] ADD di, 2 ; move 2 bytes (onto the next word) DEC cx JNZ nextnumber This seems innocent enough but it can be made faster: ; assume ES:DI points to a list of Word integers MOV cx, 100 XOR ax, ax nextnumber: ADD ax, es:[di] ADD ax, es:[di+2] ADD ax, es:[di+4] ADD ax, es:[di+6] ADD ax, es:[di+8] ADD di, 10 ; move 10 bytes SUB cx, 5 ; we've added 5 words... OR cx, cx ; We can use this instead of CMP JNZ nextnumber This has the same effect as the previous example, but it is much faster as the loop only runs 20 times instead of 100.

14.4 Other tips

This is just a few little useful pieces I didn't know where to put. - Make sure you always use XOR ax, ax iinstead of MOV ax, 0 - You can use the stack for temporary sstorage by pushing a load of blank registers and referencing the 'slots' with BP - If you have a lot of items on the staack, which you need to remove, instead of doing: pop bx pop bx pop bx pop bx You can simply do: ADD sp, 8 - If you have a number which needs a Shhift-Left 8 places, instead of SHL dx, 8 use XCHG dh, dl XOR dl, dl This moves DL into DH which has the same effect as a shift but is much faster! - The format of a sprite stored by QB'ss GET is as follows: Element Size Content Array(0) 2 Bytes Contains the Width of the sprite * 8 Array(1) 2 Bytes Contains the Height Array(x) ?? All pixel data in a linear format from left to right, top to bottom I'm not sure why QB multplies the width by 8. I think I knew once but I forget...pih. Oh, to get the correct size, just SHR 3 times. - As of the 486 onwards ADD and SUB aree the same speed as INC and DEC It used to be that they were slightly slower, not anymore. Also these instructions... MOV ax, [bp] MOV [di], cx MOV ax, bx MOV ax, 1 ...are all the same speed. Some people have been reading very out of date, old, smelly ASM guides. These guides have been for old 8086 and 286 processors. They are no good! An example would be the Peter Norton Guide to Assembly language. It's waaaaay out of date. The most recent ASM guide I could find is HelpPC which is a nice TSR that has lots of other cool stuff in it. It's available on our site. Anyway, some people thought that all those memory MOV's were really slow. They're only 1 clock tick. OK??!!!! Sorry.

15. Closing words

Um. I wrote this entire document over the weekend and sacrificed my social life.

Feel free to point out my mistakes, I just love that sort of thing. Also, if you'd like to contribute to this document, then that's great! Send me your stuff and I'll release a 1.5 version. The same applies to anything you'd like to see included in a future version.

I hope this document has helped in some way, if it hasn't, then I've successfully wasted 18 hours of my life.

If however you were enlightened in one way or another by this document then I'd love a bit of email...

Thanks for listening!
Hi to Duncan!
- CGI Joe 2000