The QBNews Page 8 Volume 3, Number 1 March 29, 1992 ---------------------------------------------------------------------- M i c r o s o f t B A S I C P D S 7 . 1 ---------------------------------------------------------------------- Converting ASM String Functions to BASIC 7.X by Jay Munro Life was easy when QuickBASIC only had one kind of variable length string and it was always in DGroup. QuickBASIC used Near Strings with descriptors in a simple format--two bytes as the length, and two bytes as the offset into the String Segment, or DGroup. That all changed with BASIC 7.x PDS. While the Near String format still lives on with BASIC 7.x as a compiler option, the QBX environment requires your assembler functions to use Far Strings. The Far Strings data is located in segments other than DGroup, with several to choose from. The Main module strings and string arrays have one 64k segment, procedure level simple strings and temporary strings share another 64k segment, and Common strings are in yet a third 64k segment. That's a lot of space, but it does complicate working with strings. To begin with, the string and array descriptors for all strings are kept in DGroup, regardless of where the data is stored. This is important to keep in mind when working with Far Strings, since it is still possible to run out of DGroup before you run out of Far String space. Having the descriptors in DGroup let you pass the string to an assembler routine as a Near Pointer, just as in earlier versions of QuickBASIC. The real differences between Near and Far Strings become apparent once inside the assembler routine. To get the length of a Near String, the first two bytes of the descriptor are the length. With Far Strings, the pointer (or address) to the descriptor is pushed and the StringLength function is called. Stringlength returns the length of the string in AX. In our Trim routine, we move AX into CX for use in the loop later. For a Near String descriptor, the second two bytes of the descriptor point to the string data that is stored in DGroup. To get the address of Far String Data, the desciptor is again pushed and the StringAddress function is called. The StringAddress function returns the address of the string as a Far Pointer in DX:AX. An interesting undocumented side effect of StringAddress is that CX contains the length when it returns, along with the address in DX:AX. Microsoft will not confirm this, and cautions against depending on undocumented features because they are subject to change, but it does work in both BASIC 7.0 and 7.1. You can decide for yourself if you want to drop StringLength. When StringAddress and StringLength are called, SS, BP, DS, SI, and DI registers are preserved, but ES is not. You can see when running a program under Codeview that ES returns with the segment of the far string after the StringAddress call. Normally this is harmless, but when your routine is juggling several incoming strings, ES may be The QBNews Page 9 Volume 3, Number 1 March 29, 1992 pointing to another segment. The trashing of ES may go unnoticed when all strings the routine is processing are in the same segment. However, if the strings are from different segments, like Common and Procedure level strings it becomes apparent. The Far string descriptor format also applies to variable length string arrays. To access strings in an array, the VarPtr of the first element of the string is passed as shown in the following code snippet, and the pointer is incremented by 4 to get each successive string. The function AddEll adds up the lengths of the strings in the array$(). '--- BASIC code Declare Function AddEll&(Byval Address%,Byval NumEls%) Dim Array$(1 to 200) .... (fill array).... TotalSize& = AddEll&(VARPTR (Array$(1),200) ;---- assembler code .Model Medium,BASIC Extrn StringLength:Proc .code AddEll Proc Uses SI DI, Array:Ptr, NumEls:Word Xor DI,DI ;use DI to track total lengths Mov CX,NumEls ;get numels (byval remember) Jcxz Exit ;jump out if 0 els Mov SI,Array ;get pointer to first descriptor CountLoop: Push CX ;preserve CX through count loop Push SI ;push pointer Call StringLength ;get length Add DI,AX ;add length to counter Pop CX Add SI,4 ;point to next 4 byte descriptor Loop CountLoop ;keep going til' done Exit: Xor DX,DX ;clear DX to return long Mov AX,DI ;and return total count in AX Ret AddEll EndP End For QB45 or Near Strings, the code can be much more compact, and thereby faster. ;near String version AddEll Proc, Array:Ptr, NumEls:Word Xor AX,AX ;use AX to track total lengths Mov CX,NumEls ;get numels (byval remember) Jcxz Exit ;jump out if 0 els Mov BX,Array ;get pointer to first descriptor The QBNews Page 10 Volume 3, Number 1 March 29, 1992 CountLoop: Add AX,[BX] ;first word of is length Add BX,4 ;point to next 4 byte descriptor Loop CountLoop ;keep going til' done Exit: Xor DX,DX ;clear DX to return long Ret AddEll EndP End Now that we've discussed retrieving Far String information, let's move on to our last subject--Far String Functions. STRING FUNCTIONS Far String Functions are only a little more complex than their Near String counterparts. In a Near String function, the four byte descriptor (LENGTH:ADDRESS) is created in DGroup, as is the data that the address points to. With Far Strings, the four byte descriptor must still be in DGroup, but the data can be in any segment. Unlike Near String descriptors which are created by the programmer, the four byte descriptor is created by calling BASIC's StringAssign function. StringAssign is called with the Length, Segment and Address of the data you want to assign, and the Length, Segment and address of the destination. The StringAssign function can assign both variable length and fixed length strings. With fixed length strings, StringAssign uses the length parameter on the destination. When assigning a variable length string, the destination length is set to zero so StringAssign knows to create a descriptor and store the date. The accompanying program, ZeroTrim illustrates the technique. There are two caveats to be aware of when using Far Strings and StringAssign. First, when StringAssign is called with the address of the descriptor (or destination), it checks to see if the descriptor contains a value or not. If it sees a value, StringAssign assumes it is an old descriptor and will try to deallocate the string to make room for the new string. On the first entry into your assembler function, the descriptor should be null so a new descriptor is allocated. However, if your function clears the descriptor between calls, BASIC will end up with orphan data from old strings not being deallocated. On the other side, if your program does a CLEAR command after a call to a string function, variables on the BASIC side will be cleared, but the descriptor stored in your assemblers Data area will still exist. If the assembler routine then calls StringAssign, BASIC will try to deallocate the data and you'll end up with a String Space Corrupt message. The rule is don't call Clear once the program starts, and don't clear the descriptor between calls. If your program needs to free up the space, you may call the StringRelease function to deallocate the descriptor. This is only recommended if the function is not going to be called again, since StringAssign will always release the old string before assigning a new The QBNews Page 11 Volume 3, Number 1 March 29, 1992 one. The syntax for StringRelease is as follows. .Data Descriptor DD ? .Code Some Proc .... .. .. Push DS Push Offset Descriptor Call StringRelease Another problem programmers may encounter with Far Strings happens when the assigning a zero length string. Under Near Strings, the length portion of the descriptor just zeroed out and BASIC understood. Far Strings are not quite so simple, but the concept is similar. Though there are several ways to do it, the most reliable is to be to point to an actual data area and tell StringAssign that the length is zero. StringAssign will then deallocate the old descriptor if there is one, and allocate a new one that BASIC recognizes as a zero length string. Again, it is important to let BASIC go through the motions so it deallocates the old strings. As you can see, the conversion of assembler routines from Near Strings to Far Strings is not too difficult. By using the Far String function and watching your registers, the transition is almost painless. Once a routine is converted to handle Far Strings, it may be used with either BC 7.x compiler option. The Far String functions are automatically replaced with near string code when you leave off the /FS switch, eliminating the need for two version of the routines for BASIC 7 support. If you need to support QuickBASIC 4.x, you will need a second version for near strings only. SOURCE CODE FOR THIS ARTICLE CAN BE FOUND IN FARSTRNG.ZIP ====================================================================== Jay Munro is a programmer at Crescent Software and also writes utilities for PC Magazine in his spare time. He was given the task of converting Crescent's QuickPak Pro to use Farstrings, and more recently, converting QuickPak Pro for use in Visual Basic. Jay can be contacted through The Crescent Software Support BBS at 203-426-5958, or in care of this newsletter. ======================================================================