The QBNews                                                     Page  8
     Volume  3, Number  1                                    March 29, 1992


     ----------------------------------------------------------------------
                  M i c r o s o f t   B A S I C   P D S   7 . 1
     ----------------------------------------------------------------------

     Converting ASM String Functions to BASIC 7.X by Jay Munro
     
     Life was easy when QuickBASIC only had one kind of variable length
     string and it was always in DGroup.  QuickBASIC used Near Strings with
     descriptors in a simple format--two bytes as the length, and two bytes
     as the offset into the String Segment, or DGroup.  That all changed
     with BASIC 7.x PDS.
     
     While the Near String format still lives on with BASIC 7.x as a
     compiler option, the QBX environment requires your assembler functions
     to use Far Strings. The Far Strings data is located in segments other
     than DGroup, with several to choose from. The Main module strings and
     string arrays have one 64k segment, procedure level simple strings and
     temporary strings share another 64k segment, and Common strings are in
     yet a third 64k segment.  That's a lot of space, but it does
     complicate working with strings.
     
     To begin with, the string and array descriptors for all strings are
     kept in DGroup, regardless of where the data is stored.  This is
     important to keep in mind when working with Far Strings, since it is
     still possible to run out of DGroup before you run out of Far String
     space.  Having the descriptors in DGroup let you pass the string to an
     assembler routine as a Near Pointer, just as in earlier versions of
     QuickBASIC.   
     
     The real differences between Near and Far Strings become apparent once
     inside the assembler routine.  To get the length of a Near String, the
     first two bytes of the descriptor are the length.  With Far Strings,
     the pointer (or address) to the descriptor is pushed and the
     StringLength function is called. Stringlength returns the length of
     the string in AX.  In our Trim routine, we move AX into CX for use in
     the loop later. 
     
     For a Near String descriptor, the second two bytes of the descriptor
     point to the string data that is stored in DGroup.  To get the address
     of Far String Data, the desciptor is again pushed and the
     StringAddress function is called.  The StringAddress function returns
     the address of the string as a Far Pointer in DX:AX.  An interesting
     undocumented side effect of StringAddress is that CX contains the
     length when it returns, along with the address in DX:AX.  Microsoft
     will not confirm this, and cautions against depending on undocumented
     features because they are subject to change, but it does work in both
     BASIC 7.0 and 7.1.  You can decide for yourself if you want to drop
     StringLength.
     
     When StringAddress and StringLength are called, SS, BP, DS, SI, and DI
     registers are preserved, but ES is not.  You can see when running a
     program under Codeview that ES returns with the segment of the far
     string after the StringAddress call. Normally this is harmless, but
     when your routine is juggling several incoming strings, ES may be
     
     The QBNews                                                     Page  9
     Volume  3, Number  1                                    March 29, 1992

     pointing to another segment.  The trashing of ES may go unnoticed when
     all strings the routine is processing are in the same segment.
     However, if the strings are from different segments, like Common and
     Procedure level strings it becomes apparent.  
     
     The Far string descriptor format also applies to variable length
     string arrays.  To access strings in an array, the VarPtr of the first
     element of the string is passed as shown in the following code
     snippet, and the pointer is incremented by 4 to get each successive
     string.  The function AddEll adds up the lengths of the strings in the
     array$().
     
     '--- BASIC code
     Declare Function AddEll&(Byval Address%,Byval NumEls%)  
     Dim Array$(1 to 200)
     
     .... (fill array)....
     
     TotalSize& = AddEll&(VARPTR (Array$(1),200)
     
     ;---- assembler code 
     .Model Medium,BASIC
     Extrn StringLength:Proc
     
     .code
     AddEll Proc Uses SI DI, Array:Ptr, NumEls:Word
          Xor  DI,DI               ;use DI to track total lengths
          Mov  CX,NumEls           ;get numels (byval remember)
          Jcxz Exit                ;jump out if 0 els
          Mov  SI,Array            ;get pointer to first descriptor
     CountLoop:
          Push CX                  ;preserve CX through count loop
          Push SI                  ;push pointer 
          Call StringLength        ;get length
          Add  DI,AX               ;add length to counter
          Pop  CX
          Add  SI,4                ;point to next 4 byte descriptor
          Loop CountLoop           ;keep going til' done
     Exit:
          Xor  DX,DX               ;clear DX to return long
          Mov  AX,DI               ;and return total count in AX
          Ret
     AddEll EndP
     End
     
     For QB45 or Near Strings, the code can be much more compact, and
     thereby faster.
     
     ;near String version
     AddEll Proc, Array:Ptr, NumEls:Word
          Xor  AX,AX               ;use AX to track total lengths
          Mov  CX,NumEls           ;get numels (byval remember)
          Jcxz Exit                ;jump out if 0 els
          Mov  BX,Array            ;get pointer to first descriptor
     
     The QBNews                                                     Page 10
     Volume  3, Number  1                                    March 29, 1992

     CountLoop:
          Add  AX,[BX]             ;first word of is length
          Add  BX,4                ;point to next 4 byte descriptor
          Loop CountLoop           ;keep going til' done
     Exit:
          Xor  DX,DX               ;clear DX to return long
          Ret
     AddEll EndP
     End
     
     Now that we've discussed retrieving Far String information, let's move
     on to our last subject--Far String Functions. 
     
     
     STRING FUNCTIONS
     
     Far String Functions are only a little more complex than their Near
     String counterparts.  In a Near String function, the four byte
     descriptor (LENGTH:ADDRESS) is created in DGroup, as is the data that
     the address points to.  With Far Strings, the four byte descriptor
     must still be in DGroup, but the data can be in any segment.  Unlike
     Near String descriptors which are created by the programmer, the four
     byte descriptor is created by calling BASIC's StringAssign function.  
     
     StringAssign is called with the Length, Segment and Address of the
     data you want to assign, and the Length, Segment and address of the
     destination.  The StringAssign function can assign both variable
     length and fixed length strings.  With fixed length strings,
     StringAssign uses the length parameter on the destination.  When
     assigning a variable length string, the destination length is set to
     zero so StringAssign knows to create a descriptor and store the date.
     The accompanying program, ZeroTrim illustrates the technique.   
     
     There are two caveats to be aware of when using Far Strings and
     StringAssign.  First, when StringAssign is called with the address of
     the descriptor (or destination), it checks to see if the descriptor
     contains a value or not.  If it sees a value, StringAssign assumes it
     is an old descriptor and will try to deallocate the string to make
     room for the new string. On the first entry into your assembler
     function, the descriptor should be null so a new descriptor is
     allocated. However, if your function clears the descriptor between
     calls, BASIC will end up with orphan data from old strings not being
     deallocated.  On the other side, if your program does a CLEAR command
     after a call to a string function, variables on the BASIC side will be
     cleared, but the descriptor stored in your assemblers Data area will
     still exist.  If the assembler routine then calls StringAssign, BASIC
     will try to deallocate the data and you'll end up with a String Space
     Corrupt message.  The rule is don't call Clear once the program
     starts, and don't clear the descriptor between calls. 
     
     If your program needs to free up the space, you may call the
     StringRelease function to deallocate the descriptor.  This is only
     recommended if the function is not going to be called again, since
     StringAssign will always release the old string before assigning a new
     
     The QBNews                                                     Page 11
     Volume  3, Number  1                                    March 29, 1992

     one.  The syntax for StringRelease is as follows.
     
     .Data
       Descriptor DD ?
     
     .Code 
     Some Proc ....
     .. 
     .. 
          Push DS
          Push Offset Descriptor
          Call StringRelease
     
          
     Another problem programmers may encounter with Far Strings happens
     when the assigning a zero length string.  Under Near Strings, the
     length portion of the descriptor just zeroed out and BASIC understood.
     Far Strings are not quite so simple, but the concept is similar.
     Though there are several ways to do it, the most reliable is to be to
     point to an actual data area and tell StringAssign that the length is
     zero. StringAssign will then deallocate the old descriptor if there is
     one, and allocate a new one that BASIC recognizes as a zero length
     string.  Again, it is important to let BASIC go through the motions so
     it deallocates the old strings.
     
     As you can see, the conversion of assembler routines from Near Strings
     to Far Strings is not too difficult.  By using the Far String function
     and watching your registers, the transition is almost painless.  Once
     a routine is converted to handle Far Strings, it may be used with
     either BC 7.x compiler option. The Far String functions are
     automatically replaced with near string code when you leave off the
     /FS switch, eliminating the need for two version of the routines for
     BASIC 7 support.  If you need to support QuickBASIC 4.x, you will need
     a second version for near strings only.  
     
     SOURCE CODE FOR THIS ARTICLE CAN BE FOUND IN FARSTRNG.ZIP
     
     ======================================================================
     Jay Munro is a programmer at Crescent Software and also writes
     utilities for PC Magazine in his spare time. He was given the task of
     converting Crescent's QuickPak Pro to use Farstrings, and more
     recently, converting QuickPak Pro for use in Visual Basic.  Jay can be
     contacted through The Crescent Software Support BBS at 203-426-5958,
     or in care of this newsletter.
     ======================================================================