QBCM 3

                   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                   QB CULT MAGAZINE - Issue 3 - May 2000             
                   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
             OBJECT ORIENTED BASIC - Possibility or Pipe Dream?
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Writer: Unknown

TABLE OF CONTENTS

1.0 Introduction
1.1 Key Terminology and Concepts
2.0 BASIC-Specific Considerations of Object Paradigm Implementation
2.1 Standardization of Terms in Object Oriented BASIC
2.2 An Introduction to Advanced Topics in OOP
3.0 Closing Notes

1.0     Introduction

BASIC has evolved from the time-sharing "Beast of Dartmouth" into a
powerful, structured language fit for the programming needs of the
nineties.  Despite this evolution, however, major software compiler
developers have failed to introduce object oriented extensions into
the language.

This article will explore some possible extensions to modern
BASIC that would truly expand the language.  Since, because of its
nature, this article will use a speculative approach, the reader
should bear in mind that no particular implementation is being
suggested as the "best" way to bring object-orientation to
BASIC.  Moreover, some BASIC programmers may feel that certain low
level features such as in-line assembler and more diverse data
types should be introduced into the BASIC language before object-
orientation is even considered.  These readers should remember the
theoretical nature of this discussion, and leave all such
preferences out of the exploration at hand.

1.1     Key Terminology and Concepts

First, I must define some key terms and concepts.  My use of the generic
term BASIC (Beginner's All-purpose Symbolic Instruction Code) will,
unless otherwise stated, refer to the Microsoft QuickBASIC v4.5 dialect
of BASIC, since this represents a widely accepted implemenation of
modern, structured BASIC.  The term OOP (Object Oriented Programming)
will be used to refer to those programming practices that rely on the
object paradigm.  Although the terminology differs from compiler to
compiler, the object oriented paradigm is considered by modern usage to
intrinsically encompass the following concepts, to be defined later:

    1.  Encapsulation
    2.  Inheritence
    3.  Polymorphism
    4.  Overloading

Therefore, when I say that a given concept is "object oriented" I
specifically mean that it involves the above four concepts.
Other important terms that cannot be ignored in any discussion of OOP,
due to their repeated use in the discussion of such are:

    5.  Class
    6.  Method (or Member Function)
    7.  Object (or Class Instance)
    8.  Information or Data Hiding

Not able to decide which term to define first, I will begin with a
general overview fo the underlying philosophy of OOP.

In classical structured programming, data and code are considered
separate entities.  Code manipulates data.  Data fuels code.  For
example, wanting to implement a graphics font engine, a classical
BASIC programmer might read a list of DATA statements into a globally
accessible array, and then have a series of globally accessible
SUBPROGRAMS manipulate those raster data on the screen in such a
way as to produce the desired visual effect.  The problem with this
approach is that both the data and the related code are equally
accessible, and they are only loosely cohesive.  Wanting to enhance
code written by a colleague, a second programmer will encounter
data structures that he should neither modify nor even poll, but
that may not always be possible.  Having modified an essential
data structure, the second programmer may introduce errors
into the whole underlying logic of the system.

For instance, suppose the original programmer had defined the
font data structure thus:

    TYPE FontDataType
        FontName AS STRING * 12
        FontPointSize AS INTEGER
        RasterDataPtr AS LONG
    END TYPE

Now, looking at this, Programmer Two decides that he can avoid a
FUNCTION call to funGetFontPointSize() by just reading the value of
Font.FontPointSize directly.  Programmer Two alters his code to
access this variable directly, and in doing so, avoids what he
considers costly calls to funGetFontPointSize().  He is
promoted to another department (presumably for having sped up the
code of his predecessor).  Enter Programmer Three.  Quite within
his bounds, he discovers that point size is never greater than 255,
and so redefines the whole structure to account for this, thereby
reducing overall memory consumption by one byte:

    TYPE FontDataType
        FontName AS STRING * 12
        FontPointSize AS STRING * 1
        RasterDataPtr AS LONG
    END TYPE

Of course, being conscientious, he modifies funGetFontPointSize()
to account for this innovation.  He compiles the program.  It crashes.
Why?  Because, this is now an illegal statement:

    Font.FontPointSize = 12

What to do?  He must use his search and replace to go through the
entire program and change all such instances to:

    Font.FontPointSize = CHR$(12)

Or, he can forget his alterations altogether.

In BASIC, there is no INFORMATION HIDING that would prevent such
problems from occuring.  Since FontPointSize is a public member
of FontDataType, Programmer Two was well within his rights to do
as he saw fit as far as accessing it.  Had the original programmer
had an object oriented BASIC, however, he could have prevented the
entire problem with little difficulty by making FontPointSize a
PRIVATE data member of the CLASS font.  This might have looked
similar to this:

    CLASS FontClass
        FontName AS PUBLIC STRING * 12
        FontPointSize AS PRIVATE INTEGER
        RasterDataPtr AS PRIVATE LONG
        funGetFontPointSize AS PUBLIC FUNCTION
        subSetFontPointSize AS PUBLIC SUB
    END CLASS

    DIM Font AS FontClass

[Please bear with the strange new syntax, since it will be covered
in more detail in section 2.0.]

Now, the only way to access Font.FontPointSize is indirectly.  This
would NOT work, since this data member is now PRIVATE:

    Font.FontPointSize = 12

This, then, would be the ONLY way to achieve such a thing:

    Font.subSetFontPointSize 12

In the above example, the item Font is what is called a CLASS INSTANCE.
That is to say, Font is "an instance of the class FontClass." This is
what is commonly called an OBJECT, and it is from this that we arrive at
the phrase "object oriented" programming.

Now, when Programmer Two comes along, he CANNOT pull off his stunt,
and he is not promoted to another department.  Programmer Three comes
along, and sees room for improvement and redefines the class thus:

    CLASS FontClass
        FontName AS PUBLIC STRING * 12
        FontPointSize AS PRIVATE STRING * 1
        RasterDataPtr AS PRIVATE LONG
        funGetFontPointSize AS PUBLIC FUNCTION
        subSetFontPointSize AS PUBLIC SUB
    END CLASS

Since all calls to change FontPointSize are through the centralized
subSetFontPointSize, Programmer Three just modifies that a bit, and
earns himself a nice raise in salary for shaving a byte off the
memory requirements of the structure.

Consider the above example.  The data are:

    1. FontName
    2. FontPointSize

The code portions (called MEMBER FUNCTIONS or METHODS, since they
are "methods of acting upon or accessing" the data) are:

    1. funGetFontPointSize
    2. subSetFontPointSize

Since it is unlikely that subSetFontPointSize will ever be needed for
anything other than the setting of FontPointSize, it makes sense to
bind the code to the data it works with.  This binding is called
ENCAPSULATION.

Having examined these more essential terms, there is the issue of
OVERLOADING.  Although not object oriented in the strictest sense,
it does aid in generalizing classes to an extent that they can
operate upon different types of data.

Consider the following:

    subQuickSort A%()

Now, in classical BASIC programming, if we wanted to sort anything
other than INTEGER arrays, we would have to write another SUBPROGRAM
and modify the algorithm to account for this new data type.  This
SUBPROGRAM would have to be named something other than subQuickSort.
For example:

    subQuickSortSTR A$()

might be used for STRING arrays, and

    subQuickSortLONG A&()

might be used for LONG INTEGER arrays.  And, of course, should a
programmer ever want to sort a user-defined TYPE array:

    subQuickSortUserTYPE UserArray()

would be the only way to do it.

But, consider the above.  All of these routines do the same thing.  It
seems a waste to have three names to do what amounts to the same thing:
sorting arrays.  The answer is to "overload" a SUBPROGRAM name with
three corresponding pieces of code.  Once subQuickSort is overloaded, it
can do tripple-time thus:

    subQuickSort A%()
    subQuickSort A$()
    subQuickSort UserArray()

Of course, each call invokes DIFFERENT CODE to do the actual sorting,
but this detail is handled by the compiler in a transparent fashion.
The programmer's only responsibility would be to provide the code for
each instance of subQuickSort, in the following manner:

    SUB subQuickSort (Array AS INTEGER)
        |
        |
        code to sort INTEGER arrays goes here
        |
    END SUB

    SUB subQuickSort (Array AS LONG)
        |
        |
        code to sort LONG INTEGER arrays goes here
        |
        |
    END SUB

    SUB subQuickSort (Array AS UserDefinedType)
        |
        |
        code to sort arrays of UserDefinedType goes here
        |
        |
    END SUB

Upon seeing the second instance of subQuickSort in the source listing,
the object oriented BASIC compiler would know that it is dealing with
an overloaded SUBPROGRAM.

Overloading is already done by BASIC compilers, but it is done at a
level not within the control of the programmer.  Consider:

    PRINT a
    PRINT a$

Each case of PRINT prints a different data type.  The PRINT statement,
we could say, then, is overloaded.  Also to consider is the overloading
of operators such as occurs already in BASIC:

    A$ = B$ + C$
    A% = B% + C%

The addition operator is serving two masters here.  In the first case,
it is being used to concactenate strings.  In the second, it is being
used to add two numbers.  The processes are internally dissimilar.
How, then, does the BASIC compiler contend with these cases?  The
addition operator is overloaded at an internal level.  If a programmer
using an object oriented BASIC were to step into the scene, however,
we very well might see this type of overloading of the addition and
assignment operators:

    OVERLOAD "+" FOR ArrayOne(), ArrayTwo()
        TotalElements = UBOUND(ArrayOne) + UBOUND(ArrayTwo)
        DIM ReturnArray(TotalElements)
        FOR i = 1 to UBOUND(ArrayOne)
            ReturnArray(i) = ArrayOne(i)
        NEXT i
        FOR q = i + 1 TO i + UBOUND(ArrayTwo)
            ReturnArray(q) = ArrayTwo(q-i)
        NEXT q
        REDIM ArrayOne(TotalElements)

        ' The following uses an overloaded assingment operator
        ' whose overload definition follows.
        ArrayOne() = ReturnArray()
    END OVERLOAD

    OVERLOAD "=" FOR ArrayOne(), ArrayTwo()
        FOR i = 1 TO UBOUND(ArrayOne)
            ArrayOne(i) = ArrayTwo(i)
        NEXT i
    END OVERLOAD

This bit of sophisticated operator overloading would allow the
programmers to add entire arrays to one another as follows:

    NewList() = ListOne() + ListTwo()

For some readers, all this may be a new concept in programming.  If
it seems hard to understand, please take time to reread this section
before continuing, since the next part of this discussion relies on
the reader's comprehension of all eight terms pertinent to the object
oriented programming paradigm, which are, again:

    1.  Encapsulation,
    2.  Inheritence,
    3.  Polymorphism,
    4.  Overloading,
    5.  Class,
    6.  Method (or Member Function),
    7.  Object (or Class Instance),
    8.  Information or Data Hiding.

[Polymorphism has been purposely avoided for the purposes of this
discussion, due to its rather esoteric nature.]

2.0     BASIC-Specific Considerations of Object Paradigm Implementation

When considering whether BASIC in its present form could
be expanded to include object oriented extensions, we must first look
at what is already possible in standard BASIC.  For example, the
following code resembles inheritence, at least in part:

    TYPE ColorType
        R AS INTEGER
        G AS INTEGER
        B AS INTEGER
    END TYPE

    TYPE CoordinateType
        X AS INTEGER
        Y AS INTEGER
    END TYPE

    TYPE CircleType
        Point AS CoordinateType
        Color AS ColorType
        Radius AS INTEGER
    END TYPE

This is not classical inheritence, but the analogy suffices.  Looking
at the syntactical elements of the above code, we see that a similar
structure could easily be adopted for use with CLASS definitions:

    CLASS CircleClass
        Point AS CoordinateType
        Color AS ColorType
        Radius AS INTEGER
    END CLASS

A question arises, however.  The above definition of the CircleClass
CLASS is not executable code, but merely a definition template.  It
defines CircleClass, but does not assign a "class instance."  That is
to say, there are not yet any objects of CircleClass defined in the
program.  Consider this standard BASIC:

    TYPE AddressType
        Street AS STRING * 10
        City AS STRING * 32
        State AS STRING * 2
        ZIP AS STRING * 12
    END TYPE

    DIM Envelope AS AddressType

The DIM statement is used to create an instance of a variable
called Envelope that is of the user defined type AddressType.  It
makes perfect sense, then, that the DIM statement could be used
in this manner:

    CLASS CircleClass
        Point AS CoordinateType
        Color AS ColorType
        Radius AS INTEGER
    END CLASS

    DIM Orb AS CircleClass

(Remember, having DIM serve this double purpose is known as
overloading the DIM statement.)  This syntax serves our purposes
wonderfully, since it does not involve the introduction of completely
foreign operators and follows the present syntactical structure of
standard BASIC.

Another consideration in the creation of classes is the fact that
classes may contain both variables and methods in their definitions,
as shown in the introduction:

    CLASS FontClass
        FontName AS PUBLIC STRING * 12
        FontPointSize AS PRIVATE INTEGER
        RasterDataPtr AS PRIVATE LONG
        funGetFontPointSize AS PUBLIC FUNCTION
        subSetFontPointSize AS PUBLIC SUB
    END CLASS

This shows a suggested means of expressing both the scope and the
type of each part of the definition.  Note, however, that, although
subSetFontPointSize is defined in this template, there is, as yet,
no code attached to the definition.  It is said, in OOP parlance, that
the "the scope of the member function is unresolved."  The method is
prototyped, but that is all.  In C++, what is known as the "scope
resolution operator" is used to resolve a method, that is, assign
executable code to it.  This is done as follows:

    void FontClass::subSetFontPointSize (int PointSize)
    {
    |
    code to achieve this end goes here
    |
    }

Essentially, this translates into the English statement:

    "Define funGetFontPoint size of the class FontClass as follows...."

In an attempt to avoid convoluted syntactical introductions into the
BASIC language, what follows is a possible solution:

    SUB FontClass.subSetFontPointSize (PointSize AS INTEGER)
        |
        |
        code that assigns the point size goes here
        |
        |
    END SUB

Since the compiler would presumably recognize FontClass as being a
class from the earlier CLASS ... END CLASS block, this should suffice
as a means of resolving the scope of the method subSetFontPointSize,
while avoiding the introduction of :: as a new BASIC operator.

Next comes the issue of overloading both keywords and operators.  A
simple extension of BASIC would allow this to be sufficient in the
case of SUBPROGRAMS and FUNCTIONS:

    SUB subQuickSort (Array AS STRING)
        |
        |
    END SUB

    SUB subQuickSort (Array AS INTEGER)
        |
        |
    END SUB

The second SUB definition would imply overloading. This would be
prototyped at the beginning of the source listing thus:

    DECLARE SUB subQuickSort (Array AS STRING)
    DECLARE SUB subQuickSort (Array AS INTEGER)

Operators, however, are completely different in that BASIC has
no way of referring to them explicitly.  A proposed extension:

    OVERLOAD "=" FOR LeftArgument, RightArgument
        |
        |
        definition code goes here
        |
        |
        result returned in LeftArgument
        |
        |
    END OVERLOAD

Of course, the "=" could be any ASCII character or even multiple
ASCII characters.  This would allow the object oriented BASIC program
to do this, for example:

    OVERLOAD "**" FOR LeftArgument, RightArgument

        ' Some langauges use ** for raising to a power
        LeftArgument = LeftArgument ^ RightArgument

    END OVERLOAD

The following, however, would not be possible, since it would involve
late binding and interpreted evaluation at run-time:

    OVERLOAD Operator$ FOR LeftArgument, RightArgument
        SELECT CASE Operator$
            CASE "**"
                LeftArgument = LeftArgument ^ RightArgument
            |
            |
            etc.
            |
            |
        END SELECT
    END OVERLOAD

2.1     Standardization of Terms in Object Oriented BASIC

Before the discussion continues, perhaps it would be wise to step
aside to establish a set of standard terms.  Since certain
OOP concepts carry many different names (ie. "member function" is
also "method") a standard way of refering to any particular device
should be adopted.  But, really, this could become quite involved;
what is more appropriate, the term "method" or "member function?"
Perhaps, rather than debate too long and hard on the subject,
Microsoft's terminology as used for Visual Basic should be adopted:

    1.  OBJECT rather than "class instance"
    2.  METHOD rather than "member function"
    3.  PROPERTY rather than "member variable"

For terms not used by Visual Basic, I suggest the following use by
object oriented BASIC:

    1.  DATA HIDING rather than "information hiding"
    2.  METHOD DECLARATION rather than "scope resolution"
    3.  METHOD DECLARATOR rather than "scope resolution operator"
    4.  OBJECT BINDING rather than "encapsulation"
    5.  OVERLOADING remains unchanged
    6.  CLASS remains unchanged

I use these substitutes for the other terms because they have a
BASIC sound to them, whereas the other terms, like "scope resolution
operator" may sound odd to BASIC programmers.  DECLARATOR rings of
BASIC's DECLARE statement, thereby reducing the foreigness of the
term METHOD DECLARATOR.  (In case you have forgotten, the :: is
the scope resolution operator in C++, whereas the . is used in this
theoretical object oriented BASIC of ours.)

Using this terminology, we have this model:

      / CLASS VectorClass  ' This is a CLASS DECLARATION
      |     X AS PRIVATE INTEGER   ' This is a PROPERTY of VectorClass
 O B  |     Y AS PRIVATE INTEGER   ' As is this
 B I  |     '    ^^^^^^
 J N  |     ' Use of PRIVATE demonstrates DATA HIDING
 E D  |     ' Whereas use of PUBLIC demonstrates the oposite--\
 C I  |     '                                                 |
 T N  |     '                 /-------------------------------/
   G  |     '               VVVVVV
      |     subSetVector AS PUBLIC SUB ' This is a METHOD
      \ END CLASS

        '  This operator is the METHOD DECLARATOR in this context
        '              |
        '              V
    D / SUB VectorClass.subSetVector ( X AS INTEGER, Y AS INTEGER )
    E |
 M  C |
 E  L |
 T  A |
 H  R |
 O  A |
 D  T |
    I |
    O |
    N \ END SUB

2.2     An Introduction to Advanced Topics in OOP

To this point, most fundemental concepts of the object oriented
paradigm have been examined.  The reader should have a concept of
class, object binding, method declaration, overloading, and
data hiding, and should also understand the essence of how these
object oriented extensions may be added to BASIC.

There are other considerations, however.  When an object is created,
for instance, how is it initialized?  That is to say, how are its
properties set to appropriate starting values?  A typical standard
BASIC program might accomplish this thus:

    CALL subFontInit()

This is fine, but remember that there can be more than one OBJECT of
the same CLASS as in this case:

    DIM Helvetica AS FontClass
    DIM Courier AS FontClass
    DIM TimesRoman AS FontClass

Now, to initialize the data for each of these, we must do something
like this:

    CALL subFontHelveticaInit
    CALL subFontCourierInit
    etc.

In C++, there is away around this that we can adopt for BASIC use.
In every class in C++ there is an implied "constructor."  This is
a new term.  Essentially, the constructor is a method within the
class definition that is executed whenever an object is created.
For an example of this, consider this method declaration:

    SUB FontClass.FontClass
        |
        |
        code to initialize object goes here
        |
        |
    END SUB

(Visual Basic programmers will recognize this as being analogous to
the Load_Form event.)  Note that the method declaration uses FontClass
twice.  This informs the compiler that it is dealing with the explicit
definition of a CONSTRUCTOR.

In the actual binding declaration of the class, this syntax is
suitable:

    CLASS FontType
        |
        etc.
        |
        FontType AS CONSTRUCTOR
        |
        etc.
        |
    END CLASS

The CONSTRUCTOR type then, signifies that this template will be
followed by a method declaration for a constructor.  Now, when the
programmer includes this code:

    DIM Helvetica AS FontType

The compiler will include appropriate initialization routines.

Another aspect of this, the "destructor," is exactly the same, except
that it operates after the object falls from scope.  (Visual Basic
programmers again will note the analagous use of the Form_Unload event.)
Destructors deinitialize data, cleaning up things when the program ends
execution, for instance.  In C++, a special operator is used to indicate
the deconstructor: ~FontClass.  This use of the tilde is foreign to
BASIC, however, so perhaps it would be better to introduce another
keyword rather than a new operator:

    CLASS FontType
        |
        etc.
        |
        FontType AS CONSTRUCTOR
        FontType AS DESTRUCTOR
        |
        etc.
        |
    END CLASS

Now, the method would simply be declared:

    SUB FontType.FontType DESTRUCTOR
        |
        |
        code to deinitialize data structures goes here
        |
        |
    END SUB

This is syntacally familiar to a BASIC programmer in another form:

    SUB subPrintToScreen (InText AS STRING) STATIC
        |
        |
    END SUB

The STATIC keyword modifies the nature of the SUBPROGRAM.  Consquently,
I have suggested the DESTRUCTOR keyword be used in a similar syntactical
fashion.

3.0     Closing Notes

Indeed, BASIC has evolved from the time-sharing days of Dartmouth.
Despite this evolution, however, major software compiler developers
have failed to introduce object oriented extensions into the language.
Perhaps this article has introduced some new concepts to the reader,
perhaps not.  At the very least, it has explored some ways
an object oriented paradigm might be introduced successfully into
BASIC programming with as little pain possible.  Programmers tend to
maintain their old programming habbits despite the innovations that
come into their languages, and consequently, any major changes to
the way BASIC operates may prove to be obstacles rather than useful
tools.  I feel that my suggestions involve minimal relearning of the
syntax of BASIC, since they adopt the flavor of existing structures.
In the end, though, the question is not what is the better method
or terminology to use, really, but rather:

    "Object Oriented BASIC, possibility or pipedream?"