-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- QB CULT MAGAZINE - Issue 3 - May 2000 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- OBJECT ORIENTED BASIC - Possibility or Pipe Dream? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Writer: Unknown
TABLE OF CONTENTS
1.0 Introduction 1.1 Key Terminology and Concepts 2.0 BASIC-Specific Considerations of Object Paradigm Implementation 2.1 Standardization of Terms in Object Oriented BASIC 2.2 An Introduction to Advanced Topics in OOP 3.0 Closing Notes
1.0 Introduction
BASIC has evolved from the time-sharing "Beast of Dartmouth" into a powerful, structured language fit for the programming needs of the nineties. Despite this evolution, however, major software compiler developers have failed to introduce object oriented extensions into the language.
This article will explore some possible extensions to modern BASIC that would truly expand the language. Since, because of its nature, this article will use a speculative approach, the reader should bear in mind that no particular implementation is being suggested as the "best" way to bring object-orientation to BASIC. Moreover, some BASIC programmers may feel that certain low level features such as in-line assembler and more diverse data types should be introduced into the BASIC language before object- orientation is even considered. These readers should remember the theoretical nature of this discussion, and leave all such preferences out of the exploration at hand.
1.1 Key Terminology and Concepts
First, I must define some key terms and concepts. My use of the generic term BASIC (Beginner's All-purpose Symbolic Instruction Code) will, unless otherwise stated, refer to the Microsoft QuickBASIC v4.5 dialect of BASIC, since this represents a widely accepted implemenation of modern, structured BASIC. The term OOP (Object Oriented Programming) will be used to refer to those programming practices that rely on the object paradigm. Although the terminology differs from compiler to compiler, the object oriented paradigm is considered by modern usage to intrinsically encompass the following concepts, to be defined later:
1. Encapsulation 2. Inheritence 3. Polymorphism 4. Overloading
Therefore, when I say that a given concept is "object oriented" I specifically mean that it involves the above four concepts. Other important terms that cannot be ignored in any discussion of OOP, due to their repeated use in the discussion of such are:
5. Class 6. Method (or Member Function) 7. Object (or Class Instance) 8. Information or Data Hiding
Not able to decide which term to define first, I will begin with a general overview fo the underlying philosophy of OOP.
In classical structured programming, data and code are considered separate entities. Code manipulates data. Data fuels code. For example, wanting to implement a graphics font engine, a classical BASIC programmer might read a list of DATA statements into a globally accessible array, and then have a series of globally accessible SUBPROGRAMS manipulate those raster data on the screen in such a way as to produce the desired visual effect. The problem with this approach is that both the data and the related code are equally accessible, and they are only loosely cohesive. Wanting to enhance code written by a colleague, a second programmer will encounter data structures that he should neither modify nor even poll, but that may not always be possible. Having modified an essential data structure, the second programmer may introduce errors into the whole underlying logic of the system.
For instance, suppose the original programmer had defined the font data structure thus:
TYPE FontDataType FontName AS STRING * 12 FontPointSize AS INTEGER RasterDataPtr AS LONG END TYPE
Now, looking at this, Programmer Two decides that he can avoid a FUNCTION call to funGetFontPointSize() by just reading the value of Font.FontPointSize directly. Programmer Two alters his code to access this variable directly, and in doing so, avoids what he considers costly calls to funGetFontPointSize(). He is promoted to another department (presumably for having sped up the code of his predecessor). Enter Programmer Three. Quite within his bounds, he discovers that point size is never greater than 255, and so redefines the whole structure to account for this, thereby reducing overall memory consumption by one byte:
TYPE FontDataType FontName AS STRING * 12 FontPointSize AS STRING * 1 RasterDataPtr AS LONG END TYPE
Of course, being conscientious, he modifies funGetFontPointSize() to account for this innovation. He compiles the program. It crashes. Why? Because, this is now an illegal statement:
Font.FontPointSize = 12
What to do? He must use his search and replace to go through the entire program and change all such instances to:
Font.FontPointSize = CHR$(12)
Or, he can forget his alterations altogether.
In BASIC, there is no INFORMATION HIDING that would prevent such problems from occuring. Since FontPointSize is a public member of FontDataType, Programmer Two was well within his rights to do as he saw fit as far as accessing it. Had the original programmer had an object oriented BASIC, however, he could have prevented the entire problem with little difficulty by making FontPointSize a PRIVATE data member of the CLASS font. This might have looked similar to this:
CLASS FontClass FontName AS PUBLIC STRING * 12 FontPointSize AS PRIVATE INTEGER RasterDataPtr AS PRIVATE LONG funGetFontPointSize AS PUBLIC FUNCTION subSetFontPointSize AS PUBLIC SUB END CLASS
DIM Font AS FontClass
[Please bear with the strange new syntax, since it will be covered in more detail in section 2.0.]
Now, the only way to access Font.FontPointSize is indirectly. This would NOT work, since this data member is now PRIVATE:
Font.FontPointSize = 12
This, then, would be the ONLY way to achieve such a thing:
Font.subSetFontPointSize 12
In the above example, the item Font is what is called a CLASS INSTANCE. That is to say, Font is "an instance of the class FontClass." This is what is commonly called an OBJECT, and it is from this that we arrive at the phrase "object oriented" programming.
Now, when Programmer Two comes along, he CANNOT pull off his stunt, and he is not promoted to another department. Programmer Three comes along, and sees room for improvement and redefines the class thus:
CLASS FontClass FontName AS PUBLIC STRING * 12 FontPointSize AS PRIVATE STRING * 1 RasterDataPtr AS PRIVATE LONG funGetFontPointSize AS PUBLIC FUNCTION subSetFontPointSize AS PUBLIC SUB END CLASS
Since all calls to change FontPointSize are through the centralized subSetFontPointSize, Programmer Three just modifies that a bit, and earns himself a nice raise in salary for shaving a byte off the memory requirements of the structure.
Consider the above example. The data are:
1. FontName 2. FontPointSize
The code portions (called MEMBER FUNCTIONS or METHODS, since they are "methods of acting upon or accessing" the data) are:
1. funGetFontPointSize 2. subSetFontPointSize
Since it is unlikely that subSetFontPointSize will ever be needed for anything other than the setting of FontPointSize, it makes sense to bind the code to the data it works with. This binding is called ENCAPSULATION.
Having examined these more essential terms, there is the issue of OVERLOADING. Although not object oriented in the strictest sense, it does aid in generalizing classes to an extent that they can operate upon different types of data.
Consider the following:
subQuickSort A%()
Now, in classical BASIC programming, if we wanted to sort anything other than INTEGER arrays, we would have to write another SUBPROGRAM and modify the algorithm to account for this new data type. This SUBPROGRAM would have to be named something other than subQuickSort. For example:
subQuickSortSTR A$()
might be used for STRING arrays, and
subQuickSortLONG A&()
might be used for LONG INTEGER arrays. And, of course, should a programmer ever want to sort a user-defined TYPE array:
subQuickSortUserTYPE UserArray()
would be the only way to do it.
But, consider the above. All of these routines do the same thing. It seems a waste to have three names to do what amounts to the same thing: sorting arrays. The answer is to "overload" a SUBPROGRAM name with three corresponding pieces of code. Once subQuickSort is overloaded, it can do tripple-time thus:
subQuickSort A%() subQuickSort A$() subQuickSort UserArray()
Of course, each call invokes DIFFERENT CODE to do the actual sorting, but this detail is handled by the compiler in a transparent fashion. The programmer's only responsibility would be to provide the code for each instance of subQuickSort, in the following manner:
SUB subQuickSort (Array AS INTEGER) | | code to sort INTEGER arrays goes here | END SUB
SUB subQuickSort (Array AS LONG) | | code to sort LONG INTEGER arrays goes here | | END SUB
SUB subQuickSort (Array AS UserDefinedType) | | code to sort arrays of UserDefinedType goes here | | END SUB
Upon seeing the second instance of subQuickSort in the source listing, the object oriented BASIC compiler would know that it is dealing with an overloaded SUBPROGRAM.
Overloading is already done by BASIC compilers, but it is done at a level not within the control of the programmer. Consider:
PRINT a PRINT a$
Each case of PRINT prints a different data type. The PRINT statement, we could say, then, is overloaded. Also to consider is the overloading of operators such as occurs already in BASIC:
A$ = B$ + C$ A% = B% + C%
The addition operator is serving two masters here. In the first case, it is being used to concactenate strings. In the second, it is being used to add two numbers. The processes are internally dissimilar. How, then, does the BASIC compiler contend with these cases? The addition operator is overloaded at an internal level. If a programmer using an object oriented BASIC were to step into the scene, however, we very well might see this type of overloading of the addition and assignment operators:
OVERLOAD "+" FOR ArrayOne(), ArrayTwo() TotalElements = UBOUND(ArrayOne) + UBOUND(ArrayTwo) DIM ReturnArray(TotalElements) FOR i = 1 to UBOUND(ArrayOne) ReturnArray(i) = ArrayOne(i) NEXT i FOR q = i + 1 TO i + UBOUND(ArrayTwo) ReturnArray(q) = ArrayTwo(q-i) NEXT q REDIM ArrayOne(TotalElements)
' The following uses an overloaded assingment operator ' whose overload definition follows. ArrayOne() = ReturnArray() END OVERLOAD
OVERLOAD "=" FOR ArrayOne(), ArrayTwo() FOR i = 1 TO UBOUND(ArrayOne) ArrayOne(i) = ArrayTwo(i) NEXT i END OVERLOAD
This bit of sophisticated operator overloading would allow the programmers to add entire arrays to one another as follows:
NewList() = ListOne() + ListTwo()
For some readers, all this may be a new concept in programming. If it seems hard to understand, please take time to reread this section before continuing, since the next part of this discussion relies on the reader's comprehension of all eight terms pertinent to the object oriented programming paradigm, which are, again:
1. Encapsulation, 2. Inheritence, 3. Polymorphism, 4. Overloading, 5. Class, 6. Method (or Member Function), 7. Object (or Class Instance), 8. Information or Data Hiding.
[Polymorphism has been purposely avoided for the purposes of this discussion, due to its rather esoteric nature.]
2.0 BASIC-Specific Considerations of Object Paradigm Implementation
When considering whether BASIC in its present form could be expanded to include object oriented extensions, we must first look at what is already possible in standard BASIC. For example, the following code resembles inheritence, at least in part:
TYPE ColorType R AS INTEGER G AS INTEGER B AS INTEGER END TYPE
TYPE CoordinateType X AS INTEGER Y AS INTEGER END TYPE
TYPE CircleType Point AS CoordinateType Color AS ColorType Radius AS INTEGER END TYPE
This is not classical inheritence, but the analogy suffices. Looking at the syntactical elements of the above code, we see that a similar structure could easily be adopted for use with CLASS definitions:
CLASS CircleClass Point AS CoordinateType Color AS ColorType Radius AS INTEGER END CLASS
A question arises, however. The above definition of the CircleClass CLASS is not executable code, but merely a definition template. It defines CircleClass, but does not assign a "class instance." That is to say, there are not yet any objects of CircleClass defined in the program. Consider this standard BASIC:
TYPE AddressType Street AS STRING * 10 City AS STRING * 32 State AS STRING * 2 ZIP AS STRING * 12 END TYPE
DIM Envelope AS AddressType
The DIM statement is used to create an instance of a variable called Envelope that is of the user defined type AddressType. It makes perfect sense, then, that the DIM statement could be used in this manner:
CLASS CircleClass Point AS CoordinateType Color AS ColorType Radius AS INTEGER END CLASS
DIM Orb AS CircleClass
(Remember, having DIM serve this double purpose is known as overloading the DIM statement.) This syntax serves our purposes wonderfully, since it does not involve the introduction of completely foreign operators and follows the present syntactical structure of standard BASIC.
Another consideration in the creation of classes is the fact that classes may contain both variables and methods in their definitions, as shown in the introduction:
CLASS FontClass FontName AS PUBLIC STRING * 12 FontPointSize AS PRIVATE INTEGER RasterDataPtr AS PRIVATE LONG funGetFontPointSize AS PUBLIC FUNCTION subSetFontPointSize AS PUBLIC SUB END CLASS
This shows a suggested means of expressing both the scope and the type of each part of the definition. Note, however, that, although subSetFontPointSize is defined in this template, there is, as yet, no code attached to the definition. It is said, in OOP parlance, that the "the scope of the member function is unresolved." The method is prototyped, but that is all. In C++, what is known as the "scope resolution operator" is used to resolve a method, that is, assign executable code to it. This is done as follows:
void FontClass::subSetFontPointSize (int PointSize) { | code to achieve this end goes here | }
Essentially, this translates into the English statement:
"Define funGetFontPoint size of the class FontClass as follows...."
In an attempt to avoid convoluted syntactical introductions into the BASIC language, what follows is a possible solution:
SUB FontClass.subSetFontPointSize (PointSize AS INTEGER) | | code that assigns the point size goes here | | END SUB
Since the compiler would presumably recognize FontClass as being a class from the earlier CLASS ... END CLASS block, this should suffice as a means of resolving the scope of the method subSetFontPointSize, while avoiding the introduction of :: as a new BASIC operator.
Next comes the issue of overloading both keywords and operators. A simple extension of BASIC would allow this to be sufficient in the case of SUBPROGRAMS and FUNCTIONS:
SUB subQuickSort (Array AS STRING) | | END SUB
SUB subQuickSort (Array AS INTEGER) | | END SUB
The second SUB definition would imply overloading. This would be prototyped at the beginning of the source listing thus:
DECLARE SUB subQuickSort (Array AS STRING) DECLARE SUB subQuickSort (Array AS INTEGER)
Operators, however, are completely different in that BASIC has no way of referring to them explicitly. A proposed extension:
OVERLOAD "=" FOR LeftArgument, RightArgument | | definition code goes here | | result returned in LeftArgument | | END OVERLOAD
Of course, the "=" could be any ASCII character or even multiple ASCII characters. This would allow the object oriented BASIC program to do this, for example:
OVERLOAD "**" FOR LeftArgument, RightArgument
' Some langauges use ** for raising to a power LeftArgument = LeftArgument ^ RightArgument
END OVERLOAD
The following, however, would not be possible, since it would involve late binding and interpreted evaluation at run-time:
OVERLOAD Operator$ FOR LeftArgument, RightArgument SELECT CASE Operator$ CASE "**" LeftArgument = LeftArgument ^ RightArgument | | etc. | | END SELECT END OVERLOAD
2.1 Standardization of Terms in Object Oriented BASIC
Before the discussion continues, perhaps it would be wise to step aside to establish a set of standard terms. Since certain OOP concepts carry many different names (ie. "member function" is also "method") a standard way of refering to any particular device should be adopted. But, really, this could become quite involved; what is more appropriate, the term "method" or "member function?" Perhaps, rather than debate too long and hard on the subject, Microsoft's terminology as used for Visual Basic should be adopted:
1. OBJECT rather than "class instance" 2. METHOD rather than "member function" 3. PROPERTY rather than "member variable"
For terms not used by Visual Basic, I suggest the following use by object oriented BASIC:
1. DATA HIDING rather than "information hiding" 2. METHOD DECLARATION rather than "scope resolution" 3. METHOD DECLARATOR rather than "scope resolution operator" 4. OBJECT BINDING rather than "encapsulation" 5. OVERLOADING remains unchanged 6. CLASS remains unchanged
I use these substitutes for the other terms because they have a BASIC sound to them, whereas the other terms, like "scope resolution operator" may sound odd to BASIC programmers. DECLARATOR rings of BASIC's DECLARE statement, thereby reducing the foreigness of the term METHOD DECLARATOR. (In case you have forgotten, the :: is the scope resolution operator in C++, whereas the . is used in this theoretical object oriented BASIC of ours.)
Using this terminology, we have this model:
/ CLASS VectorClass ' This is a CLASS DECLARATION | X AS PRIVATE INTEGER ' This is a PROPERTY of VectorClass O B | Y AS PRIVATE INTEGER ' As is this B I | ' ^^^^^^ J N | ' Use of PRIVATE demonstrates DATA HIDING E D | ' Whereas use of PUBLIC demonstrates the oposite--\ C I | ' | T N | ' /-------------------------------/ G | ' VVVVVV | subSetVector AS PUBLIC SUB ' This is a METHOD \ END CLASS
' This operator is the METHOD DECLARATOR in this context ' | ' V D / SUB VectorClass.subSetVector ( X AS INTEGER, Y AS INTEGER ) E | M C | E L | T A | H R | O A | D T | I | O | N \ END SUB
2.2 An Introduction to Advanced Topics in OOP
To this point, most fundemental concepts of the object oriented paradigm have been examined. The reader should have a concept of class, object binding, method declaration, overloading, and data hiding, and should also understand the essence of how these object oriented extensions may be added to BASIC.
There are other considerations, however. When an object is created, for instance, how is it initialized? That is to say, how are its properties set to appropriate starting values? A typical standard BASIC program might accomplish this thus:
CALL subFontInit()
This is fine, but remember that there can be more than one OBJECT of the same CLASS as in this case:
DIM Helvetica AS FontClass DIM Courier AS FontClass DIM TimesRoman AS FontClass
Now, to initialize the data for each of these, we must do something like this:
CALL subFontHelveticaInit CALL subFontCourierInit etc.
In C++, there is away around this that we can adopt for BASIC use. In every class in C++ there is an implied "constructor." This is a new term. Essentially, the constructor is a method within the class definition that is executed whenever an object is created. For an example of this, consider this method declaration:
SUB FontClass.FontClass | | code to initialize object goes here | | END SUB
(Visual Basic programmers will recognize this as being analogous to the Load_Form event.) Note that the method declaration uses FontClass twice. This informs the compiler that it is dealing with the explicit definition of a CONSTRUCTOR.
In the actual binding declaration of the class, this syntax is suitable:
CLASS FontType | etc. | FontType AS CONSTRUCTOR | etc. | END CLASS
The CONSTRUCTOR type then, signifies that this template will be followed by a method declaration for a constructor. Now, when the programmer includes this code:
DIM Helvetica AS FontType
The compiler will include appropriate initialization routines.
Another aspect of this, the "destructor," is exactly the same, except that it operates after the object falls from scope. (Visual Basic programmers again will note the analagous use of the Form_Unload event.) Destructors deinitialize data, cleaning up things when the program ends execution, for instance. In C++, a special operator is used to indicate the deconstructor: ~FontClass. This use of the tilde is foreign to BASIC, however, so perhaps it would be better to introduce another keyword rather than a new operator:
CLASS FontType | etc. | FontType AS CONSTRUCTOR FontType AS DESTRUCTOR | etc. | END CLASS
Now, the method would simply be declared:
SUB FontType.FontType DESTRUCTOR | | code to deinitialize data structures goes here | | END SUB
This is syntacally familiar to a BASIC programmer in another form:
SUB subPrintToScreen (InText AS STRING) STATIC | | END SUB
The STATIC keyword modifies the nature of the SUBPROGRAM. Consquently, I have suggested the DESTRUCTOR keyword be used in a similar syntactical fashion.
3.0 Closing Notes
Indeed, BASIC has evolved from the time-sharing days of Dartmouth. Despite this evolution, however, major software compiler developers have failed to introduce object oriented extensions into the language. Perhaps this article has introduced some new concepts to the reader, perhaps not. At the very least, it has explored some ways an object oriented paradigm might be introduced successfully into BASIC programming with as little pain possible. Programmers tend to maintain their old programming habbits despite the innovations that come into their languages, and consequently, any major changes to the way BASIC operates may prove to be obstacles rather than useful tools. I feel that my suggestions involve minimal relearning of the syntax of BASIC, since they adopt the flavor of existing structures. In the end, though, the question is not what is the better method or terminology to use, really, but rather:
"Object Oriented BASIC, possibility or pipedream?"