Unions
By Richard D. Clark

This isnt an article about the AFL-CIO, but about
those strange data structures called unions. Unions
are an efficient and powerful way to organize data,
but strangely enough, I dont see many people using
them. This article defines a union, describes the
benefits and shows how to implement them in your own
programs.

You can think of a union as a type-def that can hold
different data within a single data segment. The size
of a union is the size of the largest data item in the
union definition. If you define a union with a 4-byte
element and an 8-byte element, the size of the union
is 8 bytes. Unlike a type-def though, a union can only
hold a single data item at any given time. This may
not sound very efficient or useful, but it is a
powerful way to organize different types of data using
a single data object.

A union is efficient. Using the above definition, if
you use a union to hold a 4 byte data element, it may
sound like you are wasting 4 bytes, but in fact you
are gaining 4 bytes. Without a union, you would have
to allocate a 4-byte and an 8-byte data item, using 12
bytes total. By using a union, the 4-byte data element
fits within the 8-byte data segment of the union; in
other words you are using the same space for the
4-byte data element and the 8-byte data element. You
are saving 4 bytes by not having to allocate an extra
4 bytes for the 4-byte data element.

While efficiency in a program is always welcome, the
real power of a union derives from the fact that at
any given instance, a union can only hold one data
item at a time. This may sound like a contradictory
statement, but it is a way to manage complexity in a
program but grouping disparate data items within a
single container. A real-world example will illustrate
this concept.

In my program Deep Deadly Dungeons (DDD), a rogue-like
game, I needed a way to contain and manipulate a
player characters inventory. The inventory items
include rations, weapons, ammo, armor, potions and
scrolls. Each inventory item is unique; each has
different properties that must be managed. At a first
approximation, it would appear that a lot of coding
would be required to manage all of these properties
because the items are so different. Handling each one
as a separate data structure would require routines
for each item, increasing the complexity of the
program.

It would be better to have a single data structure
that would represent a general inventory object, and a
single set of routines to manage that general
inventory object. You can create such an object by
using a union, and once you have this general-purpose
object, you can create a set of routines to manipulate
that single object, thereby reducing the complexity of
the program. Here is how I defined my inventory object
in DDD, showing only the weapon and armor type-def
elements. The other elements are defined in a similar
manner.

type armortype
   id as integer
   eval As Integer
   ac As Integer
   defmod As Integer
   defmagic As Integer
   defmagicmod As Integer
   strg As Integer
   cursed As Integer
   dr As Integer
   noise As Integer
End Type

type weapontype
   id as integer
   eval as integer
   hands as integer
   strg As Integer
   tohitmod As Integer
   damage as integer
   dammod as integer
   cursed as integer
   dr As Integer
   skill As Integer
   noise As Integer
End Type

type invtype
   typeid As Integer
   desc As String * 20
   union
       light as lighttype
       supply as supplytype
       ammo As ammotype
       necklace as necklacetype
       ring as ringtype
       wand as wandtype
       potion as potiontype
       scroll as scrolltype
       weapon as weapontype
       armor as armortype
       shield as shieldtype
   end union
end type

Notice that the union is inside a type-def. The union
part of this type describes a single inventory item.
Since only a single instance of an inventory item
resides in a union, how do you determine what is in
the union? My method was to use a typeid, which is an
integer value that indicates what the union is
holding. The id is defined within the type-def, and
not the union, so it is available to each inventory
item. The following defines describe the different
inventory item classes.

item class ids
#Define supplies 1
#Define necklaces 2
#Define potions 3
#Define rings 4
#Define wands 5
#Define weapons 6
#Define armors 7
#Define lights 8
#Define ammos 9
#Define shields 10
#Define scrolls 11

If the typeid is 6, then I know that the union holds a
weapon inventory item. If the typeid is 7 then I know
that the inventory item is armor. I can now write a
set of routines to manipulate the inventory object.
Since each routine operates on a single object, I can
create a high-level set of inventory methods that will
work on any type of inventory item. For example, to
generate an inventory item, I use the following
subroutine.

sub SetItem(item as integer, inv as invtype)
   dim cursed as integer

   cursed = GetPercentage(10)
   with inv
       select case .typeid
           case supplies
               .supply.id = item
               GenSupply inv
           case necklaces
               .necklace.id = item
               GenNecklace inv
           case potions
               .potion.id = item
               GenPotion inv
           case rings
               .ring.id = item
               GenRing inv
           case wands
               .wand.id = item
               GenWand inv
           case weapons
               .weapon.id = item
               GenWeapon inv, cursed
           Case armors
               .armor.id = item
               GenArmor inv, cursed
           case lights
               .light.id = item
               GenLight inv
           Case ammos
               .ammo.id = item
               GenAmmo inv
           case shields
               .shield.id = item
               GenShield inv, cursed
           case scrolls
               .scroll.id = item
               GenSCroll inv
       end select
   end with
end sub

This subroutine sets the passed inventory object (inv
as invtype) to the appropriate inventory item based on
the passed item id and the typeid of the inventory
object. Here I have a single method to handle any type
of inventory item. If my inventory items were separate
type-defs, I would have to create a SetItem for each
inventory item type. By using a union I can create a
single method that will handle whatever type I send
it.

The preceding code shows an example of the use of a
union in a real-world program. To illustrate how to
create and use a union, lets create a small,
non-trivial program that implements a simple variant
data type. Our variant can either be a string or an
integer, and we will add methods to set the value of
the variant, to add two variants together and to print
a variant.

First, lets define our union that will be our variant
type.

'define our data type classes
#Define isnull 0
#Define isstring 1
#Define isinteger 2

'define our union type
Type vtype
   id As Integer
   Union
       sdata As String
       idata As Integer
   End Union
End Type

The #defines indicate what type of value is in the
variant. Notice that there are three defines, one for
NULL, or no value, one for string data and one for
integer data. This completely covers all the various
states that our union may be in any given instance. In
our vtype, we have defined an id field that we will
use to indicate what data type we are holding; isnull,
isstring or isinteger. All of our routines will
examine this id field to determine what actions to
take. The actual data is stored in the union, and can
either be a string, sdata or an integer, idata.
Notice, that we dont have a NULL entry in our union.
If our variant is NULL, that is id = isnull, then we
simply ignore any values that are contained within the
union, since NULL means no meaningful value.

To use our new variant type, we simply create one or
more variables of type vtype.

'create some variant data
Dim As vtype v1, v2, v3, v4

In order to use a variant, we must initialize it with
data. This means that not only do we have to load the
actual data item into the union portion of the
type-def, we have to indicate what type of data is
being stored in the union. The easiest way to do this
is to create a set of methods that will set the id
flag and load the data. In FreeBasic we can use the
overload keyword to make the job of setting our
variant very easy.

'define our set function using overload
Declare Sub SetV Overload (idata As Integer, v As
vtype)
Declare Sub SetV (sdata As String, v As vtype)

'create the integer set
Sub SetV(idata As Integer, v As vtype)
   v.id = isinteger
   v.idata = idata
End Sub

'create the string set
Sub SetV(sdata As String, v As vtype)
   v.id = isstring
   v.sdata = sdata
End Sub

Here we declare our set method, SetV as an overloaded
subroutine that can take either an integer or a
string. We then write the actual subroutines to
initialize variant for each type of data being stored.
If we are passing an integer to SetV, then the id is
set to isinteger and the data is stored in the idata
field. If the passed value is a string, then the id is
set to isstring and the data is stored in the sdata
field of the union. We simply call Setv with the
appropriate arguments. Since we have overloaded the
subroutine, the complier will use the correct
subroutine based on the data being passed.

'create an integer type
SetV 10, v1

After this call, v1.id = isinteger and v1.idata = 10.

'create a string type
SetV "10", v2

After this call, v2.id = isstring and v2.sdata = 10.

But how do we set the variant to be NULL? Since a NULL
means no meaningful value, simply setting the id to
isnull is enough to indicate that the value of the
variant is NULL.

'create a null type
v3.id = isnull

Next we need to display the variant. Our PrintV method
prints out the value of the variant based on the id
value.

Sub PrintV(v As vtype)
   'print out data based on type
   If v.id = isstring Then
       Print v.sdata
   ElseIf v.id = isinteger Then
       Print v.idata
   Else
       'just print a null string for null
       Print
   End If
End Sub

Here we simply examine the id and print out the
appropriate data field in the union. So to print out
v1 to the screen, we would use the following code
snippet.

PrintV v1

Finally, we need a way to add two variants together.
This is where it gets a bit tricky, because we
potentially are dealing with two different data types
at the same time. That is, we may need to add a string
and integer together; how do we define what the result
will be? In order for this to work, we will need to
come up with some rules for handling the different
data types.

1. If both values are of the same type, simple add
them together. If both are strings, concatenate them,
if both are integers add them.
2. If one is a string and the other is an integer
then:
2.1 If the string can be converted to an integer, then
convert the string to an integer and add the two
together.
2.2 If the string cannot be converted to an integer,
then convert the integer to a string and concatenate
them together.
3. If one or both are NULL, return a NULL.

That covers all the possible outcomes using our
variants. Keep in mind that these rules are completely
arbitrary. This is simply how I define our variants to
behave. You may want to define the behavior
differently. With these rules in mind, we can code the
AddV method.

Sub AddV(v1 As vtype, v2 As vtype, vret As vtype)
   Dim vtmp As Integer

   'init our return value to null
    vret.id = isnull
    'check to see if both values are strings
    If v1.id = isstring And v2.id = isstring Then
       vret.id = isstring 'set the id type
       vret.sdata = v1.sdata + v2.sdata
    End If
    'check to see if both values are integers
    If v1.id = isinteger And v2.id = isinteger Then
       vret.id = isinteger 'set the id type
       vret.idata = v1.idata + v2.idata
    End If
    'check for string - integer combination
    If v1.id = isstring And v2.id = isinteger Then
       'check to see if string can be converted to
integer
       vtmp = Val(v1.sdata)
       'successful conversion so add as integers
       If vtmp > 0 Then
           vret.id = isinteger 'set the id type
           vret.idata = vtmp + v2.idata
       Else
           'can't convert to integer so convert
integer to string
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + Str$(v2.idata)
       End If
    End If
    'check for integer - string combo
    If v1.id = isinteger And v2.id = isstring Then
       'check to see if string can be converted to
integer
       vtmp = Val(v2.sdata)
       'successful conversion so add as integers
       If vtmp > 0 Then
           vret.id = isinteger 'set the id type
           vret.idata = vtmp + v1.idata
       Else
           'can't convert to integer so convert
integer to string
           vret.id = isstring 'set the id type
           vret.sdata = Str$(v1.idata) + v2.sdata
       End If
    End If
    'if one or both values are null, return null.
    If v1.id = isnull Or v2.id = isnull Then
       vret.id = isnull
    End If
End Sub

This subroutine checks each id of the first and second
parameters and then chooses the appropriate operation,
returning the result in the third parameter. I coded
this the long way to make it clear that you have to
check each parameter combination; that is,
string-string, integer-integer, string-integer or
integer-string, and finally if either value is NULL,
we simply return a NULL. Lets examine the
string-string combo in detail.

    'check to see if both values are strings
    If v1.id = isstring And v2.id = isstring Then
       vret.id = isstring 'set the id type
       vret.sdata = v1.sdata + v2.sdata
    End If

Here we check the ids of v1 and v2. Since both are
strings, we are going to do a concatenate operation,
saving the string data in vret.sdata and indicating
that we have a string value with vret.id = isstring.
Lets look at the string-integer combination.

    'check for string - integer combination
    If v1.id = isstring And v2.id = isinteger Then
       'check to see if string can be converted to
integer
       vtmp = Val(v1.sdata)
       'successful conversion so add as integers
       If vtmp > 0 Then
           vret.id = isinteger 'set the id type
           vret.idata = vtmp + v2.idata
       Else
           'can't convert to integer so convert
integer to string
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + Str$(v2.idata)
       End If
    End If

Here we have a string and integer. We check to see if
the string can be converted to a number using
Val(v1.sdata). If vtmp is greater than 0, then we will
add the value in vtmp to v2.idata and set vret as an
integer type. If vtmp is 0, meaning Val() could not
convert the string to a number, then we convert
v2.idata to a string, Str$(v2.idata), and set vret to
a string type.

To call Addv we simple pass the two data items, and
the return value.

Addv v2, v1, v4

The two operands v2 and v1 are added together and
returned in v4.

Here is the complete program.

Option Explicit
'compiled using Freebasic .14b

'define our data type classes
#Define isnull 0
#Define isstring 1
#Define isinteger 2

'define our union type
Type vtype
   id As Integer
   Union
       sdata As String
       idata As Integer
   End Union
End Type

'define our set function using overload
Declare Sub SetV Overload (idata As Integer, v As
vtype)
Declare Sub SetV (sdata As String, v As vtype)

'create the integer set
Sub SetV(idata As Integer, v As vtype)
   v.id = isinteger
   v.idata = idata
End Sub

'create the string set
Sub SetV(sdata As String, v As vtype)
   v.id = isstring
   v.sdata = sdata
End Sub

'Define our method to add two variants using the
following rules:
'1 If both are the same type, add them and return new
value.
'2 If one is an integer and one is a string, convert
string
'  to integer if it can be represented as an integer,
otherwise
'  convert integer to string and append v2 to v1.
'3 If one or both values are null return null.
Sub AddV(v1 As vtype, v2 As vtype, vret As vtype)
   Dim vtmp As Integer

   'init our return value to null
    vret.id = isnull
    'check to see if both values are strings
    If v1.id = isstring And v2.id = isstring Then
       vret.id = isstring 'set the id type
       vret.sdata = v1.sdata + v2.sdata
    End If
    'check to see if both values are integers
    If v1.id = isinteger And v2.id = isinteger Then
       vret.id = isinteger 'set the id type
       vret.idata = v1.idata + v2.idata
    End If
    'check for string - integer combination
    If v1.id = isstring And v2.id = isinteger Then
       'check to see if string can be converted to
integer
       vtmp = Val(v1.sdata)
       'successful conversion so add as integers
       If vtmp > 0 Then
           vret.id = isinteger 'set the id type
           vret.idata = vtmp + v2.idata
       Else
           'can't convert to integer so convert
integer to string
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + Str$(v2.idata)
       End If
    End If
    'check for integer - string combo
    If v1.id = isinteger And v2.id = isstring Then
       'check to see if string can be converted to
integer
       vtmp = Val(v2.sdata)
       'successful conversion so add as integers
       If vtmp > 0 Then
           vret.id = isinteger 'set the id type
           vret.idata = vtmp + v1.idata
       Else
           'can't convert to integer so convert
integer to string
           vret.id = isstring 'set the id type
           vret.sdata = Str$(v1.idata) + v2.sdata
       End If
    End If
    'if one or both values are null, return null.
    If v1.id = isnull Or v2.id = isnull Then
       vret.id = isnull
    End If
End Sub

'define our print routine
Sub PrintV(v As vtype)
   'print out data based on type
   If v.id = isstring Then
       Print v.sdata
   ElseIf v.id = isinteger Then
       Print v.idata
   Else
       'just print a null string for null
       Print
   End If
End Sub

'create some variant data type
Dim As vtype v1, v2, v3, v4

'create an integer type
SetV 10, v1
Print "Set integer: ",
PrintV v1
'create a string type
SetV "10", v2
Print "Set string: ",
PrintV v2
'create a null type
v3.id = isnull
Print "Set null: ",
PrintV v3
Print

'add two integers together
Addv v1, v1, v4
Print "Adding 2 integers:",
PrintV v4

'add two strings together
Addv v2, v2, v4
Print "Adding 2 strings:",
PrintV v4

'add string and integer
Addv v1, v2, v4
Print "Adding string, integer:",
PrintV v4

'add integer and string
Addv v2, v1, v4
Print "Adding integer, string:",
PrintV v4
Print

'adding null to integer
Addv v3, v1, v4
Print "Adding null, integer:",
PrintV v4

'adding integer to null
Addv v1, v3, v4
Print "Adding integer, null:",
PrintV v4

'adding null to string
Addv v3, v2, v4
Print "Adding null, string:",
PrintV v4

'adding string to null
Addv v2, v3, v4
Print "Adding string, null:",
PrintV v4

'adding null to null
Addv v3, v3, v4
Print "Adding null, null:",
PrintV v4
Print

'create a new string type
SetV "This is a string", v2
Print "Set string: ";
PrintV v2
Print

'add string and integer
Addv v2, v1, v4
Print "Adding new string, integer:",
PrintV v4

'add integer and string
Addv v1, v2, v4
Print "Adding integer, new string:",
PrintV v4


Print
Print
Print "Press any key"
Sleep



And here is the output.

Set integer:    10
Set string:   10
Set null:

Adding 2 integers:           20
Adding 2 strings:           1010
Adding string, integer:      20
Adding integer, string:      20

Adding null, integer:
Adding integer, null:
Adding null, string:
Adding string, null:
Adding null, null:

Set string: This is a string

Adding new string, integer: This is a string10
Adding integer, new string: 10This is a string


Press any key

Unions are a way to reduce complexity in a program, by
enabling the programmer to package different types of
data within a single object, and create a single set
of methods to manage the data contained within the
object.

--end--


Rick Clark