Unions By Richard D. Clark This isn’t an article about the AFL-CIO, but about those strange data structures called unions. Unions are an efficient and powerful way to organize data, but strangely enough, I don’t see many people using them. This article defines a union, describes the benefits and shows how to implement them in your own programs. You can think of a union as a type-def that can hold different data within a single data segment. The size of a union is the size of the largest data item in the union definition. If you define a union with a 4-byte element and an 8-byte element, the size of the union is 8 bytes. Unlike a type-def though, a union can only hold a single data item at any given time. This may not sound very efficient or useful, but it is a powerful way to organize different types of data using a single data object. A union is efficient. Using the above definition, if you use a union to hold a 4 byte data element, it may sound like you are wasting 4 bytes, but in fact you are gaining 4 bytes. Without a union, you would have to allocate a 4-byte and an 8-byte data item, using 12 bytes total. By using a union, the 4-byte data element fits within the 8-byte data segment of the union; in other words you are using the same space for the 4-byte data element and the 8-byte data element. You are saving 4 bytes by not having to allocate an extra 4 bytes for the 4-byte data element. While efficiency in a program is always welcome, the real power of a union derives from the fact that at any given instance, a union can only hold one data item at a time. This may sound like a contradictory statement, but it is a way to manage complexity in a program but grouping disparate data items within a single container. A real-world example will illustrate this concept. In my program Deep Deadly Dungeons (DDD), a rogue-like game, I needed a way to contain and manipulate a player character’s inventory. The inventory items include rations, weapons, ammo, armor, potions and scrolls. Each inventory item is unique; each has different properties that must be managed. At a first approximation, it would appear that a lot of coding would be required to manage all of these properties because the items are so different. Handling each one as a separate data structure would require routines for each item, increasing the complexity of the program. It would be better to have a single data structure that would represent a general inventory object, and a single set of routines to manage that general inventory object. You can create such an object by using a union, and once you have this general-purpose object, you can create a set of routines to manipulate that single object, thereby reducing the complexity of the program. Here is how I defined my inventory object in DDD, showing only the weapon and armor type-def elements. The other elements are defined in a similar manner. type armortype id as integer eval As Integer ac As Integer defmod As Integer defmagic As Integer defmagicmod As Integer strg As Integer cursed As Integer dr As Integer noise As Integer End Type type weapontype id as integer eval as integer hands as integer strg As Integer tohitmod As Integer damage as integer dammod as integer cursed as integer dr As Integer skill As Integer noise As Integer End Type type invtype typeid As Integer desc As String * 20 union light as lighttype supply as supplytype ammo As ammotype necklace as necklacetype ring as ringtype wand as wandtype potion as potiontype scroll as scrolltype weapon as weapontype armor as armortype shield as shieldtype end union end type Notice that the union is inside a type-def. The union part of this type describes a single inventory item. Since only a single instance of an inventory item resides in a union, how do you determine what is in the union? My method was to use a typeid, which is an integer value that indicates what the union is holding. The id is defined within the type-def, and not the union, so it is available to each inventory item. The following defines describe the different inventory item classes. ‘item class ids #Define supplies 1 #Define necklaces 2 #Define potions 3 #Define rings 4 #Define wands 5 #Define weapons 6 #Define armors 7 #Define lights 8 #Define ammos 9 #Define shields 10 #Define scrolls 11 If the typeid is 6, then I know that the union holds a weapon inventory item. If the typeid is 7 then I know that the inventory item is armor. I can now write a set of routines to manipulate the inventory object. Since each routine operates on a single object, I can create a high-level set of inventory methods that will work on any type of inventory item. For example, to generate an inventory item, I use the following subroutine. sub SetItem(item as integer, inv as invtype) dim cursed as integer cursed = GetPercentage(10) with inv select case .typeid case supplies .supply.id = item GenSupply inv case necklaces .necklace.id = item GenNecklace inv case potions .potion.id = item GenPotion inv case rings .ring.id = item GenRing inv case wands .wand.id = item GenWand inv case weapons .weapon.id = item GenWeapon inv, cursed Case armors .armor.id = item GenArmor inv, cursed case lights .light.id = item GenLight inv Case ammos .ammo.id = item GenAmmo inv case shields .shield.id = item GenShield inv, cursed case scrolls .scroll.id = item GenSCroll inv end select end with end sub This subroutine sets the passed inventory object (inv as invtype) to the appropriate inventory item based on the passed item id and the typeid of the inventory object. Here I have a single method to handle any type of inventory item. If my inventory items were separate type-defs, I would have to create a SetItem for each inventory item type. By using a union I can create a single method that will handle whatever type I send it. The preceding code shows an example of the use of a union in a real-world program. To illustrate how to create and use a union, let’s create a small, non-trivial program that implements a simple variant data type. Our variant can either be a string or an integer, and we will add methods to set the value of the variant, to add two variants together and to print a variant. First, let’s define our union that will be our variant type. 'define our data type classes #Define isnull 0 #Define isstring 1 #Define isinteger 2 'define our union type Type vtype id As Integer Union sdata As String idata As Integer End Union End Type The #defines indicate what type of value is in the variant. Notice that there are three defines, one for NULL, or no value, one for string data and one for integer data. This completely covers all the various states that our union may be in any given instance. In our vtype, we have defined an id field that we will use to indicate what data type we are holding; isnull, isstring or isinteger. All of our routines will examine this id field to determine what actions to take. The actual data is stored in the union, and can either be a string, sdata or an integer, idata. Notice, that we don’t have a NULL entry in our union. If our variant is NULL, that is id = isnull, then we simply ignore any values that are contained within the union, since NULL means “no meaningful value”. To use our new variant type, we simply create one or more variables of type vtype. 'create some variant data Dim As vtype v1, v2, v3, v4 In order to use a variant, we must initialize it with data. This means that not only do we have to load the actual data item into the union portion of the type-def, we have to indicate what type of data is being stored in the union. The easiest way to do this is to create a set of methods that will set the id flag and load the data. In FreeBasic we can use the overload keyword to make the job of setting our variant very easy. 'define our set function using overload Declare Sub SetV Overload (idata As Integer, v As vtype) Declare Sub SetV (sdata As String, v As vtype) 'create the integer set Sub SetV(idata As Integer, v As vtype) v.id = isinteger v.idata = idata End Sub 'create the string set Sub SetV(sdata As String, v As vtype) v.id = isstring v.sdata = sdata End Sub Here we declare our set method, SetV as an overloaded subroutine that can take either an integer or a string. We then write the actual subroutines to initialize variant for each type of data being stored. If we are passing an integer to SetV, then the id is set to isinteger and the data is stored in the idata field. If the passed value is a string, then the id is set to isstring and the data is stored in the sdata field of the union. We simply call Setv with the appropriate arguments. Since we have overloaded the subroutine, the complier will use the correct subroutine based on the data being passed. 'create an integer type SetV 10, v1 After this call, v1.id = isinteger and v1.idata = 10. 'create a string type SetV "10", v2 After this call, v2.id = isstring and v2.sdata = “10”. But how do we set the variant to be NULL? Since a NULL means no meaningful value, simply setting the id to isnull is enough to indicate that the value of the variant is NULL. 'create a null type v3.id = isnull Next we need to display the variant. Our PrintV method prints out the value of the variant based on the id value. Sub PrintV(v As vtype) 'print out data based on type If v.id = isstring Then Print v.sdata ElseIf v.id = isinteger Then Print v.idata Else 'just print a null string for null Print End If End Sub Here we simply examine the id and print out the appropriate data field in the union. So to print out v1 to the screen, we would use the following code snippet. PrintV v1 Finally, we need a way to add two variants together. This is where it gets a bit tricky, because we potentially are dealing with two different data types at the same time. That is, we may need to add a string and integer together; how do we define what the result will be? In order for this to work, we will need to come up with some rules for handling the different data types. 1. If both values are of the same type, simple add them together. If both are strings, concatenate them, if both are integers add them. 2. If one is a string and the other is an integer then: 2.1 If the string can be converted to an integer, then convert the string to an integer and add the two together. 2.2 If the string cannot be converted to an integer, then convert the integer to a string and concatenate them together. 3. If one or both are NULL, return a NULL. That covers all the possible outcomes using our variants. Keep in mind that these rules are completely arbitrary. This is simply how I define our variants to behave. You may want to define the behavior differently. With these rules in mind, we can code the AddV method. Sub AddV(v1 As vtype, v2 As vtype, vret As vtype) Dim vtmp As Integer 'init our return value to null vret.id = isnull 'check to see if both values are strings If v1.id = isstring And v2.id = isstring Then vret.id = isstring 'set the id type vret.sdata = v1.sdata + v2.sdata End If 'check to see if both values are integers If v1.id = isinteger And v2.id = isinteger Then vret.id = isinteger 'set the id type vret.idata = v1.idata + v2.idata End If 'check for string - integer combination If v1.id = isstring And v2.id = isinteger Then 'check to see if string can be converted to integer vtmp = Val(v1.sdata) 'successful conversion so add as integers If vtmp > 0 Then vret.id = isinteger 'set the id type vret.idata = vtmp + v2.idata Else 'can't convert to integer so convert integer to string vret.id = isstring 'set the id type vret.sdata = v1.sdata + Str$(v2.idata) End If End If 'check for integer - string combo If v1.id = isinteger And v2.id = isstring Then 'check to see if string can be converted to integer vtmp = Val(v2.sdata) 'successful conversion so add as integers If vtmp > 0 Then vret.id = isinteger 'set the id type vret.idata = vtmp + v1.idata Else 'can't convert to integer so convert integer to string vret.id = isstring 'set the id type vret.sdata = Str$(v1.idata) + v2.sdata End If End If 'if one or both values are null, return null. If v1.id = isnull Or v2.id = isnull Then vret.id = isnull End If End Sub This subroutine checks each id of the first and second parameters and then chooses the appropriate operation, returning the result in the third parameter. I coded this the “long way” to make it clear that you have to check each parameter combination; that is, string-string, integer-integer, string-integer or integer-string, and finally if either value is NULL, we simply return a NULL. Lets examine the string-string combo in detail. 'check to see if both values are strings If v1.id = isstring And v2.id = isstring Then vret.id = isstring 'set the id type vret.sdata = v1.sdata + v2.sdata End If Here we check the ids of v1 and v2. Since both are strings, we are going to do a concatenate operation, saving the string data in vret.sdata and indicating that we have a string value with vret.id = isstring. Let’s look at the string-integer combination. 'check for string - integer combination If v1.id = isstring And v2.id = isinteger Then 'check to see if string can be converted to integer vtmp = Val(v1.sdata) 'successful conversion so add as integers If vtmp > 0 Then vret.id = isinteger 'set the id type vret.idata = vtmp + v2.idata Else 'can't convert to integer so convert integer to string vret.id = isstring 'set the id type vret.sdata = v1.sdata + Str$(v2.idata) End If End If Here we have a string and integer. We check to see if the string can be converted to a number using Val(v1.sdata). If vtmp is greater than 0, then we will add the value in vtmp to v2.idata and set vret as an integer type. If vtmp is 0, meaning Val() could not convert the string to a number, then we convert v2.idata to a string, Str$(v2.idata), and set vret to a string type. To call Addv we simple pass the two data items, and the return value. Addv v2, v1, v4 The two operands v2 and v1 are added together and returned in v4. Here is the complete program. Option Explicit 'compiled using Freebasic .14b 'define our data type classes #Define isnull 0 #Define isstring 1 #Define isinteger 2 'define our union type Type vtype id As Integer Union sdata As String idata As Integer End Union End Type 'define our set function using overload Declare Sub SetV Overload (idata As Integer, v As vtype) Declare Sub SetV (sdata As String, v As vtype) 'create the integer set Sub SetV(idata As Integer, v As vtype) v.id = isinteger v.idata = idata End Sub 'create the string set Sub SetV(sdata As String, v As vtype) v.id = isstring v.sdata = sdata End Sub 'Define our method to add two variants using the following rules: '1 If both are the same type, add them and return new value. '2 If one is an integer and one is a string, convert string ' to integer if it can be represented as an integer, otherwise ' convert integer to string and append v2 to v1. '3 If one or both values are null return null. Sub AddV(v1 As vtype, v2 As vtype, vret As vtype) Dim vtmp As Integer 'init our return value to null vret.id = isnull 'check to see if both values are strings If v1.id = isstring And v2.id = isstring Then vret.id = isstring 'set the id type vret.sdata = v1.sdata + v2.sdata End If 'check to see if both values are integers If v1.id = isinteger And v2.id = isinteger Then vret.id = isinteger 'set the id type vret.idata = v1.idata + v2.idata End If 'check for string - integer combination If v1.id = isstring And v2.id = isinteger Then 'check to see if string can be converted to integer vtmp = Val(v1.sdata) 'successful conversion so add as integers If vtmp > 0 Then vret.id = isinteger 'set the id type vret.idata = vtmp + v2.idata Else 'can't convert to integer so convert integer to string vret.id = isstring 'set the id type vret.sdata = v1.sdata + Str$(v2.idata) End If End If 'check for integer - string combo If v1.id = isinteger And v2.id = isstring Then 'check to see if string can be converted to integer vtmp = Val(v2.sdata) 'successful conversion so add as integers If vtmp > 0 Then vret.id = isinteger 'set the id type vret.idata = vtmp + v1.idata Else 'can't convert to integer so convert integer to string vret.id = isstring 'set the id type vret.sdata = Str$(v1.idata) + v2.sdata End If End If 'if one or both values are null, return null. If v1.id = isnull Or v2.id = isnull Then vret.id = isnull End If End Sub 'define our print routine Sub PrintV(v As vtype) 'print out data based on type If v.id = isstring Then Print v.sdata ElseIf v.id = isinteger Then Print v.idata Else 'just print a null string for null Print End If End Sub 'create some variant data type Dim As vtype v1, v2, v3, v4 'create an integer type SetV 10, v1 Print "Set integer: ", PrintV v1 'create a string type SetV "10", v2 Print "Set string: ", PrintV v2 'create a null type v3.id = isnull Print "Set null: ", PrintV v3 Print 'add two integers together Addv v1, v1, v4 Print "Adding 2 integers:", PrintV v4 'add two strings together Addv v2, v2, v4 Print "Adding 2 strings:", PrintV v4 'add string and integer Addv v1, v2, v4 Print "Adding string, integer:", PrintV v4 'add integer and string Addv v2, v1, v4 Print "Adding integer, string:", PrintV v4 Print 'adding null to integer Addv v3, v1, v4 Print "Adding null, integer:", PrintV v4 'adding integer to null Addv v1, v3, v4 Print "Adding integer, null:", PrintV v4 'adding null to string Addv v3, v2, v4 Print "Adding null, string:", PrintV v4 'adding string to null Addv v2, v3, v4 Print "Adding string, null:", PrintV v4 'adding null to null Addv v3, v3, v4 Print "Adding null, null:", PrintV v4 Print 'create a new string type SetV "This is a string", v2 Print "Set string: "; PrintV v2 Print 'add string and integer Addv v2, v1, v4 Print "Adding new string, integer:", PrintV v4 'add integer and string Addv v1, v2, v4 Print "Adding integer, new string:", PrintV v4 Print Print Print "Press any key" Sleep And here is the output. Set integer: 10 Set string: 10 Set null: Adding 2 integers: 20 Adding 2 strings: 1010 Adding string, integer: 20 Adding integer, string: 20 Adding null, integer: Adding integer, null: Adding null, string: Adding string, null: Adding null, null: Set string: This is a string Adding new string, integer: This is a string10 Adding integer, new string: 10This is a string Press any key Unions are a way to reduce complexity in a program, by enabling the programmer to package different types of data within a single object, and create a single set of methods to manage the data contained within the object. --end-- Rick Clark