by Jim Karabatsos - GUI Computing
With the release of version 4 of Visual Basic, Microsoft made some significant enhancements to the language.
Many of these enhancements have received little coverage in the media because of the problems caused by some of the more, umm, challenging changes made in the same version, changes such as UniMess(TM) By Design, Evil Type Coercion and forced alignment of type elements. Just for a change, I thought it would be good to write a positive article focusing on the good, forgetting (for a while at least), the bad and the ugly.
One of the coolest new features of the language itself is the implementation of collections. Collections are essentially linked lists that are managed for you by Visual Basic. They provide the VB programmer with an easy mechanism for storing and accessing in-memory information in a convenient way, as well as being the underlying mechanism for creating object hierarchies in Visual Basic.
Collections are created much like other objects, using the Dim statement inside a procedure or the Private or Public statements at the module level:
Public MyCollection As Collection Private MyCollection As Collection Dim MyCollection As Collection
Once you have a collection, you can add things to it. You can add any kind of thing to a collection - as long as it is an object. This means that you cannot add fundamental data types like integers or strings, nor can you add user-defined types. Instead, you need to add object instances that you have defined in class modules, or that you have obtained a pointer to through OLE.
Objects typically contain some smarts. You can code property procedures to perform validation checks, whenever a value is set or read from an object's data field, or to implement side-effects like storing data to a database or updating an on-screen representation. However, there is no requirement that objects contain code — it is quite possible to create a VB object that is nothing but a colletion of data. Let's do just that.
The following class module would define an object type that could be used to store data about currency conversion factors :
Public Code As String Public Name As String Public NumDecimals As Integer Public ConversionFactor As Double
If we set the name of the class to TCurrencyInfo, then we could create an instance of a it using the following code:
Dim CurrencyInfo As TCurrencyInfo Set CurrencyInfo = New TCurrencyInfo CurrencyInfo.Code = "AUD" CurrencyInfo.Name = "Australian Dollar" CurrencyInfo.NumDecimals = 2 CurrencyInfo.ConversionFactor = 1.25
O.K., we now have an instance of TCurrencyInfo and we need to store it somewhere. For this article, we are discussing in-memory storage, not file system storage. Before version 4, we would have needed to store the data in an array. This is quite acceptable (and indeed quite efficient) but there is just one, tiny problem: how big do we want the array to be?
Sometimes we know ahead of time how large to make the array. Many times, however, it is not possible to know ahead of time just how many elements you need - in which case you need to either create very large arrays, which unnecessarily consume memory (and hope that you never exceed the limit), or handle the dynamic growth of the array using the ReDim Preserve statement. The latter option is quite workable but can also be quite slow if you try to grow the array by one element at a time. You really need to get a bit smarter than that and grow by some reasonable increment; unfortunately this complicates the logic somewhat - so much so that I have seen programmers resort to using a database rather than grapple with the issues involved.
Collections are just the answer we are looking for. We can add an object to a collection using the Add method of the collection object. The Add method typically looks like this (assuming we have a public collection called "Currencies" somewhere):
Currencies.Add Item := CurrencyInfo, Key := CurrencyInfo.Code
First things first. Notice how we are using the ":=" operator to assign values to parameters by name. This is another of the neat new facilities in version 4, allowing us to specify parameters in any convenient order, and to omit them altogether if we would like to accept defaults. Indeed there are two other parameters, Before and After, that we have ommited because we want the object added to the default position (the end of the list).
The two parameters you will almost always code are the:
In this example we have used the currency code as an index, allowing us to retrieve the item easily. You do not need to specify a Key, if you don't want to, and indeed it is sometimes quite convenient not to. Most times, however, you will find that the ability to refer to the items in a collection by their key is one of the main advantages of collections over arrays.
Note that the collection itself now holds a reference to the object that has been stored in it. This object continues to exist, even after the original reference to it goes out of scope. In our example above, the CurrencyInfo object was DIMmed and instantiated inside some procedure, the values were set into the fields of the object and then it was added to the collection. Behind the scenes, it is as if the collection was an array of objects and you had used Set to bind one of the elements to the CurrencyInfo object. Even when the CurrencyInfo object goes out of scope (when the procedure terminates) the object continues to exist because its reference count is not yet zero — remember that the collection holds a reference to it. If this sounds just a little bit mystical, then welcome to the brave new world of OLE (grin). It might be a good idea to invest some time in reading a good OLE book, my current favourite being "Inside OLE" from Microsoft Press.
To retrieve a particular reading from the collection, we use the Item property of the collection. Being the default property, we do not need to actually code it and can use a syntax that makes it appear that the collection is like an array:
AustDollarRate = Currencies("AUD").ConversionFactor
We can also create another reference to that object, as in:
Dim AUDInfo As TCurrencyInfo Set AUDInfo = Currencies("AUD")
The object now has two references, the Currencies collection holds one and AUDInfo holds the other.
You can determine how many objects are in a collection using the Count property, and you can index the objects using an integer ranging from 1 to Count, as in:
For I = 1 to Currencies.Count Print "One USD buys"; Currencies(I).ConversionFactor; _ " "; Currencies(I).Description Next I
Notice how we are using an integer value as a pseudo-subscript which indicates that we are selecting the object positionally rather than by its key (which is always a string).
Another way to iterate over all the objects in a collection is using the For Each statement:
Dim C As TCurrencyInfo For Each C In Currencies Print "One USD buys"; T.ConversionFactor; _ " "; T.Description Next C
This is basically a more convenient (and probably more readable) way to do exactly the same thing.
Finally, to delete an object from a collection you use the Remove method:
You can also use a numeric index to specify the object to remove by its ordinal position, if that is more convenient in your application.
That's pretty much all there is to collections; if you understand this, you have a solid framework to start using collections in your own programs. There are, however, a few interesting things you might want to consider.
First, it is possible to control the exact placement of objects when they are added to a collection using the Before or After parameters. This can be useful if you need to maintain some sort of sequence independent of the order in which the objects are being added to the collection; in most cases, however, you will just add them with a suitable Index value as we have done here.
Collections are "single-dimensional" (although that is a stretch). The "subscript" is a single integer object number or a single string key. It is, however, possible to simulate multi-dimensional collections. One way is to create a concatenated string Index, where (say) the first three characters are the first "subscript", the next three are the second, and so on. Another way (that I consider much more interesting) is to make use of the fact that, being an object, a collection can itself be stored in a collection. You can create collections of collections to any reasonable depth, opening up some very useful data structures.
Finally, you need to use the right tool for the job. Collections and arrays have a lot of overlap in the types of situations that they cover. In general, array processing is faster than collection processing, can handle multiple dimensions, and can be created for any type, whether simple, record or object. Collections, on the other hand, can only store objects - but each element in a collection can be of any object type (ie. collections are polymorphic). Collections dynamically resize themselves to make efficient use of memory and are great for "sparse array" handling. Finally, collections are a type of associative array, meaning that we can reference items using an associated key, rather than being restricted to a numeric subscript.
Like most things, this is not an either/or issue. Collections add to the rather formidable array of programmer tools in VB (pun absolutely intended).