More for C++

Introduction

License

Download

Documentation

What's next?

Credits

Feedback

Introduction

This introduction will show you some of the library's basic concepts and conventions. Like any good programming tutorial, we'd like to start with a program that prints out the message "Hello, world". But this is not a C++ tutorial, so for the introduction to "More for C++" it has been made a little more complicated than you may expect:

#include <iostream.hpp>
#include <more/glue.hpp>
#include <more/string.hpp>

using namespace std;

int main( int nArgC, char* pcArgV[] )
{
  more::Glue glue;
  more::String sMessage = "Hello, world";

  cout << sMessage << endl;

  return 0;
}

The first things to mention are the include statements and the use of the namespace "more". Every class in "More for C++" resides in the namespace "more", or one of its nested namespaces (i.e. "more::os"). The include statements for the classes correspond to their namespaces, i.e. the class "more::String" has to be included with "#include <more/string.hpp>, and the class "more::os::Thread" is included with "#include <more/os/thread.hpp>. For reasons of portability, the names of files and directories must be stated always in lower-case, and the separator must always be a slash ('/'), never a backslash!

Next, a few words about "more::Glue". Before any class or feature of "More for C++" can be used, the library has to be initialized. This is done in the constructor of the class "more::Glue", so a glue-object has to be created at the very beginning of your program. Obviously, the function "main" is a popular place to do this, thus the glue's constructor is one of the first things that is called in the execution of your program. And, as you may have thought, shutting down the library is done in the glue's destructor, and since the glue object was created at the beginning of "main", its destructor is virtually the last function called before leaving "main".

Hello, real world

The following code examples are provided with the source code distribution of the library. The source files can be found in the directory ".../more/samples/cookbook". In addition, the library comes along with project files and makefiles for both Microsoft Visual C++ 6.0 and for the C++ compiler and the make utility of the GNU project. These files are located in the directory ".../more/build/msvc60" and ".../more/build/gnu", respectively.

For Visual C++, there is a workspace file called "more.dsw". Open it and select the project "cookbook (Sample)" to build it. The project for the library is called "more" and is built depending on the cookbook project.

If you are using the GNU development tools, open a shell, change to the directory ".../more/build/gnu" as the current directory, and start the make process with the command "make". For the time being, this has only been tested under various flavors of GNU/Linux.

Now, we will enhance the example by introducing the class "Console" (as in the file "console.hpp")

#include <more/pointer.hpp>
#include <more/string.hpp>

class Console
{
  public:
    void print( size_t );
    void print( const more::String& );

  protected:
    virtual ~Console( );
};

typedef more::p<Console> ConsolePtr;

The next convention worth mentioning at this point is: you must not use the C++ keywords "use" or "using" in a header file! This is to prevent name clashes with identifiers from other libraries. In this example "use more::String;" or "using namespace more;" could lead to a potential name clash if a second class called "String" (i.e. from another library) is used in another header file included together with "console.hpp".

Another distinctive convention is the protected destructor. The "More for C++" library comes along with its own garbage collector ("GC"). The impact of using a GC in C++ is that object destruction as we know it is no longer used. This means that a class destructor is never called! By declaring every destructor "protected", no one can "delete" an object by mistake because this is rejected by the compiler.

The Garbage Collector

Before we carry on with the implementation of the class "Console", the typedef "ConsolePtr" should be explained. The GC can only manage the objects if they have been created dynamically on the heap. As we will see in a moment, every garbage-collectable object has to be created with a special "new" operator (preferably called with the macro "CREATE"). Such an object must not be referenced by a normal C++ pointer, as the GC would have no chance to keep track of the object's lifetime. A so-called "smart pointer" has to be used instead which will act very similarly to the built-in C++ pointers. This smart pointer is implemented by the template class "more::p". For reasons of readability, you may use the typedef instead of "more::p<Console>", but that is purely a matter of taste.

The implementation of "Console" in "console.cpp":

#include <iostream.hpp>
#include "console.hpp"

using namespace more;
using namespace std;

void Console::print( size_t nNumber )
{
  cout << nNumber << endl;
}

void Console::print( const String& sMessage )
{
  cout << sMessage << endl;
}

Console::~Console( )
{
}

The implementation is very straightforward, and we will not explain it in detail. Let's have a look at some further conventions instead, i.e. the inclusion of a file called "iostream.hpp". In standard C++, you should include the iostream classes with "iostream" or (as preferred in the early years) "iostream.h". With "More for C++", these standard header files should be included indirectly by using so called "wrapper files" like "iostream.hpp". This gives us the possibilty to avoid some compiler problems that may occur while including the standard headers directly along with the header files for the "More for C++" classes. Other examples of this technique are "corba.hpp", "python.hpp" or "more/stl.hpp".

The (protected) destructor has to be implemented only to satisfy the linker, but it is never called. If there is really a need for a kind of "destruction", your class could get derived from the interface "more::core::Finalizable", thus you have to implement a method "finalize". This method will be called by the garbage collector when an object of your class is not be referenced any more and the collector wants to "free" the object. This is very low-level, and you should use this "finalization" only for very good reasons, i.e. releasing a system resource like a file handle. However, in most cases, it seems to be better to shut down an object explicitly so that you have control of what is going on!

Creating objects

Now we will take a look at the function main in the file "main.cpp" where some of the basic concepts of the "More for C++" garbage collector in conjunction with smart pointers, strings, and arrays will be demonstrated.

Before we start, just a few words about all these include statements at the beginning of the module. The architecture of "More for C++" is based upon very small components with just as little dependencies as possible between each other. This corresponds to very small C++ header files, with the benefit that they can be included very cheaply (in comparison to such "monsters" like "windows.h").

01 #include <iostream.hpp>
02 #include <more/array.hpp>
03 #include <more/create.hpp>
04 #include <more/gc.hpp>
05 #include <more/glue.hpp>
06 #include <more/pointer.hpp>
07 #include <more/string.hpp>
08 #include "console.hpp"
09
10 using namespace more;
11 using namespace std;
12
13 int main( int nArgC, const char* sArgV[] )
14 {
15   Glue glue;
16   String sMessage;
17   ConsolePtr pConsole;
18   Array<String> sMessages;
19
20   pConsole = CREATE Console( );
21   pConsole -> print( GC::getNoOfObjects( ) );
22
23   sMessage = "Hello";
24   pConsole -> print( GC::getNoOfObjects( ) );
25
[...]

As we have already seen in the first example, the very first object that has to be created by an application using the "More for C++" library is the "glue object", initializing the library (i.e. the GC.)

To make it a litte bit complicated, we declare three variables to work with: a string object (sMessage), the pointer to a Console object (pConsole), and an array of strings called sMessages. After creating a Console object using the macro CREATE (line 20) we use the object for printing the number of objects to the console. As you may see while running the program, the garbage collector knows exactly one object, namely the Console object.

In line 23, we create a second object by assigning the value "Hello" to the string variable sMessage. Like in the Java language, a string variable is virtually used both ways, as a kind of reference to a string object, and as the string object itself. The empty string is the NIL-reference, and that's the reason why there is only one object on the GC's heap before the assignment is made. As the result of the assigment in line 23, a string object is created, and in line 24 the number of objects printed to the console is two.

Assignment and references

[...]
26   sMessages.setLength( 2 );
27   sMessages[0] = sMessage;
28   sMessages[1] = "world";
29   pConsole -> print( GC::getNoOfObjects( ) );
30
31   sMessage = sMessages[0] + ", " + sMessages[1];
32   pConsole -> print( GC::getNoOfObjects( ) );
33
[...]

In line 26 the length of the array of strings (initially with a size of zero) is set to two. While setting the length of an array, the items of the array are initialized implicitly. In the case of an array of strings, two empty strings are created. Then, in line 27 the string variable sMessage is assigned to the first item of the array and a new string "world" is generated while assigning it to the second item.

The number of objects that is printed to the console in line 29 of our example is - you may have guessed it - four! Let's count: after setting the array sMessages to a length of two, the number of objects is three (the Console object, the string "Hello" assigned to the variable sMessage, and the array sMessages itself, holding two empty strings which are not counted as objects). The fourth object is the string "world" which is assigned to the second item of sMessages in line 28.

The assignment in line 27 does not increase the number of objects because here the string's nature of a reference comes into play. While assigning one string variable to a second, the string is not(!) copied. Only the reference to the string object is copied, so in this example both the array's first item and the string variable sMessage are holding a reference to the same string object "Hello" that is living on the heap.

String objects in "More for C++" are immutable. Regardless of how many string variables are holding a reference to the same string object, manipulating one of the string variables virtually creates a new string object.

This is demonstrated in line 31 where sMessage is changed to a new string as the result of two string operations. While evaluating these operations, one temporary string is generated. That's the reason why the number of objects printed in line 32 is six.

The same rules of assignment, references and immutabilty that apply to string variables are also true for pointer variables, but NOT for arrays. Accessing an item of an array with the "[]" operator will change this item in-place. This means that any array-variable that holds a reference to the same array-object on the heap will see this change immediately!

Sweeping objects

As you could see, the theory of using strings, smart pointers, and arrays is not so easy as it seems to be. But in practice, it is very straightforward and you will get used to it very fast.

With that in mind, we will see how the rest of the objects in this Cookbook example are created and - even more interesting - how the garbage collector treats them:

[...]
34   sMessage = "";
35   pConsole -> print( GC::getNoOfObjects( ) );
36
37   sMessages.setLength( 0 );
38   pConsole -> print( GC::getNoOfObjects( ) );
39
40   GC::collectObjects( );
41   pConsole -> print( GC::getNoOfObjects( ) );
42
[...]

In line 34 and 37, every single reference to any string object that was created in our example is eliminated. Since there is no way of explicit object destruction, the string objects reside on the GC's heap until the garbage collector sweeps them away (line 40). From that moment, only those objects remain on the heap that are referenced by at least one string variable, smart-pointer, or array. So in line 41, the number of objects is "one" because only the smart-pointer pConsole is referencing our Console object.

Smart pointers

As mentioned before, the use of the "More for C++" smart pointers is very straightforward and similar to the built-in pointer types in C++. Nevertheless, one feature of these smart pointers is different: checking for NULL.

As a demonstration of this feature, the last part of the example is embedded in a try-catch-block. In line 47 you can see the reason why: an exception of the type NullPointerException should be caught if someone is trying to use the "->" operator of a smart pointer that has a value of NULL.

Run the program and see what is happening in the last part of the main function.

[...]
43   try
44   {
45    ConsolePtr pSecondConsolePtr;
46
47    pSecondConsolePtr = pConsole;
48    pConsole -> print( GC::getNoOfObjects( ) );
49
50    pSecondConsolePtr = 0;
51    pConsole -> print( GC::getNoOfObjects( ) );
52
53    pConsole = 0;
54    pConsole -> print( GC::getNoOfObjects( ) );
55   }
56   catch( const NullPointerException& )
57   {
58    cerr << "Null pointer exception!" << endl;
59   }
60
61   GC::collectObjects( );
62   cout << GC::getNoOfObjects( ) << endl;
63
64   return 0;
65 }

The dangers of dereferencing a pointer are very similar to those accessing the items of an array. For this reason, "More for C++" defines another exception called IndexOutOfRange. Such an exception is raised if someone tries to access an item that would be out of an array's current range of items. Therefore, it may be a good idea to embed such an operation in a try-catch-block that catches any IndexOutOfRange exception.

The GC and Multithreading

The "More for C++" library was designed for multi-threaded applications. For this reason, the string type, smart pointer, and array class were developed for thread safety wherever possible. Hence you can use string variables and pointers referencing the same objects in different threads with no harm. The one important exception is the "[]" operator when using an array. Since this gives you direct access to a member item of the array, it may be very dangerous to do this from different threads. Thus, it is always a good idea to manipulate an array (and - as a rule of thumb - every other type of container) in a synchronized manner using a mutual excusion (as shown in the "threads sample").

The garbage collector itself is thread-safe too, so you may call GC::collectObjects at any time from every thread-context you want. For your convenience, a separate "collector thread" is created by the call of "GC::startCollectorThread()". This thread is active until your program terminates, and it marks and sweeps objects in the background, so you don't have to call the collector explicitly any more.

Conclusion

This introduction to "More for C++" should have given you a first impression of the basic concepts of the library. If you want to learn more about the library, take a look at the other examples in the directory "more/samples", particularly in the sub-directories "pointer", "stl", "test", and "threads".

The presence of a Garbage Collector in C++ may be a little confusing in a language where explicit object destruction is a very commonly used feature. The reward is a kind of "C doubleplusgood" where the advantages of a very established programming language come along with the essential feature of a robust automatic memory management you may expect today.

Enjoy!

© 1999 - 2003  Thorsten Görtz