C++ Static initialization is powerful but dangerous

Static initialization is the name given to events before the program's main() function begins execution. Static initialization is meant to enable the construction of C++ objects -- that is, to set them to a known state before main() begins.

Normally, this means that variables are set to some known value and complex class objects, like strings, get initialized properly. But it is also possible to use static initialization to construct tables.

Doing so can greatly simplify the program's startup logic but it can also lead to very slow program startup times if complex things are done. Further, it can lead to very complex debugging session.

Although the C++ language standard does in fact have a well defined sequence for the static initialization of a program, few of the compilers actually follow the sequence exactly. Instead, most perform all static initializations for a given object file in the order in which the linker processes the objects. This means that static initializations in early modules cannot rely on the proper initialization of external variables declared in modules linked later in the process.

Basically, this means that static intialization is heavily dependent on something outside the control of the programmer -- the decisions made by the linker (which may change between compiler/os/linker revision levels and be different on different operating systems). Debugging in a multi-platform can be frustrating (to say the least) if the possible failure modes are not understood by the application developers.

It is possible to control the order of linking of object modules but then you lose the advantage of object module libraries. It is better, if you must use static initialization, to come up with strategies that don't require precise placement of object modules -- except, perhaps, for the first module linked (or the last depending on the linker -- some version of the gnu development kit run the object module initializations in an order reverse of their link order).

Normally it is safe and harmless to use static initializations for class object that do not depend on one another. Initializing strings, or user defined classes that don't attempt to make use of global variables is generally not an issue. The problem arises when the constructors or initialization expressions access global variables or class static methods.

While an individual developer is not likely to make the mistake of having the constructors of two different global variables both require the other to have completed before they are allowed to run, this kind of thing can easily happen in a large scale application development activity where the users's of a piece of code have no idea how it works. And isn't this the ultimate goal of programming: To be able to use other people's code without understanding all its details and nuances?

To make code re-use less of a long term pain, it behooves developers to follow the basic rule that class object constructors and static variable initializer expressions should not refer to class static methods or external variables. When you must do this, at least understand that you must control the order of execution of access to external variables in such a way that link order does not matter (except perhaps for a single program specific module that defines "known good" global variables on which you can operate).

For example, an application architect might specify that the main() function will always be in the first module that gets linked (again, or the "last", depending on the OS). This function can define all the global variables that are "known" to be involved in cross-module static initialization use.

Note: most operating systems provide a way of analyzing object modules to determine if the static/external variables are being referenced. Scripts can be written to help sort out such issues if you are facing an existing large scale application with a static initialization problem.