Last Updated Mar-17-2016

This page discusses the key techniques for developing multi-threaded C++ applications with the aspects of high quality and high performance that make them optimal for use in telecom, financial, medical, and military applications.

The above market segments have one thing in common: failure is not an option.

Most software being developed today takes the attitude that not seeming to crash is the goal. But in the markets targeted by this page, far more than this is required: doing the right thing, all the time is required -- either because your life, or at bear minimum your financial future is on the line.

Telecom, Financial, Medical, and Military Applications

Unlike many commercial applications, telecom, finanacial, military, and medical applications require high quality. Many commericial applications have the feature that they can die and automatically restart with impunity:
If you are listening to a song on a music service, it is of little consquence if the music hangs: you can must press "next" and let the service play the next song in your playlist with little or no impact to your party. Even if you have to kill your phone app and restart it, its not big deal.
But if a program controlling a surgical robot hangs, you may bleed to death before it can be rebooted. Your plane might crash during a reboot sequence. Hackers might have a window to steal all your money if your bank's system hangs. Phone calls might get hung for hours -- resulting in big telephone bills.

This document discusses strategies for developing multi-threated applications that reduce the risk of hangs, thread-related malfunctions, and outages.

Overview of the Rules

This section lists the rules at a high level and subsequent sessions discuss them in detail:
  1. Programs have several execution phases. Put the right code in the right phase:
    • static intitialization occurs before main() begins
    • after main begins and before multi-threading starts
    • while the threads are running normally.
    • while the threads should be shutting themselves down
    • after all threads are shut down
  2. Don't code up a lot of calls to low-level threads functions, write or use an existing threads library:
    • boost::thread is a good example
    • See my light-version at, directory cxx/include/cxxtls:
      • threads.h
        • scoped mutexes
        • condition variables
        • logging functions
      • threadPool.h ( a collection of command handler threads serving a single input queue )
      • threadGate.h
      • threadMonitoredSet.h ( a generic set that is thread safe )
      • threadSafeHandle.h ( the handle is safe but not the thing it points to)
      • threadSafeInt.h ( sluggish compared to atomic_int in the standard libary)
  3. Use scoped mutex lock class objects to lock mutexes in order to automate the unlock.
  4. Never write code that has Sleep()/usleep()/nanonsleep() as part of your thread logic.

    Instead do one of the following:
    The New Way

    Use condition variables to wait instead.

    The Old Way
    The old-fashioned way to do this was to use the select() statement to wait for a single-byte write to a pipe. If a reader thread is waiting on a pipe for a "flag byte" to arrive, then one or more writer threads can write a single byte to the pipe to wake the reader up. The reader thread should consume all the bytes in the pipe, do something based on the arrival of said data and then go back to reading.

    Now of course, the maximum size of a pipe buffer is limited to some number such as 4096, or other such size -- so the algorithm will fail if too many flag bytes are fed to the pipe, but if the reader is designed to be fast, the probability of this can be made to be vanishingly small.

    This approach required thread-safe data structures to hold the rest of the data (other than the 1 byte flag which signaled the availability of data. An atomic_int<pointer> to some circular buffer was the standard solution.

    This alternative explanation is provided only for historical consistency -- use condition variables if they are available.

  5. Never unlock a mutex in a different thread than the one which locked it
    • it is a violation of the posix threading definition
    • and leads to nasty race conditions
    • Use condition variables along-side mutexes when you need to simulate the feature of having multiple unlocker threads
  6. Make thorough use of exceptions in your threaded code
  7. Only code written to be either re-entrant or thread-safe should be used in a threaded application

    But there's more to it than that. Threads are not like little programs. They can't be killed consistently and safely, but they can be forced to handle exceptions and exit early when an exception is thrown. An exit can mean that the thread terminates, or it can mean that a pooled-command-handler-thread simply goes back to servicing the command queue.

    But how does such an exception get thrown? Obviously, there are many ways to do that, but the most straightforward way is to write the read and write function calls to check for errors that mean that a pipe has terminated, or that I/O cancelation has occurred. Some sort of thread-safe global flag can be set external to the thread but thread will have to check it periodically to determine that an exception needs to be thrown.

    For example:

    • signal handlers can be written to throw exceptions.
    • signals can be be sent to specific threads
    • On Windows, the CancelIoEx function can be used to cancel all I/O in a specific thread.

    Code written to run in a multithreaded application should be writtens such that file reads and writes are checked for the needed signals or I/O cancelations and throw the appropriate exceptions.

  8. Don't create and destroy threads dynamically as your program runs
    • The process is so slow as to counter the value of threads
    • Do use thread command pools
    • Pre-starting non-terminating task-specific threads up front is good too.
  9. Don't unload and reload a shared library
  10. Avoid Recursive Mutexes
    • They don't exist on POSIX
    • They can be simulated, but are slow to do so
    • They usually represent work-arounds for bad design choices
  11. Don't make all the code in application multithread aware
    • though all code should avoid static variables and shared variables
    • but putting mutex locks in libraries will just slow down the application and perhaps require recursive mutexes
    • design applications to know about threading only at high levels -- like top level input queues.
  12. Don't share variables between threads
    • "static", "global", and heap variables shared between threads should be viewed as immediately suspect.
    • function static variables of class object type should be avoided at all costs when using threads. Depending on the implementation, the initialization may occur the first time the function is called. If two threads call the function for the first time at the same time, you are definitely going to get a crash.
    • Mutexes let you share variables but they are SLOW
    • Thread Safe Constants can be shared without locks
      • but they must be fully initialized before threading begins.
    • Thread Safe Queues et al. can be used to communicate data between threads
  13. Be CAREFUL passing std:strings between threads

    std::string variables require great care because they have hidden linkages between variables. These linkages are not thread safe. However, there is a straightforward rule for working with the strings:

    • never copy or a assign a string variable into a data structure to be used by another thread.
    • instead, default construct an empty string in the shared data structure.
    • then use the assign() function of the std::basic_string class to append data to the object being passed from one string to the other
  14. Use only thread-safe logging functions
    • File handles tend to be thread safe
    • But STDOUT and STDERR are unintelligible due to intermingle of output from multiple threads
    • Use thread safe log macros/functions
    • Time stamp log messages to the nano-second accuracy
      • so that the "sort" command can be used on multiple log files to sync up messages across log files from multiple procsses to the same time line.
      • the log messages should look like this:
        YYYY-MM-DD:H24:mm:ss:nnnnnnnnn ThreadTypeID: functionName: Message ....
    Where nnnnnnnn is the current time in seconds modulo the nano-seconds -- or as close to nano-seconds as is convenient. You OS may only really allow milisconds.
  15. If you must use a non-thread-safe library in a threaded application, make all calls to from a single thread. Other threads should use a thread safe command queue to issue commands to that thread. Some examples:
    • Windowing system library (X-Windows, Microsoft Graphical Routines)
    • CURSES
  16. Create a mutex-guarded "create process" functions that insure that no more than one thread at a time can launch sub-processes.
    • Inherited Files will get passed to the wrong child-processes if you don't
    • This will lead to program hangs waiting for interited files to be closed but no one ever will.
  17. Learn to use the multi-threaded debug features of your debugger. For instance
    • Find out how to see the existing threads and look at their stack frames
    • Learn how to freeze/thaw threads (pause and resume individual threads while the rest can be allowed to run).

Scoped Mutex Unlocks

Remembering to unlock a mutex is very difficult to get right all the time -- especially when exceptions are involved. The consequences of forgetting can be terrible and the mistake is often terribly hard to find. To avoid all this pain, take advantage of language features that automate the unlocking for you:

Here is a boost example of doing this in C++ using boost:

// at file scope or in a class object shared by all affected threads boost::mutex lockvariable; int protectedVariable; ... // in some function { boost::mutex::scoped_lock locker(lockvariable); protectedVariable = 10; SomeFunc(); // might throw } // unlock occurs here automatically -- even if an exception is thrown. The java and C# destructors are not guaranteed to run at the end of scope, so they cannot be used for this purpose. However, the using and finally statements can serve a similar purpose.

After Main Begins and Before the First Thread Is Created

Threads should not share variables, if at all possible -- and it isn't always. But sharing static constants is fine -- so long as they really are constants.

Static cache variables are a bad idea -- move the static caches into thread specific variables and you overcome the problems.

If there are any static constants to be shared, they should be fully constructed before multi-threading begins. Basically, in main, populate the global shared constants then create your first thread.

Highly Stylized Exception Code Sequences

Inserting pointers into a container can result in exceptions being thrown by the container.

When pushing normal objects into the container, the compiler and the container work together to eliminate any memory loss. But when pushing pointers into a container, a lot of burden falls on the programmer. Here's an example of how to do it.

std::list<SomeClass*> container; std::auto_ptr<SomeClass> ap(new SomeClass()); // always use auto_ptr when exceptions must be handled // now append ap into container in an exception safe manner { // Do not separate the following lines of code: container.push_back(ap.get()); ap.release(); }
Note that this technique is discused in several places on the internet. And despite the horrifying situation where both the auto_ptr and the list both own the same object, temporarily, this is a fairly good trade off in risks. It is advisable to put comments around blocks of code, like this, which must not be separated during later maintenance activities.

My fellow programmer, Stephen, once gave me a ration of grief for separating such lines of code in a program that he wrote and which I was maintaining. I did not understand that the two statements were related, and so inserted other code between them that used the auto_ptr. My code might have thrown an exception, and without the protection of the above logic, both the auto_ptr and the list would have called the destructor on the same object -- likely causing a crash.

My take-away from this was two things:

Threads Can't Be Killed

Some operating systems don't even permit threads to be killed at all.

Some do -- but even in those cases, killing them is a bad idea:

Static Initialization

The static initialization of constant values defined outside function bodies is generally not a problem unless the data is not truly constant. In which case, either change the code not to use static memory, or protect it via a mutex. The alternative is to make a thread specific copy of the static data -- presumably in a class object somewhere.

Static initialzations inside function bodies is very dangerous because it gives the false impression that the compiler is somehow protecting the data for you -- but it is not.

Reloading DLLs and Shared Libraries: baaaad

Reloading previously unloaded DLLs and shared libraries is possibly bad in most cases but definitely bad if threaded code calls functions in library that has been reloaded.

When you load or reload a shared code (DLL or SA file) the static constructors will run on the data stored in the library. This data may already be in use by other threads and will be corrupted by the reloading process.

What happens to shared data when you unload shared code is not guaranteed.

Re-entrant or Thread Safe Code

Technically, thread safe code is code which If you must use unsafe code in a multithreaded application, assign one thread to own that code and all its data -- and have it read thread safe command queue so that other threads can make calls t the unsafe code.

A typical example of this is a graphical drawing library, or the CURSES library, but the logic applies to any third party tool.

Race Conditions

The name gives the implication that something is going very fast, but the typical signature of a race condition is that either the program crashes or it just hangs and does nothing.

The problem results from two threads trying to modify unprotected memory.

If both mistakenly gain access, a crash may result as both threads delete the memory.

Alternatively, if neither gains access, the two threads get locked out -- permanently because the mutex's memory is trashed.


Condition variables are an oddly-named companion to mutexes. They exist for two primary purposes:
  1. reducing CPU utilization by allowing threads to "sleep" instead of polling to see if a mutex is unlocked.
  2. allowing control over the number of sleeping threads that get to wake up when the mutex is unlocked (and thus ready to be relocked)
Condition variables are used in conjunction with mutexes and are the heart and soul of thread safe queues. It is forbidden for one thread to unlock a mutex which was locked by a different thread. A condition variable can seemingly break that rule in a structured, thread-safe way.

Use google to get the details of using the condition variables nearest you, but here is the basic idea:

Given the above explanation, it should be clear that: wait is roughly like a mutex lock and notify is rougly like an unlock -- but in a threadsafe manner.

So why not just use thread safe queues? Most of the time, they work fine for most algorithms. But if an algorithm relies on pushing data into a thread safe queue just so that the threads waiting on it can wake up and do something then this is a sign that condition variable should just be used directly -- particularly if the data pushed into the queue is not actually used by the function waiting for it.

Here are a couple of examples of other uses of condition variables:

As a Starting Gate
Since the notify all method wakes up all the threads waiting on the condition variable, let multiple worker threads perform the wait call to queue themselves up to the starting gate, like horses in a race, then have another fire the starting pistol by calling notifyAll()
As a Finish Line
Suppose a master thread wants to know when all worker threads have completed a task. Condition variables allow this to work without the master thread having to poll for final completion.

In this case, let the worker threads call the wait method, then increment a counter protected by the mutex that the condition variable controls. The workers can call notifyOne() then go on about their business.

The master thread does the following:

  1. lock the mutex
  2. repeatedly does the following:
    1. call the wait method
    2. when it finishes, check to see if the count of completed worker thread equals the number of worker threads: if so, then break out of this loop. If not, call wait() again and repeat.
As a Bulleten Board
Imagine that there is a list of tasks which are being worked on. One or more threads may wish to wait until a particular task is finished.

Let the list of running tasks be guarded by a mutex which is controlled by th a condition variable.

The worker threads completing the tasks should call the condition variable's wait() method then remove its identifier from list. Then it should call notify all.

The threads waiting for notification of the completion of that particular task will be awakened. They can then check the list for the interesting task id. If is gone from the list, they notifyAll and continue about their work. If it is still in the list, they notifyAll and go back to waiting on the condition variable.

Yes, this approach means extraneous wake-up-calls are handled, but the code immediately goes back to sleep when this happens.

The End