Storage duration and Non-local Objects in C++

C++ allows us to declare various forms of non-local objects: they usually live throughout the execution of the whole program. In this article, we’ll look at global variables, dynamic, and thread-local objects. We’ll also consider new features for safe initialization C++20.

This text comes from my book “C++ Initialization Story”.

Get the book @Leanpub or @Amazon.

Storage duration and linkage

To start, we need to understand two key properties of an object in C++: storage and linkage. Let’s begin with the definition of storage, from [basic.stc#general]:

The storage duration is the property of an object that defines the minimum potential lifetime of the storage containing the object. The storage duration is determined by the construct used to create the object.

An object in C++ has one of the following storage duration options:

Storage duration	Explanation
automatic	Automatic means that the storage is allocated at the start of the scope. Most local variables have automatic storage duration (except those declared as `static`, `extern`, or `thread_local`).
static	The storage for an object is allocated when the program begins (usually before the `main()` function starts) and deallocated when the program ends. There’s only one instance of such an object in the whole program.
thread	The storage for an object is tied to a thread: it’s started when a thread begins and is deallocated when the thread ends. Each thread has its own “copy” of that object.
dynamic	The storage for an object is allocated and deallocated using explicit dynamic memory allocation functions. For example, by the call to `new`/`delete`.

And the definition for the second property: linkage, extracted from [basic.link]:

A name is said to have linkage when it can denote the same object, reference, function, type, template, namespace, or value as a name introduced by a declaration in another scope.

We have several linkage types:

Linkage	Explanation
external linkage	External means that the name can be referred to from the scopes in the same or other translation units. Non-const global variables have external linkage by default.
module linkage	Available since C++20. A name can be referred in scopes of the same module or module units.
internal linkage	A name can be referred to from the scopes in the same translation units. For example, a `static`, `const`, and `constexpr` global variables have internal linkage.
no linkage	Cannot be referred from other scopes.
language linkage	Allows interoperability between different programming languages, usually with C. For example, by declaring `extern "C"`.

If we work with regular variables declared in a function’s scope, the storage is automatic, and there’s no linkage, but those properties matter for objects in a global or thread scope. In the following sections, we’ll try experiments with global objects to understand the meaning of those definitions.

Static duration and external linkage

Consider the following code:

#include <iostream>

struct Value {
    Value(int x) : v(x) { std::cout << "Value(" << v << ")\n"; }
    ~Value() noexcept { std::cout << "~Value(" << v << ")\n"; }

    int v {0};
};

Value v{42};

int main() {
    puts("main starts...");
    Value x { 100 };
    puts("main ends...");
}

Run @Compiler Explorer

If we run the example, you’ll see the following output:

Value(42)
main starts...
Value(100)
main ends...
~Value(100)
~Value(42)

In the example, there’s a structure called Value, and I declare and define a global variable v. As you can see from the output, the object is initialized before the main() function starts and is destroyed after the main() ends.

The global variable v has a static storage duration and external linkage. On the other hand, the second variable, x, has no linkage and automatic storage duration (as it’s a local variable).

If we have two translation units: main.cpp and other.cpp, we can point to the same global variable by declaring and defining an object in one place and then using the extern keyword to provide the declaration in the other translation unit. This is illustrated by the following example:

// main.cpp
#include <iostream>
#include "value.h"

Value v{42};
void foo();

int main() {
    std::cout << "in main(): " << &v << '\n';
    foo();
    std::cout << "main ends...\n";
}

// other.cpp
#include "value.h"

extern Value v; // declaration only!

void foo() {
    std::cout << "in foo(): " << &v << '\n';
}

Run @Wandbox

If we run the code, you’ll see that the address of v is the same in both lines. For instance:

Value(42)
in main(): 0x404194
in foo(): 0x404194
main ends...
~Value(42)

Internal linkage

If you want two global variables visible as separate objects in each translation unit, you need to define them as static. This will change their linkage from external to internal.

// main.cpp
#include <iostream>
#include "value.h"

static Value v{42};
void foo();

int main() {
    std::cout << "in main(): " << &v << '\n';
    foo();
    std::cout << "main ends...\n";
}

// other.cpp
#include "value.h"

static Value v { 100 };

void foo() {
    std::cout << "in foo(): " << &v << '\n';
}

Run @Wandbox

Now, you have two different objects which live in the static storage (outside main()):

Value(42)
Value(100)
in main(): 0x404198
in foo(): 0x4041a0
main ends...
~Value(100)
~Value(42)

You can also achieve this by wrapping objects in an anonymous namespace:

namespace {
    Value v{42};
}

Additionally, if you declare const Value v{42}; in one translation unit, then const implies an internal linkage. If you want to have a const object with the external linkage, you need to add the extern keyword:

// main.cpp:
extern const Value v { 42 }; // declaration and definition!

// other.cpp:
extern const Value v; // declaration

While constant global variables might be useful, try to avoid mutable global objects. They complicate the program’s state and may introduce subtle bugs or data races, especially in multithreaded programs. In this chapter, we cover all global variables so that you can understand how they work, but use them carefully. See this C++ Core Guideline: I.2: Avoid non-const global variables.

Thread local storage duration

Since C++11, you can use a new keyword, thread_local, to indicate the special storage of a variable. A thread_local object can be declared at a local scope or at a global scope. In both cases, its initialization is tied to a thread, and the storage is located in the Thread Local Storage space. Each thread that uses this object creates a copy of it.

#include <iostream>
#include <thread>
#include <mutex>

std::mutex mutPrint;
thread_local int x = 0;

void foo() {
    thread_local int y = 0;
    std::lock_guard guard(mutPrint);
    std::cout << "in thread\t" << std::this_thread::get_id() << " ";
    std::cout << "&x " << &x << ", ";
    std::cout << "&y " << &y << '\n';
}

int main() {
    std::cout << "main\t" << std::this_thread::get_id() << " &x " << &x << '\n';

    std::jthread worker1 { foo };
    foo();
    std::jthread worker2 { foo };
    foo();
}

Run @Compiler Explorer

And here’s a possible output:

main        4154632640 &x 0xf7a2a9b8
in thread   4154632640 &x 0xf7a2a9b8, &y 0xf7a2a9bc
in thread   4154628928 &x 0xf7a29b38, &y 0xf7a29b3c
in thread   4154632640 &x 0xf7a2a9b8, &y 0xf7a2a9bc
in thread   4146236224 &x 0xf7228b38, &y 0xf7228b3c

The example uses a mutex mutPrint to synchronize printing to the output. First, inside main(), you can see the ID of the main thread and the address of the x variable. Later in the output, you can see that foo() was called, and it’s done in the main thread (compare the IDs). As you can see, the addresses of x are the same because it’s the same thread. On the other hand, later in the output, we can see an invocation from two different threads; in both cases, the addresses of x and y are different. In summary, we have three distinct copies of x and three of y.

From the example above, we can also spot that across a single thread, thread_local in a function scope behaves like a static local variable. What’s more, the two lines are equivalent:

// local or global scope...
static thread_local int x; 
thread_local int y;        // means the same as above

The code uses std::jthread from C++20, which automatically joins to the caller thread when the jthread object goes out of scope. When you use std::thread you need to call join() manually.

Thread local variables might be used when you want a shared global state, but keep it only for a given thread and thus avoid synchronization issues. To simulate such behavior and understand those types of variables, we can create a map of variables:

std::map<thread_id, Object> objects;

And each time you access a global variable, you need to access it via the current thread id, something like:

objects[std::this_thread::get_id()] = x; // modify the global object...

Of course, the above code is just a simplification, and thanks to thread_local, all details are hidden by the compiler, and we can safely access and modify objects.

In another example, we can observe when each copy is created, have a look:

#include <iostream>
#include <thread>
#include "value.h"

thread_local Value x { 42 };

void foo() {
    std::cout << "foo()\n";
    x.v = 100;
}

int main() {
    std::cout << "main " << std::this_thread::get_id() << '\n';
    {
        std::jthread worker1 { foo };
        std::jthread worker2 { foo };
    }
    std::cout << "end main()\n";
}

Run @Compiler Explorer

Possible output:

main 4154399168
foo()
Value(42)
foo()
Value(42)
~Value(~Value(100)
100)
end main()

This time the variable x prints a message from its constructor and destructor, and thus we can see some details. Only two foo thread workers use this variable, and we have two copies, not three (the main thread doesn’t use the variable). Each copy starts its lifetime when its parent thread starts and ends when the thread joins into the main thread.

As an experiment, you can try commenting out the line with x.v = 100. After the compilation, you won’t see any Value constructor or destructor calls. It’s because the object is not used by any thread, and thus no object is created.

Possible use cases:

Having a random number generator, one per thread
One thread processes a server connection and stores some state across
Keeping some statistics per thread, for example, to measure load in a thread pool.

Dynamic storage duration

For completeness, we also have to mention dynamic storage duration. In short, by requesting a memory through explicit calls to memory management routines, you have full control when the object is created and destroyed. In most basic scenario you can call new() and then delete:

auto pInt = new int{42}; // only for illustration...
auto pSmartInt = std::make_unique<int>(42);
int main() {
    auto pDouble = new double { 42.2 }; // only for illustration...
    // use pInt...
    // use pDouble
    delete pInt;
    delete pDouble;
}

The above artificial example showed three options for dynamic storage:

pInt is a non-local object initialized with the new expression. We have to destroy it manually; in this case, it’s at the end of the main() function.
pDouble is a local variable that is also dynamically initialized; we also have to delete it manually.
On the other hand, pSmartInt is a smart pointer, a std::unique_ptr that is dynamically initialized. Thanks to the RAII pattern, there’s no need to manually delete the memory, as the smart pointer will automatically do it when it goes out of scope. In our case, it will be destroyed after main() shuts down.

Dynamic memory management is very tricky, so it’s best to rely on RAII and smart pointers to clean the memory. The example above used raw new and delete only to show the basic usage, but in production code, try to avoid it. See more in those resources: 6 Ways to Refactor new/delete into unique ptr - C++ Stories and 5 ways how unique_ptr enhances resource safety in your code - C++ Stories.

Initialization of non-local static objects

All non-local objects are initialized before main() starts and before their first “use”. But there’s more to that.

Consider the following code:

#include <iostream>

struct Value { /*as before*/ };

double z = 100.0;
int x;
Value v{42};

int main() {
    puts("main starts...");
    std::cout << x << '\n';
    puts("main ends...");
}

Run @Compiler Explorer

All global objects z, x, and v are initialized during the program startup and before the main() starts. We can divide the initialization into two distinct types: static initialization and dynamic initialization.

The static initialization occurs in two forms:

constant initialization - this happens for the z variable, which is value initialized from a constant expression.
The x object looks uninitialized, but for non-local static objects, the compiler performs zero initialization, which means they will take the value of zero (and then it’s converted to the appropriate type). Pointers are set to nullptr, arrays, trivial structs, and unions have their members initialized to a zero value.

Don’t rely on zero initialization for static objects. Always try to assign some value to be sure of the outcome. In the book, I only showed it so you could see the whole picture.

Now, v global objects are initialized during so-called dynamic initialization of non-local variables". It happens for objects that cannot be constant initialized or zero-initialized during static initialization at the program startup.

In a single translation unit, the order of dynamic initialization of global variables (including static data members) is well defined. If you have multiple compilation units, then the order is unspecified. When a global object A defined in one compilation unit depends on another global object B defined in a different translation unit, you’ll have undefined behavior. Such a problem is called the “static initialization order fiasco”; read more C++ Super FAQ.

In short, each static non-local object has to be initialized at the program startup. However, the compiler tries to optimize this process and, if possible, do as much work at compile time. For example, for built-in types initialized from constant expressions, the value of the variable might be stored as a part of the binary and then only loaded during the program startup. If it’s not possible, then a dynamic initialization must happen, meaning that the value is computed once before the main() starts. Additionally, the compiler might even defer the dynamic initialization until the first use of the variable but must guarantee the program’s correctness. Since C++11, we can try to move dynamic initialization to the compile-time stage thanks to constexpr (allowing us to write custom types). Since C++20, we can use constinit to guarantee constant initialization.

For more information, have a look at this good blog post for more information: C++ - Initialization of Static Variables by Pablo Arias and also a presentation by Matt Godbolt: CppCon 2018 “The Bits Between the Bits: How We Get to main()”.

`constinit` in C++20

As discussed in the previous section, it’s best to rely on constant initialization if you really need a global variable. In the case of dynamic initialization, the order of initialization might be hard to guess and might cause issues. Consider the following example:

// point.h
struct Point {
    double x, y;  
};

one:

// a.cpp
#include <iostream>
#include "point.h"

extern Point center;
Point offset = { center.x + 100, center.y + 200};

void foo() {
    std::cout << offset.x << ", " << offset.y << '\n';
}

two:

// b.cpp
#include "point.h"

Point createPoint(double x, double y) {
    return Point { x, y };
}

Point center = createPoint(100, 200); //dynamic

And the main:

void foo();

int main() {
    foo();
}

Run all @Wandbox

If we compile this code using the following command and order:

$ g++ prog.cc -Wall -Wextra -std=c++2a -pedantic a.cpp b.cpp

We’ll get the following:

100, 200

But if you compile b.cpp first and then a.cpp:

$ g++ prog.cc -Wall -Wextra -std=c++2a -pedantic b.cpp a.cpp

You’ll get:

200, 400

There’s a dependency of global variables: offset depends on center. If the compilation unit with center were compiled first, the dynamic initialization would be performed, and center would have 100, 200 assigned. Otherwise, it’s only zero-initialized, and thus offset has the value of 100, 200.

(This is only a toy example, but imagine a production code! In that case, you might have a hard-to-find bug that comes not from some incorrect computation logic but from the compilation order in the project!)

To mitigate the issue, you can apply constinit on the center global variable. This new keyword for C++20 forces constant initialization. In our case, it will ensure that no matter the order of compilation, the value will already be present. What’s more, as opposed to constexpr we only force initialization, and the variable itself is not constant. So you can change it later.

// b.cpp:
#include "point.h"

constexpr Point createPoint(double x, double y) {
    return Point { x, y };
}
constinit Point center = createPoint(100, 200); // constant

Run @Wandbox

Please notice that createPoint has to be constexpr now. The main requirement for constinit is that it requires the initializer expression to be evaluated at compile-time, so not all code can be converted that way.

Here’s another example that summarizes how to use constinit:

#include <iostream>
#include <utility>

constinit std::pair<int, double> global { 42, 42.2 };
constexpr std::pair<int, double> constG { 42, 42.2 };

int main() {
    std::cout << global.first << ", " << global.second << '\n';
    // but allow to change later...
    global = { 10, 10.1 };
    std::cout << global.first << ", " << global.second << '\n';
    // constG = { 10, 10.1 }; // not allowed, const
}

Run @Compiler Explorer

In the above example, I create a global std::pair object and force it to use constant initialization. I can do that on all types with constexpr constructors or trivial types. Notice that inside main(), I can change the value of my object, so it’s not const. For comparison, I also included the constG object, which is a constexpr variable. In that case, we’ll also force the compiler to use constant initialization, but this time the object cannot be changed later.

While a constinit variable will be constant initialized, it cannot be later used in the initializer of another constinit variable. A constinit object, is not constexpr.

Static variables in a function scope

As you may know, C++ also offers another type of static variable: those defined in a function scope:

void foo() { 
    static int counter = 0;
    ++counter;
}

Above, the counter variable will be initialized and created when foo() is invoked for the first time. In other words, a static local variable is initialized lazily. The counter is kept “outside” the function’s stack space. This allows, for example, to keep the state, but limit the visibility of the global object.

#include <iostream>

int foo() { 
    static int counter = 0;
    return ++counter;
}

int main() {
    foo();
    foo();
    foo();
    auto finalCounter = foo();
    std::cout << finalCounter;
}

Run @Compiler Explorer

If you run the program, you’ll get 4 as the output.

Static local variables, since C++11, are guaranteed to be initialized in a thread-safe way. The object will be initialized only once if multiple threads enter a function with such a variable. Have a look below:

#include <iostream>
#include <thread>

struct Value {
    Value(int x) : v(x) { std::cout << "Value(" << v << ")\n"; }
    ~Value() noexcept { std::cout << "~Value(" << v << ")\n"; }

    int v { 0 };
};

void foo() {
    static Value x { 10 };
}

int main() {
    std::jthread worker1 { foo };
    std::jthread worker2 { foo };
    std::jthread worker3 { foo };
}

Run @Compiler Explorer

The example creates three threads that call the foo() simple function.

However, on GCC, you can also try compiling with the following flags:

-std=c++20 -lpthread -fno-threadsafe-statics

And then the output might be as follows:

Value(Value(1010)
)
Value(10)
~Value(10)
~Value(10)
~Value(10)

Three static objects are created now!

Quiz

Try answering the following questions:

1. How does thread storage duration work in C++?

Each thread has its own "copy" of an object with thread storage duration. Thread storage duration objects are allocated when the program begins and deallocated when the program ends Thread storage duration objects are automatically synchronized across all threads in a multi-threaded program.

2. What is the difference between static initialization and dynamic initialization in C++?

Static initialization happens at compile time, while dynamic initialization happens at runtime. Static initialization initializes non-local objects with constant expressions and constexpr constructors, while dynamic initialization initializes non-local objects that cannot be statically initialized. Static initialization is used for automatic storage duration objects, while dynamic initialization is used for static storage duration objects..

3. What is dynamic storage duration in C++?

Dynamic storage duration means that a variable is allocated on the stack and its lifetime is limited to the scope in which it was declared. Dynamic storage duration means that a variable is allocated in global memory and its lifetime lasts for the entire program. ^Dynamic storage duration means that a variable is allocated and deallocated using explicit dynamic memory allocation functions, such as new and delete.

More in the book:

Get the full content and much more in my book:

Print version @Amazon
C++ Initialization Story @Leanpub

Storage duration and Non-local Objects in C++

Storage duration and linkage

Static duration and external linkage

Internal linkage

Thread local storage duration

Dynamic storage duration

Initialization of non-local static objects

constinit in C++20

Static variables in a function scope

Quiz

1. How does thread storage duration work in C++?

2. What is the difference between static initialization and dynamic initialization in C++?

3. What is dynamic storage duration in C++?

More in the book:

Similar Articles:

`constinit` in C++20