Last Update:
Storage duration and Non-local Objects in C++
C++ allows us to declare various forms of non-local objects: they usually live throughout the execution of the whole program. In this article, we’ll look at global variables, dynamic, and thread-local objects. We’ll also consider new features for safe initialization C++20.
This text comes from my book “C++ Initialization Story”.
Storage duration and linkage
To start, we need to understand two key properties of an object in C++: storage and linkage. Let’s begin with the definition of storage, from [basic.stc#general]:
The storage duration is the property of an object that defines the minimum potential lifetime of the storage containing the object. The storage duration is determined by the construct used to create the object.
An object in C++ has one of the following storage duration options:
Storage duration | Explanation |
---|---|
automatic | Automatic means that the storage is allocated at the start of the scope. Most local variables have automatic storage duration (except those declared as static , extern , or thread_local ). |
static | The storage for an object is allocated when the program begins (usually before the main() function starts) and deallocated when the program ends. There’s only one instance of such an object in the whole program. |
thread | The storage for an object is tied to a thread: it’s started when a thread begins and is deallocated when the thread ends. Each thread has its own “copy” of that object. |
dynamic | The storage for an object is allocated and deallocated using explicit dynamic memory allocation functions. For example, by the call to new /delete . |
And the definition for the second property: linkage, extracted from [basic.link]:
A name is said to have linkage when it can denote the same object, reference, function, type, template, namespace, or value as a name introduced by a declaration in another scope.
We have several linkage types:
Linkage | Explanation |
---|---|
external linkage | External means that the name can be referred to from the scopes in the same or other translation units. Non-const global variables have external linkage by default. |
module linkage | Available since C++20. A name can be referred in scopes of the same module or module units. |
internal linkage | A name can be referred to from the scopes in the same translation units. For example, a static , const , and constexpr global variables have internal linkage. |
no linkage | Cannot be referred from other scopes. |
language linkage | Allows interoperability between different programming languages, usually with C. For example, by declaring extern "C" . |
If we work with regular variables declared in a function’s scope, the storage is automatic, and there’s no linkage, but those properties matter for objects in a global or thread scope. In the following sections, we’ll try experiments with global objects to understand the meaning of those definitions.
Static duration and external linkage
Consider the following code:
#include <iostream>
struct Value {
Value(int x) : v(x) { std::cout << "Value(" << v << ")\n"; }
~Value() noexcept { std::cout << "~Value(" << v << ")\n"; }
int v {0};
};
Value v{42};
int main() {
puts("main starts...");
Value x { 100 };
puts("main ends...");
}
If we run the example, you’ll see the following output:
Value(42)
main starts...
Value(100)
main ends...
~Value(100)
~Value(42)
In the example, there’s a structure called Value,
and I declare and define a global variable v
. As you can see from the output, the object is initialized before the main()
function starts and is destroyed after the main()
ends.
The global variable v
has a static storage duration and external linkage. On the other hand, the second variable, x
, has no linkage and automatic storage duration (as it’s a local variable).
If we have two translation units: main.cpp
and other.cpp
, we can point to the same global variable by declaring and defining an object in one place and then using the extern
keyword to provide the declaration in the other translation unit. This is illustrated by the following example:
// main.cpp
#include <iostream>
#include "value.h"
Value v{42};
void foo();
int main() {
std::cout << "in main(): " << &v << '\n';
foo();
std::cout << "main ends...\n";
}
// other.cpp
#include "value.h"
extern Value v; // declaration only!
void foo() {
std::cout << "in foo(): " << &v << '\n';
}
Run @Wandbox
If we run the code, you’ll see that the address of v
is the same in both lines. For instance:
Value(42)
in main(): 0x404194
in foo(): 0x404194
main ends...
~Value(42)
Internal linkage
If you want two global variables visible as separate objects in each translation unit, you need to define them as static
. This will change their linkage from external to internal.
// main.cpp
#include <iostream>
#include "value.h"
static Value v{42};
void foo();
int main() {
std::cout << "in main(): " << &v << '\n';
foo();
std::cout << "main ends...\n";
}
// other.cpp
#include "value.h"
static Value v { 100 };
void foo() {
std::cout << "in foo(): " << &v << '\n';
}
Run @Wandbox
Now, you have two different objects which live in the static storage (outside main()
):
Value(42)
Value(100)
in main(): 0x404198
in foo(): 0x4041a0
main ends...
~Value(100)
~Value(42)
You can also achieve this by wrapping objects in an anonymous namespace:
namespace {
Value v{42};
}
Additionally, if you declare const Value v{42};
in one translation unit, then const
implies an internal linkage. If you want to have a const
object with the external linkage, you need to add the extern
keyword:
// main.cpp:
extern const Value v { 42 }; // declaration and definition!
// other.cpp:
extern const Value v; // declaration
While constant global variables might be useful, try to avoid mutable global objects. They complicate the program’s state and may introduce subtle bugs or data races, especially in multithreaded programs. In this chapter, we cover all global variables so that you can understand how they work, but use them carefully. See this C++ Core Guideline: I.2: Avoid non-const global variables.
Thread local storage duration
Since C++11, you can use a new keyword, thread_local
, to indicate the special storage of a variable. A thread_local
object can be declared at a local scope or at a global scope. In both cases, its initialization is tied to a thread, and the storage is located in the Thread Local Storage space. Each thread that uses this object creates a copy of it.
#include <iostream>
#include <thread>
#include <mutex>
std::mutex mutPrint;
thread_local int x = 0;
void foo() {
thread_local int y = 0;
std::lock_guard guard(mutPrint);
std::cout << "in thread\t" << std::this_thread::get_id() << " ";
std::cout << "&x " << &x << ", ";
std::cout << "&y " << &y << '\n';
}
int main() {
std::cout << "main\t" << std::this_thread::get_id() << " &x " << &x << '\n';
std::jthread worker1 { foo };
foo();
std::jthread worker2 { foo };
foo();
}
And here’s a possible output:
main 4154632640 &x 0xf7a2a9b8
in thread 4154632640 &x 0xf7a2a9b8, &y 0xf7a2a9bc
in thread 4154628928 &x 0xf7a29b38, &y 0xf7a29b3c
in thread 4154632640 &x 0xf7a2a9b8, &y 0xf7a2a9bc
in thread 4146236224 &x 0xf7228b38, &y 0xf7228b3c
The example uses a mutex mutPrint
to synchronize printing to the output. First, inside main()
, you can see the ID of the main thread and the address of the x
variable. Later in the output, you can see that foo()
was called, and it’s done in the main thread (compare the IDs). As you can see, the addresses of x
are the same because it’s the same thread. On the other hand, later in the output, we can see an invocation from two different threads; in both cases, the addresses of x
and y
are different. In summary, we have three distinct copies of x
and three of y
.
From the example above, we can also spot that across a single thread, thread_local
in a function scope behaves like a static
local variable. What’s more, the two lines are equivalent:
// local or global scope...
static thread_local int x;
thread_local int y; // means the same as above
The code usesstd::jthread
from C++20, which automatically joins to the caller thread when thejthread
object goes out of scope. When you usestd::thread
you need to calljoin()
manually.
Thread local variables might be used when you want a shared global state, but keep it only for a given thread and thus avoid synchronization issues. To simulate such behavior and understand those types of variables, we can create a map of variables:
std::map<thread_id, Object> objects;
And each time you access a global variable, you need to access it via the current thread id, something like:
objects[std::this_thread::get_id()] = x; // modify the global object...
Of course, the above code is just a simplification, and thanks to thread_local
, all details are hidden by the compiler, and we can safely access and modify objects.
In another example, we can observe when each copy is created, have a look:
#include <iostream>
#include <thread>
#include "value.h"
thread_local Value x { 42 };
void foo() {
std::cout << "foo()\n";
x.v = 100;
}
int main() {
std::cout << "main " << std::this_thread::get_id() << '\n';
{
std::jthread worker1 { foo };
std::jthread worker2 { foo };
}
std::cout << "end main()\n";
}
Possible output:
main 4154399168
foo()
Value(42)
foo()
Value(42)
~Value(~Value(100)
100)
end main()
This time the variable x
prints a message from its constructor and destructor, and thus we can see some details. Only two foo
thread workers use this variable, and we have two copies, not three (the main thread doesn’t use the variable). Each copy starts its lifetime when its parent thread starts and ends when the thread joins into the main thread.
As an experiment, you can try commenting out the line with x.v = 100
. After the compilation, you won’t see any Value
constructor or destructor calls. It’s because the object is not used by any thread, and thus no object is created.
Possible use cases:
- Having a random number generator, one per thread
- One thread processes a server connection and stores some state across
- Keeping some statistics per thread, for example, to measure load in a thread pool.
Dynamic storage duration
For completeness, we also have to mention dynamic storage duration. In short, by requesting a memory through explicit calls to memory management routines, you have full control when the object is created and destroyed. In most basic scenario you can call new()
and then delete
:
auto pInt = new int{42}; // only for illustration...
auto pSmartInt = std::make_unique<int>(42);
int main() {
auto pDouble = new double { 42.2 }; // only for illustration...
// use pInt...
// use pDouble
delete pInt;
delete pDouble;
}
The above artificial example showed three options for dynamic storage:
pInt
is a non-local object initialized with the new expression. We have to destroy it manually; in this case, it’s at the end of themain()
function.pDouble
is a local variable that is also dynamically initialized; we also have to delete it manually.- On the other hand,
pSmartInt
is a smart pointer, astd::unique_ptr
that is dynamically initialized. Thanks to the RAII pattern, there’s no need to manually delete the memory, as the smart pointer will automatically do it when it goes out of scope. In our case, it will be destroyed aftermain()
shuts down.
Dynamic memory management is very tricky, so it’s best to rely on RAII and smart pointers to clean the memory. The example above used raw new and delete only to show the basic usage, but in production code, try to avoid it. See more in those resources: 6 Ways to Refactor new/delete into unique ptr - C++ Stories and 5 ways how unique_ptr enhances resource safety in your code - C++ Stories.
Initialization of non-local static objects
All non-local objects are initialized before main()
starts and before their first “use”. But there’s more to that.
Consider the following code:
#include <iostream>
struct Value { /*as before*/ };
double z = 100.0;
int x;
Value v{42};
int main() {
puts("main starts...");
std::cout << x << '\n';
puts("main ends...");
}
All global objects z
, x
, and v
are initialized during the program startup and before the main()
starts. We can divide the initialization into two distinct types: static initialization and dynamic initialization.
The static initialization occurs in two forms:
- constant initialization - this happens for the
z
variable, which is value initialized from a constant expression. - The
x
object looks uninitialized, but for non-local static objects, the compiler performs zero initialization, which means they will take the value of zero (and then it’s converted to the appropriate type). Pointers are set tonullptr
, arrays, trivial structs, and unions have their members initialized to a zero value.
Don’t rely on zero initialization for static objects. Always try to assign some value to be sure of the outcome. In the book, I only showed it so you could see the whole picture.
Now, v
global objects are initialized during so-called dynamic initialization of non-local variables". It happens for objects that cannot be constant initialized or zero-initialized during static initialization at the program startup.
In a single translation unit, the order of dynamic initialization of global variables (including static data members) is well defined. If you have multiple compilation units, then the order is unspecified. When a global object A defined in one compilation unit depends on another global object B defined in a different translation unit, you’ll have undefined behavior. Such a problem is called the “static initialization order fiasco”; read more C++ Super FAQ.
In short, each static non-local object has to be initialized at the program startup. However, the compiler tries to optimize this process and, if possible, do as much work at compile time. For example, for built-in types initialized from constant expressions, the value of the variable might be stored as a part of the binary and then only loaded during the program startup. If it’s not possible, then a dynamic initialization must happen, meaning that the value is computed once before the main()
starts. Additionally, the compiler might even defer the dynamic initialization until the first use of the variable but must guarantee the program’s correctness. Since C++11, we can try to move dynamic initialization to the compile-time stage thanks to constexpr
(allowing us to write custom types). Since C++20, we can use constinit
to guarantee constant initialization.
For more information, have a look at this good blog post for more information: C++ - Initialization of Static Variables by Pablo Arias and also a presentation by Matt Godbolt: CppCon 2018 “The Bits Between the Bits: How We Get to main()”.
constinit
in C++20
As discussed in the previous section, it’s best to rely on constant initialization if you really need a global variable. In the case of dynamic initialization, the order of initialization might be hard to guess and might cause issues. Consider the following example:
// point.h
struct Point {
double x, y;
};
one:
// a.cpp
#include <iostream>
#include "point.h"
extern Point center;
Point offset = { center.x + 100, center.y + 200};
void foo() {
std::cout << offset.x << ", " << offset.y << '\n';
}
two:
// b.cpp
#include "point.h"
Point createPoint(double x, double y) {
return Point { x, y };
}
Point center = createPoint(100, 200); //dynamic
And the main:
void foo();
int main() {
foo();
}
Run all @Wandbox
If we compile this code using the following command and order:
$ g++ prog.cc -Wall -Wextra -std=c++2a -pedantic a.cpp b.cpp
We’ll get the following:
100, 200
But if you compile b.cpp
first and then a.cpp
:
$ g++ prog.cc -Wall -Wextra -std=c++2a -pedantic b.cpp a.cpp
You’ll get:
200, 400
There’s a dependency of global variables: offset
depends on center
. If the compilation unit with center
were compiled first, the dynamic initialization would be performed, and center
would have 100, 200
assigned. Otherwise, it’s only zero-initialized, and thus offset
has the value of 100, 200
.
(This is only a toy example, but imagine a production code! In that case, you might have a hard-to-find bug that comes not from some incorrect computation logic but from the compilation order in the project!)
To mitigate the issue, you can apply constinit
on the center
global variable. This new keyword for C++20 forces constant initialization. In our case, it will ensure that no matter the order of compilation, the value will already be present. What’s more, as opposed to constexpr
we only force initialization, and the variable itself is not constant. So you can change it later.
// b.cpp:
#include "point.h"
constexpr Point createPoint(double x, double y) {
return Point { x, y };
}
constinit Point center = createPoint(100, 200); // constant
Run @Wandbox
Please notice that createPoint
has to be constexpr
now. The main requirement for constinit
is that it requires the initializer expression to be evaluated at compile-time, so not all code can be converted that way.
Here’s another example that summarizes how to use constinit
:
#include <iostream>
#include <utility>
constinit std::pair<int, double> global { 42, 42.2 };
constexpr std::pair<int, double> constG { 42, 42.2 };
int main() {
std::cout << global.first << ", " << global.second << '\n';
// but allow to change later...
global = { 10, 10.1 };
std::cout << global.first << ", " << global.second << '\n';
// constG = { 10, 10.1 }; // not allowed, const
}
In the above example, I create a global std::pair
object and force it to use constant initialization. I can do that on all types with constexpr
constructors or trivial types. Notice that inside main()
, I can change the value of my object, so it’s not const
. For comparison, I also included the constG
object, which is a constexpr
variable. In that case, we’ll also force the compiler to use constant initialization, but this time the object cannot be changed later.
While aconstinit
variable will be constant initialized, it cannot be later used in the initializer of anotherconstinit
variable. Aconstinit
object, is notconstexpr
.
Static variables in a function scope
As you may know, C++ also offers another type of static variable: those defined in a function scope:
void foo() {
static int counter = 0;
++counter;
}
Above, the counter
variable will be initialized and created when foo()
is invoked for the first time. In other words, a static local variable is initialized lazily. The counter
is kept “outside” the function’s stack space. This allows, for example, to keep the state, but limit the visibility of the global object.
#include <iostream>
int foo() {
static int counter = 0;
return ++counter;
}
int main() {
foo();
foo();
foo();
auto finalCounter = foo();
std::cout << finalCounter;
}
If you run the program, you’ll get 4
as the output.
Static local variables, since C++11, are guaranteed to be initialized in a thread-safe way. The object will be initialized only once if multiple threads enter a function with such a variable. Have a look below:
#include <iostream>
#include <thread>
struct Value {
Value(int x) : v(x) { std::cout << "Value(" << v << ")\n"; }
~Value() noexcept { std::cout << "~Value(" << v << ")\n"; }
int v { 0 };
};
void foo() {
static Value x { 10 };
}
int main() {
std::jthread worker1 { foo };
std::jthread worker2 { foo };
std::jthread worker3 { foo };
}
The example creates three threads that call the foo()
simple function.
However, on GCC, you can also try compiling with the following flags:
-std=c++20 -lpthread -fno-threadsafe-statics
And then the output might be as follows:
Value(Value(1010)
)
Value(10)
~Value(10)
~Value(10)
~Value(10)
Three static objects are created now!
Quiz
Try answering the following questions:
More in the book:
Get the full content and much more in my book:
I've prepared a valuable bonus for you!
Learn all major features of recent C++ Standards on my Reference Cards!
Check it out here: