Last Update:
How C++17 Benefits from the Boost Libraries
Table of Contents
In today’s article, I’ll show you battle-tested features from the well-known Boost libraries that were adapted into C++17.
With the growing number of elements in the Standard Library, supported by experience from Boost you can write even more fluent C++ code.
Read on and learn about the cool things in C++.
Note: This post was initially published at the fluentcpp blog in two parts: the first and the second.
Intro
Some time ago I saw a collection of articles at Fluent C++ about boost algorithms:
- The BooSTL Algorithms: Boost Algorithms That Extend the STL (1/3)
- The BooSTL Algorithms: Boost Algorithms That Extend the STL (2/3)
- The BooSTL Algorithms: Boost Algorithms That Extend the STL (3/3)
In the series, Jonathan described various sorting algorithms, extended partitioning, pattern searching and a few others. I realised that a lot of elements from Boost are now part of the Standard Library, so that inspired me to tackle this topic.
As you know, Boost libraries give us a vast set of handy algorithms, types and features that we don’t have in the Standard Library. Many functionalities were “ported” into core C++. For example, in C++11 we got std::regex
, threading and smart pointers.
In that context, we can treat Boost as a testing battleground before moving to the Standard Library.
When I was writing my book about C++17, I’ve noticed that there is a large number of elements that were “moved” from Boost in the new Standard.
For example:
- vocabulary types,
std::variant
,std::any
,std::optional
std::string_view
- searchers - Boyer Moore and Boyer Moore Horspool
std::filesystem
- special math functions
- template enhancements
The good news is that if you used only small parts of Boost like boost::variant
or boost::optional
, now you can use almost the same code and convert to the Standard Library types (via std::variant
and std::optiona
).
Let’s have a look at those areas, and the first topic is “vocabulary types”.
Vocabulary Types
Being able to write expressive code is a compelling capability. Sometimes using only built-in types doesn’t provide those options. For example, you can set up some number and assign it as “NOT_NUMBER” or treat values of -1 as null entries. As an “ultimate solution” you could even use a pointer and treat nullptr as null… but wouldn’t it be better to have explicit type from the Standard?
Alternatively, how about storing several alternative types in a single object? You can try with C-style unions, but they are hard to use and very low-level… and causing troubles. How about having a type that can store several alternatives… or an object that can store any type?
If you use Boost, then you probably stumbled upon types like boost::optional, boost::variant and boost::any.
Rather than treating -1 as “null number” you leverage optional<int>
- if optional is “empty” then you don’t have a number. Simple as it is.
Alternatively, variant<string, int, float>
is the type that allows you to store three possible types and switch between them at runtime.
Finally, there’s any that is like a var type in dynamic languages; it can store any type and dynamically change them. It might be int, and later you can switch it to string.
Let’s have a look at some code:
std::optional
The first one is std::optional
:
template <typename Map, typename Key>
std::optional<typename Map::value_type::second_type> TryFind(const Map& m, const Key& k) {
auto it = m.find(k);
if (it != m.end())
return std::make_optional(it->second);
return std::nullopt;
}
TryFind returns optional of the value stored in the map, or nullopt. See demo @Wandbox.
You can use it in the following way:
std::map<std::string, int> mm { {"hello", 10}, { "super", 42 }};
auto ov = TryFind(mm, "hello");
// one:
std::cout << ov.value_or(0) << '\n';
// two:
if (ov)
std::cout << *ov << '\n';
If the optional ov
contains a value, we can access it through the .value()
member function or operator*
. In the above code, we used another alternative which is the value_or()
function that returns the value if present or returns the passed parameter.
std::variant
std::optional
stores one value or nothing, so how about storing more types in a safe union type?
Here’s an example:
std::variant<int, float, std::string> TryParseString(std::string_view sv) {
// try with float first
float fResult = 0.0f;
const auto last = sv.data() + sv.size();
const auto res = std::from_chars(sv.data(), last, fResult);
if (res.ec != std::errc{} || res.ptr != last) {
// if not possible, then just assume it's a string
return std::string{sv};
}
// no fraction part? then just cast to integer
if (static_cast<int>(fResult) == fResult)
return static_cast<int>(fResult);
return fResult;
}
std::variant
can be used to store different types as a parsing result. One common use case is parsing command line or some configuration file. The function TryParseString
takes a string view and then tries to parse it into float
, int
or string
. If the floating-point value has no fraction part, then we store it as an integer. Otherwise, it’s a float
. If the numerical conversion cannot be performed, then the function copies the string.
To access the value stored in a variant, you first have to know the active type. Here’s a code that shows how to do it and use the return value from TryParseString
:
const auto var = TryParseString("12345.98");
try {
if (std::holds_alternative<int>(var))
std::cout << "parsed as int: " << std::get<int>(var) << '\n';
else if (std::holds_alternative<float>(var))
std::cout << "parsed as float: " << std::get<float>(var) << '\n';
else if (std::holds_alternative<string>(var))
std::cout << "parsed as string: " << std::get<std::string>(var) << '\n';
}
catch (std::bad_variant_access&) {
std::cout << "bad variant access...\n";
}
The main idea is to use std::holds_alternative()
that allows us to check what type is present. variant also offers the .index()
member function that returns number from 0… to the max num of stored types.
But one of the coolest uses is a thing called std::visit()
.
With this new functionality, you can pass a variant and visit the type that is actively stored. To do it you need to provide a functor that has call operator for all possible types in the given variant:
struct PrintInfo {
void operator()(const int& i) const { cout << "parsed as int" << i << '\n'; }
void operator()(const float& f) const { cout << "parsed as float" << f << '\n'; }
void operator()(const string& s) const { cout << "parsed as str" << s << '\n'; }
};
auto PrintVisitorAuto = [](const auto& t) { std::cout << t << '\n'; };
const auto var = TryParseString("Hello World");
std::visit(PrintVisitorAuto , var);
std::visit(PrintInfo{}, var);
In the above example, we used two “types” of visitors. The first one - PrintInfo
is a structure that provides all overrides for the call operator. We can use it to show more information about the given type and perform unique implementations. The other version - PrintVisitorAuto
- leverages generic lambdas, which is convenient if the implementation for all of the types is the same.
You can also read about the overload pattern in a separate blog post. This allows you to write all lambdas locally in a place where std::visit()
is called: Bartek’s coding blog: 2 Lines Of Code and 3 C++17 Features - The overload Pattern
std::any
std::any
is probably the least know vocabulary type, and I think there are not many use cases for such a flexible type. It’s almost like var from JavaScript, as it can hold anything.
A little demo of std::any (comes from the proposal N1939:
struct property {
property();
property(const std::string &, const std::any &);
std::string name;
std::any value;
};
typedef std::vector<property> properties;
With such property class, you can store any type. Still, if you can restrict the number of possible types, then it’s better to use std::variant
as it performs faster than std::any
(no extra dynamic memory allocation needed).
More About std::optional
, std::variant
and std::any
If you want to know more about the vocabulary types you can read separate articles :
- using std::optional,
- And also recent post at fluentcpp about expressive nullable types: here and here.
- using std::variant,
- using std::any.
std::string_view
- non-owning string
std::string_view
is a not owning view on the contiguous sequence of characters. It has been ready in Boost for several years now (see boost utils string_view). As far as I know, their interfaces were a bit different, but now the boost version is conformant with C++17.
Conceptually string_view consists of a pointer to the character sequence and the size:
struct BasicCharStringView {
char* dataptr;
size_t size;
};
You may wonder what’s unique about std::string_view
?
First of all string_view
is a natural replacement for char*
arguments. If your function takes const char*
and then performs some operation on that, then you can also use view and benefit from nice string-like API.
For example:
size_t CStyle(const char* str, char ch) {
auto chptr = strchr(str, ch);
if (chptr != nullptr)
return strlen(str) + (chptr - str);
return strlen(str);
}
size_t CppStyle(std::string_view sv, char ch) {
auto pos = sv.find(ch);
if (pos != std::string_view::npos)
return sv.length() + pos;
return sv.length();
}
// use:
std::cout << CStyle("Hello World", 'X') << '\n';
std::cout << CppStyle("Hello World", 'X') << '\n';
See the code @Wandbox
Going further, as you might know, there are many string-like class implementations. CString, QString, etc… and if your code needs to handle many types, string_view might help. Those other types can provide access to the data pointer and the size, and then you can create a string_view object.
Views might also be helpful when doing some work on large strings and when you slice and cut smaller sections. For example, in the parsing of files: You can load file content into a single std::string object and then use views to perform the processing. This might show a nice performance boost as there won’t be any extra copies of strings needed.
It’s also important to remember that since the string_view doesn’t own the data, and also might not be null-terminated, there are some risks associated with using it:
- Taking care of the (non)null-terminated strings - string_view may not contain NULL at the end of the string. So you have to be prepared for such a case.
- Problematic when calling functions like atoi, printf that accepts null-terminated strings
- References and Temporary objects - string_view doesn’t own the memory, so you have to be very careful when working with temporary objects.
- When returning string_view from a function
- Storing string_view in objects or container.
A good summary of string views can be found at Marco Arena’s blog post: string_view odi et amo.
starts_with
/ends_with
New Algorithms
C++20 info: Another good news is that starts_with()
/ends_with()
algorithms from Boost are now part of C++20… and many compilers already have implemented them. They are available both for string_view
and std::string
.
Searchers
As Jonathan wrote in his second part of the searchers series, Boost offers three pattern searching algorithms:
- the Knuth-Morris-Pratt algorithm,
- the Boyer-Moore algorithm,
- the Boyer-Moore-Horspool algorithm.
All of the algorithms beat the naive pattern searching for large strings by using a preprocessing step. They build additional tables based on the input pattern, and the search is more efficient.
The last two of those algorithms were ported into C++17, and they are available as an additional searcher object for the std::search function.
Right now, C++17 provides a new overload for std::search:
template<class ForwardIterator, class Searcher>
ForwardIterator search( ForwardIterator first, ForwardIterator last,
const Searcher& searcher );
The Searcher is a template parameter (so you can even come up with your implementation!), and the library offers three types:
default_searcher
boyer_moore_searcher
boyer_moore_horspool_searcher
All in all you can use it like:
std::string testString = "Hello Super World";
std::string needle = "Super";
auto it = search(testString.begin(), testString.end(),
boyer_moore_searcher(needle.begin(), needle.end()));
if (it == testString.end())
cout << "The string " << needle << " not found\n";
The searcher object is created once for each pattern. If you want to search the same text in different containers, then you can save a bit of preprocessing time.
On my blog, I did some performance experiments, and it looks like for larger patterns and boyer_moore we can achieve much better performance than with a default searcher. For example, when scanning inside text with 547412 characters, and looking for a 200-letter pattern, I got 8x perf speedup over the default searcher. And even 3x perf over optimised std::string::find.
If you want more about the searchers, with even some basic benchmarks you can have a look here: Speeding up Pattern Searches with Boyer-Moore Algorithm from C++17.
Filesystem
This is a massive addition to C++17 and The Standard Library. The committee took years of experience with boost::filesystem improved it, proposed a technical specification and later merged into the Standard.
As the canonical example, let’s have a look at the directory iteration from Boost:
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
fs::path inputPath = GetInputPath();
for (const auto& entry : fs::directory_iterator(inputPath))
std::cout << entry.path() << '\n';
And now, the C++17’s version:
#include <filesystem>
namespace fs = std::filesystem;
fs::path inputPath = GetInputPath();
for (const auto& entry : fs::directory_iterator(inputPath)) {
std::cout << entry.path() << '\n';
Do you see any difference? :) The code is almost the same as in Boost!
We can even extend it a bit and add more logging:
#include <filesystem>
namespace fs = std::filesystem;
for (const auto& entry : fs::directory_iterator(inputPath)) {
const auto filenameStr = entry.path().filename().string();
if (entry.is_directory())
std::cout << "dir: " << filenameStr << '\n';
else if (entry.is_regular_file())
std::cout << "file: " << filenameStr << '\n';
else
std::cout << "?? " << filenameStr << '\n';
}
As you can see, in the above code we can efficiently work with path objects, run the iteration over a directory (recursive or not) and print various information about the given directory entry.
The filesystem library is composed of four main parts:
-
The path object - a type that represents a path in the system. With various methods to extract the path parts, compose it, convert between formats and even from string to wide string.
-
directory_entry - holds information about the path that is inside some directory, plus cache
-
Directory iterators - two classes that allow you to scan a directory: just once or recursively.
-
Plus many supportive non-member functions:
-
getting information about the path
-
files manipulation: copy, move, create, symlinks
-
last write time
-
permissions
-
space/filesize
-
…
The library is enormous, and I hope it will be beneficial for applications that rely on file access (and which app doesn’t have to work with files?)
On my blog, I published one article by a guest author who described his process of moving from boost::filesystem into std::filesystem. Check it out if you also need to convert some of your file handling code.
Bartek’s coding blog: Converting from Boost to std::filesystem
Special Math Functions: clamp, gcd and More
The Boost libraries offer lots of algorithms and functions that help with even advanced math calculations.
For example, there’s a whole Math Toolkit 2.9.0 - 1.70.0 module with almost everything you can expect from a math library.
The C++17 Standard extended the library with a few extra functions.
We have a simple functions like clamp , gcd and lcm :
#include <iostream>
#include <algorithm> // clamp
#include <numeric> // for gcm, lcm
int main() {
std::cout << std::clamp(300, 0, 255) << ', ';
std::cout << std::clamp(-10, 0, 255) << '\n';
std::cout << std::gcd(24, 60) << ', ';
std::cout << std::lcm(15, 50) << '\n';
}
And, also there’s a set of special math functions: assoc_laguerre, beta, comp_ellint_1/_2/_3, hermite, laguerre, riemann_zeta and a few others.
The full list of those special math function can be found at Mathematical special functions - @cppreference.
Template Enhancements - and, or, not
P0013 proposes to add the metafunctions and_, or_ and not_ to the standard library and cites Boost.MPL as one of the standard libraries having implemented such features for a long time. The paper was adopted in C++17 as std::conjunction, std::disjunction and std::negation.
Here’s an example, based on the code from the proposal:
template<typename... Ts>
std::enable_if_t<std::conjunction_v<std::is_same<int, Ts>...> >
PrintIntegers(Ts ... args) {
(std::cout << ... << args) << '\n';
}
The above function PrintIntegers works with a variable number of arguments, but they all have to be of type int.
A Glimpse of C++20
As you might already know in C++20 we’ll get Ranges and Concepts… but did you know that an earlier version was also available in Boost?
Here’s a link to the Ranges library Boost Range 2.0
And now while the Concepts in C++20 are part of the language, you can simulate them with The Boost Concept Check Library:
The library is heavily based on macros, but you could get some outline about generic programming and what we might want to achieve with Real concepts.
Summary
I hope with this blog post I gave you more incentives to start using C++17 :). The last C++ standard offers not only many language features (like if constexpr, structured bindings, fold expressions…), but also a broad set of utilities from the Standard Library. You can now use many vocabulary types: variant, optional, any. Use string views and even a significant component: std::filesystem. All without the need to reference some external library.
Your Turn
- What are your favourite features from Boost that you use?
- Maybe they will also be merged into the Standard?
- Have you ported some boost code into C++17 (and its corresponding feature-set)?
Share your experience in comments.
I've prepared a valuable bonus for you!
Learn all major features of recent C++ Standards on my Reference Cards!
Check it out here: