Last Update:
Performance of std::string_view vs std::string from C++17
Table of Contents
In this text we’ll have a look at a few examples that compare std::string_view
against std::string
. How much is std::string_view
faster than standard std::string
operations?
Intro
I was looking for some examples of string_view
, and after a while, I got curious about the performance gain we might get.
string_view
is conceptually only a view of the string: usually implemented as[ptr, length]
. When a string_view
is created there’s no need to copy the data (as opposite when you create a copy of a string). What’s more string_view
is smaller than std::string
- regarding the size on the stack/heap.
For example when we look at a possible (pseudo) implementation:
string_view {
size_t _len;
const CharT* _str;
}
Depending on the architecture the total size is 8 or 16 bytes.
Due to Small String Optimizations std::string
is usually 24 or 32 bytes so doubles or triple the size of string_view
. In that form, such a string can hold between 15 (GCC, MSVC) and 22 characters (Clang) without having to allocate memory on the heap. Of course larger string will use more memory, but 24/32 bytes is the minimal size of the std::string
.
You can read more details about SSO in this excellent post Exploring std::string.
Or here: SSO-23 (as suggested in a comment)
Obviously returning string views, creating string views, using substr
is definitely much faster than deep copies of std::string
. However, the initial performance tests showed that std::string
is usually highly optimized and sometimes string_view
doesn’t win that much.
The C++17 Series
This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:
- Refactoring with
std::optional
- Using
std::optional
- Error handling and
std::optional
- About
std::variant
- About
std::any
- In place construction for
std::optional
,std::variant
andstd::any
std::string_view
Performance (this post)- C++17 string searchers & conversion utilities
- Working with
std::filesystem
- Something more?
Resources about C++17 STL:
- C++17 In Detail by Bartek!
- C++17 - The Complete Guide by Nicolai Josuttis
- C++ Fundamentals Including C++ 17 by Kate Gregory
- Practical C++14 and C++17 Features - by Giovanni Dicanio
- C++17 STL Cookbook by Jacek Galowicz
string_view
Operations
string_view
is modelled to be very similar to std::string
. Yet, the view is non-owning, so any operation that modifies the data cannot go into the API. Here’s a brief list of methods that you can use with this new type:
operator[]
at
front
back
data
size
/length
max_size
empty
remove_prefix
remove_suffix
swap
copy
(notconstexpr
)substr
- complexityO(1)
and notO(n)
as instd::string
compare
find
rfind
find_first_of
find_last_of
find_first_not_of
find_last_not_of
- operators for lexicography compare:
==, !=, <=, >=, <, >
operator <<
One important note is that all of the above methods (except for copy
and operator <<
) are also constexpr
! With this capability, you might now work with strings in constant expressions.
What’s more for C++20 we’ll get at least two new methods:
starts_with
ends_with
That are implemented both for std::string_view
and std::string
. As of now (July 2018) Clang 6.0 supports those functions. So you can experiment with them.
A Basic Test - substr
substr
gives probably the best advantage over the standard string substr
. It has the complexity of O(1)
and not O(n)
as with regular strings.
I’ve created a basic test using Quick C++ Benchmark and got the following results:
Using Clang 6.0.0, -O3, libc++
The code:
static void StringSubStr(benchmark::State& state) {
std::string s = "Hello Super Extra Programming World";
for (auto _ : state) {
auto oneStr = s.substr(0, 5);
auto twoStr = s.substr(6, 5);
auto threeStr = s.substr(12, 5);
auto fourStr = s.substr(18, 11);
auto fiveStr = s.substr(30, 5);
// Make sure the variable is not optimized away by compiler
benchmark::DoNotOptimize(oneStr);
benchmark::DoNotOptimize(twoStr);
benchmark::DoNotOptimize(threeStr);
benchmark::DoNotOptimize(fourStr);
benchmark::DoNotOptimize(fiveStr);
}
}
And for string_view
:
static void StringViewSubStr(benchmark::State& state) {
// Code before the loop is not measured
std::string s = "Hello Super Extra Programming World";
for (auto _ : state) {
std::string_view sv = s;
auto oneSv = sv.substr(0, 5);
auto twoSv = sv.substr(6, 5);
auto threeSv = sv.substr(12, 5);
auto fourSv = sv.substr(18, 11);
auto fiveSv = sv.substr(30, 5);
benchmark::DoNotOptimize(oneSv);
benchmark::DoNotOptimize(twoSv);
benchmark::DoNotOptimize(threeSv);
benchmark::DoNotOptimize(fourSv);
benchmark::DoNotOptimize(fiveSv);
}
}
Here’s the full experiment: @Quick C++ Bench
For this test, we have 10x speed-up!
Can we achieve similar results in other cases?
String Split
After the basic tests we can do one more step and try to compose a more complicated algorithm: let’s take string splitting.
For this experiment I’ve gathered code from this resources:
Here are the two versions, one for std::string
and the second one for std::string_view
:
std::vector<std::string>
split(const std::string& str, const std::string& delims = " ")
{
std::vector<std::string> output;
auto first = std::cbegin(str);
while (first != std::cend(str))
{
const auto second = std::find_first_of(first, std::cend(str),
std::cbegin(delims), std::cend(delims));
if (first != second)
output.emplace_back(first, second);
if (second == std::cend(str))
break;
first = std::next(second);
}
return output;
}
Now the string_view
version:
std::vector<std::string_view>
splitSV(std::string_view strv, std::string_view delims = " ")
{
std::vector<std::string_view> output;
size_t first = 0;
while (first < strv.size())
{
const auto second = strv.find_first_of(delims, first);
if (first != second)
output.emplace_back(strv.substr(first, second-first));
if (second == std::string_view::npos)
break;
first = second + 1;
}
return output;
}
And here’s the benchmark:
const std::string_view LoremIpsumStrv{
/*one paragraph of lorem ipsum */
};
static void StringSplit(benchmark::State& state) {
std::string str { LoremIpsumStrv };
for (auto _ : state) {
auto v = split(str);
benchmark::DoNotOptimize(v);
}
}
// Register the function as a benchmark
BENCHMARK(StringSplit);
static void StringViewSplit(benchmark::State& state) {
for (auto _ : state) {
auto v = splitSV(LoremIpsumStrv);
benchmark::DoNotOptimize(v);
}
}
BENCHMARK(StringViewSplit);
Will we get the same 10X perf speed as in the previous benchmark… hmmm:
This is GCC 8.1, -O3
A bit better with Clang 6.0.0, -O3:
A slightly better result when I run it locally in MSVC 2017:
string length: 486
test iterations: 10000
string split: 36.7115 ms
string_view split: 30.2734 ms
Here’s the benchmark @Quick C++ Bench
Do you have any ideas why we do not see 10X speed up as with the initial experiment?
Of course, we cannot assume 10X is realistic in this case.
First of all, we have a container - std::vector
- that the algorithm uses to output the results. The memory allocations inside std::vector
will affect the overall speed.
If we run the iteration once, and when I override operator new
I can see the following numbers (MSVC):
string length: 486
test iterations: 1
string split: 0.011448 ms, Allocation count: 15, size 6912
string_view split: 0.006316 ms, Allocation count: 12, size 2272
We have 69 words in that string, the string
version generated 15 memory allocations (both for strings and to increase the vector
space), and in total it allocated 6912 bytes.
The strng_view
version used 12 memory allocations (only for vector
as there’s no need to allocate memory for string_view
) and in total it used 2272 bytes (3x less than thestd::string
version)
Some ideas to improve
See the comment by JFT where he reimplemented the split algorithms using raw pointers rather than iterators, and he got much more performance improvements.
Another possibility is to reserve some space up-front in the vector (and later we can use shrink_to_fit
- that way we save a lot of memory allocations.
Comparing with boost::split
:
For completeness, I also run the benchmark against boost::split
(1.67), and both of our versions are much faster:
Running on WandBox, GCC 8.1
string length: 489
test iterations: 10000
string split: 42.8627 ms, Allocation count: 110000, size 82330000
string_view split: 45.6841 ms, Allocation count: 80000, size 40800000
boost split: 117.521 ms, Allocation count: 160000, size 83930000
So the hand-crafted version is almost 3x faster than the boost.split
algorithm!
String Split and Loading From a File
You might notice that my test string is just one paragraph of “lorem ipsum”. Such a simple test case might cause some additional optimisations in the compiler and produce unrealistic results.
I’ve found a nice post from Rainer Grimm: C++17 - Avoid Copying with std::string_view - ModernesCpp.com
In the article, he used TXT files to process strings. It’s a much better idea to work on some real and large text files, rather than simple strings.
Instead of my lorem ipsum paragraph, I’m just loading a file, for example, ~540kb of text (Gutenberg project)
Here’s a result from a test run over that file:
string length: 547412
test iterations: 100
string split: 564.215 ms, Allocation count: 191800, size 669900000
string_view split: 363.506 ms, Allocation count: 2900, size 221262300
The test is run 100 times, so for one iteration we have 191800/100 = 1918
memory allocations (in total we use 669900000/100 = 6699000 bytes
per iteration) for std::string
.
For string_view
we have only 2900/100 = 29
memory allocations and 221262300/100 = 2212623 bytes
used per iteration.
While it’s still not 10x gain, we have 3x less memory used and around 1.5x perf boost.
Risks With Using string_view
I think that every article about string_view
should also mention the potential risks involved with this new type:
- Taking care of the (non)null-terminated strings -
string_view
may not contain NULL at the end of the string. So you have to be prepared for such a case.- Problematic when calling functions like
atoi
,printf
that accepts null-terminated strings - Conversion into strings
- Problematic when calling functions like
- References and Temporary objects -
string_view
doesn’t own the memory, so you have to be very careful when working with temporary objects.- When returning
string_view
from a function - Storing
string_view
in objects or container.
- When returning
Wrap Up
By leveraging string_view
, you can achieve a lot of performance boost in many use cases. However, it’s important to know that there are caveats and sometimes the perf might be even slower as compare to std::string
!
The first thing is that string_view
doesn’t own the data - thus you need to be careful, so you don’t end up with references to deleted memory!
The second thing is that compilers are very smart when handling strings, especially when strings are short (so they work nicely with SSO - Small String Optimization), and in that case, the perf boost might not be that visible.
Back to you
- What is your experience with
string_view
performance? - Can you share some results and examples?
I've prepared a valuable bonus for you!
Learn all major features of recent C++ Standards on my Reference Cards!
Check it out here: