Last Update:
Integer Conversions and Safe Comparisons in C++20
Table of Contents
Sometimes, If you mix different integer types in an expression, you might end up with tricky cases. For example, comparing long
with size_t
might give different results than long
with unsigned short
. C++20 brings some help, and there’s no need to learn all the complex rules :)
Conversion and Ranks
Let’s have a look at two comparisons:
#include <iostream>
int main() {
long a = -100;
unsigned short b = 100;
std::cout << (a < b); // 1
size_t c = 100;
std::cout << (a < c); // 2
}
If you run the code @Compiler Explorer (GCC 12, x86-64, default flags) you’ll see:
10
Why? Why not 11
?
(By the way, I asked that question on Twitter, see https://twitter.com/fenbf/status/1568566458333990914 - thank you for all the answers and hints)
If we run C++Insights, we’ll see the following transformation:
long a = static_cast<long>(-100);
unsigned short b = 100;
std::cout.operator<<((a < static_cast<long>(b)));
size_t c = 100;
std::cout.operator<<((static_cast<unsigned long>(a) < c));
As you can see, in the first case, the compiler converted unsigned short
to long
, and then comparing -100
to 100
made sense. But in the second case, long
was promoted to unsigned long
and thus -100
become (-100) % std::numeric_limits<size_t>::max()
which is some super large positive number.
In general, if you have a binary operation, the compiler needs to have the same types; if the types differ, the compiler must perform some conversion. See the notes from C++ Reference -
For the binary operators (except shifts), if the promoted operands have different types, additional set of implicit conversions is applied, known as usual arithmetic conversions with the goal to produce the common type (also accessible via the
std::common_type
type trait)…
As for integral types:
- If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank.
- Otherwise, if the unsigned operand’s conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand’s type.
- Otherwise, if the signed operand’s type can represent all values of the unsigned operand, the unsigned operand is converted to the signed operand’s type.
- Otherwise, both operands are converted to the unsigned counterpart of the signed operand’s type.
And the conversion rank:
The conversion rank above increases in order
bool
,signed char
,short
,int
,long
,long long
(since C++11). The rank of anyunsigned
type is equal to the rank of the correspondingsigned
type. The rank ofchar
is equal to the rank ofsigned char
andunsigned char
. The ranks ofchar8_t
, (since C++20)char16_t
,char32_t
, and (since C++11)wchar_t
are equal to the ranks of their corresponding underlying types.
For our use case, the rank of unsigned short
is smaller than long
and thus it was promoted to long
. While in the second case, the rank of size_t
, which can be unsigned long
is larger or equal to the rank of long
, so we have promotion to unsigned long
.
If you comparesigned
withunsigned
, make sure thesigned
value is positive to avoid unexpected conversions.
Use Cases
In general, we should aim to use the same integral types to avoid various conversion warnings and bugs. For example, the following code:
std::vector numbers {42, 76, 2, 21, 98, 100 };
for (int i = 0; i < numbers.size(); ++i)
std::cout << i << "(" << numbers[i] << "), ";
It will generate a GCC warning in -Wall
. However, it can be easily fixed by using unsigned int
or size_t
as the type for the loop counter.
What’s more, such code might also be improved by various C++ features, for example:
std::vector numbers {42, 76, 2, 21, 98, 100 };
for (int i = 0; auto &num : numbers)
std::cout << "i: " << i++ << " - " << num << '\n';
The above example uses a range-based-for loop with an initializer (C++20). That way, there’s no need to compare the counter against the container size.
On the other hand, there are situations where you get integral numbers of different types:
long id = -1;
if (id >= 0 && id < container.size()) {
}
In the above sample, I used id
, which can have some negative value (to indicate some other properties), and when it’s valid (in range), I can access elements of some container.
In this case, I don’t want to change the type of the id
object, so I have to put static_cast<size_t>(id)
to avoid warnings.
Putting casts here and there might not be the best idea, not to mention the code style.
Additionally, we should also follow the C++ Core Guideline Rule:
ES.100: Don’t mix signed and unsigned arithmetic:
Reason Avoid wrong results.
Fortunately, in C++20, we have a utility to handle such situations.
It’s called “Safe Integral Comparisons” - P0586 by Federico Kircheis.
Safe integral comparisons functions
In the Standard Library we’ll have the following new functions that compare with the “mathematical” meaning:
// <utility> header:
template <class T, class U>
constexpr bool cmp_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_not_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_less (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_greater (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_less_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_greater_equal (T t , U u) noexcept
template <class R, class T>
constexpr bool in_range (T t) noexcept
T
and U
are required to be standard integer types and so those functions cannot be used to compare std::byte
, char
, char8_t
, char16_t
, char32_t
, wchar_t
and bool
.
You can find those functions in the <utility>
header file.
This article started as a preview for Patrons, sometimes even months before the publication. If you want to get extra content, previews, free ebooks and access to our Discord server, join the C++ Stories Premium membership or see more information.
Examples
We can rewrite our initial example into:
#include <iostream>
#include <utility>
int main() {
long a = -100;
unsigned short b = 100;
std::cout << std::cmp_less(a, b);
size_t c = 100;
std::cout << std::cmp_less(a, c);
}
See the code at @Compiler Explorer
And here’s another snippet:
#include <cstdint>
#include <iostream>
#include <utility>
int main() {
std::cout << std::boolalpha;
std::cout << 256 << "\tin uint8_t:\t" << std::in_range<uint8_t>(256) << '\n';
std::cout << 256 << "\tin long:\t" << std::in_range<long>(256) << '\n';
std::cout << -1 << "\tin uint8_t:\t" << std::in_range<unsigned>(-1) << '\n';
}
Real code
I also looked at some open-source code using codesearch.isocpp.org. I searched for static_cast<int>
to see some loops patterns or conditions. Some interesting things?
// actcd19/main/c/chromium/chromium_72.0.3626.121-1/chrome/browser/media/webrtc/window_icon_util_x11.cc:49:
int start = 0;
int i = 0;
while (i + 1 < static_cast<int>(size)) {
if ((i == 0 || static_cast<int>(data[i] * data[i + 1]) > width * height) &&
(i + 1 + data[i] * data[i + 1] < static_cast<int>(size))) {
size
is probably unsigned, so they always have to convert it and compare it against int
.
And searching for static_cast<size_t>
shows: codesearch.isocpp.org
// actcd19/main/c/chromium/chromium_72.0.3626.121-
// 1/third_party/libwebm/source/common/vp9_level_stats_tests.cc:92:
for (int i = 0; i < frame_count; ++i) {
const mkvparser::Block::Frame& frame = block->GetFrame(i);
if (static_cast<size_t>(frame.len) > data.size()) {
data.resize(frame.len);
data_len = static_cast<size_t>(frame.len);
// ...
This time frame.len
has to be converted to size_t
to allow safe comparisons.
Implementation Notes
Since MSVC is on Github, you can quickly see how the feature was developed, see this pull request and even see the code in STL/utility at master · Microsoft/STL.
Here’s the code for cmp_equal()
:
template <class _Ty1, class _Ty2>
_NODISCARD constexpr bool cmp_equal(const _Ty1 _Left, const _Ty2 _Right) noexcept {
static_assert(_Is_standard_integer<_Ty1> && _Is_standard_integer<_Ty2>,
"The integer comparison functions only "
"accept standard and extended integer types.");
if constexpr (is_signed_v<_Ty1> == is_signed_v<_Ty2>) {
return _Left == _Right;
} else if constexpr (is_signed_v<_Ty2>) {
return _Left == static_cast<make_unsigned_t<_Ty2>>(_Right) && _Right >= 0;
} else {
return static_cast<make_unsigned_t<_Ty1>>(_Left) == _Right && _Left >= 0;
}
}
And a similar code for cmp_less()
:
template <class _Ty1, class _Ty2>
_NODISCARD constexpr bool cmp_less(const _Ty1 _Left, const _Ty2 _Right) noexcept {
static_assert(_Is_standard_integer<_Ty1> && _Is_standard_integer<_Ty2>, "same...");
if constexpr (is_signed_v<_Ty1> == is_signed_v<_Ty2>) {
return _Left < _Right;
} else if constexpr (is_signed_v<_Ty2>) {
return _Right > 0 && _Left < static_cast<make_unsigned_t<_Ty2>>(_Right);
} else {
return _Left < 0 || static_cast<make_unsigned_t<_Ty1>>(_Left) < _Right;
}
}
Notes:
- the
std::
namespace is omitted here, sois_signed_v
is a standard type trait,std::is_signed_v
, same asmake_unsigned_t
isstd::make_unsigned_t
. - Notice the excellent and expressive use of
if constexpr
; it makes metaprogramming code very easy to read.
The code fragments present cmp_equal()
and cmp_less(). In both cases, the main idea is to work with the same sign. There are three cases to cover:
- If both types have the same sign, then we can compare them directly
- But when the sign differs (two remaining cases), then the code uses
make_unisgned_t
to convert the_Right
or_Left
part and ensure that the value is not smaller than 0.
Help from the compiler
When I asked the question on Twitter, I also got a helpful answer:
Funny, I get none of the above. My output was:
— John McFarlane (@JSAMcFarlane) September 11, 2022
> error: comparison of integer expressions of different signedness: 'long int' and 'size_t' {aka 'long unsigned int'} [-Werror=sign-compare]https://t.co/xge7A3F4Ic
My example used only default GCC settings, but it’s best to turn on handy compiler warnings and avoid such conversion bugs at compile time.
Just adding -Wall
generates the following warning:
<source>:8:21: warning: comparison of integer expressions of different signedness: 'long int' and 'size_t' {aka 'long unsigned int'} [-Wsign-compare]
8 | std::cout << (a < c);
| ~~^~~
See at Compiler Explorer
You can also compile with -Werror -Wall -Wextra
, and then the compiler won’t let you run the code with signed to unsigned conversions.
Compiler Support
As of September 2022, the feature is implemented in GCC 10.0, Clang 13.0, and MSVC 16.7.
Summary
This post discussed some fundamental issues with integer promotions and comparisons. In short, if you have a binary arithmetic operation, the compiler must have the same types for operands. Thanks to promotion rules, some types might be converted from signed to unsigned and thus yield problematic results. C++20 offers a new set of comparison functions cmp_**
, ensuring the sign is correctly handled.
If you want to read more about integer conversions, look at this excellent blog post: The Usual Arithmetic Confusions by Shafik Yaghmour. And also this one Summary of C/C++ integer rules by Nayuki.
Back to you
- What’s your approach for working with different integer types?
- How do you avoid conversion errors?
Share your feedback in the comments below.
I've prepared a valuable bonus for you!
Learn all major features of recent C++ Standards on my Reference Cards!
Check it out here: