nullprogram.com/blog/2024/05/24/
Occasionally we need to compute the size of an object that does not yet
exist, or a subscript that may fall out of bounds. It’s easy to miss
the edge cases where results overflow, creating a nasty, subtle bug, even
in the presence of type safety. Ideally such computations happen in
specialized code, such as inside an allocator (calloc
, reallocarray
)
and not outside by the allocatee (i.e. malloc
). Mitigations exist with
different trade-offs: arbitrary precision, or using a wider fixed integer
— i.e. 128-bit integers on 64-bit hosts. In the typical case, working only
with fixed size-type integers, I’ve come up with a set of guidelines to
avoid overflows in the edge cases.
- Range check before computing a result. No exceptions.
- Do not cast unless you know a priori the operand is in range.
- Never mix unsigned and signed operands. Prefer signed. If you
need to convert an operand, see (2).
- Do not add unless you know a priori the result is in range.
- Do not multiply unless you know a priori the result is in range.
- Do not subtract unless you know a priori both signed operands
are non-negative. For unsigned, that the second operand is not larger
than the first (treat it like (4)).
- Do not divide unless you know a prior the denominator is positive.
- Make it correct first. Make it fast later, if needed.
These guidelines are also useful when reviewing code, tracking in your
mind whether the invariants are held at each step. If not, you’ve likely
found a bug. If in doubt, use assertions to document and check invariants.
I compiled this list during code review, so for me that’s where it’s most
useful.
Range check, then compute
Not strictly necessary when overflow is well-defined, i.e. wraparound, but
it’s like defensive driving. It’s simpler and clearer to check with basic
arithmetic rather than reason from a wraparound, i.e. a negative result.
Checked math functions are fine, too, if you check the overflow boolean
before accessing the result.
// bad
len++;
if (len <= 0) error();
// good
if (len == MAX) error();
len++;
Casting
Casting from signed to unsigned, it’s as simple as knowing the value is
non-negative, which is likely if you’re following (1). If a negative size
has appeared, there’s already been a bug earlier in the program, and the
only reasonable course of action is to abort, not handle it like an error.
Addition
To check if addition will overflow, subtract one of the operands from the
maximum value.
if (b > MAX - a) error();
r = a + b;
In pointer arithmetic addition, it’s a common mistake to compute the
result pointer then compare it to the bounds. If the check failed, then
the pointer already overflowed, i.e. undefined behavior. Major pieces
software, like glibc, are riddled with such pointer overflows.
(Now that you’re aware of it, you’ll start noticing it everywhere. Sorry.)
// bad: never do this
beg += size;
if (beg > end) error();
To do this correctly, check integers not pointers. Like before,
subtract before adding.
available = end - beg;
if (size > available) error();
beg += size;
Mind mixing signed and unsigned operands for the comparison operator (3),
e.g. an unsigned size on the left and signed difference on the right.
Multiplication and division
If you’re working this out on your own, multiplication seems tricky until
you’ve internalized a simple pattern. Just as we subtracted before adding,
we need to divide before multiplying. Divide the maximum value by one of
the operands:
if (a>0 && b>MAX/a) error();
r = a * b;
It’s often permitted for one or both to be zero, so mind divide-by-zero,
which is handled above by the first condition. Sometimes size must be
positive, e.g. the result of the sizeof
operator in C, in which case we
should prefer it as the denominator.
assert(size > 0);
assert(count >= 0);
if (count > MAX/size) error();
total = count * size;
With arena allocation there are usually two concerns. First, will
it overflow when computing the total size, i.e. count * size
? Second, is
the total size within the arena capacity. Naively that’s two checks, but
we can kill two birds with one stone: Check both at once by using the
current arena capacity as the maximum value when considering overflow.
if (count > (end - beg)/size) error();
total = count * size;
One condition pulling double duty.
Subtraction
With signed sizes, the negative range is a long “runway” allowing a single
unchecked subtraction before overflow might occur. In essence, we were
exploiting this in order to check addition. The most common mistake with
unsigned subtraction is not accounting for overflow when going below zero.
// note: signed "i" only
for (i = end - stride; i >= beg; i -= stride) ...
This loop will go awry if i
is unsigned and beg <= stride
.
In special cases we can get away with a second subtraction without an
overflow check if we know some properties of our operands. For example, my
arena allocators look like this:
padding = -beg & (align - 1);
if (count >= (end - beg - padding)/size) error();
That’s two subtractions in a row. However, end - beg
describes the size
of a realized object, and align
is a small constant (e.g. 2^(0–6)). It
could only overflow if the entirety of memory was occupied by the arena.
Bonus, advanced note: This check is actually pulling triple duty. Notice
that I used >=
instead of >
. The arena can’t fill exactly to the brim,
but it handles the extreme edge case where count
is zero, the arena is
nearly full, but the bump pointer is unaligned. The result of subtracting
padding
is negative, which rounds to zero by integer division, and would
pass a >
check. That wouldn’t be a problem except that aligning the bump
pointer would break the invariant beg <= end
.
Try it for yourself
Next time you’re reviewing code that computes sizes or subscripts, bring
the list up and see how well it follows the guidelines. If it misses one,
try to contrive an input that causes an overflow. If it follows guidelines
and you can still contrive such an input, then perhaps the list could use
another item!