## Guidelines for computing sizes and subscripts

Occasionally we need to compute the size of an object that does not yet
exist, or a subscript that may fall out of bounds. It’s easy to miss
the edge cases where results overflow, creating a nasty, subtle bug, even
in the presence of type safety. Ideally such computations happen in
specialized code, such as *inside* an allocator (`calloc`

, `reallocarray`

)
and not *outside* by the allocatee (i.e. `malloc`

). Mitigations exist with
different trade-offs: arbitrary precision, or using a wider fixed integer
— i.e. 128-bit integers on 64-bit hosts. In the typical case, working only
with fixed size-type integers, I’ve come up with a set of guidelines to
avoid overflows in the edge cases.

- Range check
*before*computing a result. No exceptions. - Do not cast unless you know
*a priori*the operand is in range. - Never mix unsigned and signed operands. Prefer signed. If you need to convert an operand, see (2).
- Do not add unless you know
*a priori*the result is in range. - Do not multiply unless you know
*a priori*the result is in range. - Do not subtract unless you know
*a priori*both signed operands are non-negative. For unsigned, that the second operand is not larger than the first (treat it like (4)). - Do not divide unless you know
*a prior*the denominator is positive. - Make it correct first. Make it fast later, if needed.

These guidelines are also useful when *reviewing* code, tracking in your
mind whether the invariants are held at each step. If not, you’ve likely
found a bug. If in doubt, use assertions to document and check invariants.
I compiled this list during code review, so for me that’s where it’s most
useful.

### Range check, then compute

Not strictly necessary when overflow is well-defined, i.e. wraparound, but it’s like defensive driving. It’s simpler and clearer to check with basic arithmetic rather than reason from a wraparound, i.e. a negative result. Checked math functions are fine, too, if you check the overflow boolean before accessing the result.

```
// bad
len++;
if (len <= 0) error();
// good
if (len == MAX) error();
len++;
```

### Casting

Casting from signed to unsigned, it’s as simple as knowing the value is non-negative, which is likely if you’re following (1). If a negative size has appeared, there’s already been a bug earlier in the program, and the only reasonable course of action is to abort, not handle it like an error.

### Addition

To check if addition will overflow, subtract one of the operands from the maximum value.

```
if (b > MAX - a) error();
r = a + b;
```

In pointer arithmetic addition, it’s a common mistake to compute the
result pointer then compare it to the bounds. If the check failed, then
the pointer *already* overflowed, i.e. undefined behavior. Major pieces
software, like glibc, are riddled with such pointer overflows.
(Now that you’re aware of it, you’ll start noticing it everywhere. Sorry.)

```
// bad: never do this
beg += size;
if (beg > end) error();
```

To do this correctly, **check integers not pointers**. Like before,
subtract before adding.

```
available = end - beg;
if (size > end - beg) error();
beg += size;
```

Mind mixing signed and unsigned operands for the comparison operator (3), e.g. an unsigned size on the left and signed difference on the right.

### Multiplication and division

If you’re working this out on your own, multiplication seems tricky until you’ve internalized a simple pattern. Just as we subtracted before adding, we need to divide before multiplying. Divide the maximum value by one of the operands:

```
if (a>0 && b>MAX/a) error();
r = a * b;
```

It’s often permitted for one or both to be zero, so mind divide-by-zero,
which is handled above by the first condition. Sometimes size must be
positive, e.g. the result of the `sizeof`

operator in C, in which case we
should prefer it as the denominator.

```
assert(size > 0);
assert(count >= 0);
if (count > MAX/size) error();
total = count * size;
```

With arena allocation there are usually two concerns. First, will
it overflow when computing the total size, i.e. `count * size`

? Second, is
the total size within the arena capacity. Naively that’s two checks, but
we can kill two birds with one stone: Check both at once by using the
current arena capacity as the maximum value when considering overflow.

```
if (count > (end - beg)/size) error();
total = count * size;
```

One condition pulling double duty.

### Subtraction

With signed sizes, the negative range is a long “runway” allowing a single unchecked subtraction before overflow might occur. In essence, we were exploiting this in order to check addition. The most common mistake with unsigned subtraction is not accounting for overflow when going below zero.

```
// note: signed "i" only
for (i = end - stride; i >= beg; i -= stride) ...
```

This loop will go awry if `i`

is unsigned and `beg <= stride`

.

In special cases we can get away with a second subtraction without an overflow check if we know some properties of our operands. For example, my arena allocators look like this:

```
padding = -beg & (align - 1);
if (count >= (end - beg - padding)/size) error();
```

That’s two subtractions in a row. However, `end - beg`

describes the size
of a realized object, and `align`

is a small constant (e.g. 2^(0–6)). It
could only overflow if the entirety of memory was occupied by the arena.

Bonus, advanced note: This check is actually pulling *triple duty*. Notice
that I used `>=`

instead of `>`

. The arena can’t fill exactly to the brim,
but it handles the extreme edge case where `count`

is zero, the arena is
nearly full, but the bump pointer is unaligned. The result of subtracting
`padding`

is negative, which rounds to zero by integer division, and would
pass a `>`

check. That wouldn’t be a problem except that aligning the bump
pointer would break the invariant `beg <= end`

.

### Try it for yourself

Next time you’re reviewing code that computes sizes or subscripts, bring the list up and see how well it follows the guidelines. If it misses one, try to contrive an input that causes an overflow. If it follows guidelines and you can still contrive such an input, then perhaps the list could use another item!

Have a comment on this article? Start a discussion in my public inbox by sending an email to ~skeeto/public-inbox@lists.sr.ht [mailing list etiquette] , or see existing discussions.