Rules to avoid common extended inline assembly mistakes

GCC and Clang inline assembly is an interface between high and low level programming languages. It is subtle and treacherous. Many are ensnared in its traps, usually unknowingly. As such, the asm keyword is essentially the unsafe keyword of C and C++. Nearly every inline assembly tutorial, including the awful ibilio page at the top of search engines for decades, propagate fundamental, serious mistakes, and most examples are incorrect. The dangerous part is that the examples usually produce the expected results! The situation is dire. This article isn’t a tutorial, but basic rules to avoid the most common mistakes, or to spot them in code review.

The focus is entirely extended assembly, and not basic assembly, which has different rules. The former is any inline assembly statement with constraints or clobbers. That is, there’s a colon : token between the asm parenthesis. Basic assembly is blunt and has fewer uses, mostly at the top level or in “naked” functions, making misuse less likely.

(1) Avoid inline assembly if possible

Because it’s so treacherous, the first rule is to avoid it if at all possible. Modern compilers are loaded with intrinsics and built-ins that replace nearly all the old inline assembly use cases. They allow access to low level features from the high level language. No need to bridge the gap between low and high yourself when there’s an intrinsic.

Compilers do not have built-ins for system calls, and occasionally lack a useful intrinsic. Other times you might be building foundational infrastructure. These remaining cases are mostly about interacting with external interfaces, not optimization nor performance.

(2) It should nearly always be volatile

Falling right out of rule (1), the remaining inline assembly cases nearly always have side effects beyond output constraints. That includes memory accesses, and it certainly includes system calls. Because of this, inline assembly should usually have the volatile qualifier.

asm volatile ( ... );

This prevents compilers from eliding or re-ordering the assembly. As a special rule, inline assembly lacking output constraints is implicitly volatile. Despite this, please use volatile anyway! When I do not see volatile it’s likely a defect. Stopping to consider if it’s this special case slows understanding and impedes code review.

Tutorials often use __volatile__. Do not do this. It is an ancient alias keyword to support pre-standard compilers lacking the volatile keyword. This is not your situation. When I see __volatile__ it likely means you copy-pasted the inline assembly from somewhere without understanding it. It’s a red flag that draws my attention for even more careful review.

Side note: __asm or __asm__ is fine, and even required in some cases (e.g. -std=cXX). I usually write it asm.

(3) It probably needs a memory clobber

The "memory" clobber is orthogonal to volatile, each serving different purposes. It’s less often needed than volatile, but typical remaining inline assembly cases require it. If memory is accessed in any way while executing the assembly, you need a memory clobber. This includes most system calls, and definitely a generic syscall wrapper.

    asm volatile (... : "memory");

In code review, if you do not see a "memory" clobber, give it extra scrutiny. It’s probably missing. If it’s truly unnecessary, I suggest documenting such in a comment so that reviewers know the omission is considered and intentional.

The constraint prevents compilers from re-ordering loads and stores around the assembly. It would be disastrous, for example, if a write(2) system call occurred before the program populated the output buffer! In this case, volatile would prevent followup write(2) from being optimized out while "memory" forces memory stores to occur before the system call.

(4) Never modify input constraints

It’s easy not to modify inputs, so this is mostly about ignorance, but this rule is broken with shocking frequency. Most of the time you can get away with it, right up until certain configurations have a heisenbug. In most cases this can be fixed by changing an input into read-write output constraint with "+":

asm volatile ("..." :: "r"(x) : ...);  // before
asm volatile ("..." : "+r"(x) : ...);  // after

If you hadn’t been using volatile (in violation of rule 2) then now suddenly you’d need it because there’s an output constraint. This happens often.

(5) Never call functions from inline assembly

Many things can go wrong because the semantics cannot be expressed using inline assembly constraints. The stack may not be aligned, and you’ll clobber the redzone. (Yes, there’s a "redzone" constraint, but its insufficient to actually make a function call.) Do not do it. Tutorials like to show it because it makes for a simple demonstration, but all those examples are littered with defects.

System calls are fine. Basic assembly may call functions when used outside of non-naked functions. The goto qualifier, used correctly, allows jumps to be safely expressed to the compiler. Just don’t use call in extended assembly.

(6) Do not define absolute assembly labels

That is, if you need to jump within your assembly block, such as for a loop, do not write a named label:

myloop:
    ...
    jz myloop

Your inline assembly is part of a function, and that function may be cloned or inlined, in which case there will be multiple copies of your assembly block in the translation unit. The assembler will see duplicate label names and reject the program. Until that function is inlined, perhaps at a high optimization level, this will likely work as expected. On the plus side it’s a loud compile time error when it doesn’t work.

In inline assembly you can have the compiler generate a unique label with %=, but my preferred solution is the local labels feature of the assembler:

0:
    ...
    jz 0b

In this case the assembler generates unique labels, and the number 0 isn’t the literal label name. 0b (“backward”) refers to the previous 0 label, and 0f (“forward”) would refer to the next 0 label. Perfectly unambiguous.

Naturally occurring practice problems

Now that you’ve made it this far, here’s an exercise for practice: Search online for “inline assembly tutorial” and count the defects you find by applying my 6 rules. You’ll likely find at least one per result that isn’t official compiler documentation. Besides tutorials and reviewing real programs, you could ask an LLM to generate inline assembly, as they’ve been been trained to produce these common defects.

Have a comment on this article? Start a discussion in my public inbox by sending an email to ~skeeto/public-inbox@lists.sr.ht [mailing list etiquette] , or see existing discussions.

null program

Chris Wellons

wellons@nullprogram.com (PGP)
~skeeto/public-inbox@lists.sr.ht (view)