QOI is now my favorite asset format

This article was discussed on Hacker News.

The Quite OK Image (QOI) format was announced late last year and finalized into a specification a month later. Initially dismissive, a revisit has shifted my opinion to impressed. The format hits a sweet spot in the trade-off space between complexity, speed, and compression ratio. Also considering its alpha channel support, QOI has become my default choice for embedded image assets. It’s not perfect, but at the very least it’s a solid foundation.

Since I’m now working with QOI images, I need a good QOI viewer, and so I added support to my ill-named pbmview tool, which I wrote to serve the same purpose for Netpbm. I will continue to use Netpbm as an output format, especially for raw video output, but no longer will I use it for an embedded asset (nor re-invent yet another RLE over Netpbm).

I was dismissive because the website claimed, and still claims today, QOI images are “a similar size” to PNG. However, for the typical images where I would use PNG, QOI is around 3x larger, and some outliers are far worse. The 745 PNGs on my blog — a perfect test corpus for my own needs — convert to QOIs 2.8x larger on average. The official QOI benchmark has much better results, 1.3x larger, but that’s because it includes a lot of photography where PNG and QOI both do poorly, making QOI seem more comparable.

However, as I said, QOI’s strength is its trade-off sweet spot. The specification is one page, and an experienced developer can write a complete implementation from scratch in a single sitting. My own implementation is about 100 lines of libc-free C for each of the encoder and decoder. With error checking removed, my decoder is ~600 bytes of x86 object code — a great story for embedding alongside assets. It’s more complex than Netpbm or farbfeld, but it’s far simpler than BMP. I’ve already begun experimenting with converting assets to QOI, and the results have so far exceeded my expectations.

To my surprise, the encoder was easier to write than the decoder. The format is so straightforward such that two different encoders will produce the identical files. There’s little room for specialized optimization, and no meaningful “compression level” knob.


There are a lot of dimensions on which QOI could be improved, but most cases involve trade-offs, e.g. more complexity for better compression. The areas where QOI could have been strictly better, the dimensions on which it is not on the Pareto frontier, are more meaningful criticisms — missed opportunities. My criticisms of this kind:

More subjective criticisms that might count as having trade-offs:

Of course, you’re not obligated to follow QOI exactly to spec for your own assets, so you could always use a modified QOI with one or more of these tweaks. That’s what I meant about it being a solid foundation: You don’t have to start from scratch with some custom RLE. Since the format is so simple, you can easily build your own tools — as I’ve already begun doing myself — so you don’t need to rely on tools supporting your QOI fork.

Minimalist API

I’m really happy with my QOI implementation, particularly since it’s another example of a minimalist C API: no allocating, no input or output, and no standard library use. As usual, the expectation is that it’s in the same translation unit where it’s used, so it’s likely inlined into callers.

The encoder is streaming — it accepts and returns only a little bit of input and output at a time. It has three functions and one struct with no “public” fields:

struct qoiencoder qoiencoder(void *buf, int w, int h, const char *flags);
int qoiencode(struct qoiencoder *, void *buf, unsigned color);
int qoifinish(struct qoiencoder *, void *buf);

The first function initializes an encoder and writes a fixed-length header into the QOI buffer. The flags field is a mode string, like fopen. I would normally use bit flags, but this is a little experiment. The second function encodes a single pixel into the QOI buffer, returning the number of bytes written (possibly zero). The last flushes any encoding state and writes the end-of-stream marker. There are no errors. My typical use so far looks like:

char buf[16];
struct qoiencoder q = qoiencoder(buf, width, height, "a");
fwrite(buf, QOIHDRLEN, 1, file);
for (int y = 0; y < height; y++) {
    for (int x = 0; x < width; x++) {
        // ... compute 32-bit ABGR sample at (x, y) ...
        fwrite(buf, qoiencode(&q, buf, abgr), 1, file);
fwrite(buf, qoifinish(&q, buf), 1, file);
return ferror(file);

This appends encoder outputs to a buffered stream, but it could just as well accumulate directly into a larger buffer, advancing the write pointer a little after each call.

The decoder is two functions, but its struct has some “public” fields.

struct qoidecoder {
    int width, height;
    _Bool alpha, srgb, error;
    // ...
struct qoidecoder qoidecoder(const void *buf, int len);
static unsigned qoidecode(struct qoidecoder *);

The input is not streamed and the entire buffer must be loaded into memory at once — not too bad since it’s compressed, and perhaps even already loaded as part of the executable image — but the output is streamed, delivering one packed 32-bit ABGR sample per call. The decoder makes no assumptions about the output format, and the caller unpacks samples and stores them in whatever format is appropriate (shader texture, etc.).

To make it easier to use, my decoder range checks to guarantee that width and height can be multiplied without overflow. Unlike encoding, there may be errors due to invalid input, including that failed range check. The decoder error flag is “sticky” and the decoder returns zero samples when in an error state, so callers can wait to check for errors until the end. (Though if you’re only decoding embedded assets, then there are no practical errors, and checks can be removed/ignored.)

Example usage, copied almost verbatim from a real program:

int loadimage(Image *image, const uint8_t *qoi, int len)
    struct qoidecoder q = qoidecoder(qoi, len);
    if (/* image dimensions too large */) {
        return 0;
    image->width  = q.width;
    image->height = q.height;
    int count = q.width * q.height;
    for (int i = 0; i < count; i++) {
        unsigned abgr = qoidecode(&q);
        image->data[4*i+0] = abgr >> 16;
        image->data[4*i+1] = abgr >>  8;
        image->data[4*i+2] = abgr >>  0;
        image->data[4*i+3] = abgr >> 24;
    return !q.error;

Note the aforementioned awkward RGB shuffle.

It’s safe to say that I’m excited about QOI, and that it now has a permanent slot on my developer toolbelt.

Have a comment on this article? Start a discussion in my public inbox by sending an email to ~skeeto/public-inbox@lists.sr.ht [mailing list etiquette] , or see existing discussions.

null program

Chris Wellons

wellons@nullprogram.com (PGP)
~skeeto/public-inbox@lists.sr.ht (view)