An easy-to-implement, arena-friendly hash map

My last article had tips for for arena allocation. This next article demonstrates a technique for building bespoke hash maps that compose nicely with arena allocation. In addition, they’re fast, simple, and automatically scale to any problem that could reasonably be solved with an in-memory hash map. To avoid resizing — both to better support arenas and to simplify implementation — they have slightly above average memory requirements. The design, which we’re calling a hash-trie, is the result of fruitful collaboration with NRK, whose sibling article includes benchmarks. It’s my new favorite data structure, and has proven incredibly useful. With a couple well-placed acquire/release atomics, we can even turn it into a lock-free concurrent hash map.

[]

Arena allocator tips and tricks

This article was discussed on Hacker News.

Over the past year I’ve refined my approach to arena allocation. With practice, it’s effective, simple, and fast; typically as easy to use as garbage collection but without the costs. Depending on need, an allocator can weigh just 7–25 lines of code — perfect when lacking a runtime. With the core details of my own technique settled, now is a good time to document and share lessons learned. This is certainly not the only way to approach arena allocation, but these are practices I’ve worked out to simplify programs and reduce mistakes.

[]

How to link identical function names from different DLLs

For the typical DLL function call you declare the function prototype (via header file), you inform the link editor (ld, link) that the DLL exports a symbol with that name (import library), it matches the declared name with this export, and it becomes an import in your program’s import table. What happens when two different DLLs export the same symbol? The link editor will pick the first found. But what if you want to use both exports? If they have the same name, how could program or link editor distinguish them? In this article I’ll demonstrate a technique to resolve this by creating a program which links with and directly uses two different C runtimes (CRTs) simultaneously.

[]

Everything you never wanted to know about Win32 environment blocks

In an effort to avoid programming by superstition, I did a deep dive into the Win32 “environment block,” the data structure holding a process’s environment variables, in order to better understand it. Along the way I discovered implied and undocumented behaviors. (The environment block must not to be confused with the Process Environment Block (PEB) which is different.) Because I cannot possibly retain all the quirky details in my head for long, I’m writing them down for future reference. I ran my tests on different Windows versions as far back as Windows XP SP3 in order to fill in gaps where documentation is ambiguous, incomplete, or wrong. Overall conclusion: Correct, direct manipulation of an environment block is impossible in the general case due to under-specified and incorrect documentation. This has important consequences mainly for programming language runtimes.

[]

"Once" one-time concurrent initialization with an integer

We’ve previously discussed integer barriers, integer queues, and integer wait groups as tiny concurrency utilities. Next let’s tackle “once” initialization, i.e. pthread_once, using an integer. We’ll need only three basic atomic operations — store, load, and increment — and futex wait/wake. It will be zero-initialized and the entire source small enough to fit on an old-fashioned terminal display. The interface will also get an overhaul, more to my own tastes.

[]

Solving "Two Sum" in C with a tiny hash table

I came across a question: How does one efficiently solve Two Sum in C? There’s a naive quadratic time solution, but also an amortized linear time solution using a hash table. Without a built-in or standard library hash table, the latter sounds onerous. However, a mask-step-index table, a hash table construction suitable for many problems, requires only a few lines of code. This approach is useful even when a standard hash table is available, because by exploiting the known problem constraints, it beats typical generic hash table performance by an order of magnitude (demo).

[]

My ranking of every Shakespeare play

This article was discussed on Hacker News.

A few years ago I set out on a personal journey to study and watch a performance of each of Shakespeare’s 37 plays. I’ve reached my goal and, though it’s not a usual topic around here, I wanted to get my thoughts down while fresh. I absolutely loved some of these plays and performances, and so I’d like to highlight them, especially because my favorites are, with one exception, not “popular” plays. Per tradition, I begin with my least enjoyed plays and work my way up. All performances were either a recording of a live stage or an adaptation, so they’re also available to you if you’re interested, though in most cases not for free. I’ll mention notable performances when applicable. The availability of a great performance certainly influenced my play rankings.

[]

Hand-written Windows API prototypes: fast, flexible, and tedious

I love fast builds, and for years I’ve been bothered by the build penalty for translation units including windows.h. This header has an enormous number of definitions and declarations and so, for C programs, it tends to dominate the build time of those translation units. Most programs, especially systems software, only needs a tiny portion of it. For example, when compiling u-config with GCC, two thirds of the debug build was spent processing windows.h just for 4 types, 16 definitions, and 16 prototypes.

[]

My favorite C compiler flags during development

This article was discussed on Hacker News and on reddit.

The major compilers have an enormous number of knobs. Most are highly specialized, but others are generally useful even if uncommon. For warnings, the venerable -Wall -Wextra is a good start, but circumstances improve by tweaking this warning set. This article covers high-hitting development-time options in GCC, Clang, and MSVC that ought to get more consideration.

[]

Practical libc-free threading on Linux

Suppose you’re not using a C runtime on Linux, and instead you’re programming against its system call API. It’s long-term and stable after all. Memory management and buffered I/O are easily solved, but a lot of software benefits from concurrency. It would be nice to also have thread spawning capability. This article will demonstrate a simple, practical, and robust approach to spawning and managing threads using only raw system calls. It only takes about a dozen lines of C, including a few inline assembly instructions.

[]

null program

Chris Wellons

wellons@nullprogram.com (PGP)
~skeeto/public-inbox@lists.sr.ht (view)