null program

Speculations on arenas and custom strings in C++

April 14, 2024

nullprogram.com/blog/2024/04/14/

My techniques with arena allocation and strings are oriented around C. I’m always looking for a better way, and lately I’ve been experimenting with building them using C++ features. What are the trade-offs? Are the benefits worth the costs? In this article I lay out my goals, review implementation possibilities, and discuss my findings. Following along will require familiarity with those previous two articles.

[…]

c
cpp

Protecting paths in macro expansions by extending UTF-8

March 05, 2024

nullprogram.com/blog/2024/03/05/

After a year I’ve finally came up with an elegant solution to a vexing u-config problem. The pkg-config format uses macros to generate build flags through recursive expansion. Some flags embed file system paths, but to the macro system it’s all strings. The output is also ultimately just one big string, which the receiving shell splits into fields. If a path contains spaces, or shell metacharacters, u-config must escape them so that shells treat them as part of a token. But how can u-config itself distinguish incidental spaces in paths from deliberate spaces between flags? What about other shell metacharacters in paths? My solution is to extend UTF-8 to encode metadata that survives macro expansion.

[…]

c
trick

An improved chkstk function on Windows

February 05, 2024

nullprogram.com/blog/2024/02/05/

If you’ve spent much time developing with Mingw-w64 you’ve likely seen the symbol ___chkstk_ms, perhaps in an error message. It’s a little piece of runtime provided by GCC via libgcc which ensures enough of the stack is committed for the caller’s stack frame. The “function” uses a custom ABI and is implemented in assembly. So is the subject of this article, a slightly improved implementation soon to be included in w64devkit as libchkstk (-lchkstk).

[…]

Two handy GDB breakpoint tricks

January 28, 2024

nullprogram.com/blog/2024/01/28/

Over the past couple months I’ve discovered a couple of handy tricks for working with GDB breakpoints. I figured these out on my own, and I’ve not seen either discussed elsewhere, so I really ought to share them.

[…]

c
cpp

So you want custom allocator support in your C library

December 17, 2023

nullprogram.com/blog/2023/12/17/

This article was discussed on Hacker News and on reddit.

Users of mature C libraries conventionally get to choose how memory is allocated — that is, when it cannot be avoided entirely. The C standard never laid down a convention — perhaps for the better — so each library re-invents an allocator interface. Not all are created equal, and most repeat a few fundamental mistakes. Often the interface is merely a token effort, to check off that it’s “supported” without actual consideration to its use. This article describes the critical features of a practical allocator interface, and demonstrates why they’re important.

[…]

c

My personal C coding style as of late 2023

October 08, 2023

nullprogram.com/blog/2023/10/08/

This article was discussed on Hacker News and on reddit.

This has been a ground-breaking year for my C skills, and paradigm shifts in my technique has provoked me to reconsider my habits and coding style. It’s been my largest personal style change in years, so I’ve decided to take a snapshot of its current state and my reasoning. These changes have produced significant productive and organizational benefits, so while most is certainly subjective, it likely includes a few objective improvements. I’m not saying everyone should write C this way, and when I contribute code to a project I follow their local style. This is about what works well for me.

[…]

c

A simple, arena-backed, generic dynamic array for C

October 05, 2023

nullprogram.com/blog/2023/10/05/

Previously I presented an arena-friendly hash map applicable to any programming language where one might use arena allocation. In this third article I present a generic, arena-backed dynamic array. The details are specific to C, as the most appropriate mechanism depends on the language (e.g. templates, generics). Just as in the previous two articles, the goal is to demonstrate an idea so simple that a full implementation fits on one terminal pager screen — a concept rather than a library.

[…]

c

An easy-to-implement, arena-friendly hash map

September 30, 2023

nullprogram.com/blog/2023/09/30/

My last article had tips for for arena allocation. This next article demonstrates a technique for building bespoke hash maps that compose nicely with arena allocation. In addition, they’re fast, simple, and automatically scale to any problem that could reasonably be solved with an in-memory hash map. To avoid resizing — both to better support arenas and to simplify implementation — they have slightly above average memory requirements. The design, which we’re calling a hash-trie, is the result of fruitful collaboration with NRK, whose sibling article includes benchmarks. It’s my new favorite data structure, and has proven incredibly useful. With a couple well-placed acquire/release atomics, we can even turn it into a lock-free concurrent hash map.

[…]

Arena allocator tips and tricks

September 27, 2023

nullprogram.com/blog/2023/09/27/

This article was discussed on Hacker News.

Over the past year I’ve refined my approach to arena allocation. With practice, it’s effective, simple, and fast; typically as easy to use as garbage collection but without the costs. Depending on need, an allocator can weigh just 7–25 lines of code — perfect when lacking a runtime. With the core details of my own technique settled, now is a good time to document and share lessons learned. This is certainly not the only way to approach arena allocation, but these are practices I’ve worked out to simplify programs and reduce mistakes.

[…]

c

How to link identical function names from different DLLs

August 27, 2023

nullprogram.com/blog/2023/08/27/

For the typical DLL function call you declare the function prototype (via header file), you inform the link editor (ld, link) that the DLL exports a symbol with that name (import library), it matches the declared name with this export, and it becomes an import in your program’s import table. What happens when two different DLLs export the same symbol? The link editor will pick the first found. But what if you want to use both exports? If they have the same name, how could program or link editor distinguish them? In this article I’ll demonstrate a technique to resolve this by creating a program which links with and directly uses two different C runtimes (CRTs) simultaneously.

[…]

c
win32