Today I got an e-mail asking about a previous article on creating
threads on Linux using raw system calls (specifically x86-64).
The questioner was looking to use threads in a program without any
libc dependency. However, he was concerned about checking for mmap(2)
errors when allocating the thread’s stack. The mmap(2) man
page says it returns -1 (a.k.a. MAP_FAILED) on error and sets
errno. But how do you check errno without libc?
As a reminder here’s what the (unoptimized) assembly looks like.
As usual, the system call return value is in rax, which becomes the
return value for stack_create(). Again, its C prototype would look
If you were to, say, intentionally botch the arguments to force an
error, you might notice that the system call isn’t returning -1, but
other negative values. What gives?
The trick is that errno is a C concept. That’s why it’s documented
as errno(3) — the 3 means it belongs to C. Just think about
how messy this thing is: it’s a thread-local value living in the
application’s address space. The kernel rightfully has nothing to do
with it. Instead, the mmap(2) wrapper in libc assigns errno (if
needed) after the system call returns. This is how all system calls
through libc work, even with the syscall(2)
So how does the kernel report the error? It’s an old-fashioned return
value. If you have any doubts, take it straight from the horse’s
mouth: mm/mmap.c:do_mmap(). Here’s a sample of return
if(!len)return-EINVAL;/* Careful about overflows.. */len=PAGE_ALIGN(len);if(!len)return-ENOMEM;/* offset overflow? */if((pgoff+(len>>PAGE_SHIFT))<pgoff)return-EOVERFLOW;/* Too many mappings? */if(mm->map_count>sysctl_max_map_count)return-ENOMEM;
It’s returning the negated error number. Simple enough.
If you think about it a moment, you might notice a complication: This
is a form of in-band signaling. On success, mmap(2) returns a memory
address. All those negative error numbers are potentially addresses
that a caller might want to map. How can we tell the difference?
1) None of the possible error numbers align on a page boundary, so
they’re not actually valid return values. NULL does lie on a page
boundary, which is one reason why it’s not used as an error return
value for mmap(2). The other is that you might actually want to map
NULL, for better or worse.
2) Those low negative values lie in a region of virtual memory
reserved exclusively for the kernel (sometimes called “low
memory”). On x86-64, any address with the most significant
bit set (i.e. the sign bit of a signed integer) is one of these
addresses. Processes aren’t allowed to map these addresses, and so
mmap(2) will never return such a value on success.
So what’s a clean, safe way to go about checking for error values?
It’s a lot easier to read musl than glibc, so let’s take a
peek at how musl does it in its own mmap: src/mman/mmap.c.
Bingo. As documented, if the value falls within that “high” (unsigned)
range of negative values for any system call, it’s an error number.
Getting back to the original question, we could employ this same check
in the assembly code. However, since this is a anonymous memory map
with a kernel-selected address, there’s only one possible error:
ENOMEM (12). This error happens if the maximum number of memory maps
has been reached, or if there’s no contiguous region available for the
4MB stack. The check will only need to test the result against -12.
One of my favorite data processing tools is jq, a command line
JSON processor. It’s essentially awk for JSON. You supply a small
script composed of transformations and filters, and jq evaluates
the filters on each input JSON object, producing zero or more outputs
per input. My most common use case is converting JSON data into CSV
with jq’s @csv filter, which is then fed into SQLite (another
favorite) for analysis.
Setting that aside for a moment, I said before that an input could
produce zero or more outputs. The zero is when it gets filtered out,
and one output is the obvious case. Some filters produce multiple
outputs from a single input. There are a number of situations when
this happens, but the important one is the range filter. For
$ echo 6 | jq 'range(1; .)'
The . is the input object, and range is producing one output for
every number between 1 and . (exclusive). If an expression has
multiple filters producing multiple outputs, under some circumstances
jq will produce a Cartesian product: every combination is generated.
So if my goal is the Mandelbrot set, I can use this to generate the
complex plane, over which I will run the recurrence. For input, I’ll
use a single object with the keys x, dx, y, and dy, defining
the domain and range of the image. A reasonable input might be:
It iterates to a maximum depth of 94: the number of printable ASCII
characters, except space. The final two filters convert the output
ASCII characters, and the -j and -r options tell jq to produce
joined, raw output. So, if you have jq installed and an exactly
80-character wide terminal, go ahead and run that script. You should
see something like this:
Tweaking the input parameters, it scales up nicely:
As demonstrated by the GIF, it’s very slow compared to more
reasonable implementations, but I wouldn’t expect otherwise. It
could be turned into a zoom animation just by feeding it more
input objects with different parameters. It will produce one full
“image” per input. Capturing an animation is left as an exercise for
I recently ran into problem where I needed to modify bytes at the
beginning of an existing zlib stream. My program creates a
file in a format I do not control, and the file format has a header
indicating the total, uncompressed data size, followed immediately by
the data. The tricky part is that the header and data are zlib
compressed together, and I don’t know how much data there will be
until I’ve collected it all. Sometimes it’s many gigabytes. I don’t
know how to fill out the header when I start, and I can’t rewrite it
when I’m done since it’s compressed in the zlib stream … or so I
My original solution was not to compress anything until it gathered
the entirety of the data. The input would get concatenated into a huge
buffer, then finally compressed and written out. It’s not ideal,
because the program uses a lot more memory than it theoretical could,
especially if the data is highly compressible. It would be far better
to compress the data as it arrives and somehow update the header
My first thought was to ask zlib to leave the header uncompressed,
then enable compression (deflateParams()) for the data. I’d work out
the magic offset and overwrite the uncompressed header bytes once I’m
done. There are two major issues with this, and I’ll address each:
zlib includes a checksum (adler32) at the end of the
data, and editing the stream would cause a mismatch. This fairly
easy to correct thanks to adler32’s properties.
zlib is an LZ77-family compressor and compression comes from
back-references into past (and sometimes future) bytes of
decompressed output. Up to 32kB of data following the header could
reference bytes in the header as a dictionary. I would need to ask
zlib not to reference these bytes. Fortunately the zlib API is
intentionally designed for this, though for different purposes.
Fixing the checksum
Ignoring the second problem for a moment, I could fix the checksum by
computing it myself. When I overwrite my uncompressed header bytes, I
could also overwrite the checksum at the end of the compressed stream.
For illustration, here’s an simple example implementation of adler32
If you think about this for a moment, you may notice this puts me back
at square one. If I don’t know the header, then I don’t know the
checksum value at the end of the header, going into the data buffer.
I’d need to buffer all the data to compute the checksum. Fortunately
adler32 has the nice property that two checksums can be concatenated
as if they were one long stream. In a malicious context this is
known as a length extension attack, but it’s a real benefit
It’s like the zlib authors anticipated my needs, because the zlib
library has a function exactly for this:
I just have to keep track of the data checksum adler2 and I can
compute the proper checksum later.
uint64_ttotal=0;uint32_tdata_adler=adler32(0,0,0);// initial value
This part is more complicated and it helps to have some familiarity
with zlib. Every time zlib is asked to compress data, it’s given a
flush parameter. Under normal operation, this value is always
Z_NO_FLUSH until the end of the stream, in which case it’s finalized
with Z_FINISH. Other flushing options force it to emit data sooner
at the cost of reduced compression ratio. This would primarily be used
to eliminate output latency on interactive streams (e.g. compressed
The necessary flush option for this situation is Z_FULL_FLUSH. It
forces out all output data and resets the dictionary: a fence.
Future inputs cannot reference anything before a full flush. Since
the header is uncompressed, it will not reference itself either.
Ignoring the checksum problem, I can safely modify these bytes.
Putting it all together
To fully demonstrate all of this, I’ve put together an example using
one of my favorite image formats, Netpbm P6.
In the P6 format, the image header is an ASCII description of the
image’s dimensions followed immediately by raw pixel data.
It’s a bit contrived, but it’s the project I used to work it all out.
The demo reads arbitrary raw byte data on standard input and uses it
to produce a zlib-compressed PPM file on standard output. It doesn’t
know the size of the input ahead of time, nor does it naively buffer
it all. There’s no dynamic allocation (except for what zlib does
internally), but the program can process arbitrarily large input. The
only requirement is that standard output is seekable. Using the
technique described above, it patches the header within the zlib
stream with the final image dimensions after the input has been
If you’re on a Debian system, you can use zlib-flate to decompress
raw zlib streams (gzip wraps zlib, but can’t raw zlib). Alternatively
your system’s openssl program may have zlib support. Here’s running
it on itself as input. Remember, you can’t pipe it into zlib-flate
because the output needs to be seekable in order to write the header.
Unfortunately due to the efficiency-mindedness of zlib, its use
requires careful bookkeeping that’s easy to get wrong. It’s a little
machine that at each step needs to be either fed more input or its
output buffer cleared. Even with all the error checking stripped away,
it’s still too much to go over in full here, but I’ll summarize the
First I process an empty buffer with compression disabled. The output
buffer will be discarded, so input buffer could be left uninitialized,
but I don’t want to upset anyone. All I need is the output
size, which I use to seek over the to-be-written header. I use
Z_FULL_FLUSH as described, and there’s no loop because I presume my
output buffer is easily big enough for this.
I won’t include it in this article, but what follows is a standard
zlib compression loop, consuming all the input data. There’s one key
difference compared to a normal zlib compression loop: when the input
is exhausted, instead of Z_FINISH I use Z_SYNC_FLUSH to force
everything out. The problem with Z_FINISH is that it will write the
checksum, but we’re not ready for that.
With all the input processed, it’s time to go back to rewrite the
header. Rather than mess around with magic byte offsets, I start a
second, temporary zlib stream and do the Z_FULL_FLUSH like before,
but this time with the real header. In deciding the header size, I
reserved 6 characters for the width and 10 characters for the height.
The C standard library includes a qsort() function for sorting
arbitrary buffers given a comparator function. The name comes from its
original Unix implemenation, “quicker sort,” a variation of
the well-known quicksort algorithm. The C standard doesn’t specify an
algorithm, except to say that it may be unstable (C99 §22.214.171.124¶4) —
equal elements have an unspecified order. As such, different C
libraries use different algorithms, and even when using the same
algorthm they make different implementation tradeoffs.
I added a drawing routine to a comparison function to see what the
sort function was doing for different C libraries. Every time it’s
called for a comparison, it writes out a snapshot of the array as a
Netpbm PPM image. It’s easy to turn concatenated PPMs into a GIF or
video. Here’s my code if you want to try it yourself:
Adjust the parameters at the top to taste. Rather than call rand() in
the standard library, I included xorshift64star() with a hardcoded
seed so that the array will be shuffled exactly the same across all
platforms. This makes for a better comparison.
To get an optimized GIF on unix-like systems, run it like so.
(Microsoft’s UCRT currently has serious bugs with pipes, so it
was run differently in that case.)
The number of animation frames reflects the efficiency of the sort,
but this isn’t really a benchmark. The input array is fully shuffled,
and real data often not. For a benchmark, have a look at a libc
qsort() shootout of sorts instead.
To help you follow along, clicking on any animation will restart it.
Sorted in 307 frames. glibc prefers to use mergesort, which,
unlike quicksort, isn’t an in-place algorithm, so it has to allocate
memory. That allocation could fail for huge arrays, and, since qsort()
can’t fail, it uses quicksort as a backup. You can really see the
mergesort in action: changes are made that we cannot see until later,
when it’s copied back into the original array.
Sorted in 503 frames. diet libc is an alternative C
standard library for Linux. It’s optimized for size, which shows
through its slower performance. It looks like a quicksort that always
chooses the last element as the pivot.
Sort in 637 frames. musl libc is another alternative C
standard library for Linux. It’s my personal preference when I
statically link Linux binaries. Its qsort() looks a lot like a
heapsort, and with some research I see it’s actually smoothsort, a
Sorted in 354 frames. I ran it on both OpenBSD and FreeBSD with
identical results, so, unsurprisingly, they share an implementation.
It’s quicksort, and what’s neat about it is at the beginning you can
see it searching for a median for use as the pivot. This helps avoid
the O(n^2) worst case.
BSD also includes a mergesort() with the same prototype, except with
an int return for reporting failures. This one sorted in 247
frames. Like glibc before, there’s some behind-the-scenes that isn’t
captured. But even more, notice how the markers disappear during the
merge? It’s running the comparator against copies, stored outside the
original array. Sneaky!
Again, BSD also includes heapsort(), so ran that too. It sorted in
418 frames. It definitely looks like a heapsort, and the worse
performance is similar to musl. It seems heapsort is a poor fit for
It turns out Cygwin borrowed its qsort() from BSD. It’s pixel
identical to the above. I hadn’t noticed until I looked at the frame
MSVCRT.DLL (MinGW) and UCRT (Visual Studio)
MinGW builds against MSVCRT.DLL, found on every Windows system despite
its unofficial status. Until recently Microsoft didn’t
include a C standard library as part of the OS, but that changed with
their Universal CRT (UCRT) announcement. I thought I’d try
Turns out they borrowed their old qsort() for the UCRT, and the result
is the same: sorted in 417 frames. It chooses a pivot from the
median of the ends and the middle, swaps the pivot to the middle, then
partitions. Looking to the middle for the pivot makes sorting
pre-sorted arrays much more efficient.
Finally I ran it against Pelles C, a C compiler for
Windows. It sorted in 463 frames. I can’t find any information
about it, but it looks like some sort of hybrid between quicksort and
insertion sort. Like BSD qsort(), it finds a good median for the
pivot, partitions the elements, and if a partition is small enough, it
switches to insertion sort. This should behave well on mostly-sorted
arrays, but poorly on well-shuffled arrays (like this one).
That’s everything that was readily accessible to me. If you can run it
against something new, I’m certainly interested in seeing more
I recently put together a little game memory cheat tool called
MemDig. It can find the address of a particular game value
(score, lives, gold, etc.) after being given that value at different
points in time. With the address, it can then modify that value to
whatever is desired.
I’ve been using tools like this going back 20 years, but I never tried
to write one myself until now. There are many memory cheat tools to
pick from these days, the most prominent being Cheat Engine.
These tools use the platform’s debugging API, so of course any good
debugger could do the same thing, though a debugger won’t be
specialized appropriately (e.g. locating the particular address and
locking its value).
My motivation was bypassing an in-app purchase in a single player
Windows game. I wanted to convince the game I had made the purchase
when, in fact, I hadn’t. Once I had it working successfully, I ported
MemDig to Linux since I thought it would be interesting to compare.
I’ll start with Windows for this article.
Only three Win32 functions are needed, and you could almost guess at
how it works.
It’s very straightforward and, for this purpose, is probably the
simplest API for any platform (see update).
As you probably guessed, you first need to open the process, given its
process ID (integer). You’ll need to select the desired access bit a
bit set. To read memory, you need the PROCESS_VM_READ and
PROCESS_QUERY_INFORMATION rights. To write memory, you need the
PROCESS_VM_WRITE and PROCESS_VM_OPERATION rights. Alternatively
you could just ask for all rights with PROCESS_ALL_ACCESS, but I
prefer to be precise.
void*addr;// target process address
Don’t forget to check the return value and verify written. Finally,
don’t forget to close it when you’re done.
That’s all there is to it. For the full cheat tool you’d need to find
the mapped regions of memory, via VirtualQueryEx. It’s not
as simple, but I’ll leave that for another article.
Unfortunately there’s no standard, cross-platform debugging API for
unix-like systems. Most have a ptrace() system call, though each works
a little differently. Note that ptrace() is not part of POSIX, but
appeared in System V Release 4 (SVr4) and BSD, then copied elsewhere.
The following will all be specific to Linux, though the procedure is
similar on other unix-likes.
In typical Linux fashion, if it involves other processes, you use the
standard file API on the /proc filesystem. Each process has a
directory under /proc named as its process ID. In this directory is a
virtual file called “mem”, which is a file view of that process’
entire address space, including unmapped regions.
The catch is that while you can open this file, you can’t actually
read or write on that file without attaching to the process as a
debugger. You’ll just get EIO errors. To attach, use ptrace() with
PTRACE_ATTACH. This asynchronously delivers a SIGSTOP signal to
the target, which has to be waited on with waitpid().
You could select the target address with lseek(), but it’s
cleaner and more efficient just to do it all in one system call with
pread() and pwrite(). I’ve left out the error
checking, but the return value of each function should be checked:
ptrace(PTRACE_ATTACH,pid,0,0);waitpid(pid,NULL,0);off_taddr=...;// target process address
The process will (and must) be stopped during this procedure, so do
your reads/writes quickly and get out. The kernel will deliver the
writes to the other process’ virtual memory.
Like before, don’t forget to close.
To find the mapped regions in the real cheat tool, you would read and
parse the virtual text file /proc/pid/maps. I don’t know if I’d call
this stringly-typed method elegant — the kernel converts the data into
string form and the caller immediately converts it right back — but
that’s the official API.
Update: Konstantin Khlebnikov has pointed out the
process_vm_readv() and process_vm_writev()
system calls, available since Linux 3.2 (January 2012) and glibc 2.15
(March 2012). These system calls not require ptrace(), nor does the
remote process need to be stopped. They’re equivalent to
ReadProcessMemory() and WriteProcessMemory(), except there’s no
requirement to first “open” the process.
Over the past two years I’ve been mentoring a high school student, Dan
Kelly, in software development though my workplace mentoring
program. It’s been one of the most rewarding experiences in
my life, far beyond my initial expectations. It didn’t just go well,
it went extraordinarily well. One of my co-workers described the
results as, “the single most successful technical investment I’ve ever
seen made in a person.” Last week the mentorship effectively ended as
he became a college freshman. While it’s fresh in my mind, I’d like to
reflect on how it went and why this arrangement worked so particularly
well for both of us.
Students come into work for a few hours at a time a few days each week
throughout the school year, typically after school. They’re supplied
with their own computer in a mentor-arranged work space. In my case,
Dan sat at a smaller makeshift desk in my office. The schedule is
informal and we were frequently making adjustments as needed.
An indispensable feature is that the students are unpaid. Based on
my observations, the primary reason many other mentorships aren’t
nearly as effective is a failure to correctly leverage this. Some
mentors put students directly to work on an existing, funded project.
This is a mistake. Real world, practical projects are poor learning
environments for beginners. The complexity inhibits development of
the fundamentals, and, from a position of no experience, the
student will have little ability to influence the project, stifling
If being unpaid sounds unfair, remember that these students are coming
in with essentially no programming experience. For several months,
students are time sink. Maybe they fiddled around with Python on their
own, but it probably doesn’t really amount to anything meaningful
Being unpaid means students can be put to work on anything, even if
unfunded — even if it’s only for the student’s own benefit. The very
first thing I did with Dan was to walk him through an install of
Debian (wiping the near-useless, corporate Windows image). I wanted
him fully in control of, and responsible for, his own development
environment. I never sat at his keyboard, and all exercises took place
at his computer.
From here I took a bit of an unconventional approach: I decided he was
going to learn from the bottom up, starting with C. There would be no
Fisher-Price-esque graphical programming language (yes, some mentors
actually do this), not even an easier, education-friendly, but
production-ready language like Python. I wanted him to internalize
pointers before trusting him with real work. I showed him how to
invoke gcc, the various types of files involved, how to invoke the
linker, and lent him my hard copy of first edition K&R.
Similarly, I wasn’t going to start him off with some toy text editor.
Like with C, he was going to learn real production tools from the
start. I gave the Emacs run-down and started him on the tutorial
(C-h t). If he changed his mind later after getting more experience,
wanting to use something different, that would be fine.
Once he got the hang of everything so far, I taught him Git and
Magit. Finally we could collaborate.
The first three months were spent on small exercises. He’d learn a
little bit from K&R, describe to me what he learned, and I’d come up
with related exercises, not unlike those on DailyProgrammer, to
cement the concepts. I’d also translate concepts into modern C (e.g.
With K&R complete, he was ready to go beyond simple exercises.
Unfortunately, I wasn’t able to involve him in my own work at the
time. His primary interests included graphics, so we decided on a
multi-threaded, networked multi-processor raytracer.
It was a lot more educational for us both than I expected. I spent a
significant amount of personal time learning graphics well enough to
keep ahead: color spaces, blending, anti-aliasing, brushing up on
linear algebra, optics, lighting models, modeling formats, Blender,
etc. In a few cases I simply learned it from him. It reminds me of a
passage from The Moon is a Harsh Mistress (the namesake of this
I liked Prof. He would teach anything. Wouldn’t matter that he knew
nothing about it; if pupil wanted it, he would smile and set a
price, locate materials, stay a few lessons ahead. Or barely even if
he found it tough—never pretended to know more than he did. Took
algebra from him and by time we reached cubics I corrected his probs
as often as he did mine—but he charged into each lesson gaily.
I started electronics under him, soon was teaching him. So he
stopped charging and we went along together until he dug up an
engineer willing to daylight for extra money—whereupon we both paid
new teacher and Prof tried to stick with me, thumb-fingered and
slow, but happy to be stretching his mind.
Since this was first and foremost an educational project, I decided we
would use no libraries other than the (POSIX) C standard library. He
would know how nearly every aspect of the program worked. The input
(textures) and output formats were Netpbm PPMs, BMP files, and
YUV4MPEG2 (video), each of which is very easy to read and
write. It loaded 3D models in the easy-to-parse Wavefront .obj
format. It also supported a text rendered overlay, initially
using our own font format and later with fonts in the Glyph Bitmap
Distribution Format — again, easy to parse and use. The text
made it possible to produce demo videos without any additional editing
(see the above video).
After a year on the raytracer, he had implemented most of what he
wanted and I had an opportunity to involve him in my funded research.
By this point he as very capable, quickly paying back his entire
Together we made rapid progress — much more than I could alone. I
can’t go into the specifics, but much of his work was built on lessons
learned from the raytracer, including an OpenGL display and
SIMD-optimized physics engine. I also taught him x86 assembly,
which he applied to co-author a paper, ROP Gadget Prevalence and
Survival under Compiler-based Binary Diversification Schemes (soon to
be published, wherein this will become a link).
To reiterate, an important part of this entire journey was the
influence he had over his own work. He had say on the direction of
each project. Until he started earning a college intern pay
(post-graduation), I had no ability to make him do anything he didn’t
want to do. I could only rely on his motivation. Fortunately what
motivates him is what also motivates me, so to find the framing for
that motivation I only had to imagine myself in his shoes.
A Comment on Respect
An important aspect I hadn’t noticed until the second year was
respect. Most of Dan’s interactions with other co-workers was very
respectful. They listened to what he had to say and treated it with
the same respect as they would a regular, full-time co-worker. I
can’t emphasize how invaluable this is for a student.
I bring this up because there were a handful of interactions that
weren’t so respectful. A few individuals, when discovering his status,
made a jokes (“Hey, where’s my coffee?”) or wouldn’t take his comments
seriously. Please don’t be one of these people.
What’s comes next?
In many ways, Dan is starting college in a stronger position than I
was finishing college. The mentorship was a ton of vocational,
practical experience that doesn’t normally happen in a college course.
However, there’s plenty of computer science theory that I’m not so
great at teaching. For example, he got hands on experience with
practical data structures, but only picked up a shallow understanding
of their analysis (Big O, etc.). College courses will fill those gaps,
and they will be learned more easily with a thorough intuition of
As he moves on, I’ll be starting all over again with a new student.
The end of the month marks Elfeed’s third birthday. Surprising
to nobody, it’s also been three years of heavy, daily use by me. While
I’ve used Elfeed concurrently on a number of different machines over
this period, I’ve managed to keep an Elfeed database index
with a lineage going all the way back to the initial development
stages, before the announcement. It’s a large, organically-grown
database that serves as a daily performance stress test. Hopefully
this means I’m one of the first people to have trouble if an invisible
threshold is ever exceeded.
I’m also the sort of person who gets excited when I come across an
interesting dataset, and I have this gem sitting right in front of me.
So a couple of days ago I pushed a new Elfeed function,
elfeed-csv-export, which exports a database index into three CSV
files. These are intended to serve as three tables in a SQL database,
exposing the database to interesting relational queries and joins.
Entry content (HTML, etc.) has always been considered volatile, so
this is not exported. The export function isn’t interactive (yet?), so
if you want to generate your own you’ll need to (require
'elfeed-csv) and evaluate it yourself.
All the source code for performing the analysis below on your own
database can be found here:
Web authors are notoriously awful at picking actually-unique entry
IDs, even when using the smarter option, Atom. I still simply
don’t trust that entry IDs are unique, so, as usual, I’ve qualified
them by their source feed URL, hence the primary key on both columns
At this point I wish I had collected a lot more information. If I were
to start fresh today, Elfeed’s database schema would not only fully
match Atom’s schema, but also exceed it with additional logging:
When was each entry actually fetched?
How did each entry change since the last fetch?
When and for what reason did a feed fetch fail?
When did an entry stop appearing in a feed?
How long did fetching take?
How long did parsing take?
Which computer (hostname) performed the fetch?
What interesting HTTP headers were included?
Even if not kept for archival, how large was the content?
I may start tracking some of these. If I don’t, I’ll be kicking myself
three years from now when I look at this again.
A look at my index
So just how big is my index? It’s 25MB uncompressed, 2.5MB
compressed. I currently follow 117 feeds, but my index includes
43,821 entries from 309 feeds. These entries are marked with
53,360 tags from a set of 35 unique tags. Some of these datapoints
are the result of temporarily debugging Elfeed issues and don’t
represent content that I actually follow. I’m more careful these days
to test in a temporary database as to avoid contamination. Some are
duplicates due to feeds changing URLs over the years. Some are
artifacts from old bugs. This all represents a bit of noise, but
should be negligible. During my analysis I noticed some of these
anomalies and took a moment to clean up obviously bogus data (weird
dates, etc.), all by adjusting tags.
The first thing I wanted to know is the weekday frequency. A number of
times I’ve blown entire Sundays working on Elfeed, and, as if to
frustrate my testing, it’s not unusual for several hours to pass
between new entries on Sundays. Is this just my perception or are
Sundays really that slow?
Here’s my query. I’m using SQLite’s strftime to shift the
result into my local time zone, Eastern Time. This time zone is the
source, or close to the source, of a large amount of the content. This
also automatically accounts for daylight savings time, which can’t be
done with a simple divide and subtract.
The most frequent tag (13,666 appearances) is “youtube”, which marks
every YouTube video, and I’ll use gnuplot to visualize it. The input
“file” is actually a command since gnuplot is poor at filtering data
itself, especially for histograms.
plot '< grep ^youtube, weekdays.csv' using 2:3 with boxes
Wow, things do quiet down dramatically on weekends! From the
glass-half-full perspective, this gives me a chance to catch up when I
inevitably fall behind on these videos during the week.
The same is basically true for other types of content, including
“comic” (12,465 entries) and “blog” (7,505 entries).
However, “emacs” (2,404 entries) is a different story. It doesn’t slow
down on the weekend, but Emacs users sure love to talk about Emacs on
Mondays. In my own index, this spike largely comes from Planet
Emacsen. Initially I thought maybe this was an artifact of
Planet Emacsen’s date handling — i.e. perhaps it does a big fetch on
Mondays and groups up the dates — but I double checked: they pass the
date directly through from the original articles.
Conclusion: Emacs users love Mondays. Or maybe they hate Mondays and
talk about Emacs as an escape.
I can reuse the same query to look at different time scales. When
during the day do entries appear? Adjusting the time zone here becomes
a lot more important.
Emacs bloggers tend to follow a nice Eastern Time sleeping schedule.
(I wonder how Vim bloggers compare, since, as an Emacs user, I
naturally assume Vim users’ schedules are as undisciplined as their
bathing habits.) However, this also might be prolific the
Irreal breaking the curve.
The YouTube channels I follow are a bit more erratic, but there’s
still a big drop in the early morning and a spike in the early
afternoon. It’s unclear if the timestamp published in the feed is the
upload time or the publication time. This would make a difference in
the result (e.g. overnight video uploads).
December is a big drop across all tags, probably for the holidays.
Both “comic” and “blog” also have an interesting drop in August. For
brevity, I’ll only show one. This might be partially due my not
waiting until the end of this month for this analysis, since there are
only 2.5 Augusts in my 3-year dataset.
Unfortunately the timestamp is the only direct numerical quantity in
the data. So far I’ve been binning data points and counting to get a
second numerical quantity. Everything else is text, so I’ll need to
get more creative to find other interesting relationships.
So let’s have a look a the lengths of entry titles.
Emacs article titles follow a nice distribution. You can tell these
are programmers because so many titles are exactly 32 characters long.
Picking this number is such a natural instinct that we aren’t even
aware of it. Or maybe all their database schemas have VARCHAR(32)
This is the title length versus time of day (not binned). Each point
is one of the 53,360 posts.
set style fill transparent solid 0.25 noborder
set style circle radius 0.04
plot 'length-vs-daytime.csv' using 1:2 with circles
(This is a good one to follow through to the full size image.)
Again, all Eastern Time since I’m self-centered like that. Vertical
lines are authors rounding their post dates to the hour. Horizontal
lines are the length spikes from above, such as the line of entries at
title length 10 in the evening (Dwarf Fortress blog). There’s a the
mid-day cloud of entries of various title lengths, with the shortest
title cloud around mid-morning. That’s probably when many of the
webcomics come up.
Additional analysis could look further at textual content, beyond
simply length, in some quantitative way (n-grams? soundex?). But
mostly I really need to keep track of more data!
Conventionally, a program that creates an output file will delete its
incomplete output should an error occur while writing the file. It’s
risky to leave behind a file that the user may rightfully confuse for
a valid file. They might not have noticed the error.
For example, compression programs such as gzip, bzip2, and xz when
given a compressed file as an argument will create a new file with the
compression extension removed. They write to this file as the
compressed input is being processed. If the compressed stream contains
an error in the middle, the partially-completed output is removed.
There are exceptions of course, such as programs that download files
over a network. The partial result has value, especially if the
transfer can be continued from where it left off. The
convention is to append another extension, such as “.part”, to
indicate a partial output.
The straightforward solution is to always delete the file as part of
error handling. A non-interactive program would report the error on
standard error, delete the file, and exit with an error code. However,
there are at least two situations where error handling would be unable
to operate: unhandled signals (usually including a segmentation fault)
and power failures. A partial or corrupted output file will be left
behind, possibly looking like a valid file.
A common, more complex approach is to name the file differently from
its final name while being written. If written successfully, the
completed file is renamed into place. This is already required for
durable replacement, so it’s basically free for many
applications. In the worst case, where the program is unable to clean
up, the obviously incomplete file is left behind only wasting space.
Looking to be more robust, I had the following misguided idea: Rely
completely on the operating system to perform cleanup in the case of a
failure. Initially the file would be configured to be automatically
deleted when the final handle is closed. This takes care of all
abnormal exits, and possibly even power failures. The program can just
exit on error without deleting the file. Once written successfully,
the automatic-delete indicator is cleared so that the file survives.
The target application for this technique supports both Linux and
Windows, so I would need to figure it out for both systems. On
Windows, there’s the flag FILE_FLAG_DELETE_ON_CLOSE. I’d just need
to find a way to clear it. On POSIX, file would be unlinked while
being written, and linked into the filesystem on success. The latter
turns out to be a lot harder than I expected.
Solution for Windows
I’ll start with Windows since the technique actually works fairly well
here — ignoring the usual, dumb Win32 filesystem caveats. This is a
little surprising, since it’s usually Win32 that makes these things
far more difficult than they should be.
The primary Win32 function for opening and creating files is
CreateFile. There are many options, but the key is
FILE_FLAG_DELETE_ON_CLOSE. Here’s how an application might typically
open a file for output.
This special flag asks Windows to delete the file as soon as the last
handle to to file object is closed. Notice I said file object, not
file, since these are different things. The catch: This flag
is a property of the file object, not the file, and cannot be removed.
However, the solution is simple. Create a new link to the file so that
it survives deletion. This even works for files residing on a network
The gotcha is that the underlying filesystem must be NTFS. FAT32
doesn’t support hard links. Unfortunately, since FAT32 remains the
least common denominator and is still widely used for removable media,
depending on the application, your users may expect support for saving
files to FAT32. A workaround is probably required.
Solution for Linux
This is where things really fall apart. It’s just barely possible on
Linux, it’s messy, and it’s not portable anywhere else. There’s no way
to do this for POSIX in general.
My initial thought was to create a file then unlink it. Unlike the
situation on Windows, files can be unlinked while they’re currently
open by a process. These files are finally deleted when the last file
descriptor (the last reference) is closed. Unfortunately, using
unlink(2) to remove the last link to a file prevents that file from
being linked again.
Instead, the solution is to use the relatively new (since Linux 3.11),
Linux-specific O_TMPFILE flag when creating the file. Instead of a
filename, this variation of open(2) takes a directory and creates an
unnamed, temporary file in it. These files are special in that they’re
permitted to be given a name in the filesystem at some future point.
For this example, I’ll assume the output is relative to the current
working directory. If it’s not, you’ll need to open an additional file
descriptor for the parent directory, and also use openat(2) to avoid
possible race conditions (since paths can change from under you). The
number of ways this can fail is already rapidly multiplying.
The catch is that only a handful of filesystems support O_TMPFILE.
It’s like the FAT32 problem above, but worse. You could easily end up
in a situation where it’s not supported, and will almost certainly
require a workaround.
Linking a file from a file descriptor is where things get messier. The
file descriptor must be linked with linkat(2) from its name on the
/proc virtual filesystem, constructed as a string. The following
snippet comes straight from the Linux open(2) manpage.
Even on Linux, /proc isn’t always available, such as within a chroot
or a container, so this part can fail as well. In theory there’s a way
to do this with the Linux-specific AT_EMPTY_PATH and avoid /proc,
but I couldn’t get it to work.
// Note: this doesn't actually work for me.
Given the poor portability (even within Linux), the number of ways
this can go wrong, and that a workaround is definitely needed anyway,
I’d say this technique is worthless. I’m going to stick with the
tried-and-true approach for this one.