<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Articles tagged bsd at null program</title>
  <link rel="alternate" type="text/html"
        href="https://nullprogram.com/tags/bsd/"/>
  <link rel="self" type="application/atom+xml"
        href="https://nullprogram.com/tags/bsd/feed/"/>
  <updated>2026-03-30T21:58:42Z</updated>
  <id>urn:uuid:5e43aa24-aef5-4f63-b37e-42b2d7859c01</id>

  <author>
    <name>Christopher Wellons</name>
    <uri>https://nullprogram.com</uri>
    <email>wellons@nullprogram.com</email>
  </author>

  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>A more robust raw OpenBSD syscall demo</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2025/03/06/"/>
    <id>urn:uuid:f7101ee1-a2e6-4895-b763-bd7b2a842280</id>
    <updated>2025-03-06T02:43:20Z</updated>
    <category term="c"/><category term="bsd"/><category term="x86"/>
    <content type="html">
      <![CDATA[<p>Ted Unangst published <a href="https://flak.tedunangst.com/post/dude-where-are-your-syscalls"><em>dude, where are your syscalls?</em></a> on flak
yesterday, with a neat demonstration of OpenBSD’s <a href="https://undeadly.org/cgi?action=article;sid=20230222064027">pinsyscall</a>
security feature, whereby only pre-registered addresses are allowed to
make system calls. Whether it strengthens or weakens security is <a href="https://isopenbsdsecu.re/mitigations/pinsyscall/">up for
debate</a>, but regardless it’s an interesting, low-level programming
challenge. The original demo is fragile for multiple reasons, and requires
manually locating and entering addresses for each build. In this article I
show how to fix it. To prove that it’s robust, I ported an entire, real
application to use raw system calls on OpenBSD.</p>

<p>The original program uses ARM64 assembly. I’m a lot more comfortable with
x86-64 assembly, plus that’s the hardware I have readily on hand. So the
assembly language will be different, but all the concepts apply to both
these architectures. Almost none of these OpenBSD system interfaces are
formally documented (or stable for that matter), and I had to dig around
the OpenBSD source tree to figure it out (along with a <a href="https://news.ycombinator.com/item?id=26290723">helpful jart
nudge</a>). So don’t be afraid to get your hands dirty.</p>

<p>There are lots of subtle problems in the original demo, so let’s go
through the program piece by piece, starting with the entry point:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span>
<span class="nf">start</span><span class="p">()</span>
<span class="p">{</span>
        <span class="n">w</span><span class="p">(</span><span class="s">"hello</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="mi">6</span><span class="p">);</span>
        <span class="n">x</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This function is registered as the entry point in the ELF image, so it has
no caller. <del>That means no return address on the stack, so the stack is
not aligned for a function.</del>(<strong>Correction</strong>: The stack alignment issue is
true for x86, but not ARM, so the original demo is fine.) In toy programs
that goes unnoticed, but compilers generate code assuming the stack is
aligned. In a real application this is likely to crash deep on the first
SIMD register spill.</p>

<p>We could fix this with a <a href="https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html#index-force_005falign_005farg_005fpointer-function-attribute_002c-x86"><code class="language-plaintext highlighter-rouge">force_align_arg_pointer</code></a> attribute, at
least for architectures that support it, but I prefer to write the entry
point in assembly. Especially so we can access the command line arguments
and environment variables, which is necessary in a real application. That
happens to work the same as it does on Linux, so here’s my old, familiar
entry point:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">asm</span> <span class="p">(</span>
    <span class="s">"        .globl _start</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">"_start: mov   %rsp, %rdi</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">"        call  start</span><span class="se">\n</span><span class="s">"</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Per the ABI, the first argument passes through <code class="language-plaintext highlighter-rouge">rdi</code>, so I pass a copy of
the stack pointer, <code class="language-plaintext highlighter-rouge">rsp</code>, as it appeared on entry. Entry point arguments
<code class="language-plaintext highlighter-rouge">argc</code>, <code class="language-plaintext highlighter-rouge">argv</code>, and <code class="language-plaintext highlighter-rouge">envp</code> are all pushed on the stack at <code class="language-plaintext highlighter-rouge">rsp</code>, so the
first real function can retrieve it all from just the stack pointer. The
original demo won’t use it, though. Using <code class="language-plaintext highlighter-rouge">call</code> to pass control pushes a
return address, which will never be used, and aligns the stack for the
first real function. I name it <code class="language-plaintext highlighter-rouge">_start</code> because that’s what the linker
expects and so things will go a little smoother, so it’s rather convenient
that the original didn’t use this name.</p>

<p>Next up, the “write” function:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span>
<span class="nf">w</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">what</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">len</span><span class="p">)</span> <span class="p">{</span>
        <span class="kr">__asm</span><span class="p">(</span>
<span class="s">"       mov x2, x1;"</span>
<span class="s">"       mov x1, x0;"</span>
<span class="s">"       mov w0, #1;"</span>
<span class="s">"       mov x8, #4;"</span>
<span class="s">"       svc #0;"</span>
        <span class="p">);</span>
        <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There are two <a href="/blog/2024/12/20/">serious problems with this assembly block</a>. First, the
function arguments are not necessarily in those registers by the time
control reaches the basic assembly block. The function prologue could move
them around. Even more so if this function was inlined. This is exactly
the problem <em>extended</em> inline assembly is intended to solve. Second, it
clobbers a number of registers. Compilers assume this does not happen when
generating their own code. This sort of assembly falls apart the moment it
comes into contact with a non-zero optimization level.</p>

<p>Solving this is just a matter of using inline assembly properly:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="nf">w</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">what</span><span class="p">,</span> <span class="kt">long</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
    <span class="kt">char</span> <span class="n">err</span><span class="p">;</span>
    <span class="kt">long</span> <span class="n">rax</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>  <span class="c1">// SYS_write</span>
    <span class="n">asm</span> <span class="k">volatile</span> <span class="p">(</span>
        <span class="s">"syscall"</span>
        <span class="o">:</span> <span class="s">"+a"</span><span class="p">(</span><span class="n">rax</span><span class="p">),</span> <span class="s">"+d"</span><span class="p">(</span><span class="n">len</span><span class="p">),</span> <span class="s">"=@ccc"</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
        <span class="o">:</span> <span class="s">"D"</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="s">"S"</span><span class="p">(</span><span class="n">what</span><span class="p">)</span>
        <span class="o">:</span> <span class="s">"rcx"</span><span class="p">,</span> <span class="s">"r11"</span><span class="p">,</span> <span class="s">"memory"</span>
    <span class="p">);</span>
    <span class="k">return</span> <span class="n">err</span> <span class="o">?</span> <span class="o">-</span><span class="n">rax</span> <span class="o">:</span> <span class="n">rax</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I’ve enhanced it a bit, returning a <a href="/blog/2016/09/23/">Linux-style negative errno</a> on
error. In the BSD ecosystem, syscall errors are indicated using the carry
flag, which here is output into <code class="language-plaintext highlighter-rouge">err</code> via <code class="language-plaintext highlighter-rouge">=@ccc</code>. When set, the return
value is an errno. Further, the OpenBSD kernel uses both <code class="language-plaintext highlighter-rouge">rax</code> and <code class="language-plaintext highlighter-rouge">rdx</code>
for return values, so I’ve also listed <code class="language-plaintext highlighter-rouge">rdx</code> as an input+output despite
not consuming the result. Despite all these changes, this function is not
yet complete! We’ll get back to it later.</p>

<p>The “exit” function, <code class="language-plaintext highlighter-rouge">x</code>, is just fine:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span>
<span class="nf">x</span><span class="p">()</span> <span class="p">{</span>
        <span class="kr">__asm</span><span class="p">(</span>
<span class="s">"       mov x8, #1;"</span>
<span class="s">"       svc #0;"</span>
        <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It doesn’t set an exit status, so it passes garbage instead, but otherwise
this works. No inputs, plus clobbers and outputs don’t matter when control
never returns. In a real application I might write it:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">__attribute</span><span class="p">((</span><span class="n">noreturn</span><span class="p">))</span>
<span class="kt">void</span> <span class="nf">x</span><span class="p">(</span><span class="kt">int</span> <span class="n">status</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">asm</span> <span class="k">volatile</span> <span class="p">(</span><span class="s">"syscall"</span> <span class="o">::</span> <span class="s">"a"</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="s">"D"</span><span class="p">(</span><span class="n">status</span><span class="p">));</span>
    <span class="n">__builtin_unreachable</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This function will need a little additional work later, too.</p>

<p>The <code class="language-plaintext highlighter-rouge">ident</code> section is basically fine as-is:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">__asm</span><span class="p">(</span><span class="s">" .section </span><span class="se">\"</span><span class="s">.note.openbsd.ident</span><span class="se">\"</span><span class="s">, </span><span class="se">\"</span><span class="s">a</span><span class="se">\"\n</span><span class="s">"</span>
<span class="s">"       .p2align 2</span><span class="se">\n</span><span class="s">"</span>
<span class="s">"       .long   8</span><span class="se">\n</span><span class="s">"</span>
<span class="s">"       .long   4</span><span class="se">\n</span><span class="s">"</span>
<span class="s">"       .long   1</span><span class="se">\n</span><span class="s">"</span>
<span class="s">"       .ascii </span><span class="se">\"</span><span class="s">OpenBSD</span><span class="se">\\</span><span class="s">0</span><span class="se">\"\n</span><span class="s">"</span>
<span class="s">"       .long   0</span><span class="se">\n</span><span class="s">"</span>
<span class="s">"       .previous</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
</code></pre></div></div>

<p>The compiler assumes the current section remains the same at the end of
the assembly block, which here is accomplished with <code class="language-plaintext highlighter-rouge">.previous</code>. Though it
clobbers the assembler’s remembered “other” section and so may interfere
with surrounding code using <code class="language-plaintext highlighter-rouge">.previous</code>. Better to use <code class="language-plaintext highlighter-rouge">.pushsection</code> and
<code class="language-plaintext highlighter-rouge">.popsection</code> for good stack discipline. There are many such examples in
the OpenBSD source tree.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">asm</span> <span class="p">(</span>
    <span class="s">".pushsection .note.openbsd.ident, </span><span class="se">\"</span><span class="s">a</span><span class="se">\"\n</span><span class="s">"</span>
    <span class="s">".long  8, 4, 1, 0x6e65704f, 0x00445342, 0</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">".popsection</span><span class="se">\n</span><span class="s">"</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Now the trickiest part, the pinsyscall table:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">whats</span> <span class="p">{</span>
        <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">offset</span><span class="p">;</span>
        <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">sysno</span><span class="p">;</span>
<span class="p">}</span> <span class="n">happening</span><span class="p">[]</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">section</span><span class="p">(</span><span class="s">".openbsd.syscalls"</span><span class="p">)))</span> <span class="o">=</span> <span class="p">{</span>
        <span class="p">{</span> <span class="mh">0x104f4</span><span class="p">,</span> <span class="mi">4</span> <span class="p">},</span>
        <span class="p">{</span> <span class="mh">0x10530</span><span class="p">,</span> <span class="mi">1</span> <span class="p">},</span>
<span class="p">};</span>
</code></pre></div></div>

<p>Those offsets — offsets from the beginning of the ELF image — were entered
manually, and it kind of ruins the whole demo. We don’t have a good way to
get at those offsets from C, or any high level language. However, we can
solve that by tweaking the inline assembly with some labels:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">__attribute</span><span class="p">((</span><span class="n">noinline</span><span class="p">))</span>
<span class="kt">long</span> <span class="nf">w</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">what</span><span class="p">,</span> <span class="kt">long</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="n">asm</span> <span class="k">volatile</span> <span class="p">(</span>
        <span class="s">"_w: syscall"</span>
        <span class="c1">// ...</span>
    <span class="p">);</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">__attribute</span><span class="p">((</span><span class="n">noinline</span><span class="p">,</span><span class="n">noreturn</span><span class="p">))</span>
<span class="kt">void</span> <span class="nf">x</span><span class="p">(</span><span class="kt">int</span> <span class="n">status</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">asm</span> <span class="k">volatile</span> <span class="p">(</span>
        <span class="s">"_x: syscall"</span>
        <span class="c1">// ...</span>
    <span class="p">);</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Very importantly I’ve added <code class="language-plaintext highlighter-rouge">noinline</code> to prevent these functions from
being inlined into additional copies of the <code class="language-plaintext highlighter-rouge">syscall</code> instruction, which
of course won’t be registered. This also prevents duplicate labels causing
assembler errors. Once we have the labels, we can use them in an assembly
block listing the allowed syscall instructions:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">asm</span> <span class="p">(</span>
    <span class="s">".pushsection .openbsd.syscalls</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">".long  _x, 1</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">".long  _w, 4</span><span class="se">\n</span><span class="s">"</span>
    <span class="s">".popsection</span><span class="se">\n</span><span class="s">"</span>
<span class="p">);</span>
</code></pre></div></div>

<p>That lets the linker solve the offsets problem, which is its main job
after all. With these changes the demo works reliably, even under high
optimization levels. I suggest these flags:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cc -static -nostdlib -no-pie -o where where.c
</code></pre></div></div>

<p>Disabling PIE with <code class="language-plaintext highlighter-rouge">-no-pie</code> is necessary in real applications or else
strings won’t work. You can apply more flags to strip it down further, but
these are the flags generally necessary to compile these sorts of programs
on at least OpenBSD 7.6.</p>

<p>So, how do I know this stuff works in general? Because I ported <a href="/blog/2023/01/18/">my ultra
portable pkg-config clone, u-config</a>, to use raw OpenBSD syscalls:
<strong><a href="https://github.com/skeeto/u-config/blob/openbsd/openbsd_main.c"><code class="language-plaintext highlighter-rouge">openbsd_main.c</code></a></strong>. Everything still works at high optimization
levels.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cc -static -nostartfiles -no-pie -o pkg-config openbsd_main.c libmemory.a
$ ./pkg-config --cflags --libs libcurl
-I/usr/local/include -L/usr/local/lib -lcurl
</code></pre></div></div>

<p>Because the new syscall wrappers behave just like Linux system calls, it
leverages the <code class="language-plaintext highlighter-rouge">linux_noarch.c</code> platform, and the whole port is ~70 lines
of code. A few more flags (<code class="language-plaintext highlighter-rouge">-fno-stack-protector</code>, <code class="language-plaintext highlighter-rouge">-Oz</code>, <code class="language-plaintext highlighter-rouge">-s</code>, etc.), and
it squeezes into a slim 21.6K static binary.</p>

<p>Despite making no libc calls, it’s not possible stop compilers from
fabricating (<a href="/blog/2024/11/10/">hallucinating?</a>) string function calls, so the build
above depends on external definitions. In the command above, <code class="language-plaintext highlighter-rouge">libmemory.a</code>
comes from <a href="https://github.com/skeeto/w64devkit/blob/master/src/libmemory.c"><code class="language-plaintext highlighter-rouge">libmemory.c</code></a> found <a href="/blog/2024/02/05/">in w64devkit</a>. Alternatively,
<a href="https://flak.tedunangst.com/post/you-dont-link-all-of-libc">and on topic</a>, you could link the OpenBSD libc string functions by
omitting <code class="language-plaintext highlighter-rouge">libmemory.a</code> from the build.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cc -static -nostartfiles -no-pie -o pkg-config openbsd_main.c
</code></pre></div></div>

<p>Though it pulls in a lot of bloat (~8x size increase), and teasing out the
necessary objects isn’t trivial.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>OpenBSD's pledge and unveil from Python</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2021/09/15/"/>
    <id>urn:uuid:cd3857dd-270c-430e-824d-6512688687a3</id>
    <updated>2021-09-15T02:46:56Z</updated>
    <category term="bsd"/><category term="c"/><category term="python"/>
    <content type="html">
      <![CDATA[<p><em>This article was discussed <a href="https://news.ycombinator.com/item?id=28535255">on Hacker News</a>.</em></p>

<p>Years ago, OpenBSD gained two new security system calls, <a href="https://man.openbsd.org/pledge.2"><code class="language-plaintext highlighter-rouge">pledge(2)</code></a>
(originally <a href="https://www.openbsd.org/papers/tame-fsec2015/mgp00001.html"><code class="language-plaintext highlighter-rouge">tame(2)</code></a>) and <a href="https://man.openbsd.org/unveil.2"><code class="language-plaintext highlighter-rouge">unveil</code></a>. In both, an application
surrenders capabilities at run-time. The idea is to perform initialization
like usual, then drop capabilities before handling untrusted input,
limiting unwanted side effects. This feature is applicable even where type
safety isn’t an issue, such as Python, where a program might still get
tricked into accessing sensitive files or making network connections when
it shouldn’t. So how can a Python program access these system calls?</p>

<p>As <a href="/blog/2021/06/29/">discussed previously</a>, it’s quite easy to access C APIs from
Python through its <a href="https://docs.python.org/3/library/ctypes.html"><code class="language-plaintext highlighter-rouge">ctypes</code></a> package, and this is no exception.
In this article I show how to do it. Here’s the full source if you want to
dive in: <a href="https://github.com/skeeto/scratch/tree/master/misc/openbsd.py"><strong><code class="language-plaintext highlighter-rouge">openbsd.py</code></strong></a>.</p>

<!--more-->

<p>I’ve chosen these extra constraints:</p>

<ul>
  <li>
    <p>As extra safety features, unnecessary for correctness, attempts to call
these functions on systems where they don’t exist will silently do
nothing, as though they succeeded. They’re provided as a best effort.</p>
  </li>
  <li>
    <p>Systems other than OpenBSD may support these functions, now or in the
future, and it would be nice to automatically make use of them when
available. This means no checking for OpenBSD specifically but instead
<em>feature sniffing</em> for their presence.</p>
  </li>
  <li>
    <p>The interfaces should be Pythonic as though they were implemented in
Python itself. Raise exceptions for errors, and accept strings since
they’re more convenient than bytes.</p>
  </li>
</ul>

<p>For reference, here are the function prototypes:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">pledge</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">promises</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">execpromises</span><span class="p">);</span>
<span class="kt">int</span> <span class="nf">unveil</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">permissions</span><span class="p">);</span>
</code></pre></div></div>

<p>The <a href="https://flak.tedunangst.com/post/string-interfaces">string-oriented interface of <code class="language-plaintext highlighter-rouge">pledge</code></a> will make this a whole
lot easier to implement.</p>

<h3 id="finding-the-functions">Finding the functions</h3>

<p>The first step is to grab functions through <code class="language-plaintext highlighter-rouge">ctypes</code>. Like a lot of Python
documentation, this area is frustratingly imprecise and under-documented.
I want to grab a handle to the already-linked libc and search for either
function. However, getting that handle is a little different on each
platform, and in the process I saw four different exceptions, only one of
which is documented.</p>

<p>I came up with passing None to <code class="language-plaintext highlighter-rouge">ctypes.CDLL</code>, which ultimately just passes
<code class="language-plaintext highlighter-rouge">NULL</code> to <a href="https://man.openbsd.org/dlopen.3"><code class="language-plaintext highlighter-rouge">dlopen(3)</code></a>. That’s really all I wanted. Currently on
Windows this is a TypeError. Once the handle is in hand, try to access the
<code class="language-plaintext highlighter-rouge">pledge</code> attribute, which will fail with AttributeError if it doesn’t
exist. In the event of any exception, just assume the behavior isn’t
available. If found, I also define the function prototype for <code class="language-plaintext highlighter-rouge">ctypes</code>.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_pledge</span> <span class="o">=</span> <span class="bp">None</span>
<span class="k">try</span><span class="p">:</span>
    <span class="n">_pledge</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">use_errno</span><span class="o">=</span><span class="bp">True</span><span class="p">).</span><span class="n">pledge</span>
    <span class="n">_pledge</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_int</span>
    <span class="n">_pledge</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span><span class="p">,</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span>
<span class="k">except</span> <span class="nb">Exception</span><span class="p">:</span>
    <span class="n">_pledge</span> <span class="o">=</span> <span class="bp">None</span>
</code></pre></div></div>

<p>Catching a broad Exception isn’t great, but it’s the best we can do since
the documentation is incomplete. From this block I’ve seen TypeError,
AttributeError, FileNotFoundError, and OSError. I wouldn’t be surprised if
there are more possibilities, and I don’t want to risk missing them.</p>

<p>Note that I’m catching Exception rather than using a bare <code class="language-plaintext highlighter-rouge">except</code>. My
code will not catch KeyboardInterrupt nor SystemExit. This is deliberate,
and I never want to catch these.</p>

<p>The same story for <code class="language-plaintext highlighter-rouge">unveil</code>:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_unveil</span> <span class="o">=</span> <span class="bp">None</span>
<span class="k">try</span><span class="p">:</span>
    <span class="n">_unveil</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">use_errno</span><span class="o">=</span><span class="bp">True</span><span class="p">).</span><span class="n">unveil</span>
    <span class="n">_unveil</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_int</span>
    <span class="n">_unveil</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span><span class="p">,</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span>
<span class="k">except</span> <span class="nb">Exception</span><span class="p">:</span>
    <span class="n">_unveil</span> <span class="o">=</span> <span class="bp">None</span>
</code></pre></div></div>

<h3 id="pythonic-wrappers">Pythonic wrappers</h3>

<p>The next and final step is to wrap the low-level call in an interface that
hides their C and <code class="language-plaintext highlighter-rouge">ctypes</code> nature.</p>

<p>Python strings must be encoded to bytes before they can be passed to C
functions. Rather than make the caller worry about this, we’ll let them
pass friendly strings and have the wrapper do the conversion. Either may
also be <code class="language-plaintext highlighter-rouge">NULL</code>, so None is allowed.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">pledge</span><span class="p">(</span><span class="n">promises</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span> <span class="n">execpromises</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]):</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">_pledge</span><span class="p">:</span>
        <span class="k">return</span>  <span class="c1"># unimplemented
</span>
    <span class="n">r</span> <span class="o">=</span> <span class="n">_pledge</span><span class="p">(</span><span class="bp">None</span> <span class="k">if</span> <span class="n">promises</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">promises</span><span class="p">.</span><span class="n">encode</span><span class="p">(),</span>
                <span class="bp">None</span> <span class="k">if</span> <span class="n">execpromises</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">execpromises</span><span class="p">.</span><span class="n">encode</span><span class="p">())</span>
    <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span>
        <span class="n">errno</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">get_errno</span><span class="p">()</span>
        <span class="k">raise</span> <span class="nb">OSError</span><span class="p">(</span><span class="n">errno</span><span class="p">,</span> <span class="n">os</span><span class="p">.</span><span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">))</span>
</code></pre></div></div>

<p>As usual, a return of -1 means there was an error, in which case we fetch
<code class="language-plaintext highlighter-rouge">errno</code> and raise the appropriate OSError.</p>

<p><code class="language-plaintext highlighter-rouge">unveil</code> works a little differently since the first argument is a path.
Python functions that accept paths, such as <code class="language-plaintext highlighter-rouge">open</code>, generally accept
either strings or bytes. On unix-like systems, <a href="https://simonsapin.github.io/wtf-8/">paths are fundamentally
bytestrings</a> and not necessarily Unicode, so it’s necessary to accept
bytes. Since strings are nearly always more convenient, they take both.
The <code class="language-plaintext highlighter-rouge">unveil</code> wrapper here will do the same. If it’s a string, encode it,
otherwise pass it straight through.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">unveil</span><span class="p">(</span><span class="n">path</span><span class="p">:</span> <span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">bytes</span><span class="p">,</span> <span class="bp">None</span><span class="p">],</span> <span class="n">permissions</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]):</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">_unveil</span><span class="p">:</span>
        <span class="k">return</span>  <span class="c1"># unimplemented
</span>
    <span class="n">r</span> <span class="o">=</span> <span class="n">_unveil</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">encode</span><span class="p">()</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="nb">str</span><span class="p">)</span> <span class="k">else</span> <span class="n">path</span><span class="p">,</span>
                <span class="bp">None</span> <span class="k">if</span> <span class="n">permissions</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">permissions</span><span class="p">.</span><span class="n">encode</span><span class="p">())</span>
    <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span>
        <span class="n">errno</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">get_errno</span><span class="p">()</span>
        <span class="k">raise</span> <span class="nb">OSError</span><span class="p">(</span><span class="n">errno</span><span class="p">,</span> <span class="n">os</span><span class="p">.</span><span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">))</span>
</code></pre></div></div>

<p>That’s it!</p>

<h3 id="trying-it-out">Trying it out</h3>

<p>Let’s start with <code class="language-plaintext highlighter-rouge">unveil</code>. Initially a process has access to the whole
file system with the usual restrictions. On the first call to <code class="language-plaintext highlighter-rouge">unveil</code>
it’s immediately restricted to some subset of the tree. Each call reveals
a little more until a final <code class="language-plaintext highlighter-rouge">NULL</code> which locks it in place for the rest of
the process’s existence.</p>

<p>Suppose a program has been tricked into accessing your shell history,
perhaps by mishandling a path:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">hackme</span><span class="p">():</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">pathlib</span><span class="p">.</span><span class="n">Path</span><span class="p">.</span><span class="n">home</span><span class="p">()</span> <span class="o">/</span> <span class="s">".bash_history"</span><span class="p">):</span>
            <span class="k">print</span><span class="p">(</span><span class="s">"You've been hacked!"</span><span class="p">)</span>
    <span class="k">except</span> <span class="nb">FileNotFoundError</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"Blocked by unveil."</span><span class="p">)</span>

<span class="n">hackme</span><span class="p">()</span>
</code></pre></div></div>

<p>If you’re a Bash user, this prints:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>You've been hacked!
</code></pre></div></div>

<p>Using our new feature to restrict the program’s access first:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># restrict access to static program data
</span><span class="n">unveil</span><span class="p">(</span><span class="s">"/usr/share"</span><span class="p">,</span> <span class="s">"r"</span><span class="p">)</span>
<span class="n">unveil</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>

<span class="n">hackme</span><span class="p">()</span>
</code></pre></div></div>

<p>On OpenBSD this now prints:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Blocked by unveil.
</code></pre></div></div>

<p>Working just as it should!</p>

<p>With <code class="language-plaintext highlighter-rouge">pledge</code> we declare what abilities we’d like to keep by supplying a
list of promises, <em>pledging</em> to use only those abilities afterward. A
common case is the <code class="language-plaintext highlighter-rouge">stdio</code> promise which allows reading and writing of
open files, but not <em>opening</em> files. A program might open its log file,
then drop the ability to open files while retaining the ability to write
to its log.</p>

<p>An invalid or unknown promise is an error. Does that work?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; pledge("doesntexist", None)
OSError: [Errno 22] Invalid argument
</code></pre></div></div>

<p>So far so good. How about the functionality itself?</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pledge</span><span class="p">(</span><span class="s">"stdio"</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="n">hackme</span><span class="p">()</span>
</code></pre></div></div>

<p>The program is instantly killed when making the disallowed system call:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Abort trap (core dumped)
</code></pre></div></div>

<p>If you want something a little softer, include the <code class="language-plaintext highlighter-rouge">error</code> promise:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pledge</span><span class="p">(</span><span class="s">"stdio error"</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="n">hackme</span><span class="p">()</span>
</code></pre></div></div>

<p>Instead it’s an exception, which will be a lot easier to debug when it
comes to Python, so you probably always want to use it.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>OSError: [Errno 78] Function not implemented
</code></pre></div></div>

<p>The core dump isn’t going to be much help to a Python program, so you
probably always want to use this promise. In general you need to be extra
careful about <code class="language-plaintext highlighter-rouge">pledge</code> in complex runtimes like Python’s which may
reasonably need to do many arbitrary, undocumented things at any time.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>A Survey of $RANDOM</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2018/12/25/"/>
    <id>urn:uuid:071e3ec5-fe1d-309a-3e66-3b590a96ac2c</id>
    <updated>2018-12-25T00:05:38Z</updated>
    <category term="linux"/><category term="bsd"/><category term="c"/>
    <content type="html">
      <![CDATA[<p>Most Bourne shell clones support a special <code class="language-plaintext highlighter-rouge">RANDOM</code> environment
variable that evaluates to a random value between 0 and 32,767 (e.g.
15 bits). Assigment to the variable seeds the generator. This variable
is an extension and <a href="http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/utilities/V3_chap02.html">did not appear</a> in the original Unix Bourne
shell. Despite this, the different Bourne-like shells that implement
it have converged to the same interface, but <em>only</em> the interface.
Each implementation differs in interesting ways. In this article we’ll
explore how <code class="language-plaintext highlighter-rouge">$RANDOM</code> is implemented in various Bourne-like shells.</p>

<p><del>Unfortunately I was unable to determine the origin of <code class="language-plaintext highlighter-rouge">$RANDOM</code>.</del>
Nobody was doing a good job tracking source code changes before the
mid-1990s, so that history appears to be lost. Bash was first released
in 1989, but the earliest version I could find was 1.14.7, released in 1996.
KornShell was first released in 1983, but the earliest source I could
find <a href="https://web.archive.org/web/20120613182836/http://www.research.att.com/sw/download/man/man1/ksh.html">was from 1993</a>. In both cases <code class="language-plaintext highlighter-rouge">$RANDOM</code> already existed. My
guess is that it first appeared in one of these two shells, probably
KornShell.</p>

<p><strong>Update</strong>: Quentin Barnes has informed me that his 1986 copy of
KornShell (a.k.a. ksh86) implements <code class="language-plaintext highlighter-rouge">$RANDOM</code>. This predates Bash and
makes it likely that this feature originated in KornShell.</p>

<h3 id="bash">Bash</h3>

<p>Of all the shells I’m going to discuss, Bash has the most interesting
history. It never made use use of <code class="language-plaintext highlighter-rouge">srand(3)</code> / <code class="language-plaintext highlighter-rouge">rand(3)</code> and instead
uses its own generator — which is generally <a href="/blog/2017/09/21/">what I prefer</a>. Prior
to Bash 4.0, it used the crummy linear congruential generator (LCG)
<a href="http://port70.net/~nsz/c/c89/c89-draft.html#4.10.2.2">found in the C89 standard</a>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">rseed</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

<span class="k">static</span> <span class="kt">int</span>
<span class="nf">brand</span> <span class="p">()</span>
<span class="p">{</span>
  <span class="n">rseed</span> <span class="o">=</span> <span class="n">rseed</span> <span class="o">*</span> <span class="mi">1103515245</span> <span class="o">+</span> <span class="mi">12345</span><span class="p">;</span>
  <span class="k">return</span> <span class="p">((</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)((</span><span class="n">rseed</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">32767</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>For some reason it was naïvely decided that <code class="language-plaintext highlighter-rouge">$RANDOM</code> should never
produce the same value twice in a row. The caller of <code class="language-plaintext highlighter-rouge">brand()</code> filters
the output and discards repeats before returning to the shell script.
This actually <em>reduces</em> the quality of the generator further since it
increases correlation between separate outputs.</p>

<p>When the shell starts up, <code class="language-plaintext highlighter-rouge">rseed</code> is seeded from the PID and the current
time in seconds. These values are literally summed and used as the seed.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Note: not the literal code, but equivalent. */</span>
<span class="n">rseed</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">()</span> <span class="o">+</span> <span class="n">time</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
</code></pre></div></div>

<p>Subshells, which fork and initally share an <code class="language-plaintext highlighter-rouge">rseed</code>, are given similar
treatment:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rseed</span> <span class="o">=</span> <span class="n">rseed</span> <span class="o">+</span> <span class="n">getpid</span><span class="p">()</span> <span class="o">+</span> <span class="n">time</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
</code></pre></div></div>

<p>Notice there’s no <a href="/blog/2018/07/31/">hashing</a> or <a href="http://www.pcg-random.org/posts/developing-a-seed_seq-alternative.html">mixing</a> of these values, so
there’s no avalanche effect. That would have prevented shells that start
around the same time from having related initial random sequences.</p>

<p>With Bash 4.0, released in 2009, the algorithm was changed to a
<a href="http://www.firstpr.com.au/dsp/rand31/p1192-park.pdf">Park–Miller multiplicative LCG</a> from 1988:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span>
<span class="nf">brand</span> <span class="p">()</span>
<span class="p">{</span>
  <span class="kt">long</span> <span class="n">h</span><span class="p">,</span> <span class="n">l</span><span class="p">;</span>

  <span class="cm">/* can't seed with 0. */</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">rseed</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
    <span class="n">rseed</span> <span class="o">=</span> <span class="mi">123459876</span><span class="p">;</span>
  <span class="n">h</span> <span class="o">=</span> <span class="n">rseed</span> <span class="o">/</span> <span class="mi">127773</span><span class="p">;</span>
  <span class="n">l</span> <span class="o">=</span> <span class="n">rseed</span> <span class="o">%</span> <span class="mi">127773</span><span class="p">;</span>
  <span class="n">rseed</span> <span class="o">=</span> <span class="mi">16807</span> <span class="o">*</span> <span class="n">l</span> <span class="o">-</span> <span class="mi">2836</span> <span class="o">*</span> <span class="n">h</span><span class="p">;</span>
  <span class="k">return</span> <span class="p">((</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)(</span><span class="n">rseed</span> <span class="o">&amp;</span> <span class="mi">32767</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There’s actually a subtle mistake in this implementation compared to the
generator described in the paper. This function will generate different
numbers than the paper, and it will generate different numbers on
different hosts! More on that later.</p>

<p>This algorithm is a <a href="http://www.pcg-random.org/posts/does-it-beat-the-minimal-standard.html">much better choice</a> than the previous LCG.
There were many more options available in 2009 compared to 1989, but,
honestly, this generator is pretty reasonable for this application.
Bash is <em>so slow</em> that you’re never practically going to generate
enough numbers for the small state to matter. Since the Park–Miller
algorithm is older than Bash, they could have used this in the first
place.</p>

<p>I considered submitting a patch to switch to something more modern.
However, given Bash’s constraints, it’s harder said than done.
Portability to weird systems is still a concern, and I expect they’d
reject a patch that started making use of <code class="language-plaintext highlighter-rouge">long long</code> in the PRNG.
They still support pre-ANSI C compilers that don’t have 64-bit
arithmetic.</p>

<p>However, what still really <em>could</em> be improved is seeding. In Bash 4.x
here’s what it looks like:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">void</span>
<span class="nf">seedrand</span> <span class="p">()</span>
<span class="p">{</span>
  <span class="k">struct</span> <span class="n">timeval</span> <span class="n">tv</span><span class="p">;</span>

  <span class="n">gettimeofday</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">tv</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
  <span class="n">sbrand</span> <span class="p">(</span><span class="n">tv</span><span class="p">.</span><span class="n">tv_sec</span> <span class="o">^</span> <span class="n">tv</span><span class="p">.</span><span class="n">tv_usec</span> <span class="o">^</span> <span class="n">getpid</span> <span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Seeding is both better and worse. It’s better that it’s seeded from a
higher resolution clock (milliseconds), so two shells started close in
time have more variation. However, it’s “mixed” with XOR, which, in
this case, is worse than addition.</p>

<p>For example, imagine two Bash shells started one millsecond apart. Both
<code class="language-plaintext highlighter-rouge">tv_usec</code> and <code class="language-plaintext highlighter-rouge">getpid()</code> are incremented by one. Those increments are
likely to cancel each other out by an XOR, and they end up with the same
seed.</p>

<p>Instead, each of those quantities should be hashed before mixing. Here’s
a rough example using my <a href="https://github.com/skeeto/hash-prospector#three-round-functions"><code class="language-plaintext highlighter-rouge">triple32()</code> hash</a> (adapted to glorious
GNU-style pre-ANSI C):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">long</span>
<span class="n">hash32</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span>
     <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">x</span><span class="p">;</span>
<span class="p">{</span>
  <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">17</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">*=</span> <span class="mh">0xed5ad4bbUL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">&amp;=</span> <span class="mh">0xffffffffUL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">11</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">*=</span> <span class="mh">0xac4c1b51UL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">&amp;=</span> <span class="mh">0xffffffffUL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">15</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">*=</span> <span class="mh">0x31848babUL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">&amp;=</span> <span class="mh">0xffffffffUL</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">14</span><span class="p">;</span>
  <span class="k">return</span> <span class="n">x</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">void</span>
<span class="nf">seedrand</span> <span class="p">()</span>
<span class="p">{</span>
  <span class="k">struct</span> <span class="n">timeval</span> <span class="n">tv</span><span class="p">;</span>

  <span class="n">gettimeofday</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">tv</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
  <span class="n">sbrand</span> <span class="p">(</span><span class="n">hash32</span> <span class="p">(</span><span class="n">tv</span><span class="p">.</span><span class="n">tv_sec</span><span class="p">)</span> <span class="o">^</span>
          <span class="n">hash32</span> <span class="p">(</span><span class="n">hash32</span> <span class="p">(</span><span class="n">tv</span><span class="p">.</span><span class="n">tv_usec</span><span class="p">)</span> <span class="o">^</span> <span class="n">getpid</span> <span class="p">()));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I had said there’s there’s a mistake in the Bash implementation of
Park–Miller. Take a closer look at the types and the assignment to
rseed:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="cm">/* The variables */</span>
  <span class="kt">long</span> <span class="n">h</span><span class="p">,</span> <span class="n">l</span><span class="p">;</span>
  <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">rseed</span><span class="p">;</span>

  <span class="cm">/* The assignment */</span>
  <span class="n">rseed</span> <span class="o">=</span> <span class="mi">16807</span> <span class="o">*</span> <span class="n">l</span> <span class="o">-</span> <span class="mi">2836</span> <span class="o">*</span> <span class="n">h</span><span class="p">;</span>
</code></pre></div></div>

<p>The result of the substraction can be negative, and that negative
value is converted to <code class="language-plaintext highlighter-rouge">unsigned long</code>. The C standard says
<code class="language-plaintext highlighter-rouge">ULONG_MAX + 1</code> is added to make the value positive. <code class="language-plaintext highlighter-rouge">ULONG_MAX</code>
varies by platform — typicially <code class="language-plaintext highlighter-rouge">long</code> is either 32 bits or 64 bits —
so the results also vary. Here’s how the paper defined it:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="kt">long</span> <span class="n">test</span><span class="p">;</span>

  <span class="n">test</span> <span class="o">=</span> <span class="mi">16807</span> <span class="o">*</span> <span class="n">l</span> <span class="o">-</span> <span class="mi">2836</span> <span class="o">*</span> <span class="n">h</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">test</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
    <span class="n">rseed</span> <span class="o">=</span> <span class="n">test</span><span class="p">;</span>
  <span class="k">else</span>
    <span class="n">rseed</span> <span class="o">=</span> <span class="n">test</span> <span class="o">+</span> <span class="mi">2147483647</span><span class="p">;</span>
</code></pre></div></div>

<p>As far as I can tell, this mistake doesn’t hurt the quality of the
generator.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ 32/bash -c 'RANDOM=127773; echo $RANDOM $RANDOM'
29932 13634

$ 64/bash -c 'RANDOM=127773; echo $RANDOM $RANDOM'
29932 29115
</code></pre></div></div>

<h3 id="zsh">Zsh</h3>

<p>In contrast to Bash, Zsh is the most straightforward: defer to
<code class="language-plaintext highlighter-rouge">rand(3)</code>. Its <code class="language-plaintext highlighter-rouge">$RANDOM</code> can return the same value twice in a row,
assuming that <code class="language-plaintext highlighter-rouge">rand(3)</code> does.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">zlong</span>
<span class="nf">randomgetfn</span><span class="p">(</span><span class="n">UNUSED</span><span class="p">(</span><span class="n">Param</span> <span class="n">pm</span><span class="p">))</span>
<span class="p">{</span>
    <span class="k">return</span> <span class="n">rand</span><span class="p">()</span> <span class="o">&amp;</span> <span class="mh">0x7fff</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span>
<span class="nf">randomsetfn</span><span class="p">(</span><span class="n">UNUSED</span><span class="p">(</span><span class="n">Param</span> <span class="n">pm</span><span class="p">),</span> <span class="n">zlong</span> <span class="n">v</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">srand</span><span class="p">((</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)</span><span class="n">v</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>A cool feature is that means you could override it if you wanted with <a href="https://xkcd.com/221/">a
custom generator</a>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span>
<span class="nf">rand</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">return</span> <span class="mi">4</span><span class="p">;</span> <span class="c1">// chosen by fair dice roll.</span>
              <span class="c1">// guaranteed to be random.</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Usage:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ gcc -shared -fPIC -o rand.so rand.c
$ LD_PRELOAD=./rand.so zsh -c 'echo $RANDOM $RANDOM $RANDOM'
4 4 4
</code></pre></div></div>

<p>This trick also applies to the rest of the shells below.</p>

<h3 id="kornshell-ksh">KornShell (ksh)</h3>

<p>KornShell originated in 1983, but it was finally released under an open
source license in 2005. There’s a clone of KornShell called Public
Domain Korn Shell (pdksh) that’s been forked a dozen different ways, but
I’ll get to that next.</p>

<p>KornShell defers to <code class="language-plaintext highlighter-rouge">rand(3)</code>, but it does some additional naïve
filtering on the output. When the shell starts up, it generates 10
values from <code class="language-plaintext highlighter-rouge">rand()</code>. If any of them are larger than 32,767 then it will
shift right by three all generated numbers.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define RANDMASK 0x7fff
</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">n</span> <span class="o">&lt;</span> <span class="mi">10</span><span class="p">;</span> <span class="n">n</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// Don't use lower bits when rand() generates large numbers.</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">rand</span><span class="p">()</span> <span class="o">&gt;</span> <span class="n">RANDMASK</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">rand_shift</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>
            <span class="k">break</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>Why not just look at <code class="language-plaintext highlighter-rouge">RAND_MAX</code>? I guess they didn’t think of it.</p>

<p><strong>Update</strong>: Quentin Barnes pointed out that <code class="language-plaintext highlighter-rouge">RAND_MAX</code> didn’t exist
until POSIX standardization in 1988. The constant <a href="https://github.com/dspinellis/unix-history-repo/commit/1cc1b02a4361">first appeared in
Unix in 1990</a>. This KornShell code either predates the standard
or needed to work on systems that predate the standard.</p>

<p>Like Bash, repeated values are not allowed. I suspect one shell got this
idea from the other.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">do</span> <span class="p">{</span>
        <span class="n">cur</span> <span class="o">=</span> <span class="p">(</span><span class="n">rand</span><span class="p">()</span> <span class="o">&gt;&gt;</span> <span class="n">rand_shift</span><span class="p">)</span> <span class="o">&amp;</span> <span class="n">RANDMASK</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">cur</span> <span class="o">==</span> <span class="n">last</span><span class="p">);</span>
</code></pre></div></div>

<p>Who came up with this strange idea first?</p>

<h3 id="openbsds-public-domain-korn-shell-pdksh">OpenBSD’s Public Domain Korn Shell (pdksh)</h3>

<p>I picked the OpenBSD variant of pdksh since it’s the only pdksh fork I
ever touch in practice, and its <code class="language-plaintext highlighter-rouge">$RANDOM</code> is the most interesting of the
pdksh forks — at least since 2014.</p>

<p>Like Zsh, pdksh simply defers to <code class="language-plaintext highlighter-rouge">rand(3)</code>. However, OpenBSD’s <code class="language-plaintext highlighter-rouge">rand(3)</code>
is <a href="https://marc.info/?l=openbsd-tech&amp;m=141807224826859&amp;w=2">infamously and proudly non-standard</a>. By default it returns
<em>non-deterministic</em>, cryptographic-quality results seeded from system
entropy (via the misnamed <a href="https://man.openbsd.org/arc4random.3"><code class="language-plaintext highlighter-rouge">arc4random(3)</code></a>), à la <code class="language-plaintext highlighter-rouge">/dev/urandom</code>.
Its <code class="language-plaintext highlighter-rouge">$RANDOM</code> inherits this behavior.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">setint</span><span class="p">(</span><span class="n">vp</span><span class="p">,</span> <span class="p">(</span><span class="kt">int64_t</span><span class="p">)</span> <span class="p">(</span><span class="n">rand</span><span class="p">()</span> <span class="o">&amp;</span> <span class="mh">0x7fff</span><span class="p">));</span>
</code></pre></div></div>

<p>However, if a value is assigned to <code class="language-plaintext highlighter-rouge">$RANDOM</code> in order to seed it, it
reverts to its old pre-2014 deterministic generation via
<a href="https://man.openbsd.org/rand"><code class="language-plaintext highlighter-rouge">srand_deterministic(3)</code></a>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">srand_deterministic</span><span class="p">((</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)</span><span class="n">intval</span><span class="p">(</span><span class="n">vp</span><span class="p">));</span>
</code></pre></div></div>

<p>OpenBSD’s deterministic <code class="language-plaintext highlighter-rouge">rand(3)</code> is the crummy LCG from the C89
standard, just like Bash 3.x. So if you assign to <code class="language-plaintext highlighter-rouge">$RANDOM</code>, you’ll get
nearly the same results as Bash 3.x and earlier — the only difference
being that it can repeat numbers.</p>

<p>That’s a slick upgrade to the old interface without breaking anything,
making it my favorite version <code class="language-plaintext highlighter-rouge">$RANDOM</code> for any shell.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>Intercepting and Emulating Linux System Calls with Ptrace</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2018/06/23/"/>
    <id>urn:uuid:a39b7709-d0a6-3b12-159f-7445d9524594</id>
    <updated>2018-06-23T20:41:08Z</updated>
    <category term="linux"/><category term="x86"/><category term="c"/><category term="bsd"/>
    <content type="html">
      <![CDATA[<p>The <code class="language-plaintext highlighter-rouge">ptrace(2)</code> (“process trace”) system call is usually associated with
debugging. It’s the primary mechanism through which native debuggers
monitor debuggees on unix-like systems. It’s also the usual approach for
implementing <a href="https://blog.plover.com/Unix/strace-groff.html">strace</a> — system call trace. With Ptrace, tracers
can pause tracees, <a href="/blog/2016/09/03/">inspect and set registers and memory</a>, monitor
system calls, or even <em>intercept</em> system calls.</p>

<p>By intercept, I mean that the tracer can mutate system call arguments,
mutate the system call return value, or even block certain system calls.
Reading between the lines, this means a tracer can fully service system
calls itself. This is particularly interesting because it also means <strong>a
tracer can emulate an entire foreign operating system</strong>. This is done
without any special help from the kernel beyond Ptrace.</p>

<p>The catch is that a process can only have one tracer attached at a time,
so it’s not possible emulate a foreign operating system while also
debugging that process with, say, GDB. The other issue is that emulated
systems calls will have higher overhead.</p>

<p>For this article I’m going to focus on <a href="http://man7.org/linux/man-pages/man2/ptrace.2.html">Linux’s Ptrace</a> on
x86-64, and I’ll be taking advantage of a few Linux-specific extensions.
For the article I’ll also be omitting error checks, but the full source
code listings will have them.</p>

<p>You can find runnable code for the examples in this article here:</p>

<p><strong><a href="https://github.com/skeeto/ptrace-examples">https://github.com/skeeto/ptrace-examples</a></strong></p>

<h3 id="strace">strace</h3>

<p>Before getting into the really interesting stuff, let’s start by
reviewing a bare bones implementation of strace. It’s <a href="/blog/2018/01/17/">no
DTrace</a>, but strace is still incredibly useful.</p>

<p>Ptrace has never been standardized. Its interface is similar across
different operating systems, especially in its core functionality, but
it’s still subtly different from system to system. The <code class="language-plaintext highlighter-rouge">ptrace(2)</code>
prototype generally looks something like this, though the specific
types may be different.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="nf">ptrace</span><span class="p">(</span><span class="kt">int</span> <span class="n">request</span><span class="p">,</span> <span class="n">pid_t</span> <span class="n">pid</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">addr</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">data</span><span class="p">);</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">pid</code> is the tracee’s process ID. While a tracee can have only one
tracer attached at a time, a tracer can be attached to many tracees.</p>

<p>The <code class="language-plaintext highlighter-rouge">request</code> field selects a specific Ptrace function, just like the
<code class="language-plaintext highlighter-rouge">ioctl(2)</code> interface. For strace, only two are needed:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">PTRACE_TRACEME</code>: This process is to be traced by its parent.</li>
  <li><code class="language-plaintext highlighter-rouge">PTRACE_SYSCALL</code>: Continue, but stop at the next system call
entrance or exit.</li>
  <li><code class="language-plaintext highlighter-rouge">PTRACE_GETREGS</code>: Get a copy of the tracee’s registers.</li>
</ul>

<p>The other two fields, <code class="language-plaintext highlighter-rouge">addr</code> and <code class="language-plaintext highlighter-rouge">data</code>, serve as generic arguments for
the selected Ptrace function. One or both are often ignored, in which
case I pass zero.</p>

<p>The strace interface is essentially a prefix to another command.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ strace [strace options] program [arguments]
</code></pre></div></div>

<p>My minimal strace doesn’t have any options, so the first thing to do —
assuming it has at least one argument — is <code class="language-plaintext highlighter-rouge">fork(2)</code> and <code class="language-plaintext highlighter-rouge">exec(2)</code> the
tracee process on the tail of <code class="language-plaintext highlighter-rouge">argv</code>. But before loading the target
program, the new process will inform the kernel that it’s going to be
traced by its parent. The tracee will be paused by this Ptrace system
call.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pid_t</span> <span class="n">pid</span> <span class="o">=</span> <span class="n">fork</span><span class="p">();</span>
<span class="k">switch</span> <span class="p">(</span><span class="n">pid</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span> <span class="cm">/* error */</span>
        <span class="n">FATAL</span><span class="p">(</span><span class="s">"%s"</span><span class="p">,</span> <span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">));</span>
    <span class="k">case</span> <span class="mi">0</span><span class="p">:</span>  <span class="cm">/* child */</span>
        <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_TRACEME</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
        <span class="n">execvp</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">argv</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
        <span class="n">FATAL</span><span class="p">(</span><span class="s">"%s"</span><span class="p">,</span> <span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The parent waits for the child’s <code class="language-plaintext highlighter-rouge">PTRACE_TRACEME</code> using <code class="language-plaintext highlighter-rouge">wait(2)</code>. When
<code class="language-plaintext highlighter-rouge">wait(2)</code> returns, the child will be paused.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">waitpid</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
</code></pre></div></div>

<p>Before allowing the child to continue, we tell the operating system that
the tracee should be terminated along with its parent. A real strace
implementation may want to set other options, such as
<code class="language-plaintext highlighter-rouge">PTRACE_O_TRACEFORK</code>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SETOPTIONS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">PTRACE_O_EXITKILL</span><span class="p">);</span>
</code></pre></div></div>

<p>All that’s left is a simple, endless loop that catches on system calls
one at a time. The body of the loop has four steps:</p>

<ol>
  <li>Wait for the process to enter the next system call.</li>
  <li>Print a representation of the system call.</li>
  <li>Allow the system call to execute and wait for the return.</li>
  <li>Print the system call return value.</li>
</ol>

<p>The <code class="language-plaintext highlighter-rouge">PTRACE_SYSCALL</code> request is used in both waiting for the next system
call to begin, and waiting for that system call to exit. As before, a
<code class="language-plaintext highlighter-rouge">wait(2)</code> is needed to wait for the tracee to enter the desired state.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SYSCALL</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="n">waitpid</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
</code></pre></div></div>

<p>When <code class="language-plaintext highlighter-rouge">wait(2)</code> returns, the registers for the thread that made the
system call are filled with the system call number and its arguments.
However, <em>the operating system has not yet serviced this system call</em>.
This detail will be important later.</p>

<p>The next step is to gather the system call information. This is where
it gets architecture specific. On x86-64, <a href="/blog/2015/05/15/">the system call number is
passed in <code class="language-plaintext highlighter-rouge">rax</code></a>, and the arguments (up to 6) are passed in
<code class="language-plaintext highlighter-rouge">rdi</code>, <code class="language-plaintext highlighter-rouge">rsi</code>, <code class="language-plaintext highlighter-rouge">rdx</code>, <code class="language-plaintext highlighter-rouge">r10</code>, <code class="language-plaintext highlighter-rouge">r8</code>, and <code class="language-plaintext highlighter-rouge">r9</code>. Reading the registers is
another Ptrace call, though there’s no need to <code class="language-plaintext highlighter-rouge">wait(2)</code> since the
tracee isn’t changing state.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">user_regs_struct</span> <span class="n">regs</span><span class="p">;</span>
<span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_GETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>
<span class="kt">long</span> <span class="n">syscall</span> <span class="o">=</span> <span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span><span class="p">;</span>

<span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"%ld(%ld, %ld, %ld, %ld, %ld, %ld)"</span><span class="p">,</span>
        <span class="n">syscall</span><span class="p">,</span>
        <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">rdi</span><span class="p">,</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">rsi</span><span class="p">,</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">rdx</span><span class="p">,</span>
        <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">r10</span><span class="p">,</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">r8</span><span class="p">,</span>  <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">r9</span><span class="p">);</span>
</code></pre></div></div>

<p>There’s one caveat. For <a href="https://web.archive.org/web/20190323050358/https://stackoverflow.com/a/6469069">internal kernel purposes</a>, the system
call number is stored in <code class="language-plaintext highlighter-rouge">orig_rax</code> rather than <code class="language-plaintext highlighter-rouge">rax</code>. All the other
system call arguments are straightforward.</p>

<p>Next it’s another <code class="language-plaintext highlighter-rouge">PTRACE_SYSCALL</code> and <code class="language-plaintext highlighter-rouge">wait(2)</code>, then another
<code class="language-plaintext highlighter-rouge">PTRACE_GETREGS</code> to fetch the result. The result is stored in <code class="language-plaintext highlighter-rouge">rax</code>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_GETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>
<span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">" = %ld</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">regs</span><span class="p">.</span><span class="n">rax</span><span class="p">);</span>
</code></pre></div></div>

<p>The output from this simple program is <em>very</em> crude. There is no
symbolic name for the system call and every argument is printed
numerically, even if it’s a pointer to a buffer. A more complete strace
would know which arguments are pointers and use <code class="language-plaintext highlighter-rouge">process_vm_readv(2)</code> to
read those buffers from the tracee in order to print them appropriately.</p>

<p>However, this does lay the groundwork for system call interception.</p>

<h3 id="system-call-interception">System call interception</h3>

<p>Suppose we want to use Ptrace to implement something like OpenBSD’s
<a href="https://man.openbsd.org/pledge.2"><code class="language-plaintext highlighter-rouge">pledge(2)</code></a>, in which <a href="http://www.openbsd.org/papers/hackfest2015-pledge/mgp00001.html">a process <em>pledges</em> to use only a
restricted set of system calls</a>. The idea is that many
programs typically have an initialization phase where they need lots
of system access (opening files, binding sockets, etc.). After
initialization they enter a main loop in which they processing input
and only a small set of system calls are needed.</p>

<p>Before entering this main loop, a process can limit itself to the few
operations that it needs. If <a href="/blog/2017/07/19/">the program has a flaw</a> allowing it
to be exploited by bad input, the pledge significantly limits what the
exploit can accomplish.</p>

<p>Using the same strace model, rather than print out all system calls,
we could either block certain system calls or simply terminate the
tracee when it misbehaves. Termination is easy: just call <code class="language-plaintext highlighter-rouge">exit(2)</code> in
the tracer. Since it’s configured to also terminate the tracee.
Blocking the system call and allowing the child to continue is a
little trickier.</p>

<p>The tricky part is that <strong>there’s no way to abort a system call once
it’s started</strong>. When tracer returns from <code class="language-plaintext highlighter-rouge">wait(2)</code> on the entrance to
the system call, the only way to stop a system call from happening is
to terminate the tracee.</p>

<p>However, not only can we mess with the system call arguments, we can
change the system call number itself, converting it to a system call
that doesn’t exist. On return we can report a “friendly” <code class="language-plaintext highlighter-rouge">EPERM</code> error
in <code class="language-plaintext highlighter-rouge">errno</code> <a href="/blog/2016/09/23/">via the normal in-band signaling</a>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
    <span class="cm">/* Enter next system call */</span>
    <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SYSCALL</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
    <span class="n">waitpid</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>

    <span class="k">struct</span> <span class="n">user_regs_struct</span> <span class="n">regs</span><span class="p">;</span>
    <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_GETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>

    <span class="cm">/* Is this system call permitted? */</span>
    <span class="kt">int</span> <span class="n">blocked</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">is_syscall_blocked</span><span class="p">(</span><span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span><span class="p">))</span> <span class="p">{</span>
        <span class="n">blocked</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
        <span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="c1">// set to invalid syscall</span>
        <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="cm">/* Run system call and stop on exit */</span>
    <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SYSCALL</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
    <span class="n">waitpid</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">blocked</span><span class="p">)</span> <span class="p">{</span>
        <span class="cm">/* errno = EPERM */</span>
        <span class="n">regs</span><span class="p">.</span><span class="n">rax</span> <span class="o">=</span> <span class="o">-</span><span class="n">EPERM</span><span class="p">;</span> <span class="c1">// Operation not permitted</span>
        <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This simple example only checks against a whitelist or blacklist of
system calls. And there’s no nuance, such as allowing files to be
opened (<code class="language-plaintext highlighter-rouge">open(2)</code>) read-only but not as writable, allowing anonymous
memory maps but not non-anonymous mappings, etc. There’s also no way
to the tracee to dynamically drop privileges.</p>

<p>How <em>could</em> the tracee communicate to the tracer? Use an artificial
system call!</p>

<h3 id="creating-an-artificial-system-call">Creating an artificial system call</h3>

<p>For my new pledge-like system call — which I call <code class="language-plaintext highlighter-rouge">xpledge()</code> to
distinguish it from the real thing — I picked system call number 10000,
a nice high number that’s unlikely to ever be used for a real system
call.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define SYS_xpledge 10000
</span></code></pre></div></div>

<p>Just for demonstration purposes, I put together a minuscule interface
that’s not good for much in practice. It has little in common with
OpenBSD’s <code class="language-plaintext highlighter-rouge">pledge(2)</code>, which uses a <a href="https://www.tedunangst.com/flak/post/string-interfaces">string interface</a>.
<em>Actually</em> designing robust and secure sets of privileges is really
complicated, as the <code class="language-plaintext highlighter-rouge">pledge(2)</code> manpage shows. Here’s the entire
interface <em>and</em> implementation of the system call for the tracee:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define _GNU_SOURCE
#include</span> <span class="cpf">&lt;unistd.h&gt;</span><span class="cp">
</span>
<span class="cp">#define XPLEDGE_RDWR  (1 &lt;&lt; 0)
#define XPLEDGE_OPEN  (1 &lt;&lt; 1)
</span>
<span class="cp">#define xpledge(arg) syscall(SYS_xpledge, arg)
</span></code></pre></div></div>

<p>If it passes zero for the argument, only a few basic system calls are
allowed, including those used to allocate memory (e.g. <code class="language-plaintext highlighter-rouge">brk(2)</code>). The
<code class="language-plaintext highlighter-rouge">PLEDGE_RDWR</code> bit allows <a href="/blog/2017/03/01/">various</a> read and write system calls
(<code class="language-plaintext highlighter-rouge">read(2)</code>, <code class="language-plaintext highlighter-rouge">readv(2)</code>, <code class="language-plaintext highlighter-rouge">pread(2)</code>, <code class="language-plaintext highlighter-rouge">preadv(2)</code>, etc.). The
<code class="language-plaintext highlighter-rouge">PLEDGE_OPEN</code> bit allows <code class="language-plaintext highlighter-rouge">open(2)</code>.</p>

<p>To prevent privileges from being escalated back, <code class="language-plaintext highlighter-rouge">pledge()</code> blocks
itself — though this also prevents dropping more privileges later down
the line.</p>

<p>In the xpledge tracer, I just need to check for this system call:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Handle entrance */</span>
<span class="k">switch</span> <span class="p">(</span><span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="n">SYS_pledge</span><span class="p">:</span>
        <span class="n">register_pledge</span><span class="p">(</span><span class="n">regs</span><span class="p">.</span><span class="n">rdi</span><span class="p">);</span>
        <span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The operating system will return <code class="language-plaintext highlighter-rouge">ENOSYS</code> (Function not implemented)
since this isn’t a <em>real</em> system call. So on the way out I overwrite
this with a success (0).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Handle exit */</span>
<span class="k">switch</span> <span class="p">(</span><span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">case</span> <span class="n">SYS_pledge</span><span class="p">:</span>
        <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_POKEUSER</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="n">RAX</span> <span class="o">*</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
        <span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I wrote a little test program that opens <code class="language-plaintext highlighter-rouge">/dev/urandom</code>, makes a read,
tries to pledge, then tries to open <code class="language-plaintext highlighter-rouge">/dev/urandom</code> a second time, then
confirms it can read from the original <code class="language-plaintext highlighter-rouge">/dev/urandom</code> file descriptor.
Running without a pledge tracer, the output looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ./example
fread("/dev/urandom")[1] = 0xcd2508c7
XPledging...
XPledge failed: Function not implemented
fread("/dev/urandom")[2] = 0x0be4a986
fread("/dev/urandom")[1] = 0x03147604
</code></pre></div></div>

<p>Making an invalid system call doesn’t crash an application. It just
fails, which is a rather convenient fallback. When run under the
tracer, it looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ./xpledge ./example
fread("/dev/urandom")[1] = 0xb2ac39c4
XPledging...
fopen("/dev/urandom")[2]: Operation not permitted
fread("/dev/urandom")[1] = 0x2e1bd1c4
</code></pre></div></div>

<p>The pledge succeeds but the second <code class="language-plaintext highlighter-rouge">fopen(3)</code> does not since the tracer
blocked it with <code class="language-plaintext highlighter-rouge">EPERM</code>.</p>

<p>This concept could be taken much further, to, say, change file paths or
return fake results. A tracer could effectively chroot its tracee,
prepending some chroot path to the root of any path passed through a
system call. It could even lie to the process about what user it is,
claiming that it’s running as root. In fact, this is exactly how the
<a href="https://fakeroot-ng.lingnu.com/index.php/Home_Page">Fakeroot NG</a> program works.</p>

<h3 id="foreign-system-emulation">Foreign system emulation</h3>

<p>Suppose you don’t just want to intercept <em>some</em> system calls, but
<em>all</em> system calls. You’ve got <a href="/blog/2017/11/30/">a binary intended to run on another
operating system</a>, so none of the system calls it makes will ever
work.</p>

<p>You could manage all this using only what I’ve described so far. The
tracer would always replace the system call number with a dummy, allow
it to fail, then service the system call itself. But that’s really
inefficient. That’s essentially three context switches for each system
call: one to stop on the entrance, one to make the always-failing
system call, and one to stop on the exit.</p>

<p>The Linux version of PTrace has had a more efficient operation for
this technique since 2005: <code class="language-plaintext highlighter-rouge">PTRACE_SYSEMU</code>. PTrace stops only <em>once</em>
per a system call, and it’s up to the tracer to service that system
call before allowing the tracee to continue.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
    <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_SYSEMU</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
    <span class="n">waitpid</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>

    <span class="k">struct</span> <span class="n">user_regs_struct</span> <span class="n">regs</span><span class="p">;</span>
    <span class="n">ptrace</span><span class="p">(</span><span class="n">PTRACE_GETREGS</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">regs</span><span class="p">);</span>

    <span class="k">switch</span> <span class="p">(</span><span class="n">regs</span><span class="p">.</span><span class="n">orig_rax</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">case</span> <span class="n">OS_read</span><span class="p">:</span>
            <span class="cm">/* ... */</span>

        <span class="k">case</span> <span class="n">OS_write</span><span class="p">:</span>
            <span class="cm">/* ... */</span>

        <span class="k">case</span> <span class="n">OS_open</span><span class="p">:</span>
            <span class="cm">/* ... */</span>

        <span class="k">case</span> <span class="n">OS_exit</span><span class="p">:</span>
            <span class="cm">/* ... */</span>

        <span class="cm">/* ... and so on ... */</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>To run binaries for the same architecture from any system with a
stable (enough) system call ABI, you just need this <code class="language-plaintext highlighter-rouge">PTRACE_SYSEMU</code>
tracer, a loader (to take the place of <code class="language-plaintext highlighter-rouge">exec(2)</code>), and whatever system
libraries the binary needs (or only run static binaries).</p>

<p>In fact, this sounds like a fun weekend project.</p>

<h3 id="see-also">See also</h3>

<ul>
  <li><a href="https://www.youtube.com/watch?v=uXgxMDglxVM">Implementing a clone of OpenBSD pledge into the Linux kernel</a></li>
</ul>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>A Crude Personal Package Manager</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2018/03/27/"/>
    <id>urn:uuid:b100f50f-c8f8-3a08-e149-a04b2308226b</id>
    <updated>2018-03-27T02:10:35Z</updated>
    <category term="c"/><category term="posix"/><category term="linux"/><category term="bsd"/>
    <content type="html">
      <![CDATA[<p>For the past couple of months I’ve been using a custom package manager
to manage a handful of software packages within various unix-like
environments. Packages are <a href="/blog/2017/06/19/">installed in my home directory</a> under
<code class="language-plaintext highlighter-rouge">~/.local/bin</code>, and the package manager itself is just a 110 line Bourne
shell script. It’s is not intended to replace the system’s package
manager but, instead, compliment it in some cases where I need more
flexibility. I use it to run custom versions of specific pieces of
software — newer or older than the system-installed versions, or with my
own patches and modifications — without interfering with the rest of
system, and without a need for root access. It’s worked out <em>really</em>
well so far and I expect to continue making heavy use of it in the
future.</p>

<p>It’s so simple that I haven’t even bothered putting the script in its
own repository. It sits unadorned within my dotfiles repository with the
name <em>qpkg</em> (“quick package”):</p>

<ul>
  <li><a href="https://github.com/skeeto/dotfiles/blob/master/bin/qpkg">https://github.com/skeeto/dotfiles/blob/master/bin/qpkg</a></li>
</ul>

<p>Sitting alongside my dotfiles means it’s always there when I need it,
just as if it was a built-in command.</p>

<p>I say it’s crude because its “install” (<code class="language-plaintext highlighter-rouge">-I</code>) procedure is little more
than a wrapper around tar. It doesn’t invoke libtool after installing a
library, and there’s no post-install script — or <code class="language-plaintext highlighter-rouge">postinst</code> as Debian
calls it. It doesn’t check for conflicts between packages, though
there’s a command for doing so manually ahead of time. It doesn’t manage
dependencies, nor even have them as a concept. That’s all on the user to
screw up.</p>

<p>In other words, it doesn’t attempt solve most of the hard problems
tackled by package managers… <em>except</em> for three important issues:</p>

<ol>
  <li>
    <p>It provides a clean, guaranteed-to-work uninstall procedure. Some
Makefiles <em>do</em> have a token “uninstall” target, but it’s often
unreliable.</p>
  </li>
  <li>
    <p>Unlike blindly using a Makefile “install” target, I can check for
conflicts <em>before</em> installing the software. I’ll know if and how a
package clobbers an already-installed package, and I can manage, or
ignore, that conflict manually as needed.</p>
  </li>
  <li>
    <p>It produces a compact, reusable package file that I can reinstall
later, even on a different machine (with a couple of caveats). I
don’t need to keep around the original source and build directories
should I want to install or uninstall later. I can also rapidly
switch back and forth between different builds of the same software.</p>
  </li>
</ol>

<p>The first caveat is that the package will be configured for exactly my
own home directory, so I usually can’t share it with other users, or
install it on machines where I have a different home directory. Though I
could still create packages for different installation prefixes.</p>

<p>The second caveat is that some builds tailor themselves by default to
the host (e.g. <code class="language-plaintext highlighter-rouge">-march=native</code>). If care isn’t taken, those packages may
not be very portable. This is more common than I had expected and has
mildly annoyed me.</p>

<h3 id="birth-of-a-package-manager">Birth of a package manager</h3>

<p>While the package manager is new, I’ve been building and installing
software in my home directory for years. I’d follow the normal process
of setting the install <em>prefix</em> to <code class="language-plaintext highlighter-rouge">$HOME/.local</code>, running the build,
and then letting the “install” target do its thing.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ tar xzf name-version.tar.gz
$ cd name-version/
$ ./configure --prefix=$HOME/.local
$ make -j$(nproc)
$ make install
</code></pre></div></div>

<p>This worked well enough for years. However, I’ve come to rely a lot on
this technique, and I’m using it for increasingly sophisticated
purposes, such as building custom cross-compiler toolchains.</p>

<p>A common difficulty has been handling the release of new versions of
software. I’d like to upgrade to the new version, but lack a way to
cleanly uninstall the previous version. Simply clobbering the old
version by installing it on top <em>usually</em> works. Occasionally it
wouldn’t, and I’d have to blow away <code class="language-plaintext highlighter-rouge">~/.local</code> and start all over again.
With more and more software installed in my home directory, restarting
has become more and more of a chore that I’d like to avoid.</p>

<p>What I needed was a way to track exactly which files were installed so
that I could remove them later when I needed to uninstall. Fortunately
there’s a widely-used convention for exactly this purpose: <code class="language-plaintext highlighter-rouge">DESTDIR</code>.</p>

<p>It’s expected that when a Makefile provides an “install” target, it
prefixes the installation path with the <code class="language-plaintext highlighter-rouge">DESTDIR</code> macro, which is
assigned to the empty string by default. This allows the user to install
the software to a temporary location for the purposes of packaging.
Unlike the installation prefix (<code class="language-plaintext highlighter-rouge">--prefix</code>) configured before the build
began, the software is not expected to function properly when run in the
<code class="language-plaintext highlighter-rouge">DESTDIR</code> location.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ DESTDIR=_destdir
$ mkdir $DESTDIR
$ make DESTDIR=$DESTDIR install
</code></pre></div></div>

<p>A different tool will used to copy these files into place and actually
install it. This tool can track what files were installed, allowing them
to be removed later when uninstalling. My package manager uses the tar
program for both purposes. First it creates a package by packing up the
<code class="language-plaintext highlighter-rouge">DESTDIR</code> (at the root of the actual install prefix):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ tar czf package.tgz -C $DESTDIR$HOME/.local .
</code></pre></div></div>

<p>So a package is nothing more than a gzipped tarball. To install, it
unpacks the tarball in <code class="language-plaintext highlighter-rouge">~/.local</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd $HOME/.local
$ tar xzf ~/package.tgz
</code></pre></div></div>

<p>But how does it uninstall a package? It didn’t keep track of what was
installed. Easy! The tarball itself contains the package list, and it’s
printed with tar’s <code class="language-plaintext highlighter-rouge">t</code> mode.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd $HOME/.local
for file in $(tar tzf package.tgz | grep -v '/$'); do
    rm -f "$file"
done
</code></pre></div></div>

<p>I’m using <code class="language-plaintext highlighter-rouge">grep</code> to skip directories, which are conveniently listed with
a trailing slash. Note that in the example above, there are a couple of
issues with file names containing whitespace. If the file contains a
space character, it will word split incorrectly in the <code class="language-plaintext highlighter-rouge">for</code> loop. A
Makefile couldn’t handle such a file in the first place, but, in case
it’s still necessary, my package manager sets <code class="language-plaintext highlighter-rouge">IFS</code> to just a newline.</p>

<p>If the file name contains a newline, then my package manager relies on
<a href="http://dinaburg.org/bitsquatting.html">a cosmic ray striking just the right bit at just the right
instant</a> to make it all work out, because no version of tar can
unambiguously print such file names. Crossing your fingers during this
process may help.</p>

<h3 id="commands">Commands</h3>

<p>There are five commands, each assigned to a capital letter: <code class="language-plaintext highlighter-rouge">-B</code>, <code class="language-plaintext highlighter-rouge">-C</code>,
<code class="language-plaintext highlighter-rouge">-I</code>, <code class="language-plaintext highlighter-rouge">-V</code>,  and <code class="language-plaintext highlighter-rouge">-U</code>. It’s an interface pattern inspired by <a href="https://www.openbsd.org/papers/bsdcan-signify.html">Ted
Unangst’s signify</a> (see <a href="https://man.openbsd.org/signify.1"><code class="language-plaintext highlighter-rouge">signify(1)</code></a>). I also used this
pattern with <a href="/blog/2017/09/15/">Blowpipe</a> and, in retrospect, wish I had also used
with <a href="/blog/2017/03/12/">Enchive</a>.</p>

<h4 id="build--b">Build (<code class="language-plaintext highlighter-rouge">-B</code>)</h4>

<p>Unlike the other three commands, the “build” command isn’t essential,
and is just for convenience. It assumes the build uses an Autoconfg-like
configure script and runs it automatically, followed by <code class="language-plaintext highlighter-rouge">make</code> with the
appropriate <code class="language-plaintext highlighter-rouge">-j</code> (jobs) option. It automatically sets the <code class="language-plaintext highlighter-rouge">--prefix</code>
argument when running the configure script.</p>

<p>If the build uses something other and an Autoconf-like configure script,
such as CMake, then you can’t use the “build” command and must perform
the build yourself. For example, I must do this when building LLVM and
Clang.</p>

<p>Before using the “build” command, the package must first be unpacked and
patched if necessary. Then the package manager can take over to run the
build.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ tar xzf name-version.tar.gz
$ cd name-version/
$ patch -p1 &lt; ../0001.patch
$ patch -p1 &lt; ../0002.patch
$ patch -p1 &lt; ../0003.patch
$ cd ..
$ mkdir build
$ cd build/
$ qpkg -B ../name-version/
</code></pre></div></div>

<p>In this example I’m doing an out-of-source build by invoking the
configure script from a different directory. Did you know Autoconf
scripts support this? I didn’t know until recently! Unfortunately some
hand-written Autoconf-like scripts don’t, though this will
be immediately obvious.</p>

<p>Once <code class="language-plaintext highlighter-rouge">qpkg</code> returns, the program will be fully built — or stuck on a
build error if you’re unlucky. If you need to pass custom configure
options, just tack them on the <code class="language-plaintext highlighter-rouge">qpkg</code> command:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qpkg -B ../name-version/ --without-libxml2 --with-ncurses
</code></pre></div></div>

<p>Since the second and third steps — creating the build directory and
moving into it — is so common, there’s an optional switch for it: <code class="language-plaintext highlighter-rouge">-d</code>.
This option’s argument is the build directory. <code class="language-plaintext highlighter-rouge">qpkg</code> creates that
directory and runs the build inside it. In practice I just use “x” for
the build directory since it’s so quick to add “dx” to the command.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ tar xzf name-version.tar.gz
$ qpkg -Bdx ../name-version/
</code></pre></div></div>

<p>With the software compiled, the next step is creating the package.</p>

<h4 id="create--c">Create (<code class="language-plaintext highlighter-rouge">-C</code>)</h4>

<p>The “create” command creates the <code class="language-plaintext highlighter-rouge">DESTDIR</code> (<code class="language-plaintext highlighter-rouge">_destdir</code> in the working
directory) and runs the “install” Makefile target to fill it with files.
Continuing with the example above and its <code class="language-plaintext highlighter-rouge">x/</code> build directory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qpkg -Cdx name
</code></pre></div></div>

<p>Where “name” is the name of the package, without any file name
extension. Like with “build”, extra arguments after the package name are
passed to <code class="language-plaintext highlighter-rouge">make</code> in case there needs to be any additional tweaking.</p>

<p>When the “create” command finishes, there will be new package named
<code class="language-plaintext highlighter-rouge">name.tgz</code> in the working directory. At this point the source and build
directories are no longer needed, assuming everything went fine.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rm -rf name-version/
$ rm -rf x/
</code></pre></div></div>

<p>This package is ready to install, though you may want to verify it
first.</p>

<h4 id="verify--v">Verify (<code class="language-plaintext highlighter-rouge">-V</code>)</h4>

<p>The “verify” command checks for collisions against installed packages.
It works like uninstallation, but rather than deleting files, it checks
if any of the files already exist. If they do, it means there’s a
conflict with an existing package. These file names are printed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qpkg -V name.tgz
</code></pre></div></div>

<p>The most common conflict I’ve seen is in the info index (<code class="language-plaintext highlighter-rouge">info/dir</code>)
file, which is safe to ignore since I don’t care about it.</p>

<p>If the package has already been installed, there will of course be tons
of conflicts. This is the easiest way to check if a package has been
installed.</p>

<h4 id="install--i">Install (<code class="language-plaintext highlighter-rouge">-I</code>)</h4>

<p>The “install” command is just the dumb <code class="language-plaintext highlighter-rouge">tar xzf</code> explained above. It
will clobber anything in its way without warning, which is why, if that
matters, “verify” should be used first.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qpkg -I name.tgz
</code></pre></div></div>

<p>When <code class="language-plaintext highlighter-rouge">qpkg</code> returns, the package has been installed and is probably
ready to go. A lot of packages complain that you need to run libtool to
finalize an installation, but I’ve never had a problem skipping it. This
dumb unpacking generally works fine.</p>

<h4 id="uninstall--u">Uninstall (<code class="language-plaintext highlighter-rouge">-U</code>)</h4>

<p>Obviously the last command is “uninstall”. As explained above, this
needs the original package that was given to the “install” command.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ qpkg -U name.tgz
</code></pre></div></div>

<p>Just as “install” is dumb, so is “uninstall,” blindly deleting anything
listed in the tarball. One thing I like about dumb tools is that there
are no surprises.</p>

<p>I typically suffix the package name with the version number to help keep
the packages organized. When upgrading to a new version of a piece of
software, I build the new package, which, thanks to the version suffix,
will have a distinct name. Then I uninstall the old package, and,
finally, install the new one in its place. So far I’ve been keeping the
old package around in case I still need it, though I could always
rebuild it in a pinch.</p>

<h3 id="package-by-accumulation">Package by accumulation</h3>

<p>Building a GCC cross-compiler toolchain is a tricky case that doesn’t
fit so well with the build, create, and install process illustrated
above. It would be nice for the cross-compiler to be a single, big
package, but due to the way it’s built, it would need to be five or so
packages, a couple of which will conflict (one being a subset of
another):</p>

<ol>
  <li>binutils</li>
  <li>C headers</li>
  <li>core GCC</li>
  <li>C runtime</li>
  <li>rest of GCC</li>
</ol>

<p>Each step needs to be installed before the next step will work. (I don’t
even want to think about cross-compiling a cross-compiler.)</p>

<p>To deal with this, I added a “keep” (<code class="language-plaintext highlighter-rouge">-k</code>) option that leaves the
<code class="language-plaintext highlighter-rouge">DESTDIR</code> around after creating the package. To keep things tidy, the
intermediate packages exist and are installed, but the final, big
cross-compiler package <em>accumulates</em> into the <code class="language-plaintext highlighter-rouge">DESTDIR</code>. The final
package at the end is actually the whole cross compiler in one package,
a superset of them all.</p>

<p>Complicated situations like these are where I can really understand the
value of Debian’s <a href="https://wiki.debian.org/FakeRoot">fakeroot</a> tool.</p>

<h3 id="my-use-case-and-an-alternative">My use case, and an alternative</h3>

<p>The role filled by my package manager is actually pretty well suited for
<a href="https://www.pkgsrc.org/">pkgsrc</a>, which is NetBSD’s ports system made available to other
unix-like systems. However, I just need something really lightweight
that gives me absolute control — even more than I get with pkgsrc — in
the dozen or so cases where I <em>really</em> need it.</p>

<p>All I need is a standard C toolchain in a unix-like environment (even a
really old one), the source tarballs for the software I need, my 110
line shell script package manager, and one to two cans of elbow grease.
From there I can bootstrap everything I might need without root access,
even <a href="/blog/2017/04/01/">in a disaster</a>. If the software I need isn’t written in C, it
can ultimately get bootstrapped from some crusty old C compiler, which
might even involve building some newer C compilers in between. After a
certain point it’s C all the way down.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>Debugging Emacs or: How I Learned to Stop Worrying and Love DTrace</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2018/01/17/"/>
    <id>urn:uuid:a55cabc9-2d87-30a4-9066-9ec5e45b8bce</id>
    <updated>2018-01-17T23:59:49Z</updated>
    <category term="emacs"/><category term="elfeed"/><category term="bsd"/>
    <content type="html">
      <![CDATA[<p><em>Update: This article was featured on <a href="https://www.youtube.com/watch?v=Xi_pX2QIzho">BSD Now 233</a> (starting
at 21:38).</em></p>

<p>For some time <a href="https://github.com/skeeto/elfeed">Elfeed</a> was experiencing a strange, spurious
failure. Every so often users were <a href="https://github.com/skeeto/elfeed/issues/248">seeing an error</a> (spoiler
warning) when updating feeds: “error in process sentinel: Search
failed.” If you use Elfeed, you might have even seen this yourself.
From the surface it appeared that curl, tasked with the
<a href="/blog/2016/06/16/">responsibility for downloading feed data</a>, was producing
incomplete output despite reporting a successful run. Since the run
was successful, Elfeed assumed certain data was in curl’s output
buffer, but, since it wasn’t, it failed hard.</p>

<!--more-->

<p>Unfortunately this issue was not reproducible. Manually running curl
outside of Emacs never revealed any issues. Asking Elfeed to retry
fetching the feeds would work fine. The issue would only randomly rear
its head when Elfeed was fetching many feeds in parallel, under
stress. By the time the error was discovered, the curl process had
exited and vital debugging information was lost. Considering that
this was likely to be a bug in Emacs itself, there really wasn’t a
reliable way to capture the necessary debugging information from
within Emacs Lisp. And, indeed, this later proved to be the case.</p>

<p>A quick-and-dirty work around is to use <code class="language-plaintext highlighter-rouge">condition-case</code> to catch and
swallow the error. When the bizarre issue shows up, rather than fail
badly in front of the user, Elfeed could attempt to swallow the error
— assuming it can be reliably detected — and treat the fetch as simply
a failure. That didn’t sit comfortably with me. Elfeed had done its
due diligence checking for errors already. <em>Someone</em> was lying to
Elfeed, and I intended to catch them with their pants on fire.
Someday.</p>

<p>I’d just need to witness the bug on one of my own machines. Elfeed is
part of my daily routine, so surely I’d have to experience this issue
myself someday. My plan was, should that day come, to run a modified
Elfeed, instrumented to capture extra data. I would have also routinely
run Emacs under GDB so that I could inspect the failure more deeply.</p>

<p>For now I just had to wait to <a href="https://www.youtube.com/watch?v=fE2KDzZaxvE">hunt that zebra</a>.</p>

<h3 id="bryan-cantrill-dtrace-and-freebsd">Bryan Cantrill, DTrace, and FreeBSD</h3>

<p>Over the holidays I re-discovered <a href="https://en.wikipedia.org/wiki/Bryan_Cantrill">Bryan Cantrill</a>, a systems
software engineer who worked for Sun between 1996 and 2010, and is most
well known for <a href="http://dtrace.org/blogs/about/">DTrace</a>. My first exposure to him was in a <a href="https://www.youtube.com/watch?v=l6XQUciI-Sc">BSD
Now interview</a> in 2015. I had re-watched that interview and decided
there was a lot more I had to learn from him. He’s become a personal
hero to me. So I scoured the internet for <a href="http://dtrace.org/blogs/bmc/2018/02/03/talks/">more of his writing and
talks</a>. Besides what I’ve already linked in this article, here
are a couple more great presentations:</p>

<ul>
  <li><a href="https://www.youtube.com/watch?v=4PaWFYm0kEw">Oral Tradition in Software Engineering</a></li>
  <li><a href="https://www.youtube.com/watch?v=-zRN7XLCRhc">Fork Yeah! The Rise and Development of illumos</a></li>
</ul>

<p>You can also find some of his writing <a href="http://dtrace.org/blogs/bmc/">scattered around the DTrace
blog</a>.</p>

<p>Some interesting operating system technology came out of Sun during
its final 15 or so years — most notably DTrace and ZFS — and Bryan
speaks about it passionately. Almost as a matter of luck, most of it
survived the Oracle acquisition thanks to Sun releasing it as open
source in just the nick of time. Otherwise it would have been lost
forever. The scattered ex-Sun employees, still passionate about their
prior work at Sun, along with some of their old customers have since
picked up the pieces and kept going as a community under the name
<a href="https://illumos.org/">illumos</a>. It’s like an open source flotilla.</p>

<p>Naturally I wanted to get my hands on this stuff to try it out for
myself. Is it really as good as they say? Normally I stick to Linux,
but it (generally) doesn’t have these Sun technologies. The main
reason is license incompatibility. Sun released its code under the
<a href="https://opensource.org/licenses/CDDL-1.0">CDDL</a>, which is incompatible with the GPL. Ubuntu <em>does</em>
<a href="https://insights.ubuntu.com/2016/02/18/zfs-licensing-and-linux/">infamously include ZFS</a>, but other distributions are
unwilling to take that risk. Porting DTrace is a serious undertaking
since it’s got its fingers throughout the kernel, which also makes the
licensing issues even more complicated.</p>

<p>(<em>Update Feburary 2018</em>: <a href="https://gnu.wildebeest.org/blog/mjw/2018/02/14/dtrace-for-linux-oracle-does-the-right-thing/">DTrace has been released under the
GPLv2</a>, allowing it to be legally integrated with Linux.)</p>

<p>Linux has a reputation for Not Invented Here (NIH) syndrome, and these
licensing issues certainly contribute to that. Rather than adopt ZFS
and DTrace, they’ve been reinvented from scratch: btrfs instead of
ZFS, and <a href="http://www.brendangregg.com/blog/2015-07-08/choosing-a-linux-tracer.html">a slew of partial options</a> instead of DTrace.
Normally I’m most interested in system call tracing, and my go to is
<a href="https://en.wikipedia.org/wiki/Strace">strace</a>, though it certainly has its limitations — including
this situation of debugging curl under Emacs. Another famous example
of NIH is Linux’s <a href="http://man7.org/linux/man-pages/man7/epoll.7.html"><code class="language-plaintext highlighter-rouge">epoll(2)</code></a>, which is a <a href="https://idea.popcount.org/2017-02-20-epoll-is-fundamentally-broken-12/">broken</a>
<a href="https://idea.popcount.org/2017-03-20-epoll-is-fundamentally-broken-22/">version</a> of BSD <a href="https://www.freebsd.org/cgi/man.cgi?query=kqueue&amp;sektion=2"><code class="language-plaintext highlighter-rouge">kqueue(2)</code></a>.</p>

<p>So, if I want to try these for myself, I’ll need to install a
different operating system. I’ve dabbled with <a href="https://omnios.omniti.com/">OmniOS</a>, an OS
built on illumos, in virtual machines, using it as an alien
environment to test some of my software (e.g. <a href="/blog/2017/03/12/">enchive</a>).
OmniOS has a philosophy called <a href="https://omnios.omniti.com/wiki.php/KYSTY">Keep Your Software To Yourself</a>
(KYSTY), which is really just code for “we don’t do packaging.”
Honestly, you can’t blame them since <a href="https://utcc.utoronto.ca/~cks/space/blog/solaris/IllumosSupportLimits">they’re a tiny community</a>.
The best solution to this is probably <a href="https://www.pkgsrc.org/">pkgsrc</a>, which is
essentially a universal packaging system. Otherwise <a href="/blog/2017/06/19/">you’re on your
own</a>.</p>

<p>There’s also <a href="https://www.openindiana.org/">openindiana</a>, which is a more friendly
desktop-oriented illumos distribution. Still, the short of it is that
you’re very much on your own when things don’t work. The situation is
like running Linux a couple decades ago, when it was still difficult
to do.</p>

<p>If you’re interested in trying DTrace, the easiest option these days is
probably <a href="https://www.freebsd.org/">FreeBSD</a>. It’s got a big, active community, thorough
documentation, and a huge selection of packages. Its license (the <em>BSD
license</em>, duh) is compatible with the CDDL, so both ZFS and DTrace have
been ported to FreeBSD.</p>

<h3 id="what-is-dtrace">What is DTrace?</h3>

<p>I’ve done all this talking but haven’t yet described what <a href="https://wiki.freebsd.org/DTrace/Tutorial">DTrace
really is</a>. I won’t pretend to write my own tutorial, but I’ll
provide enough information to follow along. DTrace is a tracing
framework for debugging production systems <em>in real time</em>, both for
the kernel and for applications. The “production systems” part means
it’s stable and safe — using DTrace won’t put your system at risk of
crashing or damaging data. The “real time” part means it has little
impact on performance. You can use DTrace on live, active systems with
little impact. Both of these core design principles are vital for
troubleshooting those really tricky bugs that only show up in
production.</p>

<p>There are DTrace <em>probes</em> scattered all throughout the system: on
system calls, scheduler events, networking events, process events,
signals, virtual memory events, etc. Using a specialized language
called D (unrelated to the general purpose programming language D),
you can dynamically add behavior at these instrumentation points.
Generally the behavior is to capture information, but it can also
manipulate the event being traced.</p>

<p>Each probe is fully identified by a 4-tuple delimited by colons:
provider, module, function, and probe name. An empty element denotes a
sort of wildcard. For example, <code class="language-plaintext highlighter-rouge">syscall::open:entry</code> is a probe at the
beginning (i.e. “entry”) of <code class="language-plaintext highlighter-rouge">open(2)</code>. <code class="language-plaintext highlighter-rouge">syscall:::entry</code> matches all
system call entry probes.</p>

<p>Unlike strace on Linux which monitors a specific process, DTrace
applies to the entire system when active. To run curl under strace
from Emacs, I’d have to modify Emacs’ behavior to do so. With DTrace I
can instrument every curl process without making a single change to
Emacs, and with negligible impact to Emacs. That’s a big deal.</p>

<p>So, when it comes to this Elfeed issue, FreeBSD is much better poised
for debugging the problem. All I have to do is catch it in the act.
However, it’s been months since that bug report and I’m not really
making this connection yet. I’m just hoping I eventually find an
interesting problem where I can apply DTrace.</p>

<h3 id="freebsd-on-a-raspberry-pi-2">FreeBSD on a Raspberry Pi 2</h3>

<p>So I’ve settled in FreeBSD as the playground for these technologies, I
just have to decide where. I could always run it in a virtual machine,
but it’s always more interesting to try things out on real hardware.
<a href="https://wiki.freebsd.org/FreeBSD/arm/Raspberry%20Pi">FreeBSD supports the Raspberry Pi 2</a> as a Tier 2 system, and
I had a Raspberry Pi 2 sitting around collecting dust, so I put it to
use.</p>

<p>I wrote the image to an SD card, and for a few days I stretched my
legs on this new system. I cloned a couple dozen of my own git
repositories, ran the builds and the tests, and just got a feel for
things. I tried out the ports system for the first time, mainly to
discover that the low-powered Raspberry Pi 2 takes days to build some
of the packages I want to try.</p>

<p>I <a href="/blog/2017/04/01/">mostly program in Vim these days</a>, so it’s some days before I
even set up Emacs. Eventually I do build Emacs, clone my
configuration, fire it up, and give Elfeed a spin.</p>

<p>And that’s when the “search failed” bug strikes! Not just once, but
dozens of times. Perfect! This low-powered platform is the jackpot for
this particular bug, triggering it left and right. Given that I’ve got
DTrace at my disposal, it’s <em>the</em> perfect place to debug this.
Something is lying to Elfeed and DTrace will play the judge.</p>

<p>Before I dive in I see three possibilities:</p>

<ol>
  <li>curl is reporting success but truncating its output.</li>
  <li>Emacs is quietly truncating curl’s output.</li>
  <li>Emacs is misinterpreting curl’s exit status.</li>
</ol>

<p>With Dtrace I can observe what every curl process writes to Emacs, and
I can also double check curl’s exit status. I come up with the
following (newbie) DTrace script:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>syscall::write:entry
/execname == "curl"/
{
    printf("%d WRITE %d \"%s\"\n",
           pid, arg2, stringof(copyin(arg1, arg2)));
}

syscall::exit:entry
/execname == "curl"/
{
    printf("%d EXIT  %d\n", pid, arg0);
}
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">/execname == "curl"/</code> is a predicate that (obviously) causes the
behavior to only fire for curl processes. The first probe has DTrace
print a line for every <code class="language-plaintext highlighter-rouge">write(2)</code> from curl. <code class="language-plaintext highlighter-rouge">arg0</code>, <code class="language-plaintext highlighter-rouge">arg1</code>, and
<code class="language-plaintext highlighter-rouge">arg2</code> correspond to the arguments of <code class="language-plaintext highlighter-rouge">write(2)</code>: fd, buf, count. It
logs the process ID (pid) of the write, the length of the write, and
the actual contents written. Remember that these curl processes are
run in parallel by Emacs, so the pid allows me to associate the
separate writes and the exit status.</p>

<p>The second probe prints the pid and the exit status (the first argument
to <code class="language-plaintext highlighter-rouge">exit(2)</code>).</p>

<p>I also want to compare this to exactly what is delivered to Elfeed when
curl exits, so I modify the <a href="http://www.gnu.org/software/emacs/manual/html_node/elisp/Sentinels.html">process sentinel</a> — the callback
that handles a subprocess exiting — to call <code class="language-plaintext highlighter-rouge">write-file</code> before any
action is taken. I can compare these buffer dumps to the logs produced
by DTrace.</p>

<p>There are two important findings.</p>

<p>First, when the “search failed” bug occurs, the buffer was completely
empty (95% of the time) or truncated at the end of the HTTP headers
(5% of the time), right at the blank line. DTrace indicates that curl
did its job to the full, so it’s Emacs who’s the liar. It’s not
delivering all of curl’s data to Elfeed. That’s pretty annoying.</p>

<p>Second, <strong>curl was line-buffered</strong>. Each line was a separate,
independent <code class="language-plaintext highlighter-rouge">write(2)</code>. I was certainly <em>not</em> expecting this. Normally
the C library only does line buffering when the output is a terminal.
That’s because it’s guessing a user may be watching, expecting the
output to arrive a line at a time.</p>

<p>Here’s a sample of what it looked like in the log:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>88188 WRITE 32 "Server: Apache/2.4.18 (Ubuntu)
"
88188 WRITE 46 "Location: https://blog.plover.com/index.atom
"
88188 WRITE 21 "Content-Length: 299
"
88188 WRITE 45 "Content-Type: text/html; charset=iso-8859-1
"
88188 WRITE 2 "
"
</code></pre></div></div>

<p>Why would curl think Emacs is a terminal?</p>

<p><em>Oh.</em> That’s right. <em>This is the <a href="/blog/2014/02/06/">same problem I ran into four years
ago when writing EmacSQL</a>.</em> By default Emacs connects to
subprocesses through a pseudo-terminal (pty). I called this a mistake
in Emacs back then, and I still stand by that claim. The pty causes
weird, annoying problems for little benefit:</p>

<ul>
  <li>Interpreting control characters. Hope you weren’t transferring binary
data!</li>
  <li>Subprocesses will generally get line buffered. This makes them
slower, though in some situations it might be desirable.</li>
  <li>Stdout and stderr get mixed together. (Optional since Emacs 25.)</li>
  <li><em>New!</em> There’s a bug somewhere in Emacs that causes truncation when
ptys are used heavily in parallel.</li>
</ul>

<p>Just from eyeballing the DTrace log I knew what to do: dump the pty
and switch to a pipe. This is controlled with the
<code class="language-plaintext highlighter-rouge">process-connection-type</code> variable, and fixing it <a href="https://github.com/skeeto/elfeed/commit/945765a57d2f27996b6a43bc62e803dc167d1547">is a
one-liner</a>.</p>

<p>Not only did this completely resolve the truncation issue, Elfeed is
noticeably faster at fetching feeds on all machines. It’s no longer
receiving mountains of XML one line at a time, like sucking pudding
through a straw. It’s now quite zippy even on my Raspberry Pi 2, which
had <em>never</em> been the case before (without the “search failed” bug).
Even if you were never affected by this bug, you will benefit from the
fix.</p>

<p>I haven’t officially reported this as an Emacs bug yet because
reproducibility is still an issue. It needs something better than
“fire off a bunch of HTTP requests across the internet in parallel
from a Raspberry Pi.”</p>

<p>The fix reminds me of that <a href="https://www.buzzmaven.com/old-engineer-hammer-2/">old boilermaker story</a> about
charging a lot of money just to swing a hammer. Once the problem
arose, <strong>DTrace quickly helped to identify the place to hit Emacs with
the hammer</strong>.</p>

<p><em>Finally, a big thanks to alphapapa for originally taking the time to
report this bug months ago.</em></p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  

</feed>
