<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Articles tagged asyncio at null program</title>
  <link rel="alternate" type="text/html"
        href="https://nullprogram.com/tags/asyncio/"/>
  <link rel="self" type="application/atom+xml"
        href="https://nullprogram.com/tags/asyncio/feed/"/>
  <updated>2026-04-09T13:25:45Z</updated>
  <id>urn:uuid:702ad32f-4903-4692-9b8c-b8182b548416</id>

  <author>
    <name>Christopher Wellons</name>
    <uri>https://nullprogram.com</uri>
    <email>wellons@nullprogram.com</email>
  </author>

  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>Asynchronously Opening and Closing Files in Asyncio</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2020/09/04/"/>
    <id>urn:uuid:ae94da45-f65d-4c72-a10e-9e421ea843ec</id>
    <updated>2020-09-04T01:36:20Z</updated>
    <category term="c"/><category term="linux"/><category term="python"/><category term="asyncio"/>
    <content type="html">
      <![CDATA[<p>Python <a href="https://docs.python.org/3/library/asyncio.html">asyncio</a> has support for asynchronous networking,
subprocesses, and interprocess communication. However, it has nothing
for asynchronous file operations — opening, reading, writing, or
closing. This is likely in part because operating systems themselves
also lack these facilities. If a file operation takes a long time,
perhaps because the file is on a network mount, then the entire Python
process will hang. It’s possible to work around this, so let’s build a
utility that can asynchronously open and close files.</p>

<p>The usual way to work around the lack of operating system support for a
particular asynchronous operation is to <a href="http://docs.libuv.org/en/v1.x/design.html#file-i-o">dedicate threads to waiting on
those operations</a>. By using a thread pool, we can even avoid the
overhead of spawning threads when we need them. Plus asyncio is designed
to play nicely with thread pools anyway.</p>

<h3 id="test-setup">Test setup</h3>

<p>Before we get started, we’ll need some way to test that it’s working. We
need a slow file system. One thought is to <a href="/blog/2018/06/23/">use ptrace to intercept the
relevant system calls</a>, though this isn’t quite so simple. The
other threads need to continue running while the thread waiting on
<code class="language-plaintext highlighter-rouge">open(2)</code> is paused, but ptrace pauses the whole process. Fortunately
there’s a simpler solution anyway: <code class="language-plaintext highlighter-rouge">LD_PRELOAD</code>.</p>

<p>Setting the <code class="language-plaintext highlighter-rouge">LD_PRELOAD</code> environment variable to the name of a shared
object will cause the loader to load this shared object ahead of
everything else, allowing that shared object to override other
libraries. I’m on x86-64 Linux (Debian), and so I’m looking to override
<code class="language-plaintext highlighter-rouge">open64(2)</code> in glibc. Here’s my <code class="language-plaintext highlighter-rouge">open64.c</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define _GNU_SOURCE
#include</span> <span class="cpf">&lt;dlfcn.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;string.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;unistd.h&gt;</span><span class="cp">
</span>
<span class="kt">int</span>
<span class="nf">open64</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">,</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span> <span class="kt">int</span> <span class="n">mode</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">strncmp</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="s">"/tmp/"</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span> <span class="p">{</span>
        <span class="n">sleep</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="kt">int</span> <span class="p">(</span><span class="o">*</span><span class="n">f</span><span class="p">)(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="o">=</span> <span class="n">dlsym</span><span class="p">(</span><span class="n">RTLD_NEXT</span><span class="p">,</span> <span class="s">"open64"</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">f</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now Python must go through my C function when it opens files. If the
file resides where under <code class="language-plaintext highlighter-rouge">/tmp/</code>, opening the file will be delayed by 3
seconds. Since I still want to actually open a file, I use <code class="language-plaintext highlighter-rouge">dlsym()</code> to
access the <em>real</em> <code class="language-plaintext highlighter-rouge">open64()</code> in glibc. I build it like so:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cc -shared -fPIC -o open64.so open64.c -ldl
</code></pre></div></div>

<p>And to test that it works with Python, let’s time how long it takes to
open <code class="language-plaintext highlighter-rouge">/tmp/x</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ touch /tmp/x
$ time LD_PRELOAD=./open64.so python3 -c 'open("/tmp/x")'

real    0m3.021s
user    0m0.014s
sys     0m0.005s
</code></pre></div></div>

<p>Perfect! (Note: It’s a little strange putting <code class="language-plaintext highlighter-rouge">time</code> <em>before</em> setting the
environment variable, but that’s because I’m using Bash and it <code class="language-plaintext highlighter-rouge">time</code> is
special since this is the shell’s version of the command.)</p>

<h3 id="thread-pools">Thread pools</h3>

<p>Python’s standard <code class="language-plaintext highlighter-rouge">open()</code> is most commonly used as a <em>context manager</em>
so that the file is automatically closed no matter what happens.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'output.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">out</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="s">'hello world'</span><span class="p">,</span> <span class="nb">file</span><span class="o">=</span><span class="n">out</span><span class="p">)</span>
</code></pre></div></div>

<p>I’d like my asynchronous open to follow this pattern using <a href="https://www.python.org/dev/peps/pep-0492/"><code class="language-plaintext highlighter-rouge">async
with</code></a>. It’s like <code class="language-plaintext highlighter-rouge">with</code>, but the context manager is acquired and
released asynchronously. I’ll call my version <code class="language-plaintext highlighter-rouge">aopen()</code>:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">with</span> <span class="n">aopen</span><span class="p">(</span><span class="s">'output.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">out</span><span class="p">:</span>
    <span class="p">...</span>
</code></pre></div></div>

<p>So <code class="language-plaintext highlighter-rouge">aopen()</code> will need to return an <em>asynchronous context manager</em>, an
object with methods <code class="language-plaintext highlighter-rouge">__aenter__</code> and <code class="language-plaintext highlighter-rouge">__aexit__</code> that both return
<a href="https://docs.python.org/3/glossary.html#term-awaitable"><em>awaitables</em></a>. Usually this is by virtue of these methods being
<a href="https://docs.python.org/3/glossary.html#term-coroutine-function"><em>coroutine functions</em></a>, but a normal function that directly returns
an awaitable also works, which is what I’ll be doing for <code class="language-plaintext highlighter-rouge">__aenter__</code>.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">_AsyncOpen</span><span class="p">():</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">args</span><span class="p">,</span> <span class="n">kwargs</span><span class="p">):</span>
        <span class="p">...</span>

    <span class="k">def</span> <span class="nf">__aenter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="p">...</span>

    <span class="k">async</span> <span class="k">def</span> <span class="nf">__aexit__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">exc_type</span><span class="p">,</span> <span class="n">exc</span><span class="p">,</span> <span class="n">tb</span><span class="p">):</span>
        <span class="p">...</span>
</code></pre></div></div>

<p>Ultimately we have to call <code class="language-plaintext highlighter-rouge">open()</code>. The arguments for <code class="language-plaintext highlighter-rouge">open()</code> will be
given to the constructor to be used later. This will make more sense
when you see the definition for <code class="language-plaintext highlighter-rouge">aopen()</code>.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">args</span><span class="p">,</span> <span class="n">kwargs</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">_args</span> <span class="o">=</span> <span class="n">args</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">_kwargs</span> <span class="o">=</span> <span class="n">kwargs</span>
</code></pre></div></div>

<p>When it’s time to actually open the file, Python will call <code class="language-plaintext highlighter-rouge">__aenter__</code>.
We can’t call <code class="language-plaintext highlighter-rouge">open()</code> directly since that will block, so we’ll use a
thread pool to wait on it. Rather than create a thread pool, we’ll use
the one that comes with the current event loop. The <code class="language-plaintext highlighter-rouge">run_in_executor()</code>
method runs a function in a thread pool — where <code class="language-plaintext highlighter-rouge">None</code> means use the
default pool — returning an asyncio future representing the future
result, in this case the opened file object.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">__aenter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">def</span> <span class="nf">thread_open</span><span class="p">():</span>
            <span class="k">return</span> <span class="nb">open</span><span class="p">(</span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">_args</span><span class="p">,</span> <span class="o">**</span><span class="bp">self</span><span class="p">.</span><span class="n">_kwargs</span><span class="p">)</span>
        <span class="n">loop</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">get_event_loop</span><span class="p">()</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">_future</span> <span class="o">=</span> <span class="n">loop</span><span class="p">.</span><span class="n">run_in_executor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">thread_open</span><span class="p">)</span>
        <span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">_future</span>
</code></pre></div></div>

<p>Since this <code class="language-plaintext highlighter-rouge">__aenter__</code> is not a coroutine function, it returns the
future directly as its awaitable result. The caller will await it.</p>

<p>The default thread pool is limited to one thread per core, which I
suppose is the most obvious choice, though not ideal here. That’s fine
for CPU-bound operations but not for I/O-bound operations. In a real
program we may want to use a larger thread pool.</p>

<p>Closing a file may block, so we’ll do that in a thread pool as well.
First pull the file object <a href="/blog/2020/07/30/">from the future</a>, then close it in the
thread pool, waiting until the file has actually closed:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">async</span> <span class="k">def</span> <span class="nf">__aexit__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">exc_type</span><span class="p">,</span> <span class="n">exc</span><span class="p">,</span> <span class="n">tb</span><span class="p">):</span>
        <span class="nb">file</span> <span class="o">=</span> <span class="k">await</span> <span class="bp">self</span><span class="p">.</span><span class="n">_future</span>
        <span class="k">def</span> <span class="nf">thread_close</span><span class="p">():</span>
            <span class="nb">file</span><span class="p">.</span><span class="n">close</span><span class="p">()</span>
        <span class="n">loop</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">get_event_loop</span><span class="p">()</span>
        <span class="k">await</span> <span class="n">loop</span><span class="p">.</span><span class="n">run_in_executor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">thread_close</span><span class="p">)</span>
</code></pre></div></div>

<p>The open and close are paired in this context manager, but it may be
concurrent with an arbitrary number of other <code class="language-plaintext highlighter-rouge">_AsyncOpen</code> context
managers. There will be some upper limit to the number of open files, so
<strong>we need to be careful not to use too many of these things
concurrently</strong>, something <a href="/blog/2020/05/24/">which easily happens when using unbounded
queues</a>. Lacking back pressure, all it takes is for tasks to be
opening files slightly faster than they close them.</p>

<p>With all the hard work done, the definition for <code class="language-plaintext highlighter-rouge">aopen()</code> is trivial:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">aopen</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">_AsyncOpen</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="n">kwargs</span><span class="p">)</span>
</code></pre></div></div>

<p>That’s it! Let’s try it out with the <code class="language-plaintext highlighter-rouge">LD_PRELOAD</code> test.</p>

<h3 id="a-test-drive">A test drive</h3>

<p>First define a “heartbeat” task that will tell us the asyncio loop is
still chugging away while we wait on opening the file.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">heartbeat</span><span class="p">():</span>
    <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
        <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span>
        <span class="k">print</span><span class="p">(</span><span class="s">'HEARTBEAT'</span><span class="p">)</span>
</code></pre></div></div>

<p>Here’s a test function for <code class="language-plaintext highlighter-rouge">aopen()</code> that asynchronously opens a file
under <code class="language-plaintext highlighter-rouge">/tmp/</code> named by an integer, (synchronously) writes that integer
to the file, then asynchronously closes it.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">write</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
    <span class="k">async</span> <span class="k">with</span> <span class="n">aopen</span><span class="p">(</span><span class="sa">f</span><span class="s">'/tmp/</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">out</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="nb">file</span><span class="o">=</span><span class="n">out</span><span class="p">)</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">main()</code> function creates the heartbeat task and opens 4 files
concurrently though the intercepted file opening routine:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="n">beat</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">heartbeat</span><span class="p">())</span>
    <span class="n">tasks</span> <span class="o">=</span> <span class="p">[</span><span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">write</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">)]</span>
    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">gather</span><span class="p">(</span><span class="o">*</span><span class="n">tasks</span><span class="p">)</span>
    <span class="n">beat</span><span class="p">.</span><span class="n">cancel</span><span class="p">()</span>

<span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">main</span><span class="p">())</span>
</code></pre></div></div>

<p>The result:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ LD_PRELOAD=./open64.so python3 aopen.py
HEARTBEAT
HEARTBEAT
HEARTBEAT
HEARTBEAT
HEARTBEAT
HEARTBEAT
$ cat /tmp/{1,2,3,4}
1
2
3
4
</code></pre></div></div>

<p>As expected, 6 heartbeats corresponding to 3 seconds that all 4 tasks
spent concurrently waiting on the intercepted <code class="language-plaintext highlighter-rouge">open()</code>. Here’s the full
source if you want to try it our for yourself:</p>

<p><a href="https://gist.github.com/skeeto/89af673a0a0d24de32ad19ee505c8dbd">https://gist.github.com/skeeto/89af673a0a0d24de32ad19ee505c8dbd</a></p>

<h3 id="caveat-no-asynchronous-reads-and-writes">Caveat: no asynchronous reads and writes</h3>

<p><em>Only</em> opening and closing the file is asynchronous. Read and writes are
unchanged, still fully synchronous and blocking, so this is only a half
solution. A full solution is not nearly as simple because asyncio is
async/await. Asynchronous reads and writes would require all new APIs
<a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">with different coloring</a>. You’d need an <code class="language-plaintext highlighter-rouge">aprint()</code> to complement
<code class="language-plaintext highlighter-rouge">print()</code>, and so on, each returning an <code class="language-plaintext highlighter-rouge">awaitable</code> to be awaited.</p>

<p>This is one of the unfortunate downsides of async/await. I strongly
prefer conventional, preemptive concurrency, <em>but</em> we don’t always have
that luxury.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  <entry>
    <title>Exactly-Once Initialization in Asynchronous Python</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2020/07/30/"/>
    <id>urn:uuid:c6796958-9178-47be-8411-8f48c2c85d83</id>
    <updated>2020-07-30T23:39:12Z</updated>
    <category term="python"/><category term="asyncio"/>
    <content type="html">
      <![CDATA[<p><em>This article was discussed <a href="https://news.ycombinator.com/item?id=24007354">on Hacker News</a>.</em></p>

<p>A common situation in <a href="https://docs.python.org/3/library/asyncio.html">asyncio</a> Python programs is asynchronous
initialization. Some resource must be initialized exactly once before it
can be used, but the initialization itself is asynchronous — such as an
<a href="https://github.com/MagicStack/asyncpg">asyncpg</a> database. Let’s talk about a couple of solutions.</p>

<!--more-->

<p>The naive “solution” would be to track the initialization state in a
variable:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">initialized</span> <span class="o">=</span> <span class="bp">False</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">one_time_setup</span><span class="p">():</span>
    <span class="s">"Do not call more than once!"</span>
    <span class="p">...</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">maybe_initialize</span><span class="p">():</span>
    <span class="k">global</span> <span class="n">initialized</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">initialized</span><span class="p">:</span>
        <span class="k">await</span> <span class="n">one_time_setup</span><span class="p">()</span>
        <span class="n">initialized</span> <span class="o">=</span> <span class="bp">True</span>
</code></pre></div></div>

<p>The reasoning for <code class="language-plaintext highlighter-rouge">initialized</code> is the expectation of calling the
function more than once. However, if it might be called from concurrent
tasks there’s a <em>race condition</em>. If the second caller arrives while the
first is awaiting <code class="language-plaintext highlighter-rouge">one_time_setup()</code>, the function will be called a
second time.</p>

<p>Switching the order of the call and the assignment won’t help:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">maybe_initialize</span><span class="p">():</span>
    <span class="k">global</span> <span class="n">initialized</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">initialized</span><span class="p">:</span>
        <span class="n">initialized</span> <span class="o">=</span> <span class="bp">True</span>
        <span class="k">await</span> <span class="n">one_time_setup</span><span class="p">()</span>
</code></pre></div></div>

<p>Since asyncio is cooperative, the first caller doesn’t give up control
until to other tasks until the <code class="language-plaintext highlighter-rouge">await</code>, meaning <code class="language-plaintext highlighter-rouge">one_time_setup()</code> will
never be called twice. However, the second caller may return before
<code class="language-plaintext highlighter-rouge">one_time_setup()</code> has completed. What we want is for <code class="language-plaintext highlighter-rouge">one_time_setup()</code>
to be called exactly once, but for no caller to return until it has
returned.</p>

<h3 id="mutual-exclusion">Mutual exclusion</h3>

<p>My first thought was to use a <a href="https://docs.python.org/3/library/asyncio-sync.html#lock">mutex lock</a>. This will protect the
variable <em>and</em> prevent followup callers from progressing too soon. Tasks
arriving while <code class="language-plaintext highlighter-rouge">one_time_setup()</code> is still running will block on the
lock.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">initialized</span> <span class="o">=</span> <span class="bp">False</span>
<span class="n">initialized_lock</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Lock</span><span class="p">()</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">maybe_initialize</span><span class="p">():</span>
    <span class="k">global</span> <span class="n">initialized</span>
    <span class="k">async</span> <span class="k">with</span> <span class="n">initialized_lock</span><span class="p">:</span>
        <span class="k">if</span> <span class="ow">not</span> <span class="n">initialized</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">one_time_setup</span><span class="p">()</span>
            <span class="n">initialized</span> <span class="o">=</span> <span class="bp">True</span>
</code></pre></div></div>

<p>Unfortunately this has a serious downside: <strong>asyncio locks are
associated with the <a href="https://docs.python.org/3/library/asyncio-eventloop.html">loop</a> where they were created</strong>. Since the
lock variable is global, <code class="language-plaintext highlighter-rouge">maybe_initialize()</code> can only be called from
the same loop that loaded the module. <code class="language-plaintext highlighter-rouge">asyncio.run()</code> creates a new loop
so it’s incompatible.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># create a loop: always an error
</span><span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">maybe_initialize</span><span class="p">())</span>

<span class="c1"># reuse the loop: maybe an error
</span><span class="n">loop</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">get_event_loop</span><span class="p">()</span>
<span class="n">loop</span><span class="p">.</span><span class="n">run_until_complete</span><span class="p">((</span><span class="n">maybe_initialize</span><span class="p">()))</span>
</code></pre></div></div>

<p>(IMHO, it was a mistake for the asyncio API to include explicit loop
objects. It’s a low-level concept that unavoidably leaks through most
high-level abstractions.)</p>

<p>A workaround is to create the lock lazily. Thank goodness creating a
lock isn’t itself asynchronous!</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">initialized</span> <span class="o">=</span> <span class="bp">False</span>
<span class="n">initialized_lock</span> <span class="o">=</span> <span class="bp">None</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">maybe_initialize</span><span class="p">():</span>
    <span class="k">global</span> <span class="n">initialized</span><span class="p">,</span> <span class="n">initialized_lock</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">initialized_lock</span><span class="p">:</span>
        <span class="n">initialized_lock</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Lock</span><span class="p">()</span>
    <span class="k">async</span> <span class="k">with</span> <span class="n">initialized_lock</span><span class="p">:</span>
        <span class="k">if</span> <span class="ow">not</span> <span class="n">initialized</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">one_time_setup</span><span class="p">()</span>
            <span class="n">initialized</span> <span class="o">=</span> <span class="bp">True</span>
</code></pre></div></div>

<p>This is better, but <code class="language-plaintext highlighter-rouge">maybe_initialize()</code> can still only ever be called
from a single loop.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">maybe_initialize</span><span class="p">())</span> <span class="c1"># ok
</span><span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">maybe_initialize</span><span class="p">())</span> <span class="c1"># error!
</span></code></pre></div></div>

<h3 id="once">Once</h3>

<p>The pthreads API provides <a href="https://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_once.html"><code class="language-plaintext highlighter-rouge">pthread_once</code></a> to solve this problem.
C++11 has similarly has <a href="https://en.cppreference.com/w/cpp/thread/call_once"><code class="language-plaintext highlighter-rouge">std::call_once</code></a>. We can build something
similar using a future-like object.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">future</span> <span class="o">=</span> <span class="bp">None</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">maybe_initialize</span><span class="p">():</span>
    <span class="k">global</span> <span class="n">future</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">future</span><span class="p">:</span>
        <span class="n">future</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">one_time_setup</span><span class="p">())</span>
    <span class="k">await</span> <span class="n">future</span>
</code></pre></div></div>

<p>Awaiting a coroutine more than once is an error, but <a href="https://docs.python.org/3/library/asyncio-task.html#task-object">tasks</a> are
future-like objects and can be awaited more than once. At least on
CPython, they can also be awaited in other loops! So not only is this
simpler, it also solves the loop problem!</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">maybe_initialize</span><span class="p">())</span> <span class="c1"># ok
</span><span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">maybe_initialize</span><span class="p">())</span> <span class="c1"># still ok
</span></code></pre></div></div>

<p>This can be tidied up nicely in a <code class="language-plaintext highlighter-rouge">@once</code> decorator:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">once</span><span class="p">(</span><span class="n">func</span><span class="p">):</span>
    <span class="n">future</span> <span class="o">=</span> <span class="bp">None</span>
    <span class="k">async</span> <span class="k">def</span> <span class="nf">once_wrapper</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
        <span class="k">nonlocal</span> <span class="n">future</span>
        <span class="k">if</span> <span class="ow">not</span> <span class="n">future</span><span class="p">:</span>
            <span class="n">future</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">func</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">))</span>
        <span class="k">return</span> <span class="k">await</span> <span class="n">future</span>
    <span class="k">return</span> <span class="n">once_wrapper</span>
</code></pre></div></div>

<p>No more need for <code class="language-plaintext highlighter-rouge">maybe_initialize()</code>, just decorate the original
<code class="language-plaintext highlighter-rouge">one_time_setup()</code>:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">@</span><span class="n">once</span>
<span class="k">async</span> <span class="k">def</span> <span class="nf">one_time_setup</span><span class="p">():</span>
    <span class="p">...</span>
</code></pre></div></div>

]]>
    </content>
  </entry>
    
  
    
  
    
  <entry>
    <title>Latency in Asynchronous Python</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2020/05/24/"/>
    <id>urn:uuid:529e2382-d4ec-47a9-93a8-f450311e5a05</id>
    <updated>2020-05-24T02:44:50Z</updated>
    <category term="python"/><category term="asyncio"/>
    <content type="html">
      <![CDATA[<p>This week I was debugging a misbehaving Python program that makes
significant use of <a href="https://docs.python.org/3/library/asyncio.html">Python’s asyncio</a>. The program would
eventually take very long periods of time to respond to network
requests. My first suspicion was a CPU-heavy coroutine hogging the
thread, preventing the socket coroutines from running, but an
inspection with <code class="language-plaintext highlighter-rouge">pdb</code> showed this wasn’t the case. Instead, the
program’s author had made a couple of fundamental mistakes using
asyncio. Let’s discuss them using small examples.</p>

<p>Setting the stage: There’s a heartbeat coroutine that “beats” once per
second. A real program would send out a packet as the heartbeat, but
here it just prints how late it was scheduled.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">heartbeat</span><span class="p">():</span>
    <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
        <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="p">.</span><span class="n">time</span><span class="p">()</span>
        <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
        <span class="n">delay</span> <span class="o">=</span> <span class="n">time</span><span class="p">.</span><span class="n">time</span><span class="p">()</span> <span class="o">-</span> <span class="n">start</span> <span class="o">-</span> <span class="mi">1</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'heartbeat delay = </span><span class="si">{</span><span class="n">delay</span><span class="si">:</span><span class="p">.</span><span class="mi">3</span><span class="n">f</span><span class="si">}</span><span class="s">s'</span><span class="p">)</span>
</code></pre></div></div>

<p>Running this with <code class="language-plaintext highlighter-rouge">asyncio.run(heartbeat())</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>heartbeat delay = 0.001s
heartbeat delay = 0.001s
heartbeat delay = 0.001s
</code></pre></div></div>

<p>It’s consistently 1ms late, but good enough, especially considering
what’s to come. A program that <em>only</em> sends a heartbeat is pretty
useless, so a real program will be busy working on other things
concurrently. In this example, we have little 10ms payloads of work to
do, which are represented by this <code class="language-plaintext highlighter-rouge">process()</code> function:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">JOB_DURATION</span> <span class="o">=</span> <span class="mf">0.01</span>  <span class="c1"># 10ms
</span>
<span class="k">async</span> <span class="k">def</span> <span class="nf">process</span><span class="p">():</span>
    <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="n">JOB_DURATION</span><span class="p">)</span> <span class="c1"># simulate CPU time
</span></code></pre></div></div>

<p>That’s a synchronous sleep because it’s standing in for actual CPU work.
Maybe it’s parsing JSON in a loop or crunching numbers in NumPy. Use
your imagination. During this 10ms no other coroutines can be scheduled
because this is, after all, still <a href="https://rachelbythebay.com/w/2020/03/07/costly/">just a single-threaded program</a>.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">JOB_COUNT</span> <span class="o">=</span> <span class="mi">200</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">heartbeat</span><span class="p">())</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">2.5</span><span class="p">)</span>

    <span class="k">print</span><span class="p">(</span><span class="s">'begin processing'</span><span class="p">)</span>
    <span class="n">count</span> <span class="o">=</span> <span class="n">JOB_COUNT</span>
    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">JOB_COUNT</span><span class="p">):</span>
        <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">process</span><span class="p">())</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</code></pre></div></div>

<p>This program starts the heartbeat coroutine in a task. A coroutine
doesn’t make progress unless someone is waiting on it, and that
something can be a task. So it will continue along independently without
prodding.</p>

<p>The arbitrary 2.5 second sleep simulates waiting, say, for a network
request. In the output we’ll see the heartbeat tick a couple of times,
then it will create and process 200 jobs concurrently. In a real program
we’d have some way to collect the results, but we can ignore that part
for now. They’re <em>only</em> 10ms, so the effect on the heartbeat should be
pretty small right?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>heartbeat delay = 0.001s
heartbeat delay = 0.001s
begin processing
heartbeat delay = 1.534s
heartbeat delay = 0.001s
heartbeat delay = 0.001s
</code></pre></div></div>

<p>The heartbeat was delayed for 1.5 seconds by a mere 200 tasks each doing
only 10ms of work each. What happened?</p>

<p>Python calls the object that schedules tasks a <em>loop</em>, and this is no
coincidence. Everything to be scheduled gets put into a loop and is
scheduled round robin, one after another. The 200 tasks got scheduled
ahead of the heartbeat, and so it doesn’t get scheduled again until each
of those tasks either yields (<code class="language-plaintext highlighter-rouge">await</code>) or completes.</p>

<p>It really didn’t take much to significantly hamper the heartbeat, and,
with a <a href="/blog/2019/02/24/">dumb bytecode compiler</a>, 10ms may not be much work at all.
The lesson here is to avoid spawning many tasks if latency is an
important consideration.</p>

<h3 id="a-semaphore-is-not-the-answer">A semaphore is not the answer</h3>

<p>My first idea at a solution: What if we used a semaphore to limit the
number of “active” tasks at a time? Then perhaps the heartbeat wouldn’t
have to compete with so many other tasks for time.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">WORKER_COUNT</span> <span class="o">=</span> <span class="mi">4</span>  <span class="c1"># max "active" jobs at a time
</span>
<span class="k">async</span> <span class="k">def</span> <span class="nf">main_with_semaphore</span><span class="p">():</span>
    <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">heartbeat</span><span class="p">())</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">2.5</span><span class="p">)</span>

    <span class="n">sem</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Semaphore</span><span class="p">(</span><span class="n">WORKER_COUNT</span><span class="p">)</span>
    <span class="k">async</span> <span class="k">def</span> <span class="nf">process</span><span class="p">():</span>
        <span class="k">await</span> <span class="n">sem</span><span class="p">.</span><span class="n">acquire</span><span class="p">()</span>
        <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="n">JOB_DURATION</span><span class="p">)</span>
        <span class="n">sem</span><span class="p">.</span><span class="n">release</span><span class="p">()</span>

    <span class="k">print</span><span class="p">(</span><span class="s">'begin processing'</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">JOB_COUNT</span><span class="p">):</span>
        <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">process</span><span class="p">())</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</code></pre></div></div>

<p>When the heartbeat sleep completes, about half the jobs will be complete
and the other half blocked on the semaphore. So perhaps the heartbeat
gets to skip ahead of all the blocked tasks since they’re not yet ready
to run?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>heartbeat delay = 0.001s
heartbeat delay = 0.001s
begin processing
heartbeat delay = 1.537s
heartbeat delay = 0.001s
heartbeat delay = 0.001s
</code></pre></div></div>

<p>It made no difference whatsoever because the tasks each “held their
place” in line in the loop! Even reducing <code class="language-plaintext highlighter-rouge">WORKER_COUNT</code> to 1 would have
no effect. As soon as a task completes, it frees the task waiting next
in line. The semaphore does practically nothing here.</p>

<h3 id="solving-it-with-a-job-queue">Solving it with a job queue</h3>

<p>Here’s what does work: a <a href="https://docs.python.org/3/library/asyncio-queue.html">job queue</a>. Create a queue to be populated
with coroutines (not tasks), and have a small number of tasks run jobs
from the queue. Since this is a real solution, I’ve made this example
more complete.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">main_with_queue</span><span class="p">():</span>
    <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">heartbeat</span><span class="p">())</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">2.5</span><span class="p">)</span>

    <span class="n">queue</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Queue</span><span class="p">(</span><span class="n">maxsize</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
    <span class="k">async</span> <span class="k">def</span> <span class="nf">worker</span><span class="p">():</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="n">coro</span> <span class="o">=</span> <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">get</span><span class="p">()</span>
            <span class="k">await</span> <span class="n">coro</span>  <span class="c1"># consider using try/except
</span>            <span class="n">queue</span><span class="p">.</span><span class="n">task_done</span><span class="p">()</span>
    <span class="n">workers</span> <span class="o">=</span> <span class="p">[</span><span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">worker</span><span class="p">())</span>
                   <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">WORKER_COUNT</span><span class="p">)]</span>

    <span class="k">print</span><span class="p">(</span><span class="s">'begin processing'</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">JOB_COUNT</span><span class="p">):</span>
        <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">process</span><span class="p">())</span>
    <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">join</span><span class="p">()</span>
    <span class="k">print</span><span class="p">(</span><span class="s">'end processing'</span><span class="p">)</span>

    <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">workers</span><span class="p">:</span>
        <span class="n">w</span><span class="p">.</span><span class="n">cancel</span><span class="p">()</span>

    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">task_done()</code> and <code class="language-plaintext highlighter-rouge">join()</code> methods make it trivial synchronize on
full job completion. I also take the time to destroy the worker tasks.
It’s harmless to leave them blocked on the queue. They’ll be garbage
collected so it’s not a resource leak. However, CPython complains about
garbage collecting running tasks because it looks like a mistake — and
it usually is.</p>

<p>If you read carefully you might have noticed the queue’s maximum size is
set to 1: not much of a “queue”! <a href="https://golang.org/">Go</a> developers will recognize this
as being (nearly) an <em>unbuffered channel</em>, the default and most common
kind of channel. So it’s more a synchronized rendezvous between producer
(<code class="language-plaintext highlighter-rouge">put()</code>) and consumer (<code class="language-plaintext highlighter-rouge">get()</code>). The producer waits at the queue with a
job until a task is free to come take it. A task waits at the queue
until a producer arrives with a job for it.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>heartbeat delay = 0.001s
heartbeat delay = 0.001s
begin processing
heartbeat delay = 0.014s
heartbeat delay = 0.020s
end processing
heartbeat delay = 0.002s
heartbeat delay = 0.001s
</code></pre></div></div>

<p>The output shows that the impact to the heartbeat was modest — about
the best we could hope for from async/await — and the heartbeat
continued while jobs were running. The more concurrency — the more
worker tasks running on the queue — the greater the latency.</p>

<p>Note: Increasing the <code class="language-plaintext highlighter-rouge">WORKER_COUNT</code> in this toy example won’t have an
impact on latency since the jobs aren’t actually concurrent. They start,
run, and complete before another worker task can draw from the queue.
Putting a couple awaits in <code class="language-plaintext highlighter-rouge">process()</code> allows for concurrency:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">WORKER_COUNT</span> <span class="o">=</span> <span class="mi">200</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">process</span><span class="p">():</span>
    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.01</span><span class="p">)</span>
    <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="n">JOB_DURATION</span><span class="p">)</span>
    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.01</span><span class="p">)</span>
</code></pre></div></div>

<p>Since there are so many worker tasks, this is back to the initial
problem:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>heartbeat delay = 0.001s
heartbeat delay = 0.001s
begin processing
heartbeat delay = 1.655s
end processing
heartbeat delay = 0.001s
heartbeat delay = 0.001s
</code></pre></div></div>

<p>As <code class="language-plaintext highlighter-rouge">WORKER_COUNT</code> decreases, so does heartbeat latency.</p>

<h3 id="unbounded-queues">Unbounded queues</h3>

<p>Here’s another defect from the same program. Create an unbounded queue,
a producer, and a consumer. The consumer prints the queue size so we can
see what’s happening:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">def</span> <span class="nf">producer_consumer</span><span class="p">():</span>
    <span class="n">queue</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Queue</span><span class="p">()</span>
    <span class="n">done</span> <span class="o">=</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">Condition</span><span class="p">()</span>

    <span class="k">async</span> <span class="k">def</span> <span class="nf">producer</span><span class="p">():</span>
        <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100_000</span><span class="p">):</span>
            <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
        <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">join</span><span class="p">()</span>
        <span class="k">async</span> <span class="k">with</span> <span class="n">done</span><span class="p">:</span>
            <span class="n">done</span><span class="p">.</span><span class="n">notify</span><span class="p">()</span>

    <span class="k">async</span> <span class="k">def</span> <span class="nf">consumer</span><span class="p">():</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">get</span><span class="p">()</span>
            <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'qsize = </span><span class="si">{</span><span class="n">queue</span><span class="p">.</span><span class="n">qsize</span><span class="p">()</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
            <span class="n">queue</span><span class="p">.</span><span class="n">task_done</span><span class="p">()</span>

    <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">producer</span><span class="p">())</span>
    <span class="n">asyncio</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">consumer</span><span class="p">())</span>

    <span class="k">async</span> <span class="k">with</span> <span class="n">done</span><span class="p">:</span>
        <span class="k">await</span> <span class="n">done</span><span class="p">.</span><span class="n">wait</span><span class="p">()</span>
</code></pre></div></div>

<p>The output of this program begins:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qsize = 99999
qsize = 99998
qsize = 99997
qsize = 99996
...
</code></pre></div></div>

<p>So the entire queue is populated before the consumer does anything at
all: tons of latency for whatever is being consumed. Since the queue is
unbounded, the producer never needs to yield. You might be tempted to
use <code class="language-plaintext highlighter-rouge">asyncio.sleep(0)</code> in the producer to yield explicitly:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">async</span> <span class="k">def</span> <span class="nf">producer</span><span class="p">():</span>
        <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100_000</span><span class="p">):</span>
            <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
            <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>  <span class="c1"># yield
</span>        <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">join</span><span class="p">()</span>
        <span class="k">async</span> <span class="k">with</span> <span class="n">done</span><span class="p">:</span>
            <span class="n">done</span><span class="p">.</span><span class="n">notify</span><span class="p">()</span>
</code></pre></div></div>

<p>This even seems to work! The output looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qsize = 0
qsize = 0
qsize = 0
qsize = 0
</code></pre></div></div>

<p>However, this is fragile and not a real solution. If the consumer yields
just two times in its own loop, its nearly back to where we started:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">async</span> <span class="k">def</span> <span class="nf">consumer</span><span class="p">():</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">queue</span><span class="p">.</span><span class="n">get</span><span class="p">()</span>
            <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'qsize = </span><span class="si">{</span><span class="n">queue</span><span class="p">.</span><span class="n">qsize</span><span class="p">()</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
            <span class="n">queue</span><span class="p">.</span><span class="n">task_done</span><span class="p">()</span>
            <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
            <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>

<p>The output shows that the producer gradually creeps ahead of the
consumer. On each consumer iteration, the producer iterates twice:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qsize = 0
qsize = 1
qsize = 2
qsize = 3
...
</code></pre></div></div>

<p>There’s a really simple solution to this: <a href="https://lucumr.pocoo.org/2020/1/1/async-pressure/">Never, ever use unbounded
queues.</a> In fact <strong>every unbounded <code class="language-plaintext highlighter-rouge">asyncio.Queue()</code> is a bug</strong>.
It’s a serious API defect that asyncio allows unbounded queues to be
created at all. The default <code class="language-plaintext highlighter-rouge">maxsize</code> should have been <em>actually</em> zero
(unbuffered), not infinite. Because unbounded is the default, virtually
every example of <code class="language-plaintext highlighter-rouge">asyncio.Queue</code> — online, offline, and even the
official documentation — is broken in some way.</p>

<h3 id="important-takeaways">Important takeaways</h3>

<ol>
  <li>The default <code class="language-plaintext highlighter-rouge">asyncio.Queue()</code> is <em>always</em> wrong.</li>
  <li><code class="language-plaintext highlighter-rouge">asyncio.sleep(0)</code> is <em>nearly always</em> used incorrectly.</li>
  <li>Use a <code class="language-plaintext highlighter-rouge">maxsize=1</code> job queue instead of spawning many identical tasks.</li>
</ol>

<p>Python linters should be updated to warn about 1 and 2 by default.</p>

<p>Update: A couple of people have pointed out <a href="https://trio.readthedocs.io/en/stable/reference-core.html#buffering-in-channels">an argument in the Trio
documentation for unbounded queues</a>. This argument conflates two
different concepts: data structure queues and concurrent communication
infrastructure queues. To distinguish, the latter is often called a
channel. An unbounded <em>queue</em> (<code class="language-plaintext highlighter-rouge">collections.deque</code>) is necessary, but
and unbounded <em>channel</em> (<code class="language-plaintext highlighter-rouge">asyncio.Queue</code>) is always wrong. The Trio
documentation describes a web crawler, which is fundamentally a
breadth-first search (read: queue-oriented) of a graph. So this is a
plain old BFS queue, not a channel, which is why it’s reasonable for it
to be unbounded.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  <entry>
    <title>Endlessh: an SSH Tarpit</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2019/03/22/"/>
    <id>urn:uuid:5429ee15-3d42-4af2-8690-f7f402870dd0</id>
    <updated>2019-03-22T17:26:45Z</updated>
    <category term="netsec"/><category term="python"/><category term="c"/><category term="posix"/><category term="asyncio"/>
    <content type="html">
      <![CDATA[<p><em>This article was discussed <a href="https://news.ycombinator.com/item?id=19465967">on Hacker News</a> (<a href="https://news.ycombinator.com/item?id=24491453">later</a>), <a href="https://old.reddit.com/r/programming/comments/b4iq00/endlessh_an_ssh_tarpit/">on
reddit</a> (<a href="https://old.reddit.com/r/netsec/comments/b4dwjl/endlessh_an_ssh_tarpit/">also</a>), featured in <a href="https://www.youtube.com/watch?v=bM65iyRRW0A&amp;t=3m52s">BSD Now 294</a>.
Also check out <a href="https://github.com/bediger4000/ssh-tarpit-behavior">this Endlessh analysis</a>.</em></p>

<p>I’m a big fan of tarpits: a network service that intentionally inserts
delays in its protocol, slowing down clients by forcing them to wait.
This arrests the speed at which a bad actor can attack or probe the
host system, and it ties up some of the attacker’s resources that
might otherwise be spent attacking another host. When done well, a
tarpit imposes more cost on the attacker than the defender.</p>

<!--more-->

<p>The Internet is a very hostile place, and anyone who’s ever stood up
an Internet-facing IPv4 host has witnessed the immediate and
continuous attacks against their server. I’ve maintained <a href="/blog/2017/06/15/">such a
server</a> for nearly six years now, and more than 99% of my
incoming traffic has ill intent. One part of my defenses has been
tarpits in various forms. The latest addition is an SSH tarpit I wrote
a couple of months ago:</p>

<p><a href="https://github.com/skeeto/endlessh"><strong>Endlessh: an SSH tarpit</strong></a></p>

<p>This program opens a socket and pretends to be an SSH server. However,
it actually just ties up SSH clients with false promises indefinitely
— or at least until the client eventually gives up. After cloning the
repository, here’s how you can try it out for yourself (default port
2222):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ make
$ ./endlessh &amp;
$ ssh -p2222 localhost
</code></pre></div></div>

<p>Your SSH client will hang there and wait for at least several days
before finally giving up. Like a mammoth in the La Brea Tar Pits, it
got itself stuck and can’t get itself out. As I write, my
Internet-facing SSH tarpit currently has 27 clients trapped in it. A
few of these have been connected for weeks. In one particular spike it
had 1,378 clients trapped at once, lasting about 20 hours.</p>

<p>My Internet-facing Endlessh server listens on port 22, which is the
standard SSH port. I long ago moved my real SSH server off to another
port where it sees a whole lot less SSH traffic — essentially none.
This makes the logs a whole lot more manageable. And (hopefully)
Endlessh convinces attackers not to look around for an SSH server on
another port.</p>

<p>How does it work? Endlessh exploits <a href="https://tools.ietf.org/html/rfc4253#section-4.2">a little paragraph in RFC
4253</a>, the SSH protocol specification. Immediately after the TCP
connection is established, and before negotiating the cryptography,
both ends send an identification string:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SSH-protoversion-softwareversion SP comments CR LF
</code></pre></div></div>

<p>The RFC also notes:</p>

<blockquote>
  <p>The server MAY send other lines of data before sending the version
string.</p>
</blockquote>

<p>There is no limit on the number of lines, just that these lines must
not begin with “SSH-“ since that would be ambiguous with the
identification string, and lines must not be longer than 255
characters including CRLF. So <strong>Endlessh sends and <em>endless</em> stream of
randomly-generated “other lines of data”</strong> without ever intending to
send a version string. By default it waits 10 seconds between each
line. This slows down the protocol, but prevents it from actually
timing out.</p>

<p>This means Endlessh need not know anything about cryptography or the
vast majority of the SSH protocol. It’s dead simple.</p>

<h3 id="implementation-strategies">Implementation strategies</h3>

<p>Ideally the tarpit’s resource footprint should be as small as
possible. It’s just a security tool, and the server does have an
actual purpose that doesn’t include being a tarpit. It should tie up
the attacker’s resources, not the server’s, and should generally be
unnoticeable. (Take note all those who write the awful “security”
products I have to tolerate at my day job.)</p>

<p>Even when many clients have been trapped, Endlessh spends more than
99.999% of its time waiting around, doing nothing. It wouldn’t even be
accurate to call it I/O-bound. If anything, it’s <em>timer-bound</em>,
waiting around before sending off the next line of data. <strong>The most
precious resource to conserve is <em>memory</em>.</strong></p>

<h4 id="processes">Processes</h4>

<p>The most straightforward way to implement something like Endlessh is a
fork server: accept a connection, fork, and the child simply alternates
between <code class="language-plaintext highlighter-rouge">sleep(3)</code> and <code class="language-plaintext highlighter-rouge">write(2)</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
    <span class="kt">ssize_t</span> <span class="n">r</span><span class="p">;</span>
    <span class="kt">char</span> <span class="n">line</span><span class="p">[</span><span class="mi">256</span><span class="p">];</span>

    <span class="n">sleep</span><span class="p">(</span><span class="n">DELAY</span><span class="p">);</span>
    <span class="n">generate_line</span><span class="p">(</span><span class="n">line</span><span class="p">);</span>
    <span class="n">r</span> <span class="o">=</span> <span class="n">write</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">line</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">line</span><span class="p">));</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">r</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span> <span class="o">&amp;&amp;</span> <span class="n">errno</span> <span class="o">!=</span> <span class="n">EINTR</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">exit</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>A process per connection is a lot of overhead when connections are
expected to be up hours or even weeks at a time. An attacker who knows
about this could exhaust the server’s resources with little effort by
opening up lots of connections.</p>

<h4 id="threads">Threads</h4>

<p>A better option is, instead of processes, to create a thread per
connection. On Linux <a href="/blog/2015/05/15/">this is practically the same thing</a>, but it’s
still better. However, you still have to allocate a stack for the thread
and the kernel will have to spend some resources managing the thread.</p>

<h4 id="poll">Poll</h4>

<p>For Endlessh I went for an even more lightweight version: a
single-threaded <code class="language-plaintext highlighter-rouge">poll(2)</code> server, analogous to stackless green threads.
The overhead per connection is about as low as it gets.</p>

<p>Clients that are being delayed are not registered in <code class="language-plaintext highlighter-rouge">poll(2)</code>. Their
only overhead is the socket object in the kernel, and another 78 bytes
to track them in Endlessh. Most of those bytes are used only for
accurate logging. Only those clients that are overdue for a new line
are registered for <code class="language-plaintext highlighter-rouge">poll(2)</code>.</p>

<p>When clients are waiting, but no clients are overdue, <code class="language-plaintext highlighter-rouge">poll(2)</code> is
essentially used in place of <code class="language-plaintext highlighter-rouge">sleep(3)</code>. Though since it still needs
to manage the <em>accept</em> server socket, it (almost) never actually waits
on <em>nothing</em>.</p>

<p>There’s an option to limit the total number of client connections so
that it doesn’t get out of hand. In this case it will stop polling the
accept socket until a client disconnects. I probably shouldn’t have
bothered with this option and instead relied on <code class="language-plaintext highlighter-rouge">ulimit</code>, a feature
already provided by the operating system.</p>

<p>I could have used epoll (Linux) or kqueue (BSD), which would be much
more efficient than <code class="language-plaintext highlighter-rouge">poll(2)</code>. The problem with <code class="language-plaintext highlighter-rouge">poll(2)</code> is that it’s
constantly registering and unregistering Endlessh on each of the
overdue sockets each time around the main loop. This is by far the
most CPU-intensive part of Endlessh, and it’s all inflicted on the
kernel. Most of the time, even with thousands of clients trapped in
the tarpit, only a small number of them at polled at once, so I opted
for better portability instead.</p>

<p>One consequence of not polling connections that are waiting is that
disconnections aren’t noticed in a timely fashion. This makes the logs
less accurate than I like, but otherwise it’s pretty harmless.
Unforunately even if I wanted to fix this, the <code class="language-plaintext highlighter-rouge">poll(2)</code> interface
isn’t quite equipped for it anyway.</p>

<h4 id="raw-sockets">Raw sockets</h4>

<p>With a <code class="language-plaintext highlighter-rouge">poll(2)</code> server, the biggest overhead remaining is in the
kernel, where it allocates send and receive buffers for each client
and manages the proper TCP state. The next step to reducing this
overhead is Endlessh opening a <em>raw socket</em> and speaking TCP itself,
bypassing most of the operating system’s TCP/IP stack.</p>

<p>Much of the TCP connection state doesn’t matter to Endlessh and doesn’t
need to be tracked. For example, it doesn’t care about any data sent by
the client, so no receive buffer is needed, and any data that arrives
could be dropped on the floor.</p>

<p>Even more, raw sockets would allow for some even nastier tarpit tricks.
Despite the long delays between data lines, the kernel itself responds
very quickly on the TCP layer and below. ACKs are sent back quickly and
so on. An astute attacker could detect that the delay is artificial,
imposed above the TCP layer by an application.</p>

<p>If Endlessh worked at the TCP layer, it could <a href="https://nyman.re/super-simple-ssh-tarpit/">tarpit the TCP protocol
itself</a>. It could introduce artificial “noise” to the connection
that requires packet retransmissions, delay ACKs, etc. It would look a
lot more like network problems than a tarpit.</p>

<p>I haven’t taken Endlessh this far, nor do I plan to do so. At the
moment attackers either have a hard timeout, so this wouldn’t matter,
or they’re pretty dumb and Endlessh already works well enough.</p>

<h3 id="asyncio-and-other-tarpits">asyncio and other tarpits</h3>

<p>Since writing Endless <a href="/blog/2019/03/10/">I’ve learned about Python’s <code class="language-plaintext highlighter-rouge">asyncio</code></a>, and
it’s actually a near perfect fit for this problem. I should have just
used it in the first place. The hard part is already implemented within
<code class="language-plaintext highlighter-rouge">asyncio</code>, and the problem isn’t CPU-bound, so being written in Python
<a href="/blog/2019/02/24/">doesn’t matter</a>.</p>

<p>Here’s a simplified (no logging, no configuration, etc.) version of
Endlessh implemented in about 20 lines of Python 3.7:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">asyncio</span>
<span class="kn">import</span> <span class="nn">random</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">handler</span><span class="p">(</span><span class="n">_reader</span><span class="p">,</span> <span class="n">writer</span><span class="p">):</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span>
            <span class="n">writer</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="sa">b</span><span class="s">'%x</span><span class="se">\r\n</span><span class="s">'</span> <span class="o">%</span> <span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="o">**</span><span class="mi">32</span><span class="p">))</span>
            <span class="k">await</span> <span class="n">writer</span><span class="p">.</span><span class="n">drain</span><span class="p">()</span>
    <span class="k">except</span> <span class="nb">ConnectionResetError</span><span class="p">:</span>
        <span class="k">pass</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="n">server</span> <span class="o">=</span> <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">start_server</span><span class="p">(</span><span class="n">handler</span><span class="p">,</span> <span class="s">'0.0.0.0'</span><span class="p">,</span> <span class="mi">2222</span><span class="p">)</span>
    <span class="k">async</span> <span class="k">with</span> <span class="n">server</span><span class="p">:</span>
        <span class="k">await</span> <span class="n">server</span><span class="p">.</span><span class="n">serve_forever</span><span class="p">()</span>

<span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">main</span><span class="p">())</span>
</code></pre></div></div>

<p>Since Python coroutines are stackless, the per-connection memory
overhead is comparable to the C version. So it seems asyncio is
perfectly suited for writing tarpits! Here’s an HTTP tarpit to trip up
attackers trying to exploit HTTP servers. It slowly sends a random,
endless HTTP header:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">asyncio</span>
<span class="kn">import</span> <span class="nn">random</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">handler</span><span class="p">(</span><span class="n">_reader</span><span class="p">,</span> <span class="n">writer</span><span class="p">):</span>
    <span class="n">writer</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="sa">b</span><span class="s">'HTTP/1.1 200 OK</span><span class="se">\r\n</span><span class="s">'</span><span class="p">)</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
            <span class="n">header</span> <span class="o">=</span> <span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="o">**</span><span class="mi">32</span><span class="p">)</span>
            <span class="n">value</span> <span class="o">=</span> <span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="o">**</span><span class="mi">32</span><span class="p">)</span>
            <span class="n">writer</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="sa">b</span><span class="s">'X-%x: %x</span><span class="se">\r\n</span><span class="s">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
            <span class="k">await</span> <span class="n">writer</span><span class="p">.</span><span class="n">drain</span><span class="p">()</span>
    <span class="k">except</span> <span class="nb">ConnectionResetError</span><span class="p">:</span>
        <span class="k">pass</span>

<span class="k">async</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="n">server</span> <span class="o">=</span> <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">start_server</span><span class="p">(</span><span class="n">handler</span><span class="p">,</span> <span class="s">'0.0.0.0'</span><span class="p">,</span> <span class="mi">8080</span><span class="p">)</span>
    <span class="k">async</span> <span class="k">with</span> <span class="n">server</span><span class="p">:</span>
        <span class="k">await</span> <span class="n">server</span><span class="p">.</span><span class="n">serve_forever</span><span class="p">()</span>

<span class="n">asyncio</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">main</span><span class="p">())</span>
</code></pre></div></div>

<p>Try it out for yourself. Firefox and Chrome will spin on that server
for hours before giving up. I have yet to see curl actually timeout on
its own in the default settings (<code class="language-plaintext highlighter-rouge">--max-time</code>/<code class="language-plaintext highlighter-rouge">-m</code> does work
correctly, though).</p>

<p>Parting exercise for the reader: Using the examples above as a starting
point, implement an SMTP tarpit using asyncio. Bonus points for using
TLS connections and testing it against real spammers.</p>

]]>
    </content>
  </entry>
    
  
    
  <entry>
    <title>An Async / Await Library for Emacs Lisp</title>
    <link rel="alternate" type="text/html" href="https://nullprogram.com/blog/2019/03/10/"/>
    <id>urn:uuid:5d1462fa-a30d-432e-9a4f-827eb67862b2</id>
    <updated>2019-03-10T20:57:03Z</updated>
    <category term="emacs"/><category term="elisp"/><category term="lisp"/><category term="python"/><category term="javascript"/><category term="lang"/><category term="asyncio"/>
    <content type="html">
      <![CDATA[<p>As part of <a href="/blog/2019/02/24/">building my Python proficiency</a>, I’ve learned how to
use <a href="https://docs.python.org/3/library/asyncio.html">asyncio</a>. This new language feature <a href="https://docs.python.org/3/whatsnew/3.5.html#whatsnew-pep-492">first appeared in
Python 3.5</a> (<a href="https://www.python.org/dev/peps/pep-0492/">PEP 492</a>, September 2015). JavaScript grew <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function">a
nearly identical feature</a> in ES2017 (June 2017). An async function
can pause to await on an asynchronously computed result, much like a
generator pausing when it yields a value.</p>

<p>In fact, both Python and JavaScript async functions are essentially just
fancy generator functions with some specialized syntax and semantics.
That is, they’re <a href="https://blog.varunramesh.net/posts/stackless-vs-stackful-coroutines/">stackless coroutines</a>. Both languages already had
generators, so their generator-like async functions are a natural
extension that — unlike <a href="/blog/2017/06/21/"><em>stackful</em> coroutines</a> — do not require
significant, new runtime plumbing.</p>

<p>Emacs <a href="/blog/2018/05/31/">officially got generators in 25.1</a> (September 2016),
though, unlike Python and JavaScript, it didn’t require any additional
support from the compiler or runtime. It’s implemented entirely using
Lisp macros. In other words, it’s just another library, not a core
language feature. In theory, the generator library could be easily
backported to the first Emacs release to <a href="/blog/2016/12/22/">properly support lexical
closures</a>, Emacs 24.1 (June 2012).</p>

<p>For the same reason, stackless async/await coroutines can also be
implemented as a library. So that’s what I did, letting Emacs’ generator
library do most of the heavy lifting. The package is called <code class="language-plaintext highlighter-rouge">aio</code>:</p>

<ul>
  <li><strong><a href="https://github.com/skeeto/emacs-aio">https://github.com/skeeto/emacs-aio</a></strong></li>
</ul>

<p>It’s modeled more closely on JavaScript’s async functions than Python’s
asyncio, with the core representation being <em>promises</em> rather than a
coroutine objects. I just have an easier time reasoning about promises
than coroutines.</p>

<p>I’m definitely <a href="https://github.com/chuntaro/emacs-async-await">not the first person to realize this was
possible</a>, and was beaten to the punch by two years. Wanting to
<a href="http://www.winestockwebdesign.com/Essays/Lisp_Curse.html">avoid fragmentation</a>, I set aside all formality in my first
iteration on the idea, not even bothering with namespacing my
identifiers. It was to be only an educational exercise. However, I got
quite attached to my little toy. Once I got my head wrapped around the
problem, everything just sort of clicked into place so nicely.</p>

<p>In this article I will show step-by-step one way to build async/await
on top of generators, laying out one concept at a time and then
building upon each. But first, some examples to illustrate the desired
final result.</p>

<h3 id="aio-example">aio example</h3>

<p>Ignoring <a href="/blog/2016/06/16/">all its problems</a> for a moment, suppose you want to use
<code class="language-plaintext highlighter-rouge">url-retrieve</code> to fetch some content from a URL and return it. To keep
this simple, I’m going to omit error handling. Also assume that
<code class="language-plaintext highlighter-rouge">lexical-binding</code> is <code class="language-plaintext highlighter-rouge">t</code> for all examples. Besides, lexical scope
required by the generator library, and therefore also required by <code class="language-plaintext highlighter-rouge">aio</code>.</p>

<p>The most naive approach is to fetch the content synchronously:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">fetch-fortune-1</span> <span class="p">(</span><span class="nv">url</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">buffer</span> <span class="p">(</span><span class="nv">url-retrieve-synchronously</span> <span class="nv">url</span><span class="p">)))</span>
    <span class="p">(</span><span class="nv">with-current-buffer</span> <span class="nv">buffer</span>
      <span class="p">(</span><span class="nb">prog1</span> <span class="p">(</span><span class="nv">buffer-string</span><span class="p">)</span>
        <span class="p">(</span><span class="nv">kill-buffer</span><span class="p">)))))</span>
</code></pre></div></div>

<p>The result is returned directly, and errors are communicated by an error
signal (e.g. Emacs’ version of exceptions). This is convenient, but the
function will block the main thread, locking up Emacs until the result
has arrived. This is obviously very undesirable, so, in practice,
everyone nearly always uses the asynchronous version:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">fetch-fortune-2</span> <span class="p">(</span><span class="nv">url</span> <span class="nv">callback</span><span class="p">)</span>
  <span class="p">(</span><span class="nv">url-retrieve</span> <span class="nv">url</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">_status</span><span class="p">)</span>
                      <span class="p">(</span><span class="nb">funcall</span> <span class="nv">callback</span> <span class="p">(</span><span class="nv">buffer-string</span><span class="p">)))))</span>
</code></pre></div></div>

<p>The main thread no longer blocks, but it’s a whole lot less
convenient. The result isn’t returned to the caller, and instead the
caller supplies a callback function. The result, whether success or
failure, will be delivered via callback, so the caller must split
itself into two pieces: the part before the callback and the callback
itself. Errors cannot be delivered using a error signal because of the
inverted flow control.</p>

<p>The situation gets worse if, say, you need to fetch results from two
different URLs. You either fetch results one at a time (inefficient),
or you manage two different callbacks that could be invoked in any
order, and therefore have to coordinate.</p>

<p><em>Wouldn’t it be nice for the function to work like the first example,
but be asynchronous like the second example?</em> Enter async/await:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">fetch-fortune-3</span> <span class="p">(</span><span class="nv">url</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">buffer</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">aio-url-retrieve</span> <span class="nv">url</span><span class="p">))))</span>
    <span class="p">(</span><span class="nv">with-current-buffer</span> <span class="nv">buffer</span>
      <span class="p">(</span><span class="nb">prog1</span> <span class="p">(</span><span class="nv">buffer-string</span><span class="p">)</span>
        <span class="p">(</span><span class="nv">kill-buffer</span><span class="p">)))))</span>
</code></pre></div></div>

<p>A function defined with <code class="language-plaintext highlighter-rouge">aio-defun</code> is just like <code class="language-plaintext highlighter-rouge">defun</code> except that
it can use <code class="language-plaintext highlighter-rouge">aio-await</code> to pause and wait on any other function defined
with <code class="language-plaintext highlighter-rouge">aio-defun</code> — or, more specifically, any function that returns a
promise. Borrowing Python parlance: Returning a promise makes a
function <em>awaitable</em>. If there’s an error, it’s delivered as a error
signal from <code class="language-plaintext highlighter-rouge">aio-url-retrieve</code>, just like the first example. When
called, this function returns immediately with a promise object that
represents a future result. The caller might look like this:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">defcustom</span> <span class="nv">fortune-url</span> <span class="o">...</span><span class="p">)</span>

<span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">display-fortune</span> <span class="p">()</span>
  <span class="p">(</span><span class="nv">interactive</span><span class="p">)</span>
  <span class="p">(</span><span class="nv">message</span> <span class="s">"%s"</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">fetch-fortune-3</span> <span class="nv">fortune-url</span><span class="p">))))</span>
</code></pre></div></div>

<p>How wonderfully clean that looks! And, yes, it even works with
<code class="language-plaintext highlighter-rouge">interactive</code> like that. I can <code class="language-plaintext highlighter-rouge">M-x display-fortune</code> and a fortune is
printed in the minibuffer as soon as the result arrives from the
server. In the meantime Emacs doesn’t block and I can continue my
work.</p>

<p>You can’t do anything you couldn’t already do before. It’s just a
nicer way to organize the same callbacks: <em>implicit</em> rather than
<em>explicit</em>.</p>

<h3 id="promises-simplified">Promises, simplified</h3>

<p>The core object at play is the <em>promise</em>. Promises are already a
rather simple concept, but <code class="language-plaintext highlighter-rouge">aio</code> promises have been distilled to their
essence, as they’re only needed for this singular purpose. More on
this later.</p>

<p>As I said, a promise represents a future result. In practical terms, a
promise is just an object to which one can subscribe with a callback.
When the result is ready, the callbacks are invoked. Another way to
put it is that <em>promises <a href="https://en.wikipedia.org/wiki/Reification_(computer_science)">reify</a> the concept of callbacks</em>. A
callback is no longer just the idea of extra argument on a function.
It’s a first-class <em>thing</em> that itself can be passed around as a
value.</p>

<p>Promises have two slots: the final promise <em>result</em> and a list of
<em>subscribers</em>. A <code class="language-plaintext highlighter-rouge">nil</code> result means the result hasn’t been computed
yet. It’s so simple I’m not even <a href="/blog/2018/02/14/">bothering with <code class="language-plaintext highlighter-rouge">cl-struct</code></a>.</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-promise</span> <span class="p">()</span>
  <span class="s">"Create a new promise object."</span>
  <span class="p">(</span><span class="nv">record</span> <span class="ss">'aio-promise</span> <span class="no">nil</span> <span class="p">()))</span>

<span class="p">(</span><span class="nv">defsubst</span> <span class="nv">aio-promise-p</span> <span class="p">(</span><span class="nv">object</span><span class="p">)</span>
  <span class="p">(</span><span class="nb">and</span> <span class="p">(</span><span class="nb">eq</span> <span class="ss">'aio-promise</span> <span class="p">(</span><span class="nb">type-of</span> <span class="nv">object</span><span class="p">))</span>
       <span class="p">(</span><span class="nb">=</span> <span class="mi">3</span> <span class="p">(</span><span class="nb">length</span> <span class="nv">object</span><span class="p">))))</span>

<span class="p">(</span><span class="nv">defsubst</span> <span class="nv">aio-result</span> <span class="p">(</span><span class="nv">promise</span><span class="p">)</span>
  <span class="p">(</span><span class="nb">aref</span> <span class="nv">promise</span> <span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>

<p>To subscribe to a promise, use <code class="language-plaintext highlighter-rouge">aio-listen</code>:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-listen</span> <span class="p">(</span><span class="nv">promise</span> <span class="nv">callback</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">result</span> <span class="p">(</span><span class="nv">aio-result</span> <span class="nv">promise</span><span class="p">)))</span>
    <span class="p">(</span><span class="k">if</span> <span class="nv">result</span>
        <span class="p">(</span><span class="nv">run-at-time</span> <span class="mi">0</span> <span class="no">nil</span> <span class="nv">callback</span> <span class="nv">result</span><span class="p">)</span>
      <span class="p">(</span><span class="nb">push</span> <span class="nv">callback</span> <span class="p">(</span><span class="nb">aref</span> <span class="nv">promise</span> <span class="mi">2</span><span class="p">)))))</span>
</code></pre></div></div>

<p>If the result isn’t ready yet, add the callback to the list of
subscribers. If the result is ready <em>call the callback in the next
event loop turn</em> using <code class="language-plaintext highlighter-rouge">run-at-time</code>. This is important because it
keeps all the asynchronous components isolated from one another. They
won’t see each others’ frames on the call stack, nor frames from
<code class="language-plaintext highlighter-rouge">aio</code>. This is so important that the <a href="https://promisesaplus.com/">Promises/A+ specification</a>
is explicit about it.</p>

<p>The other half of the equation is resolving a promise, which is done
with <code class="language-plaintext highlighter-rouge">aio-resolve</code>. Unlike other promises, <code class="language-plaintext highlighter-rouge">aio</code> promises don’t care
whether the promise is being <em>fulfilled</em> (success) or <em>rejected</em>
(error). Instead a promise is resolved using a <em>value function</em> — or,
usually, a <em>value closure</em>. Subscribers receive this value function
and extract the value by invoking it with no arguments.</p>

<p>Why? This lets the promise’s resolver decide the semantics of the
result. Instead of returning a value, this function can instead signal
an error, propagating an error signal that terminated an async function.
Because of this, the promise doesn’t need to know how it’s being
resolved.</p>

<p>When a promise is resolved, subscribers are each scheduled in their own
event loop turns in the same order that they subscribed. If a promise
has already been resolved, nothing happens. (Thought: Perhaps this
should be an error in order to catch API misuse?)</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-resolve</span> <span class="p">(</span><span class="nv">promise</span> <span class="nv">value-function</span><span class="p">)</span>
  <span class="p">(</span><span class="nb">unless</span> <span class="p">(</span><span class="nv">aio-result</span> <span class="nv">promise</span><span class="p">)</span>
    <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">callbacks</span> <span class="p">(</span><span class="nb">nreverse</span> <span class="p">(</span><span class="nb">aref</span> <span class="nv">promise</span> <span class="mi">2</span><span class="p">))))</span>
      <span class="p">(</span><span class="nb">setf</span> <span class="p">(</span><span class="nb">aref</span> <span class="nv">promise</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">value-function</span>
            <span class="p">(</span><span class="nb">aref</span> <span class="nv">promise</span> <span class="mi">2</span><span class="p">)</span> <span class="p">())</span>
      <span class="p">(</span><span class="nb">dolist</span> <span class="p">(</span><span class="nv">callback</span> <span class="nv">callbacks</span><span class="p">)</span>
        <span class="p">(</span><span class="nv">run-at-time</span> <span class="mi">0</span> <span class="no">nil</span> <span class="nv">callback</span> <span class="nv">value-function</span><span class="p">)))))</span>
</code></pre></div></div>

<p>If you’re not an async function, you might subscribe to a promise like
so:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-listen</span> <span class="nv">promise</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">v</span><span class="p">)</span>
                      <span class="p">(</span><span class="nv">message</span> <span class="s">"%s"</span> <span class="p">(</span><span class="nb">funcall</span> <span class="nv">v</span><span class="p">))))</span>
</code></pre></div></div>

<p>The simplest example of a non-async function that creates and delivers
on a promise is a “sleep” function:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-sleep</span> <span class="p">(</span><span class="nv">seconds</span> <span class="k">&amp;optional</span> <span class="nv">result</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">promise</span> <span class="p">(</span><span class="nv">aio-promise</span><span class="p">))</span>
        <span class="p">(</span><span class="nv">value-function</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="nv">result</span><span class="p">)))</span>
    <span class="p">(</span><span class="nb">prog1</span> <span class="nv">promise</span>
      <span class="p">(</span><span class="nv">run-at-time</span> <span class="nv">seconds</span> <span class="no">nil</span>
                   <span class="nf">#'</span><span class="nv">aio-resolve</span> <span class="nv">promise</span> <span class="nv">value-function</span><span class="p">))))</span>
</code></pre></div></div>

<p>Similarly, here’s a “timeout” promise that delivers a special timeout
error signal at a given time in the future.</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-timeout</span> <span class="p">(</span><span class="nv">seconds</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">promise</span> <span class="p">(</span><span class="nv">aio-promise</span><span class="p">))</span>
        <span class="p">(</span><span class="nv">value-function</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="p">(</span><span class="nb">signal</span> <span class="ss">'aio-timeout</span> <span class="no">nil</span><span class="p">))))</span>
    <span class="p">(</span><span class="nb">prog1</span> <span class="nv">promise</span>
      <span class="p">(</span><span class="nv">run-at-time</span> <span class="nv">seconds</span> <span class="no">nil</span>
                   <span class="nf">#'</span><span class="nv">aio-resolve</span> <span class="nv">promise</span> <span class="nv">value-function</span><span class="p">))))</span>
</code></pre></div></div>

<p>That’s all there is to promises.</p>

<h3 id="evaluate-in-the-context-of-a-promise">Evaluate in the context of a promise</h3>

<p>Before we get into pausing functions, lets deal with the slightly
simpler matter of delivering their return values using a promise. What
we need is a way to evaluate a “body” and capture its result in a
promise. If the body exits due to a signal, we want to capture that as
well.</p>

<p>Here’s a macro that does just this:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defmacro</span> <span class="nv">aio-with-promise</span> <span class="p">(</span><span class="nv">promise</span> <span class="k">&amp;rest</span> <span class="nv">body</span><span class="p">)</span>
  <span class="o">`</span><span class="p">(</span><span class="nv">aio-resolve</span> <span class="o">,</span><span class="nv">promise</span>
                <span class="p">(</span><span class="nv">condition-case</span> <span class="nb">error</span>
                    <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">result</span> <span class="p">(</span><span class="k">progn</span> <span class="o">,@</span><span class="nv">body</span><span class="p">)))</span>
                      <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="nv">result</span><span class="p">))</span>
                  <span class="p">(</span><span class="nb">error</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
                           <span class="p">(</span><span class="nb">signal</span> <span class="p">(</span><span class="nb">car</span> <span class="nb">error</span><span class="p">)</span> <span class="c1">; rethrow</span>
                                   <span class="p">(</span><span class="nb">cdr</span> <span class="nb">error</span><span class="p">)))))))</span>
</code></pre></div></div>

<p>The body result is captured in a closure and delivered to the promise.
If there’s an error signal, it’s “<em>rethrown</em>” into subscribers by the
promise’s value function.</p>

<p>This is where Emacs Lisp has a serious weak spot. There’s not really a
concept of rethrowing a signal. Unlike a language with explicit
exception objects that can capture a snapshot of the backtrace, the
original backtrace is completely lost where the signal is caught.
There’s no way to “reattach” it to the signal when it’s rethrown. This
is unfortunate because it would greatly help debugging if you got to see
the full backtrace on the other side of the promise.</p>

<h3 id="async-functions">Async functions</h3>

<p>So we have promises and we want to pause a function on a promise.
Generators have <code class="language-plaintext highlighter-rouge">iter-yield</code> for pausing an iterator’s execution. To
tackle this problem:</p>

<ol>
  <li>Yield the promise to pause the iterator.</li>
  <li>Subscribe a callback on the promise that continues the generator
(<code class="language-plaintext highlighter-rouge">iter-next</code>) with the promise’s result as the yield result.</li>
</ol>

<p>All the hard work is done in either side of the yield, so <code class="language-plaintext highlighter-rouge">aio-await</code> is
just a simple wrapper around <code class="language-plaintext highlighter-rouge">iter-yield</code>:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defmacro</span> <span class="nv">aio-await</span> <span class="p">(</span><span class="nv">expr</span><span class="p">)</span>
  <span class="o">`</span><span class="p">(</span><span class="nb">funcall</span> <span class="p">(</span><span class="nv">iter-yield</span> <span class="o">,</span><span class="nv">expr</span><span class="p">)))</span>
</code></pre></div></div>

<p>Remember, that <code class="language-plaintext highlighter-rouge">funcall</code> is here to extract the promise value from the
value function. If it signals an error, this propagates directly into
the iterator just as if it had been a direct call — minus an accurate
backtrace.</p>

<p>So <code class="language-plaintext highlighter-rouge">aio-lambda</code> / <code class="language-plaintext highlighter-rouge">aio-defun</code> needs to wrap the body in a generator
(<code class="language-plaintext highlighter-rouge">iter-lamba</code>), invoke it to produce a generator, then drive the
generator using callbacks. Here’s a simplified, unhygienic definition of
<code class="language-plaintext highlighter-rouge">aio-lambda</code>:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defmacro</span> <span class="nv">aio-lambda</span> <span class="p">(</span><span class="nv">arglist</span> <span class="k">&amp;rest</span> <span class="nv">body</span><span class="p">)</span>
  <span class="o">`</span><span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="k">&amp;rest</span> <span class="nv">args</span><span class="p">)</span>
     <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">promise</span> <span class="p">(</span><span class="nv">aio-promise</span><span class="p">))</span>
           <span class="p">(</span><span class="nv">iter</span> <span class="p">(</span><span class="nb">apply</span> <span class="p">(</span><span class="nv">iter-lambda</span> <span class="o">,</span><span class="nv">arglist</span>
                          <span class="p">(</span><span class="nv">aio-with-promise</span> <span class="nv">promise</span>
                            <span class="o">,@</span><span class="nv">body</span><span class="p">))</span>
                        <span class="nv">args</span><span class="p">)))</span>
       <span class="p">(</span><span class="nb">prog1</span> <span class="nv">promise</span>
         <span class="p">(</span><span class="nv">aio--step</span> <span class="nv">iter</span> <span class="nv">promise</span> <span class="no">nil</span><span class="p">)))))</span>
</code></pre></div></div>

<p>The body is evaluated inside <code class="language-plaintext highlighter-rouge">aio-with-promise</code> with the result
delivered to the promise returned directly by the async function.</p>

<p>Before returning, the iterator is handed to <code class="language-plaintext highlighter-rouge">aio--step</code>, which drives
the iterator forward until it delivers its first promise. When the
iterator yields a promise, <code class="language-plaintext highlighter-rouge">aio--step</code> attaches a callback back to
itself on the promise as described above. Immediately driving the
iterator up to the first yielded promise “primes” it, which is
important for getting the ball rolling on any asynchronous operations.</p>

<p>If the iterator ever yields something other than a promise, it’s
delivered right back into the iterator.</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio--step</span> <span class="p">(</span><span class="nv">iter</span> <span class="nv">promise</span> <span class="nv">yield-result</span><span class="p">)</span>
  <span class="p">(</span><span class="nv">condition-case</span> <span class="nv">_</span>
      <span class="p">(</span><span class="nv">cl-loop</span> <span class="nv">for</span> <span class="nv">result</span> <span class="nb">=</span> <span class="p">(</span><span class="nv">iter-next</span> <span class="nv">iter</span> <span class="nv">yield-result</span><span class="p">)</span>
               <span class="nv">then</span> <span class="p">(</span><span class="nv">iter-next</span> <span class="nv">iter</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="nv">result</span><span class="p">))</span>
               <span class="nv">until</span> <span class="p">(</span><span class="nv">aio-promise-p</span> <span class="nv">result</span><span class="p">)</span>
               <span class="nv">finally</span> <span class="p">(</span><span class="nv">aio-listen</span> <span class="nv">result</span>
                                   <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">value</span><span class="p">)</span>
                                     <span class="p">(</span><span class="nv">aio--step</span> <span class="nv">iter</span> <span class="nv">promise</span> <span class="nv">value</span><span class="p">))))</span>
    <span class="p">(</span><span class="nv">iter-end-of-sequence</span><span class="p">)))</span>
</code></pre></div></div>

<p>When the iterator is done, nothing more needs to happen since the
iterator resolves its own return value promise.</p>

<p>The definition of <code class="language-plaintext highlighter-rouge">aio-defun</code> just uses <code class="language-plaintext highlighter-rouge">aio-lambda</code> with <code class="language-plaintext highlighter-rouge">defalias</code>.
There’s nothing to it.</p>

<p>That’s everything you need! Everything else in the package is merely
useful, awaitable functions like <code class="language-plaintext highlighter-rouge">aio-sleep</code> and <code class="language-plaintext highlighter-rouge">aio-timeout</code>.</p>

<h3 id="composing-promises">Composing promises</h3>

<p>Unfortunately <code class="language-plaintext highlighter-rouge">url-retrieve</code> doesn’t support timeouts. We can work
around this by composing two promises: a <code class="language-plaintext highlighter-rouge">url-retrieve</code> promise and
<code class="language-plaintext highlighter-rouge">aio-timeout</code> promise. First define a promise-returning function,
<code class="language-plaintext highlighter-rouge">aio-select</code> that takes a list of promises and returns (as another
promise) the first promise to resolve:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nb">defun</span> <span class="nv">aio-select</span> <span class="p">(</span><span class="nv">promises</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">result</span> <span class="p">(</span><span class="nv">aio-promise</span><span class="p">)))</span>
    <span class="p">(</span><span class="nb">prog1</span> <span class="nv">result</span>
      <span class="p">(</span><span class="nb">dolist</span> <span class="p">(</span><span class="nv">promise</span> <span class="nv">promises</span><span class="p">)</span>
        <span class="p">(</span><span class="nv">aio-listen</span> <span class="nv">promise</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">_</span><span class="p">)</span>
                              <span class="p">(</span><span class="nv">aio-resolve</span>
                               <span class="nv">result</span>
                               <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span> <span class="nv">promise</span><span class="p">))))))))</span>
</code></pre></div></div>

<p>We give <code class="language-plaintext highlighter-rouge">aio-select</code> both our <code class="language-plaintext highlighter-rouge">url-retrieve</code> and <code class="language-plaintext highlighter-rouge">timeout</code> promises, and
it tells us which resolved first:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">fetch-fortune-4</span> <span class="p">(</span><span class="nv">url</span> <span class="nv">timeout</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let*</span> <span class="p">((</span><span class="nv">promises</span> <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nv">aio-url-retrieve</span> <span class="nv">url</span><span class="p">)</span>
                         <span class="p">(</span><span class="nv">aio-timeout</span> <span class="nv">timeout</span><span class="p">)))</span>
         <span class="p">(</span><span class="nv">fastest</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">aio-select</span> <span class="nv">promises</span><span class="p">)))</span>
         <span class="p">(</span><span class="nv">buffer</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="nv">fastest</span><span class="p">)))</span>
    <span class="p">(</span><span class="nv">with-current-buffer</span> <span class="nv">buffer</span>
      <span class="p">(</span><span class="nb">prog1</span> <span class="p">(</span><span class="nv">buffer-string</span><span class="p">)</span>
        <span class="p">(</span><span class="nv">kill-buffer</span><span class="p">)))))</span>
</code></pre></div></div>

<p>Cool! Note: This will not actually cancel the URL request, just move
the async function forward earlier and prevent it from getting the
result.</p>

<h3 id="threads">Threads</h3>

<p>Despite <code class="language-plaintext highlighter-rouge">aio</code> being entirely about managing concurrent, asynchronous
operations, it has nothing at all to do with threads — as in Emacs 26’s
support for kernel threads. All async functions and promise callbacks
are expected to run <em>only</em> on the main thread. That’s not to say an
async function can’t await on a result from another thread. It just must
be <a href="/blog/2017/02/14/">done very carefully</a>.</p>

<h3 id="processes">Processes</h3>

<p>The package also includes two functions for realizing promises on
processes, whether they be subprocesses or network sockets.</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">aio-process-filter</code></li>
  <li><code class="language-plaintext highlighter-rouge">aio-process-sentinel</code></li>
</ul>

<p>For example, this function loops over each chunk of output (typically
4kB) from the process, as delivered to a filter function:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">process-chunks</span> <span class="p">(</span><span class="nv">process</span><span class="p">)</span>
  <span class="p">(</span><span class="nv">cl-loop</span> <span class="nv">for</span> <span class="nv">chunk</span> <span class="nb">=</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">aio-process-filter</span> <span class="nv">process</span><span class="p">))</span>
           <span class="nv">while</span> <span class="nv">chunk</span>
           <span class="nb">do</span> <span class="p">(</span><span class="o">...</span> <span class="nv">process</span> <span class="nv">chunk</span> <span class="o">...</span><span class="p">)))</span>
</code></pre></div></div>

<p>Exercise for the reader: Write an awaitable function that returns a line
at at time rather than a chunk at a time. You can build it on top of
<code class="language-plaintext highlighter-rouge">aio-process-filter</code>.</p>

<p>I considered wrapping functions like <code class="language-plaintext highlighter-rouge">start-process</code> so that their <code class="language-plaintext highlighter-rouge">aio</code>
versions would return a promise representing some kind of result from
the process. However there are <em>so</em> many different ways to create and
configure processes that I would have ended up duplicating all the
process functions. Focusing on the filter and sentinel, and letting the
caller create and configure the process is much cleaner.</p>

<p>Unfortunately Emacs has no asynchronous API for writing output to a
process. Both <code class="language-plaintext highlighter-rouge">process-send-string</code> and <code class="language-plaintext highlighter-rouge">process-send-region</code> will block
if the pipe or socket is full. There is no callback, so you cannot await
on writing output. Maybe there’s a way to do it with a dedicated thread?</p>

<p>Another issue is that the <code class="language-plaintext highlighter-rouge">process-send-*</code> functions <a href="/blog/2013/01/14/">are
preemptible</a>, made necessary because they block. The
<code class="language-plaintext highlighter-rouge">aio-process-*</code> functions leave a gap (i.e. between filter awaits)
where no filter or sentinel function is attached. It’s a consequence
of promises being single-fire. The gap is harmless so long as the
async function doesn’t await something else or get preempted. This
needs some more thought.</p>

<p><strong><em>Update</em></strong>: These process functions no longer exist and have been
replaced by a small framework for building chains of promises. See
<code class="language-plaintext highlighter-rouge">aio-make-callback</code>.</p>

<h3 id="testing-aio">Testing aio</h3>

<p>The test suite for <code class="language-plaintext highlighter-rouge">aio</code> is a bit unusual. Emacs’ built-in test suite,
ERT, doesn’t support asynchronous tests. Furthermore, tests are
generally run in batch mode, where Emacs invokes a single function and
then exits rather than pump an event loop. Batch mode can only handle
asynchronous process I/O, not the async functions of <code class="language-plaintext highlighter-rouge">aio</code>. So it’s
not possible to run the tests in batch mode.</p>

<p>Instead I hacked together a really crude callback-based test suite. It
runs in non-batch mode and writes the test results into a buffer
(run with <code class="language-plaintext highlighter-rouge">make check</code>). Not ideal, but it works.</p>

<p>One of the tests is a sleep sort (with reasonable tolerances). It’s a
pretty neat demonstration of what you can do with <code class="language-plaintext highlighter-rouge">aio</code>:</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">sleep-sort</span> <span class="p">(</span><span class="nb">values</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nv">promises</span> <span class="p">(</span><span class="nb">mapcar</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">v</span><span class="p">)</span> <span class="p">(</span><span class="nv">aio-sleep</span> <span class="nv">v</span> <span class="nv">v</span><span class="p">))</span> <span class="nb">values</span><span class="p">)))</span>
    <span class="p">(</span><span class="nv">cl-loop</span> <span class="nv">while</span> <span class="nv">promises</span>
             <span class="nv">for</span> <span class="nv">next</span> <span class="nb">=</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">aio-select</span> <span class="nv">promises</span><span class="p">))</span>
             <span class="nb">do</span> <span class="p">(</span><span class="nb">setf</span> <span class="nv">promises</span> <span class="p">(</span><span class="nv">delq</span> <span class="nv">next</span> <span class="nv">promises</span><span class="p">))</span>
             <span class="nv">collect</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="nv">next</span><span class="p">))))</span>
</code></pre></div></div>

<p>To see it in action (<code class="language-plaintext highlighter-rouge">M-x sleep-sort-demo</code>):</p>

<div class="language-cl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nv">aio-defun</span> <span class="nv">sleep-sort-demo</span> <span class="p">()</span>
  <span class="p">(</span><span class="nv">interactive</span><span class="p">)</span>
  <span class="p">(</span><span class="k">let</span> <span class="p">((</span><span class="nb">values</span> <span class="o">'</span><span class="p">(</span><span class="mf">0.1</span> <span class="mf">0.4</span> <span class="mf">1.1</span> <span class="mf">0.2</span> <span class="mf">0.8</span> <span class="mf">0.6</span><span class="p">)))</span>
    <span class="p">(</span><span class="nv">message</span> <span class="s">"%S"</span> <span class="p">(</span><span class="nv">aio-await</span> <span class="p">(</span><span class="nv">sleep-sort</span> <span class="nb">values</span><span class="p">)))))</span>
</code></pre></div></div>

<h3 id="asyncawait-is-pretty-awesome">Async/await is pretty awesome</h3>

<p>I’m quite happy with how this all came together. Once I had the
concepts straight — particularly resolving to value functions —
everything made sense and all the parts fit together well, and mostly
by accident. That feels good.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  
    
  

</feed>
