Articles tagged elisp at null program

A Makefile for Emacs Packages

2020-01-22T02:54:41Z

Each of my Emacs packages has a Makefile to byte-compile all source files, run the tests, build a package file, and, in some cases, run the package in an interactive, temporary, isolated Emacs instance. These portable Makefiles have a similar structure and follow the same conventions. It would require more thought and feedback before I’d try to make it a standard, but these are conventions I’d like to see in other package Makefiles.

Here’s an incomplete list of examples:

You should make a habit of compiling your Emacs Lisp files even if you don’t think you need the performance. The byte-compiler, while dumb, does static analysis and may spot bugs and other issues early.

First things first: Every portable Makefile starts with a special target, .POSIX, to request standard behavior. This is followed by macro definitions. When compiling a C program, the CC macro is the name of the compiler. Analogously, when compiling Emacs packages the EMACS macro is the name of the Emacs program.

.POSIX:
EMACS = emacs

Users can now override the macro to specify alternate Emacs binaries. I use this all the time to test my packages under different versions of Emacs.

$ make clean
$ make EMACS=emacs-24.3 check
$ make clean
$ make EMACS=emacs-25.1 check

Note: It’s common to use ?= assignment here, but that is both non-standard and unnecessary. If you want to override macro definitions from the environment, use the -e option:

$ export EMACS=emacs-24.3
$ make -e

The first non-special target in the Makefile is the default target. For Emacs packages, this target should byte-compile all the source files, including tests. List the byte-compiled file names as the target dependencies:

compile: foo.elc foo-test.elc

Now for the tedious part: Define the dependencies between your different source files. It would be nice to automate this part somehow, but fortunately most packages just aren’t that complicated. You do not need to list trivial dependencies — i.e. mapping each .el file to its .elc file — since make will figure that out on its own.

Since foo-test.elc relies on foo.elc — it’s testing this file after all — the relationship must be indicated to make. For single file packages (one package file, one test file), this is all that’s needed:

foo-test.elc: foo.elc

I call my testing targets “check” and this target must depend on the byte-compiled files containing tests. It will transiently depend on the other package source files because of the previous section.

check: foo-test.elc
    $(EMACS) -Q --batch -L . -l foo-test.elc -f ert-run-tests-batch

The -Q option runs Emacs with “minimum customizations.” The -L . option puts the current directory in the load path so that (require 'foo) will work. Finally it loads the file containing the tests and instructs ERT to run all defined tests.

A good build can clean up after itself:

clean:
    rm -f foo.elc foo-test.elc

Finally we need one more thing to tie it all together: an inference rule to teach make how to compile .elc files from .el files.

.SUFFIXES: .el .elc
.el.elc:
    $(EMACS) -Q --batch -L . -f batch-byte-compile $<

This is similar to the “check” target, but compiles a source file instead of running tests.

For simple, single source file packages, this is all you need!

Complex packages

My most complex package is Elfeed which has 10 source files and 4 test files. It also includes a target to build a package file, which I would upload to Marmalade when it was still functioning. I did a few extra things to keep this tidy.

First, I define the package version in the Makefile:

VERSION = 1.2.3

It would be nice to grab this information from a reliable place (Git tag, source file, etc.), but I never found a reliable and satisfactory way to do this. Simple wins.

To avoid repeating myself, I list the source files in a macro as well:

EL   = foo-a.el foo-b.el foo-c.el
DOC  = README.md
TEST = foo-test.el

These will still need to have all their interdependencies individually defined for make. For example, if C depends on both A and B, but neither A nor B depend on each other, this is all you’d need:

foo-c.elc: foo-a.elc foo-b.elc

Done correctly you can perform parallel builds with the non-standard but common -j make option. This is pretty nice since Emacs can’t do parallel builds itself.

I use the file list macros in the “compile” and “check” targets:

compile: $(EL:.el=.elc) $(TEST:.el=.elc)
test: $(TEST:.el=.elc)

The “package” target copies everything under a directory and tars it up. The directory is removed first, if it exists, so that any potenntial leftover garbage from doesn’t get included.

package: foo-$(VERSION).tar
foo-$(VERSION).tar: $(EL) $(DOC)
    rm -rf foo-$(VERSION)/
    mkdir foo-$(VERSION)/
    cp $(EL) $(DOC) foo-$(VERSION)/
    tar cf $@ foo-$(VERSION)/
    rm -rf foo-$(VERSION)/

In Elfeed, the target to test in an interactive, temporary Emacs instance is called “virtual”. In Skewer it’s called “run”. The name of the target and the specific rules will depend on the package, should you even want this target at all. It’s handy to have the option test without my own configuration contaminating Emacs, and vice versa. When people report issues, I can also direct them to reproduce their issue in the clean environment.

Here’s what a simple “run” target might look like:

run: $(EL:.el=.elc)
    $(EMACS) -Q -L . -l foo-c.elc -f foo-mode

Make is not really designed to run interactive programs like this, but it works in practice.

Dependencies

What about packages with dependencies? I’ve used Cask in the past but was never satisfied, especially when integrating it into a Makefile. So, again, I’ve opted for the dumb-but-reliable option: request that dependencies are cloned in adjacent directories matching the dependency’s package name. For example, the EmacSQL Makefile header:

# Clone the dependencies of this package in sibling directories:
#     $ git clone https://github.com/cbbrowne/pg.el ../pg

I also define a new “linker flags” macro, LDFLAGS. Like with EMACS, this lets users override it if needed:

LDFLAGS = -L ../pg

Everywhere I use -L . I also include $(LDFLAGS). For example, in the inference rule:

.SUFFIXES: .el .elc
.el.elc:
    $(EMACS) -Q --batch -L . $(LDFLAGS) -f batch-byte-compile $<

If the dependencies follow these conventions, then these can also be compiled in a recursive way with little effort:

$ make -C ../pg

I’m not completely satisfied with this solution, particularly since it’s an odd burden on anyone using the Makefile, but it’s worked well enough for my needs. This is when I wish Emacs had distributed package management.

Efficient Alias of a Built-In Emacs Lisp Function

2019-12-10T02:32:04Z

Suppose you don’t like the names car and cdr, the traditional identifiers for two halves of a lisp cons cell. This is misguided. A cons is really just a 2-tuple, and the halves don’t have any particular meaning on their own, even as “head” and “tail.” However, maybe this is really important to you so you want to do it anyway. What’s the best way to go about it?

defalias

Emacs Lisp has a built-in function just for this, defalias, which is the obvious choice.

(defalias 'car-alias #'car)

The car built-in function is so fundamental to the language that it gets its own byte-code opcode. When you call car in your code, the byte-compiler doesn’t generate a function call, but instead uses a single instruction. For example, here’s an add function that sums the car of its two arguments. I’ve followed the definition with its disassembly (Emacs 26.3, lexical scope):

(defun add (a b)
  (+ (car a) (car b)))
;; 0       stack-ref 1
;; 1       car
;; 2       stack-ref 1
;; 3       car
;; 4       plus
;; 5       return

There are zero function calls because of the dedicated car opcode, and it has the optimal six byte-code instructions.

The problem with defalias is that the definition is permitted change — or be advised — and that robs the byte-compiler of optimization opportunities. It’s a constraint. When the byte-code compiler sees car-alias, it must emit a function call:

(defun add-alias (a b)
  (+ (car-alias a) (car-alias b)))
;; 0       constant  car-alias
;; 1       stack-ref 2
;; 2       call      1
;; 3       constant  car-alias
;; 4       stack-ref 2
;; 5       call      1
;; 6       plus
;; 7       return

This has two function calls and eight byte-code instructions. Those function calls are significantly more expensive than a car instruction, which will show in the benchmark later.

defsubst

An alternative is defsubst, an inlined function definition, which will inline an actual car. The semantics for defsubst are, like macros, explicit that re-definitions may not affect previous uses, so the constraint is gone. Unfortunately the byte-code compiler is pretty dumb, and does a poor job inlining car-subst.

(defsubst car-subst (x)
  (car x))

(defun add-subst (a b)
  (+ (car-subst a) (car-subst b)))
;; 0       stack-ref 1
;; 1       dup
;; 2       car
;; 3       stack-set 1
;; 5       stack-ref 1
;; 6       dup
;; 7       car
;; 8       stack-set 1
;; 10      plus
;; 11      return

There are zero function calls and ten byte-code instructions. The car opcode is in use, but there are five unnecessary instructions. This is still faster than making the function calls, though. If the byte-code compiler was just a little smarter and could compile this to the ideal case, then this would be the end of the discussion.

cl-first

The built-in cl-lib package has a cl-first alias for car. This was written by someone with intimate knowledge of Emacs Lisp, so how how well did they do?

(require 'cl-lib)

(defun add-cl-first (a b)
  (+ (cl-first a) (cl-first b)))
;; 0       stack-ref 1
;; 1       car
;; 2       stack-ref 1
;; 3       car
;; 4       plus
;; 5       return

It’s just like plain old car! How did they manage this? By using a byte-compiler hint:

(defalias 'cl-first 'car)
(put 'cl-first 'byte-optimizer 'byte-compile-inline-expand)

They used defalias, but they also manually told the byte-compiler to inline the definition like defsubst. In fact, defsubst expands to an expression that sets byte-compile-inline-expand, but, as seen above, the inline function overhead gets inlined and doesn’t get eliminated.

Benchmark

So how do the alternatives perform? (benchmark source)

add           (0.594811299 0 0.0)
add-alias     (1.232037132 0 0.0)
add-subst     (0.700044324 0 0.0)
add-cl-first  (0.58332882 0 0.0)

(The car of the list is the running time.) Since add and add-cl-first have the same byte-codes, we shouldn’t, and didn’t, see a significant difference. The simple use of defalias doubles the running time, and using defsubst is about 15% slower.

UTF-8 String Indexing Strategies

2019-05-29T21:52:06Z

This article was discussed on Hacker News.

When designing or, in some cases, implementing a programming language with built-in support for Unicode strings, an important decision must be made about how to represent or encode those strings in memory. Not all representations are equal, and there are trade-offs between different choices.

One issue to consider is that strings typically feature random access indexing of code points with a time complexity resembling constant time (O(1)). However, not all string representations actually support this well. Strings using variable length encoding, such as UTF-8 or UTF-16, have O(n) time complexity indexing, ignoring special cases (discussed below). The most obvious choice to achieve O(1) time complexity — an array of 32-bit values, as in UCS-4 — makes very inefficient use of memory, especially with typical strings.

Despite this, UTF-8 is still chosen in a number of programming languages, or at least in their implementations. In this article I’ll discuss three examples — Emacs Lisp, Julia, and Go — and how each takes a slightly different approach.

Emacs Lisp

Emacs Lisp has two different types of strings that generally can be used interchangeably: unibyte and multibyte. In fact, the difference between them is so subtle that I bet that most people writing Emacs Lisp don’t even realize there are two kinds of strings.

Emacs Lisp uses UTF-8 internally to encode all “multibyte” strings and buffers. To fully support arbitrary sequences of bytes in the files being edited, Emacs uses its own extension of Unicode to precisely and unambiguously represent raw bytes intermixed with text. Any arbitrary sequence of bytes can be decoded into Emacs’ internal representation, then losslessly re-encoded back into the exact same sequence of bytes.

Unibyte strings and buffers are really just byte-strings. In practice, they’re essentially ISO/IEC 8859-1, a.k.a. Latin-1. It’s a Unicode string where all code points are below 256. Emacs prefers the smallest and simplest string representation when possible, similar to CPython 3.3+.

(multibyte-string-p "hello")
;; => nil

(multibyte-string-p "π ≈ 3.14")
;; => t

Emacs Lisp strings are mutable, and therein lies the kicker: As soon as you insert a code point above 255, Emacs quietly converts the string to multibyte.

(defvar fish "fish")

(multibyte-string-p fish)
;; => nil

(setf (aref fish 2) ?ŝ
      (aref fish 3) ?o)

fish
;; => "fiŝo"

(multibyte-string-p fish)
;; => t

Constant time indexing into unibyte strings is straightforward, and Emacs does the obvious thing when indexing into unibyte strings. It helps that most strings in Emacs are probably unibyte, even when the user isn’t working in English.

Most buffers are multibyte, even if those buffers are generally just ASCII. Since Emacs uses gap buffers it generally doesn’t matter: Nearly all accesses are tightly clustered around the point, so O(n) indexing doesn’t often matter.

That leaves multibyte strings. Consider these idioms for iterating across a string in Emacs Lisp:

(dotimes (i (length string))
  (let ((c (aref string i)))
    ...))

(cl-loop for c being the elements of string
         ...)

The latter expands into essentially the same as the former: An incrementing index that uses aref to index to that code point. So is iterating over a multibyte string — a common operation — an O(n^2) operation?

The good news is that, at least in this case, no! It’s essentially just as efficient as iterating over a unibyte string. Before going over why, consider this little puzzle. Here’s a little string comparison function that compares two strings a code point at a time, returning their first difference:

(defun compare (string-a string-b)
  (cl-loop for a being the elements of string-a
           for b being the elements of string-b
           unless (eql a b)
           return (cons a b)))

Let’s examine benchmarks with some long strings (100,000 code points):

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (compare a b)))
;; => (0.012568031 0 0.0)

With using two, zeroed unibyte strings it takes 13ms. How about changing the last code point in one of them to 256, converting it to a multibyte string:

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (setf (aref a (1- (length a))) 256)
      (compare a b)))
;; => (0.012680513 0 0.0)

Same running time, so that multibyte string cost nothing more to iterate across. Let’s try making them both multibyte:

(benchmark-run
    (let ((a (make-string 100000 0))
          (b (make-string 100000 0)))
      (setf (aref a (1- (length a))) 256
            (aref b (1- (length b))) 256)
      (compare a b)))
;; => (2.327959762 0 0.0)

That took 2.3 seconds: about 2000x longer to run! Iterating over two multibyte strings concurrently seems to have broken an optimization. Can you reason about what’s happened?

To avoid the O(n) cost on this common indexing operating, Emacs keeps a “bookmark” for the last indexing location into a multibyte string. If the next access is nearby, it can starting looking from this bookmark, forwards or backwards. Like a gap buffer, this gives a big advantage to clustered accesses, including iteration.

However, this string bookmark is global, one per Emacs instance, not once per string. In the last benchmark, the two multibyte strings are constantly fighting over a single string bookmark, and indexing in comparison function is reduced to O(n^2) time complexity.

So, Emacs pretends it has constant time access into its UTF-8 text data, but it’s only faking it with some simple optimizations. This usually works out just fine.

Julia

Another approach is to not pretend at all, and to make this limitation of UTF-8 explicit in the interface. Julia took this approach, and it was one of my complaints about the language. I don’t think this is necessarily a bad choice, but I do still think it’s inappropriate considering Julia’s target audience (i.e. Matlab users).

Julia strings are explicitly byte strings containing valid UTF-8 data. All indexing occurs on bytes, which is trivially constant time, and always decodes the multibyte code point starting at that byte. But it is an error to index to a byte that doesn’t begin a code point. That error is also trivially checked in constant time.

s = "π"

s[1]
# => 'π'

s[2]
# ERROR: UnicodeError: invalid character index
#  in getindex at ./strings/basic.jl:37

Slices are still over bytes, but they “round up” to the end of the current code point:

s[1:1]
# => "π"

Iterating over a string requires helper functions which keep an internal “bookmark” so that each access is constant time:

for i in eachindex(string)
    c = string[i]
    # ...
end

So Julia doesn’t pretend, it makes the problem explicit.

Go

Go is very similar to Julia, but takes an even more explicit view of strings. All strings are byte strings and there are no restrictions on their contents. Conventionally strings contain UTF-8 encoded text, but this is not strictly required. There’s a unicode/utf8 package for working with strings containing UTF-8 data.

Beyond convention, the range clause also assumes the string contains UTF-8 data, and it’s not an error if it does not. Bytes not containing valid UTF-8 data appear as a REPLACEMENT CHARACTER (U+FFFD).

func main() {
    s := "π\xff"
    for _, r := range s {
        fmt.Printf("U+%04x\n", r)
    }
}

// U+03c0
// U+fffd

A further case of the language favoring UTF-8 is that casting a string to []rune decodes strings into code points, like UCS-4, again using REPLACEMENT CHARACTER:

func main() {
    s := "π\xff"
    r := []rune(s)
    fmt.Printf("U+%04x\n", r[0])
    fmt.Printf("U+%04x\n", r[1])
}

// U+03c0
// U+fffd

So, like Julia, there’s no pretending, and the programmer explicitly must consider the problem.

Preferences

All-in-all I probably prefer how Julia and Go are explicit with UTF-8’s limitations, rather than Emacs Lisp’s attempt to cover it up with an internal optimization. Since the abstraction is leaky, it may as well be made explicit.

An Async / Await Library for Emacs Lisp

2019-03-10T20:57:03Z

As part of building my Python proficiency, I’ve learned how to use asyncio. This new language feature first appeared in Python 3.5 (PEP 492, September 2015). JavaScript grew a nearly identical feature in ES2017 (June 2017). An async function can pause to await on an asynchronously computed result, much like a generator pausing when it yields a value.

In fact, both Python and JavaScript async functions are essentially just fancy generator functions with some specialized syntax and semantics. That is, they’re stackless coroutines. Both languages already had generators, so their generator-like async functions are a natural extension that — unlike stackful coroutines — do not require significant, new runtime plumbing.

Emacs officially got generators in 25.1 (September 2016), though, unlike Python and JavaScript, it didn’t require any additional support from the compiler or runtime. It’s implemented entirely using Lisp macros. In other words, it’s just another library, not a core language feature. In theory, the generator library could be easily backported to the first Emacs release to properly support lexical closures, Emacs 24.1 (June 2012).

For the same reason, stackless async/await coroutines can also be implemented as a library. So that’s what I did, letting Emacs’ generator library do most of the heavy lifting. The package is called aio:

https://github.com/skeeto/emacs-aio

It’s modeled more closely on JavaScript’s async functions than Python’s asyncio, with the core representation being promises rather than a coroutine objects. I just have an easier time reasoning about promises than coroutines.

I’m definitely not the first person to realize this was possible, and was beaten to the punch by two years. Wanting to avoid fragmentation, I set aside all formality in my first iteration on the idea, not even bothering with namespacing my identifiers. It was to be only an educational exercise. However, I got quite attached to my little toy. Once I got my head wrapped around the problem, everything just sort of clicked into place so nicely.

In this article I will show step-by-step one way to build async/await on top of generators, laying out one concept at a time and then building upon each. But first, some examples to illustrate the desired final result.

aio example

Ignoring all its problems for a moment, suppose you want to use url-retrieve to fetch some content from a URL and return it. To keep this simple, I’m going to omit error handling. Also assume that lexical-binding is t for all examples. Besides, lexical scope required by the generator library, and therefore also required by aio.

The most naive approach is to fetch the content synchronously:

(defun fetch-fortune-1 (url)
  (let ((buffer (url-retrieve-synchronously url)))
    (with-current-buffer buffer
      (prog1 (buffer-string)
        (kill-buffer)))))

The result is returned directly, and errors are communicated by an error signal (e.g. Emacs’ version of exceptions). This is convenient, but the function will block the main thread, locking up Emacs until the result has arrived. This is obviously very undesirable, so, in practice, everyone nearly always uses the asynchronous version:

(defun fetch-fortune-2 (url callback)
  (url-retrieve url (lambda (_status)
                      (funcall callback (buffer-string)))))

The main thread no longer blocks, but it’s a whole lot less convenient. The result isn’t returned to the caller, and instead the caller supplies a callback function. The result, whether success or failure, will be delivered via callback, so the caller must split itself into two pieces: the part before the callback and the callback itself. Errors cannot be delivered using a error signal because of the inverted flow control.

The situation gets worse if, say, you need to fetch results from two different URLs. You either fetch results one at a time (inefficient), or you manage two different callbacks that could be invoked in any order, and therefore have to coordinate.

Wouldn’t it be nice for the function to work like the first example, but be asynchronous like the second example? Enter async/await:

(aio-defun fetch-fortune-3 (url)
  (let ((buffer (aio-await (aio-url-retrieve url))))
    (with-current-buffer buffer
      (prog1 (buffer-string)
        (kill-buffer)))))

A function defined with aio-defun is just like defun except that it can use aio-await to pause and wait on any other function defined with aio-defun — or, more specifically, any function that returns a promise. Borrowing Python parlance: Returning a promise makes a function awaitable. If there’s an error, it’s delivered as a error signal from aio-url-retrieve, just like the first example. When called, this function returns immediately with a promise object that represents a future result. The caller might look like this:

(defcustom fortune-url ...)

(aio-defun display-fortune ()
  (interactive)
  (message "%s" (aio-await (fetch-fortune-3 fortune-url))))

How wonderfully clean that looks! And, yes, it even works with interactive like that. I can M-x display-fortune and a fortune is printed in the minibuffer as soon as the result arrives from the server. In the meantime Emacs doesn’t block and I can continue my work.

You can’t do anything you couldn’t already do before. It’s just a nicer way to organize the same callbacks: implicit rather than explicit.

Promises, simplified

The core object at play is the promise. Promises are already a rather simple concept, but aio promises have been distilled to their essence, as they’re only needed for this singular purpose. More on this later.

As I said, a promise represents a future result. In practical terms, a promise is just an object to which one can subscribe with a callback. When the result is ready, the callbacks are invoked. Another way to put it is that promises reify the concept of callbacks. A callback is no longer just the idea of extra argument on a function. It’s a first-class thing that itself can be passed around as a value.

Promises have two slots: the final promise result and a list of subscribers. A nil result means the result hasn’t been computed yet. It’s so simple I’m not even bothering with cl-struct.

(defun aio-promise ()
  "Create a new promise object."
  (record 'aio-promise nil ()))

(defsubst aio-promise-p (object)
  (and (eq 'aio-promise (type-of object))
       (= 3 (length object))))

(defsubst aio-result (promise)
  (aref promise 1))

To subscribe to a promise, use aio-listen:

(defun aio-listen (promise callback)
  (let ((result (aio-result promise)))
    (if result
        (run-at-time 0 nil callback result)
      (push callback (aref promise 2)))))

If the result isn’t ready yet, add the callback to the list of subscribers. If the result is ready call the callback in the next event loop turn using run-at-time. This is important because it keeps all the asynchronous components isolated from one another. They won’t see each others’ frames on the call stack, nor frames from aio. This is so important that the Promises/A+ specification is explicit about it.

The other half of the equation is resolving a promise, which is done with aio-resolve. Unlike other promises, aio promises don’t care whether the promise is being fulfilled (success) or rejected (error). Instead a promise is resolved using a value function — or, usually, a value closure. Subscribers receive this value function and extract the value by invoking it with no arguments.

Why? This lets the promise’s resolver decide the semantics of the result. Instead of returning a value, this function can instead signal an error, propagating an error signal that terminated an async function. Because of this, the promise doesn’t need to know how it’s being resolved.

When a promise is resolved, subscribers are each scheduled in their own event loop turns in the same order that they subscribed. If a promise has already been resolved, nothing happens. (Thought: Perhaps this should be an error in order to catch API misuse?)

(defun aio-resolve (promise value-function)
  (unless (aio-result promise)
    (let ((callbacks (nreverse (aref promise 2))))
      (setf (aref promise 1) value-function
            (aref promise 2) ())
      (dolist (callback callbacks)
        (run-at-time 0 nil callback value-function)))))

If you’re not an async function, you might subscribe to a promise like so:

(aio-listen promise (lambda (v)
                      (message "%s" (funcall v))))

The simplest example of a non-async function that creates and delivers on a promise is a “sleep” function:

(defun aio-sleep (seconds &optional result)
  (let ((promise (aio-promise))
        (value-function (lambda () result)))
    (prog1 promise
      (run-at-time seconds nil
                   #'aio-resolve promise value-function))))

Similarly, here’s a “timeout” promise that delivers a special timeout error signal at a given time in the future.

(defun aio-timeout (seconds)
  (let ((promise (aio-promise))
        (value-function (lambda () (signal 'aio-timeout nil))))
    (prog1 promise
      (run-at-time seconds nil
                   #'aio-resolve promise value-function))))

That’s all there is to promises.

Evaluate in the context of a promise

Before we get into pausing functions, lets deal with the slightly simpler matter of delivering their return values using a promise. What we need is a way to evaluate a “body” and capture its result in a promise. If the body exits due to a signal, we want to capture that as well.

Here’s a macro that does just this:

(defmacro aio-with-promise (promise &rest body)
  `(aio-resolve ,promise
                (condition-case error
                    (let ((result (progn ,@body)))
                      (lambda () result))
                  (error (lambda ()
                           (signal (car error) ; rethrow
                                   (cdr error)))))))

The body result is captured in a closure and delivered to the promise. If there’s an error signal, it’s “rethrown” into subscribers by the promise’s value function.

This is where Emacs Lisp has a serious weak spot. There’s not really a concept of rethrowing a signal. Unlike a language with explicit exception objects that can capture a snapshot of the backtrace, the original backtrace is completely lost where the signal is caught. There’s no way to “reattach” it to the signal when it’s rethrown. This is unfortunate because it would greatly help debugging if you got to see the full backtrace on the other side of the promise.

Async functions

So we have promises and we want to pause a function on a promise. Generators have iter-yield for pausing an iterator’s execution. To tackle this problem:

Yield the promise to pause the iterator.
Subscribe a callback on the promise that continues the generator (iter-next) with the promise’s result as the yield result.

All the hard work is done in either side of the yield, so aio-await is just a simple wrapper around iter-yield:

(defmacro aio-await (expr)
  `(funcall (iter-yield ,expr)))

Remember, that funcall is here to extract the promise value from the value function. If it signals an error, this propagates directly into the iterator just as if it had been a direct call — minus an accurate backtrace.

So aio-lambda / aio-defun needs to wrap the body in a generator (iter-lamba), invoke it to produce a generator, then drive the generator using callbacks. Here’s a simplified, unhygienic definition of aio-lambda:

(defmacro aio-lambda (arglist &rest body)
  `(lambda (&rest args)
     (let ((promise (aio-promise))
           (iter (apply (iter-lambda ,arglist
                          (aio-with-promise promise
                            ,@body))
                        args)))
       (prog1 promise
         (aio--step iter promise nil)))))

The body is evaluated inside aio-with-promise with the result delivered to the promise returned directly by the async function.

Before returning, the iterator is handed to aio--step, which drives the iterator forward until it delivers its first promise. When the iterator yields a promise, aio--step attaches a callback back to itself on the promise as described above. Immediately driving the iterator up to the first yielded promise “primes” it, which is important for getting the ball rolling on any asynchronous operations.

If the iterator ever yields something other than a promise, it’s delivered right back into the iterator.

(defun aio--step (iter promise yield-result)
  (condition-case _
      (cl-loop for result = (iter-next iter yield-result)
               then (iter-next iter (lambda () result))
               until (aio-promise-p result)
               finally (aio-listen result
                                   (lambda (value)
                                     (aio--step iter promise value))))
    (iter-end-of-sequence)))

When the iterator is done, nothing more needs to happen since the iterator resolves its own return value promise.

The definition of aio-defun just uses aio-lambda with defalias. There’s nothing to it.

That’s everything you need! Everything else in the package is merely useful, awaitable functions like aio-sleep and aio-timeout.

Composing promises

Unfortunately url-retrieve doesn’t support timeouts. We can work around this by composing two promises: a url-retrieve promise and aio-timeout promise. First define a promise-returning function, aio-select that takes a list of promises and returns (as another promise) the first promise to resolve:

(defun aio-select (promises)
  (let ((result (aio-promise)))
    (prog1 result
      (dolist (promise promises)
        (aio-listen promise (lambda (_)
                              (aio-resolve
                               result
                               (lambda () promise))))))))

We give aio-select both our url-retrieve and timeout promises, and it tells us which resolved first:

(aio-defun fetch-fortune-4 (url timeout)
  (let* ((promises (list (aio-url-retrieve url)
                         (aio-timeout timeout)))
         (fastest (aio-await (aio-select promises)))
         (buffer (aio-await fastest)))
    (with-current-buffer buffer
      (prog1 (buffer-string)
        (kill-buffer)))))

Cool! Note: This will not actually cancel the URL request, just move the async function forward earlier and prevent it from getting the result.

Threads

Despite aio being entirely about managing concurrent, asynchronous operations, it has nothing at all to do with threads — as in Emacs 26’s support for kernel threads. All async functions and promise callbacks are expected to run only on the main thread. That’s not to say an async function can’t await on a result from another thread. It just must be done very carefully.

Processes

The package also includes two functions for realizing promises on processes, whether they be subprocesses or network sockets.

aio-process-filter
aio-process-sentinel

For example, this function loops over each chunk of output (typically 4kB) from the process, as delivered to a filter function:

(aio-defun process-chunks (process)
  (cl-loop for chunk = (aio-await (aio-process-filter process))
           while chunk
           do (... process chunk ...)))

Exercise for the reader: Write an awaitable function that returns a line at at time rather than a chunk at a time. You can build it on top of aio-process-filter.

I considered wrapping functions like start-process so that their aio versions would return a promise representing some kind of result from the process. However there are so many different ways to create and configure processes that I would have ended up duplicating all the process functions. Focusing on the filter and sentinel, and letting the caller create and configure the process is much cleaner.

Unfortunately Emacs has no asynchronous API for writing output to a process. Both process-send-string and process-send-region will block if the pipe or socket is full. There is no callback, so you cannot await on writing output. Maybe there’s a way to do it with a dedicated thread?

Another issue is that the process-send-* functions are preemptible, made necessary because they block. The aio-process-* functions leave a gap (i.e. between filter awaits) where no filter or sentinel function is attached. It’s a consequence of promises being single-fire. The gap is harmless so long as the async function doesn’t await something else or get preempted. This needs some more thought.

Update: These process functions no longer exist and have been replaced by a small framework for building chains of promises. See aio-make-callback.

Testing aio

The test suite for aio is a bit unusual. Emacs’ built-in test suite, ERT, doesn’t support asynchronous tests. Furthermore, tests are generally run in batch mode, where Emacs invokes a single function and then exits rather than pump an event loop. Batch mode can only handle asynchronous process I/O, not the async functions of aio. So it’s not possible to run the tests in batch mode.

Instead I hacked together a really crude callback-based test suite. It runs in non-batch mode and writes the test results into a buffer (run with make check). Not ideal, but it works.

One of the tests is a sleep sort (with reasonable tolerances). It’s a pretty neat demonstration of what you can do with aio:

(aio-defun sleep-sort (values)
  (let ((promises (mapcar (lambda (v) (aio-sleep v v)) values)))
    (cl-loop while promises
             for next = (aio-await (aio-select promises))
             do (setf promises (delq next promises))
             collect (aio-await next))))

To see it in action (M-x sleep-sort-demo):

(aio-defun sleep-sort-demo ()
  (interactive)
  (let ((values '(0.1 0.4 1.1 0.2 0.8 0.6)))
    (message "%S" (aio-await (sleep-sort values)))))

Async/await is pretty awesome

I’m quite happy with how this all came together. Once I had the concepts straight — particularly resolving to value functions — everything made sense and all the parts fit together well, and mostly by accident. That feels good.

The CPython Bytecode Compiler is Dumb

2019-02-24T21:56:35Z

This article was discussed on Hacker News.

Due to sheer coincidence of several unrelated tasks converging on Python at work, I recently needed to brush up on my Python skills. So far for me, Python has been little more than a fancy extension language for BeautifulSoup, though I also used it to participate in the recent tradition of writing one’s own static site generator, in this case for my wife’s photo blog. I’ve been reading through Fluent Python by Luciano Ramalho, and it’s been quite effective at getting me up to speed.

As I write Python, like with Emacs Lisp, I can’t help but consider what exactly is happening inside the interpreter. I wonder if the code I’m writing is putting undue constraints on the bytecode compiler and limiting its options. Ultimately I’d like the code I write to drive the interpreter efficiently and effectively. The Zen of Python says there should “only one obvious way to do it,” but in practice there’s a lot of room for expression. Given multiple ways to express the same algorithm or idea, I tend to prefer the one that compiles to the more efficient bytecode.

Fortunately CPython, the main and most widely used implementation of Python, is very transparent about its bytecode. It’s easy to inspect and reason about its bytecode. The disassembly listing is easy to read and understand, and I can always follow it without consulting the documentation. This contrasts sharply with modern JavaScript engines and their opaque use of JIT compilation, where performance is guided by obeying certain patterns (hidden classes, etc.), helping the compiler understand my program’s types, and being careful not to unnecessarily constrain the compiler.

So, besides just catching up with Python the language, I’ve been studying the bytecode disassembly of the functions that I write. One fact has become quite apparent: the CPython bytecode compiler is pretty dumb. With a few exceptions, it’s a very literal translation of a Python program, and there is almost no optimization. Below I’ll demonstrate a case where it’s possible to detect one of the missed optimizations without inspecting the bytecode disassembly thanks to a small abstraction leak in the optimizer.

To be clear: This isn’t to say CPython is bad, or even that it should necessarily change. In fact, as I’ll show, dumb bytecode compilers are par for the course. In the past I’ve lamented how the Emacs Lisp compiler could do a better job, but CPython and Lua are operating at the same level. There are benefits to a dumb and straightforward bytecode compiler: the compiler itself is simpler, easier to maintain, and more amenable to modification (e.g. as Python continues to evolve). It’s also easier to debug Python (pdb) because it’s such a close match to the source listing.

Update: Darius Bacon points out that Guido van Rossum himself said, “Python is about having the simplest, dumbest compiler imaginable.” So this is all very much by design.

The consensus seems to be that if you want or need better performance, use something other than Python. (And if you can’t do that, at least use PyPy.) That’s a fairly reasonable and healthy goal. Still, if I’m writing Python, I’d like to do the best I can, which means exploiting the optimizations that are available when possible.

Disassembly examples

I’m going to compare three bytecode compilers in this article: CPython 3.7, Lua 5.3, and Emacs 26.1. Each of these languages are dynamically typed, are primarily executed on a bytecode virtual machine, and it’s easy to access their disassembly listings. One caveat: CPython and Emacs use a stack-based virtual machine while Lua uses a register-based virtual machine.

For CPython I’ll be using the dis module. For Emacs Lisp I’ll use M-x disassemble, and all code will use lexical scoping. In Lua I’ll use lua -l on the command line.

Local variable elimination

Will the bytecode compiler eliminate local variables? Keeping the variable around potentially involves allocating memory for it, assigning to it, and accessing it. Take this example:

def foo():
    x = 0
    y = 1
    return x

This function is equivalent to:

def foo():
    return 0

Despite this, CPython completely misses this optimization for both x and y:

  2           0 LOAD_CONST               1 (0)
              2 STORE_FAST               0 (x)
  3           4 LOAD_CONST               2 (1)
              6 STORE_FAST               1 (y)
  4           8 LOAD_FAST                0 (x)
             10 RETURN_VALUE

It assigns both variables, and even loads again from x for the return. Missed optimizations, but, as I said, by keeping these variables around, debugging is more straightforward. Users can always inspect variables.

How about Lua?

function foo()
    local x = 0
    local y = 1
    return x
end

It also misses this optimization, though it matters a little less due to its architecture (the return instruction references a register regardless of whether or not that register is allocated to a local variable):

        1       [2]     LOADK           0 -1    ; 0
        2       [3]     LOADK           1 -2    ; 1
        3       [4]     RETURN          0 2
        4       [5]     RETURN          0 1

Emacs Lisp also misses it:

(defun foo ()
  (let ((x 0)
        (y 1))
    x))

Disassembly:

constant  0
constant  1
stack-ref 1
return

All three are on the same page.

Constant folding

Does the bytecode compiler evaluate simple constant expressions at compile time? This is simple and everyone does it.

def foo():
    return 1 + 2 * 3 / 4

Disassembly:

  2           0 LOAD_CONST               1 (2.5)
              2 RETURN_VALUE

Lua:

function foo()
    return 1 + 2 * 3 / 4
end

Disassembly:

        1       [2]     LOADK           0 -1    ; 2.5
        2       [2]     RETURN          0 2
        3       [3]     RETURN          0 1

Emacs Lisp:

(defun foo ()
  (+ 1 (/ (* 2 3) 4.0))

Disassembly:

0	constant  2.5
1	return

That’s something we can count on so long as the operands are all numeric literals (or also, for Python, string literals) that are visible to the compiler. Don’t count on your operator overloads to work here, though.

Allocation optimization

Optimizers often perform escape analysis, to determine if objects allocated in a function ever become visible outside of that function. If they don’t then these objects could potentially be stack-allocated (instead of heap-allocated) or even be eliminated entirely.

None of the bytecode compilers are this sophisticated. However CPython does have a trick up its sleeve: tuple optimization. Since tuples are immutable, in certain circumstances CPython will reuse them and avoid both the constructor and the allocation.

def foo():
    return (1, 2, 3)

Check it out, the tuple is used as a constant:

  2           0 LOAD_CONST               1 ((1, 2, 3))
              2 RETURN_VALUE

Which we can detect by evaluating foo() is foo(), which is True. Though deviate from this too much and the optimization is disabled. Remember how CPython can’t optimize away variables, and that they break constant folding? The break this, too:

def foo():
    x = 1
    return (x, 2, 3)

Disassembly:

  2           0 LOAD_CONST               1 (1)
              2 STORE_FAST               0 (x)
  3           4 LOAD_FAST                0 (x)
              6 LOAD_CONST               2 (2)
              8 LOAD_CONST               3 (3)
             10 BUILD_TUPLE              3
             12 RETURN_VALUE

This function might document that it always returns a simple tuple, but we can tell if its being optimized or not using is like before: foo() is foo() is now False! In some future version of Python with a cleverer bytecode compiler, that expression might evaluate to True. (Unless the Python language specification is specific about this case, which I didn’t check.)

Note: Curiously PyPy replicates this exact behavior when examined with is. Was that deliberate? I’m impressed that PyPy matches CPython’s semantics so closely here.

Putting a mutable value, such as a list, in the tuple will also break this optimization. But that’s not the compiler being dumb. That’s a hard constraint on the compiler: the caller might change the mutable component of the tuple, so it must always return a fresh copy.

Neither Lua nor Emacs Lisp have a language-level concept equivalent of an immutable tuple, so there’s nothing to compare.

Other than the tuples situation in CPython, none of the bytecode compilers eliminate unnecessary intermediate objects.

def foo():
    return [1024][0]

Disassembly:

  2           0 LOAD_CONST               1 (1024)
              2 BUILD_LIST               1
              4 LOAD_CONST               2 (0)
              6 BINARY_SUBSCR
              8 RETURN_VALUE

Lua:

function foo()
    return ({1024})[1]
end

Disassembly:

        1       [2]     NEWTABLE        0 1 0
        2       [2]     LOADK           1 -1    ; 1024
        3       [2]     SETLIST         0 1 1   ; 1
        4       [2]     GETTABLE        0 0 -2  ; 1
        5       [2]     RETURN          0 2
        6       [3]     RETURN          0 1

Emacs Lisp:

(defun foo ()
  (car (list 1024)))

Disassembly:

constant  1024
list1
car
return

Don’t expect too much

I could go on with lots of examples, looking at loop optimizations and so on, and each case is almost certainly unoptimized. The general rule of thumb is to simply not expect much from these bytecode compilers. They’re very literal in their translation.

Working so much in C has put me in the habit of expecting all obvious optimizations from the compiler. This frees me to be more expressive in my code. Lots of things are cost-free thanks to these optimizations, such as breaking a complex expression up into several variables, naming my constants, or not using a local variable to manually cache memory accesses. I’m confident the compiler will optimize away my expressiveness. The catch is that clever compilers can take things too far, so I’ve got to be mindful of how it might undermine my intentions — i.e. when I’m doing something unusual or not strictly permitted.

These bytecode compilers will never truly surprise me. The cost is that being more expressive in Python, Lua, or Emacs Lisp may reduce performance at run time because it shows in the bytecode. Usually this doesn’t matter, but sometimes it does.

Emacs 26 Brings Generators and Threads

2018-05-31T17:45:16Z

Emacs 26.1 was recently released. As you would expect from a major release, it comes with lots of new goodies. Being a bit of an Emacs Lisp enthusiast, the two most interesting new features are generators (iter) and native threads (thread).

Correction: Generators were actually introduced in Emacs 25.1 (Sept. 2016), not Emacs 26.1. Doh!

Update: ThreadSanitizer (TSan) quickly shows that Emacs’ threading implementation has many data races, making it completely untrustworthy. Until this is fixed, nobody should use Emacs threads for any purpose, and threads should disabled at compile time.

Generators

Generators are one of those cool language features that provide a lot of power at a small implementation cost. They’re like a constrained form of coroutines, but, unlike coroutines, they’re typically built entirely on top of first-class functions (e.g. closures). This means no additional run-time support is needed in order to add generators to a language. The only complications are the changes to the compiler. Generators are not compiled the same way as normal functions despite looking so similar.

What’s perhaps coolest of all about lisp-family generators, including Emacs Lisp, is that the compiler component can be implemented entirely with macros. The compiler need not be modified at all, making generators no more than a library, and not actually part of the language. That’s exactly how they’ve been implemented in Emacs Lisp (emacs-lisp/generator.el).

So what’s a generator? It’s a function that returns an iterator object. When an iterator object is invoked (e.g. iter-next) it evaluates the body of the generator. Each iterator is independent. What makes them unusual (and useful) is that the evaluation is paused in the middle of the body to return a value, saving all the internal state in the iterator. Normally pausing in the middle of functions isn’t possible, which is what requires the special compiler support.

Emacs Lisp generators appear to be most closely modeled after Python generators, though it also shares some similarities to JavaScript generators. What makes it most like Python is the use of signals for flow control — something I’m not personally enthused about. When a Python generator completes, it throws a StopItertion exception. In Emacs Lisp, it’s an iter-end-of-sequence signal. A signal is out-of-band and avoids the issue relying on some special in-band value to communicate the end of iteration.

In contrast, JavaScript’s solution is to return a “rich” object wrapping the actual yield value. This object has a done field that communicates whether iteration has completed. This avoids the use of exceptions for flow control, but the caller has to unpack the rich object.

Fortunately the flow control issue isn’t normally exposed to Emacs Lisp code. Most of the time you’ll use the iter-do macro or (my preference) the new cl-loop keyword iter-by.

To illustrate how a generator works, here’s a really simple iterator that iterates over a list:

(iter-defun walk (list)
  (while list
    (iter-yield (pop list))))

Here’s how it might be used:

(setf i (walk '(:a :b :c)))

(iter-next i)  ; => :a
(iter-next i)  ; => :b
(iter-next i)  ; => :c
(iter-next i)  ; error: iter-end-of-sequence

The iterator object itself is opaque and you shouldn’t rely on any part of its structure. That being said, I’m a firm believer that we should understand how things work underneath the hood so that we can make the most effective use of at them. No program should rely on the particulars of the iterator object internals for correctness, but a well-written program should employ them in a way that best exploits their expected implementation.

Currently iterator objects are closures, and iter-next invokes the closure with its own internal protocol. It asks the closure to return the next value (:next operation), and iter-close asks it to clean itself up (:close operation).

Since they’re just closures, another really cool thing about Emacs Lisp generators is that iterator objects are generally readable. That is, you can serialize them out with print and bring them back to life with read, even in another instance of Emacs. They exist independently of the original generator function. This will not work if one of the values captured in the iterator object is not readable (e.g. buffers).

How does pausing work? Well, one of other exciting new features of Emacs 26 is the introduction of a jump table opcode, switch. I’d lamented in the past that large cond and cl-case expressions could be a lot more efficient if Emacs’ byte code supported jump tables. It turns an O(n) sequence of comparisons into an O(1) lookup and jump. It’s essentially the perfect foundation for a generator since it can be used to jump straight back to the position where evaluation was paused.

Buuut, generators do not currently use jump tables. The generator library predates the new switch opcode, and, being independent of it, its author, Daniel Colascione, went with the best option at the time. Chunks of code between yields are packaged as individual closures. These closures are linked together a bit like nodes in a graph, creating a sort of state machine. To get the next value, the iterator object invokes the closure representing the next state.

I’ve manually macro expanded the walk generator above into a form that roughly resembles the expansion of iter-defun:

(defun walk (list)
  (let (state)
    (cl-flet* ((state-2 ()
                 (signal 'iter-end-of-sequence nil))
               (state-1 ()
                 (prog1 (pop list)
                   (when (null list)
                     (setf state #'state-2))))
               (state-0 ()
                 (if (null list)
                     (state-2)
                   (setf state #'state-1)
                   (state-1))))
      (setf state #'state-0)
      (lambda ()
        (funcall state)))))

This omits the protocol I mentioned, and it doesn’t have yield results (values passed to the iterator). The actual expansion is a whole lot messier and less optimal than this, but hopefully my hand-rolled generator is illustrative enough. Without the protocol, this iterator is stepped using funcall rather than iter-next.

The state variable keeps track of where in the body of the generator this iterator is currently “paused.” Continuing the iterator is therefore just a matter of invoking the closure that represents this state. Each state closure may update state to point to a new part of the generator body. The terminal state is obviously state-2. Notice how state transitions occur around branches.

I had said generators can be implemented as a library in Emacs Lisp. Unfortunately theres a hole in this: unwind-protect. It’s not valid to yield inside an unwind-protect form. Unlike, say, a throw-catch, there’s no mechanism to trap an unwinding stack so that it can be restarted later. The state closure needs to return and fall through the unwind-protect.

A jump table version of the generator might look like the following. I’ve used cl-labels since it allows for recursion.

(defun walk (list)
  (let ((state 0))
    (cl-labels
        ((closure ()
           (cl-case state
             (0 (if (null list)
                    (setf state 2)
                  (setf state 1))
                (closure))
             (1 (prog1 (pop list)
                  (when (null list)
                    (setf state 2))))
             (2 (signal 'iter-end-of-sequence nil)))))
      #'closure)))

When byte compiled on Emacs 26, that cl-case is turned into a jump table. This “switch” form is closer to how generators are implemented in other languages.

Iterator objects can share state between themselves if they close over a common environment (or, of course, use the same global variables).

(setf foo
      (let ((list '(:a :b :c)))
        (list
         (funcall
          (iter-lambda ()
            (while list
              (iter-yield (pop list)))))
         (funcall
          (iter-lambda ()
            (while list
              (iter-yield (pop list))))))))

(iter-next (nth 0 foo))  ; => :a
(iter-next (nth 1 foo))  ; => :b
(iter-next (nth 0 foo))  ; => :c

For years there has been a very crude way to “pause” a function and allow other functions to run: accept-process-output. It only works in the context of processes, but five years ago this was sufficient for me to build primitives on top of it. Unlike this old process function, generators do not block threads, including the user interface, which is really important.

Threads

Emacs 26 also bring us threads, which have been attached in a very bolted on fashion. It’s not much more than a subset of pthreads: shared memory threads, recursive mutexes, and condition variables. The interfaces look just like they do in pthreads, and there hasn’t been much done to integrate more naturally into the Emacs Lisp ecosystem.

This is also only the first step in bringing threading to Emacs Lisp. Right now there’s effectively a global interpreter lock (GIL), and threads only run one at a time cooperatively. Like with generators, the Python influence is obvious. In theory, sometime in the future this interpreter lock will be removed, making way for actual concurrency.

This is, again, where I think it’s useful to contrast with JavaScript, which was also initially designed to be single-threaded. Low-level threading primitives weren’t exposed — though mostly because JavaScript typically runs sandboxed and there’s no safe way to expose those primitives. Instead it got a web worker API that exposes concurrency at a much higher level, along with an efficient interface for thread coordination.

For Emacs Lisp, I’d prefer something safer, more like the JavaScript approach. Low-level pthreads are now a great way to wreck Emacs with deadlocks (with no C-g escape). Playing around with the new threading API for just a few days, I’ve already had to restart Emacs a bunch of times. Bugs in Emacs Lisp are normally a lot more forgiving.

One important detail that has been designed well is that dynamic bindings are thread-local. This is really essential for correct behavior. This is also an easy way to create thread-local storage (TLS): dynamically bind variables in the thread’s entrance function.

;;; -*- lexical-binding: t; -*-

(defvar foo-counter-tls)
(defvar foo-path-tls)

(defun foo-make-thread (path)
  (make-thread
   (lambda ()
     (let ((foo-counter-tls 0)
           (foo-name-tls path))
       ...))))

However, cl-letf “bindings” are not thread-local, which makes this otherwise incredibly useful macro quite dangerous in the presence of threads. This is one way that the new threading API feels bolted on.

Building generators on threads

In my stack clashing article I showed a few different ways to add coroutine support to C. One method spawned per-coroutine threads, and coordinated using semaphores. With the new threads API in Emacs, it’s possible to do exactly the same thing.

Since generators are just a limited form of coroutines, this means threads offer another, very different way to implement them. The threads API doesn’t provide semaphores, but condition variables can fill in for them. To “pause” in the middle of the generator, just wait on a condition variable.

So, naturally, I just had to see if I could make it work. I call it a “thread iterator” or “thriter.” The API is very similar to iter:

https://github.com/skeeto/thriter

This is merely a proof of concept so don’t actually use this library for anything. These thread-based generators are about 5x slower than iter generators, and they’re a lot more heavy-weight, needing an entire thread per iterator object. This makes thriter-close all the more important. On the other hand, these generators have no problem yielding inside unwind-protect.

Originally this article was going to dive into the details of how these thread-iterators worked, but thriter turned out to be quite a bit more complicated than I anticipated, especially as I worked towards feature matching iter.

The gist of it is that each side of a next/yield transaction gets its own condition variable, but share a common mutex. Values are passed between the threads using slots on the iterator object. The side that isn’t currently running waits on a condition variable until the other side frees it, after which the releaser waits on its own condition variable for the result. This is similar to asynchronous requests in Emacs dynamic modules.

Rather than use signals to indicate completion, I modeled it after JavaScript generators. Iterators return a cons cell. The car indicates continuation and the cdr holds the yield result. To terminate an iterator early (thriter-close or garbage collection), thread-signal is used to essentially “cancel” the thread and knock it off the condition variable.

Since threads aren’t (and shouldn’t be) garbage collected, failing to run a thread-iterator to completion would normally cause a memory leak, as the thread sits there forever waiting on a “next” that will never come. To deal with this, there’s a finalizer is attached to the iterator object in such a way that it’s not visible to the thread. A lost iterator is eventually cleaned up by the garbage collector, but, as usual with finalizers, this is only a last resort.

The future of threads

This thread-iterator project was my initial, little experiment with Emacs Lisp threads, similar to why I connected a joystick to Emacs using a dynamic module. While I don’t expect the current thread API to go away, it’s not really suitable for general use in its raw form. Bugs in Emacs Lisp programs should virtually never bring down Emacs and require a restart. Outside of threads, the few situations that break this rule are very easy to avoid (and very obvious that something dangerous is happening). Dynamic modules are dangerous by necessity, but concurrency doesn’t have to be.

There really needs to be a safe, high-level API with clean thread isolation. Perhaps this higher-level API will eventually build on top of the low-level threading API.

Emacs Lisp Lambda Expressions Are Not Self-Evaluating

2018-02-22T21:30:57Z

This week I made a mistake that ultimately enlightened me about the nature of function objects in Emacs Lisp. There are three kinds of function objects, but they each behave very differently when evaluated as objects.

But before we get to that, let’s talk about one of Emacs’ embarrassing, old missteps: eval-after-load.

Taming an old dragon

One of the long-standing issues with Emacs is that loading Emacs Lisp files (.el and .elc) is a slow process, even when those files have been byte compiled. There are a number of dirty hacks in place to deal with this issue, and the biggest and nastiest of them all is the dumper, also known as unexec.

The Emacs you routinely use throughout the day is actually a previous instance of Emacs that’s been resurrected from the dead. Your undead Emacs was probably created months, if not years, earlier, back when it was originally compiled. The first stage of compiling Emacs is to compile a minimal C core called temacs. The second stage is loading a bunch of Emacs Lisp files, then dumping a memory image in an unportable, platform-dependent way. On Linux, this actually requires special hooks in glibc. The Emacs you know and love is this dumped image loaded back into memory, continuing from where it left off just after it was compiled. Regardless of your own feelings on the matter, you have to admit this is a very lispy thing to do.

There are two notable costs to Emacs’ dumper:

The dumped image contains hard-coded memory addresses. This means Emacs can’t be a Position Independent Executable (PIE). It can’t take advantage of a security feature called Address Space Layout Randomization (ASLR), which would increase the difficulty of exploiting some classes of bugs. This might be important to you if Emacs processes untrusted data, such as when it’s used as a mail client, a web server or generally parses data downloaded across the network.
It’s not possible to cross-compile Emacs since it can only be dumped by running temacs on its target platform. As an experiment I’ve attempted to dump the Windows version of Emacs on Linux using Wine, but was unsuccessful.

The good news is that there’s a portable dumper in the works that makes this a lot less nasty. If you’re adventurous, you can already disable dumping and run temacs directly by setting CANNOT_DUMP=yes at compile time. Be warned, though, that a non-dumped Emacs takes several seconds, or worse, to initialize before it even begins loading your own configuration. It’s also somewhat buggy since it seems nobody ever runs it this way productively.

The other major way Emacs users have worked around slow loading is aggressive use of lazy loading, generally via autoloads. The major package interactive entry points are defined ahead of time as stub functions. These stubs, when invoked, load the full package, which overrides the stub definition, then finally the stub re-invokes the new definition with the same arguments.

To further assist with lazy loading, an evaluated defvar form will not override an existing global variable binding. This means you can, to a certain extent, configure a package before it’s loaded. The package will not clobber any existing configuration when it loads. This also explains the bizarre interfaces for the various hook functions, like add-hook and run-hooks. These accept symbols — the names of the variables — rather than values of those variables as would normally be the case. The add-to-list function does the same thing. It’s all intended to cooperate with lazy loading, where the variable may not have been defined yet.

eval-after-load

Sometimes this isn’t enough and you need some some configuration to take place after the package has been loaded, but without forcing it to load early. That is, you need to tell Emacs “evaluate this code after this particular package loads.” That’s where eval-after-load comes into play, except for its fatal flaw: it takes the word “eval” completely literally.

The first argument to eval-after-load is the name of a package. Fair enough. The second argument is a form that will be passed to eval after that package is loaded. Now hold on a minute. The general rule of thumb is that if you’re calling eval, you’re probably doing something seriously wrong, and this function is no exception. This is completely the wrong mechanism for the task.

The second argument should have been a function — either a (sharp quoted) symbol or a function object. And then instead of eval it would be something more sensible, like funcall. Perhaps this improved version would be named call-after-load or run-after-load.

The big problem with passing an s-expression is that it will be left uncompiled due to being quoted. I’ve talked before about the importance of evaluating your lambdas. eval-after-load not only encourages badly written Emacs Lisp, it demands it.

;;; BAD!
(eval-after-load 'simple-httpd
                 '(push '("c" . "text/plain") httpd-mime-types))

This was all corrected in Emacs 25. If the second argument to eval-after-load is a function — the result of applying functionp is non-nil — then it uses funcall. There’s also a new macro, with-eval-after-load, to package it all up nicely.

;;; Better (Emacs >= 25 only)
(eval-after-load 'simple-httpd
  (lambda ()
    (push '("c" . "text/plain") httpd-mime-types)))

;;; Best (Emacs >= 25 only)
(with-eval-after-load 'simple-httpd
  (push '("c" . "text/plain") httpd-mime-types))

Though in both of these examples the compiler will likely warn about httpd-mime-types not being defined. That’s a problem for another day.

A workaround

But what if you need to use Emacs 24, as was the situation that sparked this article? What can we do with the bad version of eval-after-load? We could situate a lambda such that it’s evaluated, but then smuggle the resulting function object into the form passed to eval-after-load, all using a backquote.

;;; Note: this is subtly broken
(eval-after-load 'simple-httpd
  `(funcall
    ,(lambda ()
       (push '("c" . "text/plain") httpd-mime-types)))

When everything is compiled, the backquoted form evalutes to this:

(funcall #[0  [httpd-mime-types ("c" . "text/plain")] 2])

Where the second value (#[...]) is a byte-code object. However, as the comment notes, this is subtly broken. A cleaner and correct way to solve all this is with a named function. The damage caused by eval-after-load will have been (mostly) minimized.

(defun my-simple-httpd-hook ()
  (push '("c" . "text/plain") httpd-mime-types))

(eval-after-load 'simple-httpd
  '(funcall #'my-simple-httpd-hook))

But, let’s go back to the anonymous function solution. What was broken about it? It all has to do with evaluating function objects.

Evaluating function objects

So what happens when we evaluate an expression like the one above with eval? Here’s what it looks like again.

(funcall #[...])

First, eval notices it’s been given a non-empty list, so it’s probably a function call. The first argument is the name of the function to be called (funcall) and the remaining elements are its arguments. But each of these elements must be evaluated first, and the result of that evaluation becomes the arguments.

Any value that isn’t a list or a symbol is self-evaluating. That is, it evaluates to its own value:

(eval 10)
;; => 10

If the value is a symbol, it’s treated as a variable. If the value is a list, it goes through the function call process I’m describing (or one of a number of other special cases, such as macro expansion, lambda expressions, and special forms).

So, conceptually eval recurses on the function object #[...]. A function object is not a list or a symbol, so it’s self-evaluating. No problem.

;; Byte-code objects are self-evaluating

(let ((x (byte-compile (lambda ()))))
  (eq x (eval x)))
;; => t

What if this code wasn’t compiled? Rather than a byte-code object, we’d have some other kind of function object for the interpreter. Let’s examine the dynamic scope (shudder) case. Here, a lambda appears to evaluate to itself, but appearances can be deceiving:

(eval (lambda ())
;; => (lambda ())

However, this is not self-evaluation. Lambda expressions are not self-evaluating. It’s merely coincidence that the result of evaluating a lambda expression looks like the original expression. This is just how the Emacs Lisp interpreter is currently implemented and, strictly speaking, it’s an implementation detail that just so happens to be mostly compatible with byte-code objects being self-evaluating. It would be a mistake to rely on this.

Instead, dynamic scope lambda expression evaluation is idempotent. Applying eval to the result will return an equal, but not identical (eq), expression. In contrast, a self-evaluating value is also idempotent under evaluation, but with eq results.

;; Not self-evaluating:

(let ((x '(lambda ())))
  (eq x (eval x)))
;; => nil

;; Evaluation is idempotent:

(let ((x '(lambda ())))
  (equal x (eval x)))
;; => t

(let ((x '(lambda ())))
  (equal x (eval (eval x))))
;; => t

So, with dynamic scope, the subtly broken backquote example will still work, but only by sheer luck. Under lexical scope, the situation isn’t so lucky:

;;; -*- lexical-scope: t; -*-

(lambda ())
;; => (closure (t) nil)

These interpreted lambda functions are neither self-evaluating nor idempotent. Passing t as the second argument to eval tells it to use lexical scope, as shown below:

;; Not self-evaluating:

(let ((x '(lambda ())))
  (eq x (eval x t)))
;; => nil

;; Not idempotent:

(let ((x '(lambda ())))
  (equal x (eval x t)))
;; => nil

(let ((x '(lambda ())))
  (equal x (eval (eval x t) t)))
;; error: (void-function closure)

I can imagine an implementation of Emacs Lisp where dynamic scope lambda expressions are in the same boat, where they’re not even idempotent. For example:

;;; -*- lexical-binding: nil; -*-

(lambda ())
;; => (totally-not-a-closure ())

Most Emacs Lisp would work just fine under this change, and only code that makes some kind of logical mistake — where there’s nested evaluation of lambda expressions — would break. This essentially already happened when lots of code was quietly switched over to lexical scope after Emacs 24. Lambda idempotency was lost and well-written code didn’t notice.

There’s a temptation here for Emacs to define a closure function or special form that would allow interpreter closure objects to be either self-evaluating or idempotent. This would be a mistake. It would only serve as a hack that covers up logical mistakes that lead to nested evaluation. Much better to catch those problems early.

Solving the problem with one character

So how do we fix the subtly broken example? With a strategically placed quote right before the comma.

(eval-after-load 'simple-httpd
  `(funcall
    ',(lambda ()
        (push '("c" . "text/plain") httpd-mime-types)))

So the form passed to eval-after-load becomes:

;; Compiled:
(funcall (quote #[...]))

;; Dynamic scope:
(funcall (quote (lambda () ...)))

;; Lexical scope:
(funcall (quote (closure (t) () ...)))

The quote prevents eval from evaluating the function object, which would be either needless or harmful. There’s also an argument to be made that this is a perfect situation for a sharp-quote (#'), which exists to quote functions.

Options for Structured Data in Emacs Lisp

2018-02-14T17:43:34Z

So your Emacs package has grown beyond a dozen or so lines of code, and the data it manages is now structured and heterogeneous. Informal plain old lists, the bread and butter of any lisp, are not longer cutting it. You really need to cleanly abstract this structure, both for your own organizational sake any for anyone reading your code.

With informal lists as structures, you might regularly ask questions like, “Was the ‘name’ slot stored in the third list element, or was it the fourth element?” A plist or alist helps with this problem, but those are better suited for informal, externally-supplied data, not for internal structures with fixed slots. Occasionally someone suggests using hash tables as structures, but Emacs Lisp’s hash tables are much too heavy for this. Hash tables are more appropriate when keys themselves are data.

Defining a data structure from scratch

Imagine a refrigerator package that manages a collection of food in a refrigerator. A food item could be structured as a plain old list, with slots at specific positions.

(defun fridge-item-create (name expiry weight)
  (list name expiry weight))

A function that computes the mean weight of a list of food items might look like this:

(defun fridge-mean-weight (items)
  (if (null items)
      0.0
    (let ((sum 0.0)
          (count 0))
      (dolist (item items (/ sum count))
        (setf count (1+ count)
              sum (+ sum (nth 2 item)))))))

Note the use of (nth 2 item) at the end, used to get the item’s weight. That magic number 2 is easy to mess up. Even worse, if lots of code accesses “weight” this way, then future extensions will be inhibited. Defining some accessor functions solves this problem.

(defsubst fridge-item-name (item)
  (nth 0 item))

(defsubst fridge-item-expiry (item)
  (nth 1 item))

(defsubst fridge-item-weight (item)
  (nth 2 item))

The defsubst defines an inline function, so there’s effectively no additional run-time costs for these accessors compared to a bare nth. Since these only cover getting slots, we should also define some setters using the built-in gv (generalized variable) package.

(require 'gv)

(gv-define-setter fridge-item-name (value item)
  `(setf (nth 0 ,item) ,value))

(gv-define-setter fridge-item-expiry (value item)
  `(setf (nth 1 ,item) ,value))

(gv-define-setter fridge-item-weight (value item)
  `(setf (nth 2 ,item) ,value))

This makes each slot setf-able. Generalized variables are great for simplifying APIs, since otherwise there would need to be an equal number of setter functions (fridge-item-set-name, etc.). With generalized variables, both are at the same entrypoint:

(setf (fridge-item-name item) "Eggs")

There are still two more significant improvements.

As far as Emacs Lisp is concerned, this isn’t a real type. The type-ness of it is just a fiction created by the conventions of the package. It would be easy to make the mistake of passing an arbitrary list to these fridge-item functions, and the mistake wouldn’t be caught so long as that list has at least three items. An common solution is to add a type tag: a symbol at the beginning of the structure that identifies it.
It’s still a linked list, and nth has to walk the list (i.e. O(n)) to retrieve items. It would be much more efficient to use a vector, turning this into an efficient O(1) operation.

Addressing both of these at once:

(defun fridge-item-create (name expiry weight)
  (vector 'fridge-item name expiry weight))

(defsubst fridge-item-p (object)
  (and (vectorp object)
       (= (length object) 4)
       (eq 'fridge-item (aref object 0))))

(defsubst fridge-item-name (item)
  (unless (fridge-item-p item)
    (signal 'wrong-type-argument (list 'fridge-item item)))
  (aref item 1))

(defsubst fridge-item-name--set (item value)
  (unless (fridge-item-p item)
    (signal 'wrong-type-argument (list 'fridge-item item)))
  (setf (aref item 1) value))

(gv-define-setter fridge-item-name (value item)
  `(fridge-item-name--set ,item ,value))

;; And so on for expiry and weight...

As long as fridge-mean-weight uses the fridge-item-weight accessor, it continues to work unmodified across all these changes. But, whew, that’s quite a lot of boilerplate to write and maintain for each data structure in our package! Boilerplate code generation is a perfect candidate for a macro definition. Luckily for us, Emacs already defines a macro to generate all this code: cl-defstruct.

(require 'cl-lib)

(cl-defstruct fridge-item
  name expiry weight)

In Emacs 25 and earlier, this innocent looking definition expands into essentially all the above code. The code it generates is expressed in the most optimal form for its version of Emacs, and it exploits many of the available optimizations by using function declarations such as side-effect-free and error-free. It’s configurable, too, allowing for the exclusion of a type tag (:named) — discarding all the type checks — or using a list rather than a vector as the underlying structure (:type). As a crude form of structural inheritance, it even allows for directly embedding other structures (:include).

Two pitfalls

There a couple pitfalls, though. First, for historical reasons, the macro will define two namespace-unfriendly functions: make-NAME and copy-NAME. I always override these, preferring the -create convention for the constructor, and tossing the copier since it’s either useless or, worse, semantically wrong.

(cl-defstruct (fridge-item (:constructor fridge-item-create)
                           (:copier nil))
  name expiry weight)

If the constructor needs to be more sophisticated than just setting slots, it’s common to define a “private” constructor (double dash in the name) and wrap it with a “public” constructor that has some behavior.

(cl-defstruct (fridge-item (:constructor fridge-item--create)
                           (:copier nil))
  name expiry weight entry-time)

(cl-defun fridge-item-create (&rest args)
  (apply #'fridge-item--create :entry-time (float-time) args))

The other pitfall is related to printing. In Emacs 25 and earlier, types defined by cl-defstruct are still only types by convention. They’re really just vectors as far as Emacs Lisp is concerned. One benefit from this is that printing and reading these structures is “free” because vectors are printable. It’s trivial to serialize cl-defstruct structures out to a file. This is exactly how the Elfeed database works.

The pitfall is that once a structure has been serialized, there’s no more changing the cl-defstruct definition. It’s now a file format definition, so the slots are locked in place. Forever.

Emacs 26 throws a wrench in all this, though it’s worth it in the long run. There’s a new primitive type in Emacs 26 with its own reader syntax: records. This is similar to hash tables becoming first class in the reader in Emacs 23.2. In Emacs 26, cl-defstruct uses records instead of vectors.

;; Emacs 25:
(fridge-item-create :name "Eggs" :weight 11.1)
;; => [cl-struct-fridge-item "Eggs" nil 11.1]

;; Emacs 26:
(fridge-item-create :name "Eggs" :weight 11.1)
;; => #s(fridge-item "Eggs" nil 11.1)

So far slots are still accessed using aref, and all the type checking still happens in Emacs Lisp. The only practical change is the record function is used in place of the vector function when allocating a structure. But it does pave the way for more interesting things in the future.

The major short-term downside is that this breaks printed compatibility across the Emacs 25/26 boundary. The cl-old-struct-compat-mode function can be used for some degree of backwards, but not forwards, compatibility. Emacs 26 can read and use some structures printed by Emacs 25 and earlier, but the reverse will never be true. This issue initially tripped up Emacs’ built-in packages, and when Emacs 26 is released we’ll see more of these issues arise in external packages.

Dynamic dispatch

Prior to Emacs 25, the major built-in package for dynamic dispatch — functions that specialize on the run-time type of their arguments — was EIEIO, though it only supported single dispatch (specializing on a single argument). EIEIO brought much of the Common Lisp Object System (CLOS) to Emacs Lisp, including classes and methods.

Emacs 25 introduced a more sophisticated dynamic dispatch package called cl-generic. It focuses only on dynamic dispatch and supports multiple dispatch, completely replacing the dynamic dispatch portion of EIEIO. Since cl-defstruct does inheritance and cl-generic does dynamic dispatch, there’s not really much left for EIEIO — besides bad ideas like multiple inheritance and method combination.

Without either of these packages, the most direct way to build single dispatch on top of cl-defstruct would be to shove a function in one of the slots. Then the “method” is just a wrapper that call this function.

;; Base "class"

(cl-defstruct greeter
  greeting)

(defun greet (thing)
  (funcall (greeter-greeting thing) thing))

;; Cow "class"

(cl-defstruct (cow (:include greeter)
                   (:constructor cow--create)))

(defun cow-create ()
  (cow--create :greeting (lambda (_) "Moo!")))

;; Bird "class"

(cl-defstruct (bird (:include greeter)
                    (:constructor bird--create)))

(defun bird-create ()
  (bird--create :greeting (lambda (_) "Chirp!")))

;; Usage:

(greet (cow-create))
;; => "Moo!"

(greet (bird-create))
;; => "Chirp!"

Since cl-generic is aware of the types created by cl-defstruct, functions can specialize on them as if they were native types. It’s a lot simpler to let cl-generic do all the hard work. The people reading your code will appreciate it, too:

(require 'cl-generic)

(cl-defgeneric greet (greeter))

(cl-defstruct cow)

(cl-defmethod greet ((_ cow))
  "Moo!")

(cl-defstruct bird)

(cl-defmethod greet ((_ bird))
  "Chirp!")

(greet (make-cow))
;; => "Moo!"

(greet (make-bird))
;; => "Chirp!"

The majority of the time a simple cl-defstruct will fulfill your needs, keeping in mind the gotcha with the constructor and copier names. Its use should feel almost as natural as defining functions.

What's in an Emacs Lambda

2017-12-14T18:18:57Z

There was recently some interesting discussion about correctly using backquotes to express a mixture of data and code. Since lambda expressions seem to evaluate to themselves, what’s the difference? For example, an association list of operations:

'((add . (lambda (a b) (+ a b)))
  (sub . (lambda (a b) (- a b)))
  (mul . (lambda (a b) (* a b)))
  (div . (lambda (a b) (/ a b))))

It looks like it would work, and indeed it does work in this case. However, there are good reasons to actually evaluate those lambda expressions. Eventually invoking the lambda expressions in the quoted form above are equivalent to using eval. So, instead, prefer the backquote form:

`((add . ,(lambda (a b) (+ a b)))
  (sub . ,(lambda (a b) (- a b)))
  (mul . ,(lambda (a b) (* a b)))
  (div . ,(lambda (a b) (/ a b))))

There are a lot of interesting things to say about this, but let’s first reduce it to two very simple cases:

(lambda (x) x)

'(lambda (x) x)

What’s the difference between these two forms? The first is a lambda expression, and it evaluates to a function object. The other is a quoted list that looks like a lambda expression, and it evaluates to a list — a piece of data.

A naive evaluation of these expressions in *scratch* (C-x C-e) suggests they are are identical, and so it would seem that quoting a lambda expression doesn’t really matter:

(lambda (x) x)
;; => (lambda (x) x)

'(lambda (x) x)
;; => (lambda (x) x)

However, there are two common situations where this is not the case: byte compilation and lexical scope.

Lambda under byte compilation

It’s a little trickier to evaluate these forms byte compiled in the scratch buffer since that doesn’t happen automatically. But if it did, it would look like this:

;;; -*- lexical-binding: nil; -*-

(lambda (x) x)
;; => #[(x) "\010\207" [x] 1]

'(lambda (x) x)
;; => (lambda (x) x)

The #[...] is the syntax for a byte-code function object. As discussed in detail in my byte-code internals article, it’s a special vector object that contains byte-code, and other metadata, for evaluation by Emacs’ virtual stack machine. Elisp is one of very few languages with readable function objects, and this feature is core to its ahead-of-time byte compilation.

The quote, by definition, prevents evaluation, and so inhibits byte compilation of the lambda expression. It’s vital that the byte compiler does not try to guess the programmer’s intent and compile the expression anyway, since that would interfere with lists that just so happen to look like lambda expressions — i.e. any list containing the lambda symbol.

There are three reasons you want your lambda expressions to get byte compiled:

Byte-compiled functions are significantly faster. That’s the main purpose for byte compilation after all.
The compiler performs static checks, producing warnings and errors ahead of time. This lets you spot certain classes of problems before they occur. The static analysis is even better under lexical scope due to its tighter semantics.
Under lexical scope, byte-compiled closures may use less memory. More specifically, they won’t accidentally keep objects alive longer than necessary. I’ve never seen a name for this implementation issue, but I call it overcapturing. More on this later.

While it’s common for personal configurations to skip byte compilation, Elisp should still generally be written as if it were going to be byte compiled. General rule of thumb: Ensure your lambda expressions are actually evaluated.

Lambda in lexical scope

As I’ve stressed many times, you should always use lexical scope. There’s no practical disadvantage or trade-off involved. Just do it.

Once lexical scope is enabled, the two expressions diverge even without byte compilation:

;;; -*- lexical-binding: t; -*-

(lambda (x) x)
;; => (closure (t) (x) x)

'(lambda (x) x)
;; => (lambda (x) x)

Under lexical scope, lambda expressions evaluate to closures. Closures capture their lexical environment in their closure object — nothing in this particular case. It’s a type of function object, making it a valid first argument to funcall.

Since the quote prevents the second expression from being evaluated, semantically it evaluates to a list that just so happens to look like a (non-closure) function object. Invoking a data object as a function is like using eval — i.e. executing data as code. Everyone already knows eval should not be used lightly.

It’s a little more interesting to look at a closure that actually captures a variable, so here’s a definition for constantly, a higher-order function that returns a closure that accepts any number of arguments and returns a particular constant:

(defun constantly (x)
  (lambda (&rest _) x))

Without byte compiling it, here’s an example of its return value:

(constantly :foo)
;; => (closure ((x . :foo) t) (&rest _) x)

The environment has been captured as an association list (with a trailing t), and we can plainly see that the variable x is bound to the symbol :foo in this closure. Consider that we could manipulate this data structure (e.g. setcdr or setf) to change the binding of x for this closure. This is essentially how closures mutate their own environment. Moreover, closures from the same environment share structure, so such mutations are also shared. More on this later.

Semantically, closures are distinct objects (via eq), even if the variables they close over are bound to the same value. This is because they each have a distinct environment attached to them, even if in some invisible way.

(eq (constantly :foo) (constantly :foo))
;; => nil

Without byte compilation, this is true even when there’s no lexical environment to capture:

(defun dummy ()
  (lambda () t))

(eq (dummy) (dummy))
;; => nil

The byte compiler is smart, though. As an optimization, the same closure object is reused when possible, avoiding unnecessary work, including multiple object allocations. Though this is a bit of an abstraction leak. A function can (ab)use this to introspect whether it’s been byte compiled:

(defun have-i-been-compiled-p ()
  (let ((funcs (vector nil nil)))
    (dotimes (i 2)
      (setf (aref funcs i) (lambda ())))
    (eq (aref funcs 0) (aref funcs 1))))

(have-i-been-compiled-p)
;; => nil

(byte-compile 'have-i-been-compiled-p)

(have-i-been-compiled-p)
;; => t

The trick here is to evaluate the exact same non-capturing lambda expression twice, which requires a loop (or at least some sort of branch). Semantically we should think of these closures as being distinct objects, but, if we squint our eyes a bit, we can see the effects of the behind-the-scenes optimization.

Don’t actually do this in practice, of course. That’s what byte-code-function-p is for, which won’t rely on a subtle implementation detail.

Overcapturing

I mentioned before that one of the potential gotchas of not byte compiling your lambda expressions is overcapturing closure variables in the interpreter.

To evaluate lisp code, Emacs has both an interpreter and a virtual machine. The interpreter evaluates code in list form: cons cells, numbers, symbols, etc. The byte compiler is like the interpreter, but instead of directly executing those forms, it emits byte-code that, when evaluated by the virtual machine, produces identical visible results to the interpreter — in theory.

What this means is that Emacs contains two different implementations of Emacs Lisp, one in the interpreter and one in the byte compiler. The Emacs developers have been maintaining and expanding these implementations side-by-side for decades. A pitfall to this approach is that the implementations can, and do, diverge in their behavior. We saw this above with that introspective function, and it comes up in practice with advice.

Another way they diverge is in closure variable capture. For example:

;;; -*- lexical-binding: t; -*-

(defun overcapture (x y)
  (when y
    (lambda () x)))

(overcapture :x :some-big-value)
;; => (closure ((y . :some-big-value) (x . :x) t) nil x)

Notice that the closure captured y even though it’s unnecessary. This is because the interpreter doesn’t, and shouldn’t, take the time to analyze the body of the lambda to determine which variables should be captured. That would need to happen at run-time each time the lambda is evaluated, which would make the interpreter much slower. Overcapturing can get pretty messy if macros are introducing their own hidden variables.

On the other hand, the byte compiler can do this analysis just once at compile-time. And it’s already doing the analysis as part of its job. It can avoid this problem easily:

(overcapture :x :some-big-value)
;; => #[0 "\300\207" [:x] 1]

It’s clear that :some-big-value isn’t present in the closure.

But… how does this work?

How byte compiled closures are constructed

Recall from the internals article that the four core elements of a byte-code function object are:

Parameter specification
Byte-code string (opcodes)
Constants vector
Maximum stack usage

While a closure seems like compiling a whole new function each time the lambda expression is evaluated, there’s actually not that much to it! Namely, the behavior of the function remains the same. Only the closed-over environment changes.

What this means is that closures produced by a common lambda expression can all share the same byte-code string (second element). Their bodies are identical, so they compile to the same byte-code. Where they differ are in their constants vector (third element), which gets filled out according to the closed over environment. It’s clear just from examining the outputs:

(constantly :a)
;; => #[128 "\300\207" [:a] 2]

(constantly :b)
;; => #[128 "\300\207" [:b] 2]

constantly has three of the four components of the closure in its own constant pool. Its job is to construct the constants vector, and then assemble the whole thing into a byte-code function object (#[...]). Here it is with M-x disassemble:

     constant  make-byte-code
     constant  128
     constant  "\300\207"
     constant  vector
     stack-ref 4
     call      1
     constant  2
     call      4
     return

(Note: since byte compiler doesn’t produce perfectly optimal code, I’ve simplified it for this discussion.)

It pushes most of its constants on the stack. Then the stack-ref 5 (5) puts x on the stack. Then it calls vector to create the constants vector (6). Finally, it constructs the function object (#[...]) by calling make-byte-code (8).

Since this might be clearer, here’s the same thing expressed back in terms of Elisp:

(defun constantly (x)
  (make-byte-code 128 "\300\207" (vector x) 2))

To see the disassembly of the closure’s byte-code:

(disassemble (constantly :x))

The result isn’t very surprising:

0       constant  :x
1       return

Things get a little more interesting when mutation is involved. Consider this adder closure generator, which mutates its environment every time it’s called:

(defun adder ()
  (let ((total 0))
    (lambda () (cl-incf total))))

(let ((count (adder)))
  (funcall count)
  (funcall count)
  (funcall count))
;; => 3

(adder)
;; => #[0 "\300\211\242T\240\207" [(0)] 2]

The adder essentially works like this:

(defun adder ()
  (make-byte-code 0 "\300\211\242T\240\207" (vector (list 0)) 2))

In theory, this closure could operate by mutating its constants vector directly. But that wouldn’t be much of a constants vector, now would it!? Instead, mutated variables are boxed inside a cons cell. Closures don’t share constant vectors, so the main reason for boxing is to share variables between closures from the same environment. That is, they have the same cons in each of their constant vectors.

There’s no equivalent Elisp for the closure in adder, so here’s the disassembly:

     constant  (0)
     dup
     car-safe
     add1
     setcar
     return

It puts two references to boxed integer on the stack (constant, dup), unboxes the top one (car-safe), increments that unboxed integer, stores it back in the box (setcar) via the bottom reference, leaving the incremented value behind to be returned.

This all gets a little more interesting when closures interact:

(defun fancy-adder ()
  (let ((total 0))
    `(:add ,(lambda () (cl-incf total))
      :set ,(lambda (v) (setf total v))
      :get ,(lambda () total))))

(let ((counter (fancy-adder)))
  (funcall (plist-get counter :set) 100)
  (funcall (plist-get counter :add))
  (funcall (plist-get counter :add))
  (funcall (plist-get counter :get)))
;; => 102

(fancy-adder)
;; => (:add #[0 "\300\211\242T\240\207" [(0)] 2]
;;     :set #[257 "\300\001\240\207" [(0)] 3]
;;     :get #[0 "\300\242\207" [(0)] 1])

This is starting to resemble object oriented programming, with methods acting upon fields stored in a common, closed-over environment.

All three closures share a common variable, total. Since I didn’t use print-circle, this isn’t obvious from the last result, but each of those (0) conses are the same object. When one closure mutates the box, they all see the change. Here’s essentially how fancy-adder is transformed by the byte compiler:

(defun fancy-adder ()
  (let ((box (list 0)))
    (list :add (make-byte-code 0 "\300\211\242T\240\207" (vector box) 2)
          :set (make-byte-code 257 "\300\001\240\207" (vector box) 3)
          :get (make-byte-code 0 "\300\242\207" (vector box) 1))))

The backquote in the original fancy-adder brings this article full circle. This final example wouldn’t work correctly if those lambdas weren’t evaluated properly.

Make Flet Great Again

2017-10-27T21:02:58Z

Do you long for the days before Emacs 24.3 when flet was dynamically scoped? Well, you probably shouldn’t since there are some very good reasons lexical scope. But, still, a dynamically scoped flet is situationally really useful, particularly in unit testing. The good news is that it’s trivial to get this original behavior back without relying on deprecated functions nor third-party packages.

But first, what is flet and what does it mean for it to be dynamically scoped? The name stands for “function let” (or something to that effect). It’s a macro to bind named functions within a local scope, just as let binds variables within some local scope. It’s provided by the now-deprecated cl package.

(require 'cl)  ; deprecated!

(defun norm (x y)
  (flet ((square (v) (* v v)))
    (sqrt (+ (square x) (square y)))))

However, a gotcha here is that square is visible not just to the body of norm but also to any function called directly or indirectly from the flet body. That’s dynamic scope.

(flet ((sqrt (v) (/ v 2)))  ; close enough
  (norm 2 2))
;; -> 4

Note: This works because sqrt hasn’t (yet?) been assigned a bytecode opcode. One weakness with flet is that, due to being dynamically scoped, it is unable to define or override functions whose calls evaporate under byte compilation. For example, addition:

(defun add-with-flet ()
  (flet ((+ (&rest _) :override))
    (+ 1 2 3)))

(add-with-flet)
;; -> :override

(funcall (byte-compile #'add-with-flet))
;; -> 6

Since + has its own opcode, the function call is eliminated under byte-compilation and flet can’t do its job. This is similar these same functions being unadvisable.

cl-lib and cl-flet

The cl-lib package introduced in Emacs 24.3, replacing cl, adds a namespace prefix, cl-, to all of these Common Lisp style functions. In most cases this was the only change. One exception is cl-flet, which has different semantics: It’s lexically scoped, just like in Common Lisp. Its bindings aren’t visible outside of the cl-flet body.

(require 'cl-lib)

(cl-flet ((sqrt (v) (/ v 2)))
  (norm 2 2))
;; -> 2.8284271247461903

In most cases this is what you actually want. The old flet subtly changes the environment for all functions called directly or indirectly from its body.

Besides being cleaner and less error prone, cl-flet also doesn’t have special exceptions for functions with assigned opcodes. At macro-expansion time it walks the body, taking its action before the byte-compiler can interfere.

(defun add-with-cl-flet ()
  (cl-flet ((+ (&rest _) :override))
    (+ 1 2 3)))

(add-with-cl-flet)
;; -> :override

(funcall (byte-compile #'add-with-cl-flet))
;; -> :override

In order for it to work properly, it’s essential that functions are quoted with sharp-quotes (#') so that the macro can tell the difference between functions and symbols. Just make a general habit of sharp-quoting functions.

In unit testing, temporarily overriding functions for all of Emacs is useful, so flet still has some uses. But it’s deprecated!

Unit testing with flet

Since Emacs can do anything, suppose there is an Emacs package that makes sandwiches. In this package there’s an interactive function to set the default sandwich cheese.

(defvar default-cheese 'cheddar)

(defun set-default-cheese (type)
  (interactive
   (let* ((options '("cheddar" "swiss" "american"))
          (input (completing-read "Cheese: " options nil t)))
     (when input
       (list (intern input)))))
  (setf default-cheese type))

Since it’s interactive, it uses completing-read to prompt the user for input. A unit test could call this function non-interactively, but perhaps we’d also like to test the interactive path. The code inside interactive occasionally gets messy and may warrant testing. It would obviously be inconvenient to prompt the user for input during testing, and it wouldn’t work at all in batch mode (-batch).

With flet we can stub out completing-read just for the unit test:

;;; -*- lexical-binding: t; -*-

(ert-deftest test-set-default-cheese ()
  ;; protect original with dynamic binding
  (let (default-cheese)
    ;; simulate user entering "american"
    (flet ((completing-read (&rest _) "american"))
      (call-interactively #'set-default-cheese)
      (should (eq 'american default-cheese)))))

Since default-cheese was defined with defvar, it will be dynamically scoped despite let normally using lexical scope in this example. Both of the side effects of the tested function — setting a global variable and prompting the user — are captured using a combination of let and flet.

Since cl-flet is lexically scoped, it cannot serve this purpose. If flet is deprecated and cl-flet can’t do the job, what’s the right way to fix it? The answer lies in generalized variables.

cl-letf

What’s really happening inside flet is it’s globally binding a function name to a different function, evaluating the body, and rebinding it back to the original definition when the body completes. It macro-expands to something like this:

(let ((original (symbol-function 'completing-read)))
  (setf (symbol-function 'completing-read)
        (lambda (&rest _) "american"))
  (unwind-protect
      (call-interactively #'set-default-cheese)
    (setf (symbol-function 'completing-read) original)))

The unwind-protect ensures the original function is rebound even if the body of the call were to fail. This is very much a let-like pattern, and I’m using symbol-function as a generalized variable via setf. Is there a generalized variable version of let?

Yes! It’s called cl-letf! In this case the f suffix is analogous to the f suffix in setf. That form above can be reduced to a more general form:

(cl-letf (((symbol-function 'completing-read)
           (lambda (&rest _) "american")))
  (call-interactively #'set-default-cheese))

And that’s the way to reproduce the dynamically scoped behavior of flet since Emacs 24.3. There’s nothing complicated about it.

(ert-deftest test-set-default-cheese ()
  (let (default-cheese)
    (cl-letf (((symbol-function 'completing-read)
               (lambda (&rest _) "american")))
      (call-interactively #'set-default-cheese)
      (should (eq 'american default-cheese)))))

Keep in mind that this suffers the exact same problem with bytecode-assigned functions as flet, and for exactly the same reasons. If completing-read were to ever be assigned its own opcode then cl-letf would no longer work for this particular example.

Asynchronous Requests from Emacs Dynamic Modules

2017-02-14T02:30:00Z

A few months ago I had a discussion with Vladimir Kazanov about his Orgfuse project: a Python script that exposes an Emacs Org-mode document as a FUSE filesystem. It permits other programs to navigate the structure of an Org-mode document through the standard filesystem APIs. I suggested that, with the new dynamic modules in Emacs 25, Emacs itself could serve a FUSE filesystem. In fact, support for FUSE services in general could be an package of his own.

So that’s what he did: Elfuse. It’s an old joke that Emacs is an operating system, and here it is handling system calls.

However, there’s a tricky problem to solve, an issue also present my joystick module. Both modules handle asynchronous events — filesystem requests or joystick events — but Emacs runs the event loop and owns the main thread. The external events somehow need to feed into the main event loop. It’s even more difficult with FUSE because FUSE also wants control of its own thread for its own event loop. This requires Elfuse to spawn a dedicated FUSE thread and negotiate a request/response hand-off.

When a filesystem request or joystick event arrives, how does Emacs know to handle it? The simple and obvious solution is to poll the module from a timer.

struct queue requests;

emacs_value
Frequest_next(emacs_env *env, ptrdiff_t n, emacs_value *args, void *p)
{
    emacs_value next = Qnil;
    queue_lock(requests);
    if (queue_length(requests) > 0) {
        void *request = queue_pop(requests, env);
        next = env->make_user_ptr(env, fin_empty, request);
    }
    queue_unlock(request);
    return next;
}

And then ask Emacs to check the module every, say, 10ms:

(defun request--poll ()
  (let ((next (request-next)))
    (when next
      (request-handle next))))

(run-at-time 0 0.01 #'request--poll)

Blocking directly on the module’s event pump with Emacs’ thread would prevent Emacs from doing important things like, you know, being a text editor. The timer allows it to handle its own events uninterrupted. It gets the job done, but it’s far from perfect:

It imposes an arbitrary latency to handling requests. Up to the poll period could pass before a request is handled.
Polling the module 100 times per second is inefficient. Unless you really enjoy recharging your laptop, that’s no good.

The poll period is a sliding trade-off between latency and battery life. If only there was some mechanism to, ahem, signal the Emacs thread, informing it that a request is waiting…

SIGUSR1

Emacs Lisp programs can handle the POSIX SIGUSR1 and SIGUSR2 signals, which is exactly the mechanism we need. The interface is a “key” binding on special-event-map, the keymap that handles these kinds of events. When the signal arrives, Emacs queues it up for the main event loop.

(define-key special-event-map [sigusr1]
  (lambda ()
    (interactive)
    (request-handle (request-next))))

The module blocks on its own thread on its own event pump. When a request arrives, it queues the request, rings the bell for Emacs to come handle it (raise()), and waits on a semaphore. For illustration purposes, assume the module reads requests from and writes responses to a file descriptor, like a socket.

int event_fd = /* ... */;
struct request request;
sem_init(&request.sem, 0, 0);

for (;;) {
    /* Blocking read for request event */
    read(event_fd, &request.event, sizeof(request.event));

    /* Put request on the queue */
    queue_lock(requests);
    queue_push(requests, &request);
    queue_unlock(requests);
    raise(SIGUSR1);  // TODO: Should raise() go inside the lock?

    /* Wait for Emacs */
    while (sem_wait(&request.sem))
        ;

    /* Reply with Emacs' response */
    write(event_fd, &request.response, sizeof(request.response));
}

The sem_wait() is in a loop because signals will wake it up prematurely. In fact, it may even wake up due to its own signal on the line before. This is the only way this particular use of sem_wait() might fail, so there’s no need to check errno.

If there are multiple module threads making requests to the same global queue, the lock is necessary to protect the queue. The semaphore is only for blocking the thread until Emacs has finished writing its particular response. Each thread has its own semaphore.

When Emacs is done writing the response, it releases the module thread by incrementing the semaphore. It might look something like this:

emacs_value
Frequest_complete(emacs_env *env, ptrdiff_t n, emacs_value *args, void *p)
{
    struct request *request = env->get_user_ptr(env, args[0]);
    if (request)
        sem_post(&request->sem);
    return Qnil;
}

The top-level handler dispatches to the specific request handler, calling request-complete above when it’s done.

(defun request-handle (next)
  (condition-case e
      (cl-ecase (request-type next)
        (:open  (request-handle-open  next))
        (:close (request-handle-close next))
        (:read  (request-handle-read  next)))
    (error (request-respond-as-error next e)))
  (request-complete))

This SIGUSR1+semaphore mechanism is roughly how Elfuse currently processes requests.

Making it work on Windows

Windows doesn’t have signals. This isn’t a problem for Elfuse since Windows doesn’t have FUSE either. Nor does it matter for Joymacs since XInput isn’t event-driven and always requires polling. But someday someone will need this mechanism for a dynamic module on Windows.

Fortunately there’s a solution: input language change events, WM_INPUTLANGCHANGE. It’s also on special-event-map:

(define-key special-event-map [language-change]
  (lambda ()
    (interactive)
    (request-process (request-next))))

Instead of raise() (or pthread_kill()), broadcast the window event with PostMessage(). Outside of invoking the language-change key binding, Emacs will ignore the event because WPARAM is 0 — it doesn’t belong to any particular window. We don’t really want to change the input language, after all.

PostMessageA(HWND_BROADCAST, WM_INPUTLANGCHANGE, 0, 0);

Naturally you’ll also need to replace the POSIX threading primitives with the Windows versions (CreateThread(), CreateSemaphore(), etc.). With a bit of abstraction in the right places, it should be pretty easy to support both POSIX and Windows in these asynchronous dynamic module events.

How to Write Fast(er) Emacs Lisp

2017-01-30T21:08:19Z

Not everything written in Emacs Lisp needs to be fast. Most of Emacs itself — around 82% — is written in Emacs Lisp because those parts are generally not performance-critical. Otherwise these functions would be built-ins written in C. Extensions to Emacs don’t have a choice and — outside of a few exceptions like dynamic modules and inferior processes — must be written in Emacs Lisp, including their performance-critical bits. Common performance hot spots are automatic indentation, AST parsing, and interactive completion.

Here are 5 guidelines, each very specific to Emacs Lisp, that will result in faster code. The non-intrusive guidelines could be applied at all times as a matter of style — choosing one equally expressive and maintainable form over another just because it performs better.

There’s one caveat: These guidelines are focused on Emacs 25.1 and “nearby” versions. Emacs is constantly evolving. Changes to the virtual machine and byte-code compiler may transform currently-slow expressions into fast code, obsoleting some of these guidelines. In the future I’ll add notes to this article for anything that changes.

(1) Use lexical scope

This guideline refers to the following being the first line of every Emacs Lisp source file you write:

;;; -*- lexical-binding: t; -*-

This point is worth mentioning again and again. Not only will your code be more correct, it will be measurably faster. Dynamic scope is still opt-in through the explicit use of special variables, so there’s absolutely no reason not to be using lexical scope. If you’ve written clean, dynamic scope code, then switching to lexical scope won’t have any effect on its behavior.

Along similar lines, special variables are a lot slower than local, lexical variables. Only use them when necessary.

(2) Prefer built-in functions

Built-in functions are written in C and are, as expected, significantly faster than the equivalent written in Emacs Lisp. Complete as much work as possible inside built-in functions, even if it might mean taking more conceptual steps overall.

For example, what’s the fastest way to accumulate a list of items? That is, new items go on the tail but, for algorithm reasons, the list must be constructed from the head.

You might be tempted to keep track of the tail of the list, appending new elements directly to the tail with setcdr (via setf below).

(defun fib-track-tail (n)
  (let* ((a 0)
         (b 1)
         (head (list 1))
         (tail head))
    (dotimes (_ n head)
      (psetf a b
             b (+ a b))
      (setf (cdr tail) (list b)
            tail (cdr tail)))))

(fib-track-tail 8)
;; => (1 1 2 3 5 8 13 21 34)

Actually, it’s much faster to construct the list in reverse, then destructively reverse it at the end.

(defun fib-nreverse (n)
  (let* ((a 0)
         (b 1)
         (list (list 1)))
    (dotimes (_ n (nreverse list))
      (psetf a b
             b (+ a b))
      (push b list))))

It might not look it, but nreverse is very fast. Not only is it a built-in, it’s got its own opcode. Using push in a loop, then finishing with nreverse is the canonical and fastest way to accumulate a list of items.

In fib-track-tail, the added complexity of tracking the tail in Emacs Lisp is much slower than zipping over the entire list a second time in C.

(3) Avoid unnecessary lambda functions

I’m talking about mapcar and friends.

;; Slower
(defun expt-list (list e)
  (mapcar (lambda (x) (expt x e)) list))

Listen, I know you love dash.el and higher order functions, but this habit ain’t cheap. The byte-code compiler does not know how to inline these lambdas, so there’s an additional per-element function call overhead.

Worse, if you’re using lexical scope like I told you, the above example forms a closure over e. This means a new function object is created (e.g. make-byte-code) each time expt-list is called. To be clear, I don’t mean that the lambda is recompiled each time — the same byte-code string is shared between all instances of the same lambda. A unique function vector (#[...]) and constants vector are allocated and initialized each time expt-list is invoked.

Related mini-guideline: Don’t create any more garbage than strictly necessary in performance-critical code.

Compare to an implementation with an explicit loop, using the nreverse list-accumulation technique.

(defun expt-list-fast (list e)
  (let ((result ()))
    (dolist (x list (nreverse result))
      (push (expt x e) result))))

No unnecessary garbage is created.
No unnecessary per-element function calls.

This is the fastest possible definition for this function, and it’s what you need to use in performance-critical code.

Personally I prefer the list comprehension approach, using cl-loop from cl-lib.

(defun expt-list-fast (list e)
  (cl-loop for x in list
           collect (expt x e)))

The cl-loop macro will expand into essentially the previous definition, making them practically equivalent. It takes some getting used to, but writing efficient loops is a whole lot less tedious with cl-loop.

In Emacs 24.4 and earlier, catch/throw is implemented by converting the body of the catch into a lambda function and calling it. If code inside the catch accesses a variable outside the catch (very likely), then, in lexical scope, it turns into a closure, resulting in the garbage function object like before.

In Emacs 24.5 and later, the byte-code compiler uses a new opcode, pushcatch. It’s a whole lot more efficient, and there’s no longer a reason to shy away from catch/throw in performance-critical code. This is important because it’s often the only way to perform an early bailout.

(4) Prefer using functions with dedicated opcodes

When following the guideline about using built-in functions, you might have several to pick from. Some built-in functions have dedicated virtual machine opcodes, making them much faster to invoke. Prefer these functions when possible.

How can you tell when a function has an assigned opcode? Take a peek at the byte-defop listings in bytecomp.el. Optimization often involves getting into the weeds, so don’t be shy.

For example, the assq and assoc functions search for a matching key in an association list (alist). Both are built-in functions, and the only difference is that the former compares keys with eq (e.g. symbol or integer keys) and the latter with equal (typically string keys). The difference in performance between eq and equal isn’t as important as another factor: assq has its own opcode (158).

This means in performance-critical code you should prefer assq, perhaps even going as far as restructuring your alists specifically to have eq keys. That last step is probably a trade-off, which means you’ll want to make some benchmarks to help with that decision.

Another example is eq, =, eql, and equal. Some macros and functions use eql, especially cl-lib which inherits eql as a default from Common Lisp. Take cl-case, which is like switch from the C family of languages. It compares elements with eql.

(defun op-apply (op a b)
  (cl-case op
    (:norm (+ (* a a) (* b b)))
    (:disp (abs (- a b)))
    (:isin (/ b (sin a)))))

The cl-case expands into a cond. Since Emacs byte-code lacks support for jump tables, there’s not much room for cleverness.

Update: Emacs 26.1, released May 2018, introduced a jump table opcode.

(defun op-apply (op a b)
  (cond
   ((eql op :norm) (+ (* a a) (* b b)))
   ((eql op :disp) (abs (- a b)))
   ((eql op :isin) (/ b (sin a)))))

It turns out eql is pretty much always the worst choice for cl-case. Of the four equality functions I listed, the only one lacking an opcode is eql. A faster definition would use eq. (In theory, cl-case could have done this itself because it knows all the keys are symbols.)

(defun op-apply (op a b)
  (cond
   ((eq op :norm) (+ (* a a) (* b b)))
   ((eq op :disp) (abs (- a b)))
   ((eq op :isin) (/ b (sin a)))))

Fortunately eq can safely compare integers in Emacs Lisp. You only need eql when comparing symbols, integers, and floats all at once, which is unusual.

(5) Unroll loops using and/or

Consider the following function which checks its argument against a list of numbers, bailing out on the first match. I used % instead of mod since the former has an opcode (166) and the latter does not.

(defun detect (x)
  (catch 'found
    (dolist (f '(2 3 5 7 11 13 17 19 23 29 31))
      (when (= 0 (% x f))
        (throw 'found f)))))

The byte-code compiler doesn’t know how to unroll loops. Fortunately that’s something we can do for ourselves using and and or. The compiler will turn this into clean, efficient jumps in the byte-code.

(defun detect-unrolled (x)
  (or (and (= 0 (% x 2)) 2)
      (and (= 0 (% x 3)) 3)
      (and (= 0 (% x 5)) 5)
      (and (= 0 (% x 7)) 7)
      (and (= 0 (% x 11)) 11)
      (and (= 0 (% x 13)) 13)
      (and (= 0 (% x 17)) 17)
      (and (= 0 (% x 19)) 19)
      (and (= 0 (% x 23)) 23)
      (and (= 0 (% x 29)) 29)
      (and (= 0 (% x 31)) 31)))

In Emacs 24.4 and earlier with the old-fashioned lambda-based catch, the unrolled definition is seven times faster. With the faster pushcatch-based catch it’s about twice as fast. This means the loop overhead accounts for about half the work of the first definition of this function.

Update: It was pointed out in the comments that this particular example is equivalent to a cond. That’s literally true all the way down to the byte-code, and it would be a clearer way to express the unrolled code. In real code it’s often not quite equivalent.

Unlike some of the other guidelines, this is certainly something you’d only want to do in code you know for sure is performance-critical. Maintaining unrolled code is tedious and error-prone.

I’ve had the most success with this approach by not by unrolling these loops myself, but by using a macro, or similar, to generate the unrolled form.

(defmacro with-detect (var list)
  (cl-loop for e in list
           collect `(and (= 0 (% ,var ,e)) ,e) into conditions
           finally return `(or ,@conditions)))

(defun detect-unrolled (x)
  (with-detect x (2 3 5 7 11 13 17 19 23 29 31)))

How can I find more optimization opportunities myself?

Use M-x disassemble to inspect the byte-code for your own hot spots. Observe how the byte-code changes in response to changes in your functions. Take note of the sorts of forms that allow the byte-code compiler to produce the best code, and then exploit it where you can.

Domain-Specific Language Compilation in Elfeed

2016-12-27T21:46:30Z

Last night I pushed another performance enhancement for Elfeed, this time reducing the time spent parsing feeds. It’s accomplished by compiling, during macro expansion, a jQuery-like domain-specific language within Elfeed.

Heuristic parsing

Given the nature of the domain — an under-specified standard and a lack of robust adherence — feed parsing is much more heuristic than strict. Sure, everyone’s feed XML is strictly conforming since virtually no feed reader tolerates invalid XML (thank you, XML libraries), but, for the schema, the situation resembles the de facto looseness of HTML. Sometimes important or required information is missing, or is only available in a different namespace. Sometimes, especially in the case of timestamps, it’s in the wrong format, or encoded incorrectly, or ambiguous. It’s real world data.

To get a particular piece of information, Elfeed looks in a number of different places within the feed, starting with the preferred source and stopping when the information is found. For example, to find the date of an Atom entry, Elfeed first searches for elements in this order:

Failing to find any of these elements, or if no parsable date is found, it settles on the current time. Only the updated element is required, but published usually has the desired information, so it goes first. The last three are only valid for another namespace, but are useful fallbacks.

Before Elfeed even starts this search, the XML text is parsed into an s-expression using xml-parse-region — a pure Elisp XML parser included in Emacs. The search is made over the resulting s-expression.

For example, here’s a sample from the Atom specification.

 xmlns="http://www.w3.org/2005/Atom">

  </span>Example Feed<span class="nt">
   href="http://example.org/"/>
  2003-12-13T18:30:02Z
  
    John Doe
  
  urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6

  
    </span>Atom-Powered Robots Run Amok<span class="nt">
     rel="alternate" href="http://example.org/2003/12/13/atom03"/>
    urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a
    2003-12-13T18:30:02Z
    Some text.
  

Which is parsed to into this s-expression.

((feed ((xmlns . "http://www.w3.org/2005/Atom"))
       (title () "Example Feed")
       (link ((href . "http://example.org/")))
       (updated () "2003-12-13T18:30:02Z")
       (author () (name () "John Doe"))
       (id () "urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6")
       (entry ()
              (title () "Atom-Powered Robots Run Amok")
              (link ((rel . "alternate")
                     (href . "http://example.org/2003/12/13/atom03")))
              (id () "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a")
              (updated () "2003-12-13T18:30:02Z")
              (summary () "Some text."))))

Each XML element is converted to a list. The first item is a symbol that is the element’s name. The second item is an alist of attributes — cons pairs of symbols and strings. And the rest are its children, both string nodes and other elements. I’ve trimmed the extraneous string nodes from the sample s-expression.

A subtle detail is that xml-parse-region doesn’t just return the root element. It returns a list of elements, which always happens to be a single element list, which is the root element. I don’t know why this is, but I’ve built everything to assume this structure as input.

Elfeed strips all namespaces stripped from both elements and attributes to make parsing simpler. As I said, it’s heuristic rather than strict, so namespaces are treated as noise.

A domain-specific language

Coding up Elfeed’s s-expression searches in straight Emacs Lisp would be tedious, error-prone, and difficult to understand. It’s a lot of loops, assoc, etc. So instead I invented a jQuery-like, CSS selector-like, domain-specific language (DSL) to express these searches concisely and clearly.

For example, all of the entry links are “selected” using this expression:

(feed entry link [rel "alternate"] :href)

Reading right-to-left, this matches every href attribute under every link element with the rel="alternate" attribute, under every entry element, under the feed root element. Symbols match element names, two-element vectors match elements with a particular attribute pair, and keywords (which must come last) narrow the selection to a specific attribute value.

Imagine hand-writing the code to navigate all these conditions for each piece of information that Elfeed requires. The RSS parser makes up to 16 such queries, and the Atom parser makes as many as 24. That would add up to a lot of tedious code.

The package (included with Elfeed) that executes this query is called “xml-query.” It comes in two flavors: xml-query and xml-query-all. The former returns just the first match, and the latter returns all matches. The naming parallels the querySelector() and querySelectorAll() DOM methods in JavaScript.

(let ((xml (elfeed-xml-parse-region)))
  (xml-query-all '(feed entry link [rel "alternate"] :href) xml))

;; => ("http://example.org/2003/12/13/atom03")

That date search I mentioned before looks roughly like this. The * matches text nodes within the selected element. It must come last just like the keyword matcher.

(or (xml-query '(feed entry published *))
    (xml-query '(feed entry updated *))
    (xml-query '(feed entry date *))
    (xml-query '(feed entry modified *))
    (xml-query '(feed entry issued *))
    (current-time))

Over the past three years, Elfeed has gained more and more of these selectors as it collects more and more information from feeds. Most recently, Elfeed collects author and category information provided by feeds. Each new query slows feed parsing a little bit, and it’s a perfect example of a program slowing down as it gains more features and capabilities.

But I don’t want Elfeed to slow down. I want it to get faster!

Optimizing the domain-specific language

Just like the primary jQuery function ($), both xml-query and xml-query-all are functions. The xml-query engine processes the selector from scratch on each invocation. It examines the first element, dispatches on its type/value to apply it to the input, and then recurses on the rest of selector with the narrowed input, stopping when it hits the end of the list. That’s the way it’s worked from the start.

However, every selector argument in Elfeed is a static, quoted list. Unlike user-supplied filters, I know exactly what I want to execute ahead of time. It would be much better if the engine didn’t have to waste time reparsing the DSL for each query.

This is the classic split between interpreters and compilers. An interpreter reads input and immediately executes it, doing what the input tells it to do. A compiler reads input and, rather than execute it, produces output, usually in a simpler language, that, when evaluated, has the same effect as executing the input.

Rather than interpret the selector, it would be better to compile it into Elisp code, compile that into byte-code, and then have the Emacs byte-code virtual machine (VM) execute the query each time it’s needed. The extra work of parsing the DSL is performed ahead of time, the dispatch is entirely static, and the selector ultimately executes on a much faster engine (byte-code VM). This should be a lot faster!

So I wrote a function that accepts a selector expression and emits Elisp source that implements that selector: a compiler for my DSL. Having a readily-available syntax tree is one of the big advantages of homoiconicity, and this sort of function makes perfect sense in a lisp. For the external interface, this compiler function is called by a new pair of macros, xml-query* and xml-query-all*. These macros consume a static selector and expand into the compiled Elisp form of the selector.

To demonstrate, remember that link query from before? Here’s the macro version of that selection, but only returning the first match. Notice the selector is no longer quoted. This is because it’s consumed by the macro, not evaluated.

(xml-query* (feed entry title [rel "alternate"] :href) xml)

This will expand into the following code.

(catch 'done
  (dolist (v xml)
    (when (and (consp v) (eq (car v) 'feed))
      (dolist (v (cddr v))
        (when (and (consp v) (eq (car v) 'entry))
          (dolist (v (cddr v))
            (when (and (consp v) (eq (car v) 'title))
              (let ((value (cdr (assq 'rel (cadr v)))))
                (when (equal value "alternate")
                  (let ((v (cdr (assq 'href (cadr v)))))
                    (when v
                      (throw 'done v))))))))))))

As soon as it finds a match, it’s thrown to the top level and returned. Without the DSL, the expansion is essentially what would have to be written by hand. This is exactly the sort of leverage you should be getting from a compiler. It compiles to around 130 byte-code instructions.

The xml-query-all* form is nearly the same, but instead of a throw, it pushes the result into the return list. Only the prologue (the outermost part) and the epilogue (the innermost part) are different.

Parsing feeds is a hot spot for Elfeed, so I wanted the compiler’s output to be as efficient as possible. I had three goals for this:

No extraneous code. It’s easy for the compiler to emit unnecessary code. The byte-code compiler might be able to eliminate some of it, but I don’t want to rely on that. Except for the identifiers, it should basically look like a human wrote it.
Avoid function calls. I don’t want to pay function call overhead, and, with some care, it’s easy to avoid. In the xml-query* expansion, the only function call is throw, which is unavoidable. The xml-query-all* version makes no function calls whatsoever. Notice that I used assq rather than assoc. First, it only needs to match symbols, so it should be faster. Second, assq has its own byte-code instruction (158) and assoc does not.
No unnecessary memory allocations. The xml-query* expansion makes no allocations. The xml-query-all* version only conses once per output, which is the minimum possible.

The end result is at least as optimal as hand-written code, but without the chance of human error (typos, fat fingering) and sourced from an easy-to-read DSL.

Performance

In my tests, the xml-query macros are a full order of magnitude faster than the functions. Yes, ten times faster! It’s an even bigger gain than I expected.

In the full picture, xml-query is only one part of parsing a feed. Measuring the time starting from raw XML text (as delivered by cURL) to a list of database entry objects, I’m seeing an overall 25% speedup with the macros. The remaining time is dominated by xml-parse-region, which is mostly out of my control.

With xml-query so computationally cheap, I don’t need to worry about using it more often. Compared to parsing XML text, it’s virtually free.

When it came time to validate my DSL compiler, I was really happy that Elfeed had a test suite. I essentially rewrote a core component from scratch, and passing all of the unit tests was a strong sign that it was correct. Many times that test suite has provided confidence in changes made both by me and by others.

I’ll end by describing another possible application: Apply this technique to regular expressions, such that static strings containing regular expressions are compiled into Elisp/byte-code via macro expansion. I wonder if situationally this would be faster than Emacs’ own regular expression engine.

Some Performance Advantages of Lexical Scope

2016-12-22T02:33:36Z

I recently had a discussion with Xah Lee about lexical scope in Emacs Lisp. The topic was why lexical-binding exists at a file-level when there was already lexical-let (from cl-lib), prompted by my previous article on JIT byte-code compilation. The specific context is Emacs Lisp, but these concepts apply to language design in general.

Until Emacs 24.1 (June 2012), Elisp only had dynamically scoped variables — a feature, mostly by accident, common to old lisp dialects. While dynamic scope has some selective uses, it’s widely regarded as a mistake for local variables, and virtually no other languages have adopted it.

Way back in 1993, Dave Gillespie’s deviously clever lexical-let macro was committed to the cl package, providing a rudimentary form of opt-in lexical scope. The macro walks its body replacing local variable names with guaranteed-unique gensym names: the exact same technique used in macros to create “hygienic” bindings that aren’t visible to the macro body. It essentially “fakes” lexical scope within Elisp’s dynamic scope by preventing variable name collisions.

For example, here’s one of the consequences of dynamic scope.

(defun inner ()
  (setq v :inner))

(defun outer ()
  (let ((v :outer))
    (inner)
    v))

(outer)
;; => :inner

The “local” variable v in outer is visible to its callee, inner, which can access and manipulate it. The meaning of the free variable v in inner depends entirely on the run-time call stack. It might be a global variable, or it might be a local variable for a caller, direct or indirect.

Using lexical-let deconflicts these names, giving the effect of lexical scope.

(defvar v)

(defun lexical-outer ()
  (lexical-let ((v :outer))
    (inner)
    v))

(lexical-outer)
;; => :outer

But there’s more to lexical scope than this. Closures only make sense in the context of lexical scope, and the most useful feature of lexical-let is that lambda expressions evaluate to closures. The macro implements this using a technique called closure conversion. Additional parameters are added to the original lambda function, one for each lexical variable (and not just each closed-over variable), and the whole thing is wrapped in another lambda function that invokes the original lambda function with the additional parameters filled with the closed-over variables — yes, the variables (e.g. symbols) themselves, not just their values, (e.g. pass-by-reference). The last point means different closures can properly close over the same variables, and they can bind new values.

To roughly illustrate how this works, the first lambda expression below, which closes over the lexical variables x and y, would be converted into the latter by lexical-let. The #: is Elisp’s syntax for uninterned variables. So #:x is a symbol x, but not the symbol x (see print-gensym).

;; Before conversion:
(lambda ()
  (+ x y))

;; After conversion:
(lambda (&rest args)
  (apply (lambda (x y)
           (+ (symbol-value x)
              (symbol-value y)))
         '#:x '#:y args))

I’ve said on multiple occasions that lexical-binding: t has significant advantages, both in performance and static analysis, and so it should be used for all future Elisp code. The only reason it’s not the default is because it breaks some old (badly written) code. However, lexical-let doesn’t realize any of these advantages! In fact, it has worse performance than straightforward dynamic scope with let.

New symbol objects are allocated and initialized (make-symbol) on each run-time evaluation, one per lexical variable.
Since it’s just faking it, lexical-let still uses dynamic bindings, which are more expensive than lexical bindings. It varies depending on the C compiler that built Emacs, but dynamic variable accesses (opcode varref) take around 30% longer than lexical variable accesses (opcode stack-ref). Assignment is far worse, where dynamic variable assignment (varset) takes 650% longer than lexical variable assignment (stack-set). How I measured all this is a topic for another article.
The “lexical” variables are accessed using symbol-value, a full function call, so they’re even slower than normal dynamic variables.
Because converted lambda expressions are constructed dynamically at run-time within the body of lexical-let, the resulting closure is only partially byte-compiled even if the code as a whole has been byte-compiled. In contrast, lexical-binding: t closures are fully compiled. How this works is worth its own article.
Converted lambda expressions include the additional internal function invocation, making them slower.

While lexical-let is clever, and occasionally useful prior to Emacs 24, it may come at a hefty performance cost if evaluated frequently. There’s no reason to use it anymore.

Constraints on code generation

Another reason to be weary of dynamic scope is that it puts needless constraints on the compiler, preventing a number of important optimization opportunities. For example, consider the following function, bar:

(defun bar ()
  (let ((x 1)
        (y 2))
    (foo)
    (+ x y)))

Byte-compile this function under dynamic scope (lexical-binding: nil) and disassemble it to see what it looks like.

(byte-compile #'bar)
(disassemble #'bar)

That pops up a buffer with the disassembly listing:

     constant  1
     constant  2
     varbind   y
     varbind   x
     constant  foo
     call      0
     discard
     varref    x
     varref    y
     plus
    unbind    2
    return

It’s 12 instructions, 5 of which deal with dynamic bindings. The byte-compiler doesn’t always produce optimal byte-code, but this just so happens to be nearly optimal byte-code. The discard (a very fast instruction) isn’t necessary, but otherwise no more compiler smarts can improve on this. Since the variables x and y are visible to foo, they must be bound before the call and loaded after the call. While generally this function will return 3, the compiler cannot assume so since it ultimately depends on the behavior foo. Its hands are tied.

Compare this to the lexical scope version (lexical-binding: t):

     constant  1
     constant  2
     constant  foo
     call      0
     discard
     stack-ref 1
     stack-ref 1
     plus
     return

It’s only 8 instructions, none of which are expensive dynamic variable instructions. And this isn’t even close to the optimal byte-code. In fact, as of Emacs 25.1 the byte-compiler often doesn’t produce the optimal byte-code for lexical scope code and still needs some work. Despite not firing on all cylinders, lexical scope still manages to beat dynamic scope in performance benchmarks.

Here’s the optimal byte-code, should the byte-compiler become smarter someday:

     constant  foo
     call      0
     constant  3
     return

It’s down to 4 instructions due to computing the math operation at compile time. Emacs’ byte-compiler only has rudimentary constant folding, so it doesn’t notice that x and y are constants and misses this optimization. I speculate this is due to its roots compiling under dynamic scope. Since x and y are no longer exposed to foo, the compiler has the opportunity to optimize them out of existence. I haven’t measured it, but I would expect this to be significantly faster than the dynamic scope version of this function.

Optional dynamic scope

You might be thinking, “What if I really do want x and y to be dynamically bound for foo?” This is often useful. Many of Emacs’ own functions are designed to have certain variables dynamically bound around them. For example, the print family of functions use the global variable standard-output to determine where to send output by default.

(let ((standard-output (current-buffer)))
  (princ "value = ")
  (prin1 value))

Have no fear: With lexical-binding: t you can have your cake and eat it too. Variables declared with defvar, defconst, or defvaralias are marked as “special” with an internal bit flag (declared_special in C). When the compiler detects one of these variables (special-variable-p), it uses a classical dynamic binding.

Declaring both x and y as special restores the original semantics, reverting bar back to its old byte-code definition (next time it’s compiled, that is). But it would be poor form to mark x or y as special: You’d de-optimize all code (compiled after the declaration) anywhere in Emacs that uses these names. As a package author, only do this with the namespace-prefixed variables that belong to you.

The only way to unmark a special variable is with the undocumented function internal-make-var-non-special. I expected makunbound to do this, but as of Emacs 25.1 it does not. This could possibly be considered a bug.

Accidental closures

I’ve said there are absolutely no advantages to lexical-binding: nil. It’s only the default for the sake of backwards-compatibility. However, there is one case where lexical-binding: t introduces a subtle issue that would otherwise not exist. Take this code for example (and nevermind prin1-to-string for a moment):

;; -*- lexical-binding: t; -*-

(defun function-as-string ()
  (with-temp-buffer
    (prin1 (lambda () :example) (current-buffer))
    (buffer-string)))

This creates and serializes a closure, which is one of Elisp’s unique features. It doesn’t close over any variables, so it should be pretty simple. However, this function will only work correctly under lexical-binding: t when byte-compiled.

(function-as-string)
;; => "(closure ((temp-buffer . #) t) nil :example)"

The interpreter doesn’t analyze the closure, so just closes over everything. This includes the hidden variable temp-buffer created by the with-temp-buffer macro, resulting in an abstraction leak. Buffers aren’t readable, so this will signal an error if an attempt is made to read this function back into an s-expression. The byte-compiler fixes this by noticing temp-buffer isn’t actually closed over and so doesn’t include it in the closure, making it work correctly.

Under lexical-binding: nil it works correctly either way:

(function-as-string)
;; -> "(lambda nil :example)"

This may seem contrived — it’s certainly unlikely — but it has come up in practice. Still, it’s no reason to avoid lexical-binding: t.

Use lexical scope in all new code

As I’ve said again and again, always use lexical-binding: t. Use dynamic variables judiciously. And lexical-let is no replacement. It has virtually none of the benefits, performs worse, and it only applies to let, not any of the other places bindings are created: function parameters, dotimes, dolist, and condition-case.

Faster Elfeed Search Through JIT Byte-code Compilation

2016-12-11T23:16:42Z

Today I pushed an update for Elfeed that doubles the speed of the search filter in the worse case. This is the user-entered expression that dynamically narrows the entry listing to a subset that meets certain criteria: published after a particular date, with/without particular tags, and matching/non-matching zero or more regular expressions. The filter is live, applied to the database as the expression is edited, so it’s important for usability that this search completes under a threshold that the user might notice.

The typical workaround for these kinds of interfaces is to make filtering/searching asynchronous. It’s possible to do this well, but it’s usually a terrible, broken design. If the user acts upon the asynchronous results — say, by typing the query and hitting enter to choose the current or expected top result — then the final behavior is non-deterministic, a race between the user’s typing speed and the asynchronous search. Elfeed will keep its synchronous live search.

For anyone not familiar with Elfeed, here’s a filter that finds all entries from within the past year tagged “youtube” (+youtube) that mention Linux or Linus (linu[sx]), but aren’t tagged “bsd” (-bsd), limited to the most recent 15 entries (#15):

@1-year-old +youtube linu[xs] -bsd #15

The database is primarily indexed over publication date, so filters on publication dates are the most efficient filters. Entries are visited in order starting with the most recently published, and the search can bail out early once it crosses the filter threshold. Time-oriented filters have been encouraged as the solution to keep the live search feeling lively.

Filtering Overview

The first step in filtering is parsing the filter text entered by the user. This string is broken into its components using the elfeed-search-parse-filter function. Date filter components are converted into a unix epoch interval, tags are interned into symbols, regular expressions are gathered up as strings, and the entry limit is parsed into a plain integer. Absence of a filter component is indicated by nil.

(elfeed-search-parse-filter "@1-year-old +youtube linu[xs] -bsd #15")
;; => (31557600.0 (youtube) (bsd) ("linu[xs]") nil 15)

Previously, the next step was to apply the elfeed-search-filter function with this structured filter representation to the database. Except for special early-bailout situations, it works left-to-right across the filter, checking each condition against each entry. This is analogous to an interpreter, with the filter being a program.

Thinking about it that way, what if the filter was instead compiled into an Emacs byte-code function and executed directly by the Emacs virtual machine? That’s what this latest update does.

Benchmarks

With six different filter components, the actual filtering routine is a bit too complicated for an article, so I’ll set up a simpler, but roughly equivalent, scenario. With a reasonable cut-off date, the filter was already sufficiently fast, so for benchmarking I’ll focus on the worst case: no early bailout opportunities. An entry will be just a list of tags (symbols), and the filter will have to test every entry.

My real-world Elfeed database currently has 46,772 entries with 36 distinct tags. For my benchmark I’ll round this up to a nice 100,000 entries, and use 26 distinct tags (A–Z), which has the nice alphabet property and more closely reflects the number of tags I still care about.

First, here’s make-random-entry to generate a random list of 1–5 tags (i.e. an entry). The state parameter is the random state, allowing for deterministic benchmarks on a randomly-generated database.

(cl-defun make-random-entry (&key state (min 1) (max 5))
  (cl-loop repeat (+ min (cl-random (1+ (- max min)) state))
           for letter = (+ ?A (cl-random 26 state))
           collect (intern (format "%c" letter))))

The database is just a big list of entries. In Elfeed this is actually an AVL tree. Without dates, the order doesn’t matter.

(cl-defun make-random-database (&key state (count 100000))
  (cl-loop repeat count collect (make-random-entry :state state)))

Here’s my old time macro. An important change I’ve made since years ago is to call garbage-collect before starting the clock, eliminating bad samples from unlucky garbage collection events. Depending on what you want to measure, it may even be worth disabling garbage collection during the measurement by setting gc-cons-threshold to a high value.

(defmacro measure-time (&rest body)
  (declare (indent defun))
  (garbage-collect)
  (let ((start (make-symbol "start")))
    `(let ((,start (float-time)))
       ,@body
       (- (float-time) ,start))))

Finally, the benchmark harness. It uses a hard-coded seed to generate the same pseudo-random database. The test is run against the a filter function, f, 100 times in search for the same 6 tags, and the timing results are averaged.

(cl-defun benchmark (f &optional (n 100) (tags '(A B C D E F)))
  (let* ((state (copy-sequence [cl-random-state-tag -1 30 267466518]))
         (db (make-random-database :state state)))
    (cl-loop repeat n
             sum (measure-time
                   (funcall f db tags))
             into total
             finally return (/ total (float n)))))

The baseline will be memq (test for membership using identity, eq). There are two lists of tags to compare: the list that is the entry, and the list from the filter. This requires a nested loop for each entry, one explicit (cl-loop) and one implicit (memq), both with early bailout.

(defun memq-count (db tags)
  (cl-loop for entry in db count
           (cl-loop for tag in tags
                    when (memq tag entry)
                    return t)))

Byte-code compiling everything and running the benchmark on my laptop I get:

(benchmark #'memq-count)
;; => 0.041 seconds

That’s actually not too bad. One of the advantages of this definition is that there are no function calls. The memq built-in function has its own opcode (62), and the rest of the definition is special forms and macros expanding to special forms (cl-loop). It’s exactly the thing I need to exploit to make filters faster.

As a sanity check, what would happen if I used member instead of memq? In theory it should be slower because it uses equal for tests instead of eq.

(defun member-count (db tags)
  (cl-loop for entry in db count
           (cl-loop for tag in tags
                    when (member tag entry)
                    return t)))

It’s only slightly slower because member, like many other built-ins, also has an opcode (157). It’s just a tiny bit more overhead.

(benchmark #'member-count)
;; => 0.047 seconds

To test function call overhead while still using the built-in (e.g. written in C) memq, I’ll alias it so that the byte-code compiler is forced to emit a function call.

(defalias 'memq-alias 'memq)

(defun memq-alias-count (db tags)
  (cl-loop for entry in db count
           (cl-loop for tag in tags
                    when (memq-alias tag entry)
                    return t)))

To verify that this is doing what I expect, I M-x disassemble the function and inspect the byte-code disassembly. Here’s a simple example.

(disassemble
 (byte-compile (lambda (list) (memq :foo list))))

When compiled under lexical scope (lexical-binding is true), here’s the disassembly. To understand what this means, see Emacs Byte-code Internals.

     constant  :foo
     stack-ref 1
     memq
     return

Notice the memq instruction. Try using memq-alias instead:

(disassemble
 (byte-compile (lambda (list) (memq-alias :foo list))))

Resulting in a function call:

     constant  memq-alias
     constant  :foo
     stack-ref 2
     call      2
     return

And the benchmark:

(benchmark #'memq-alias-count)
;; => 0.052 seconds

So the function call adds about 27% overhead. This means it would be a good idea to avoid calling functions in the filter if I can help it. I should rely on these special opcodes.

Suppose memq was written in Emacs Lisp rather than C. How much would that hurt performance? My version of my-memq below isn’t quite the same since it returns t rather than the sublist, but it’s good enough for this purpose. (I’m using cl-loop because writing early bailout in plain Elisp without recursion is, in my opinion, ugly.)

(defun my-memq (needle haystack)
  (cl-loop for element in haystack
           when (eq needle element)
           return t))

(defun my-memq-count (db tags)
  (cl-loop for entry in db count
           (cl-loop for tag in tags
                    when (my-memq tag entry)
                    return t)))

And the benchmark:

(benchmark #'my-memq-count)
;; => 0.137 seconds

Oof! It’s more than 3 times slower than the opcode. This means I should use built-ins as much as possible in the filter.

Dynamic vs. lexical scope

There’s one last thing to watch out for. Everything so far has been compiled with lexical scope. You should really turn this on by default for all new code that you write. It has three important advantages:

It allows the compiler to catch more mistakes.
It eliminates a class of bugs related to dynamic scope: Local variables are exposed to manipulation by callees.
Lexical scope has better performance.

Here are all the benchmarks with the default dynamic scope:

(benchmark #'memq-count)
;; => 0.065 seconds

(benchmark #'member-count)
;; => 0.070 seconds

(benchmark #'memq-alias-count)
;; => 0.074 seconds

(benchmark #'my-memq-count)
;; => 0.256 seconds

It halves the performance in this benchmark, and for no benefit. Under dynamic scope, local variables use the varref opcode — a global variable lookup — instead of the stack-ref opcode — a simple array index.

(defun norm (a b)
  (* (- a b) (- a b)))

Under dynamic scope, this compiles to:

     varref    a
     varref    b
     diff
     varref    a
     varref    b
     diff
     mult
     return

And under lexical scope (notice the variable names disappear):

     stack-ref 1
     stack-ref 1
     diff
     stack-ref 2
     stack-ref 2
     diff
     mult
     return

JIT-compiled filters

So far I’ve been moving in the wrong direction, making things slower rather than faster. How can I make it faster than the straight memq version? By compiling the filter into byte-code.

I won’t write the byte-code directly, but instead generate Elisp code and use the byte-code compiler on it. This is safer, will work correctly in future versions of Emacs, and leverages the optimizations performed by the byte-compiler. This sort of thing recently got a bad rap on Emacs Horrors, but I was happy to see that this technique is already established.

(defun jit-count (db tags)
  (let* ((memq-list (cl-loop for tag in tags
                             collect `(memq ',tag entry)))
         (function `(lambda (db)
                      (cl-loop for entry in db
                               count (or ,@memq-list))))
         (compiled (byte-compile function)))
    (funcall compiled db)))

It dynamically builds the code as an s-expression, runs that through the byte-code compiler, executes it, and throws it away. It’s “just-in-time,” though compiling to byte-code and not native code. For the benchmark tags of (A B C D E F), this builds the following:

(lambda (db)
  (cl-loop for entry in db
           count (or (memq 'A entry)
                     (memq 'B entry)
                     (memq 'C entry)
                     (memq 'D entry)
                     (memq 'E entry)
                     (memq 'F entry))))

Due to its short-circuiting behavior, or is a special form, so this function is just special forms and memq in its opcode form. It’s as fast as Elisp can get.

Having s-expressions is a real strength for lisp, since the alternative (in, say, JavaScript) would be to assemble the function by concatenating code strings. By contrast, this looks a lot like a regular lisp macro. Invoking the byte-code compiler does add some overhead compared to the interpreted filter, but it’s insignificant.

How much faster is this?

(benchmark #'jit-count)
;; => 0.017s

It’s more than twice as fast! The big gain here is through loop unrolling. The outer loop has been unrolled into the or expression. That section of byte-code looks like this:

     constant  A
     stack-ref 1
     memq
     goto-if-not-nil-else-pop 1
     constant  B
     stack-ref 1
     memq
     goto-if-not-nil-else-pop 1
    constant  C
    stack-ref 1
    memq
    goto-if-not-nil-else-pop 1
    constant  D
    stack-ref 1
    memq
    goto-if-not-nil-else-pop 1
    constant  E
    stack-ref 1
    memq
    goto-if-not-nil-else-pop 1
    constant  F
    stack-ref 1
    memq
1    return

In Elfeed, not only does it unroll these loops, it completely eliminates the overhead for unused filter components. Comparing to this benchmark, I’m seeing roughly matching gains in Elfeed’s worst case. In Elfeed, I also bind lexical-binding around the byte-compile call to force lexical scope, since otherwise it just uses the buffer-local value (usually nil).

Filter compilation can be toggled on and off by setting elfeed-search-compile-filter. If you’re up to date, try out live filters with it both enabled and disabled. See if you can notice the difference.

Result summary

Here are the results in a table, all run with Emacs 24.4 on x86-64.

(ms)      memq      member    memq-alias my-memq   jit
lexical   41        47        52         137       17
dynamic   65        70        74         256       21

And the same benchmarks on Aarch64 (Emacs 24.5, ARM Cortex-A53), where I also occasionally use Elfeed, and where I have been very interested in improving performance.

(ms)      memq      member    memq-alias my-memq   jit
lexical   170       235       242        614       79
dynamic   274       340       345        1130      92

And here’s how you can run the benchmarks for yourself, perhaps with different parameters:

jit-bench.el

The header explains how to run the benchmark in batch mode:

$ emacs -Q -batch -f batch-byte-compile jit-bench.el
$ emacs -Q -batch -l jit-bench.elc -f benchmark-batch

Emacs, Dynamic Modules, and Joysticks

2016-11-05T04:01:51Z

Two months ago Emacs 25 was released and introduced a new dynamic module feature. Emacs can now load shared libraries built against Emacs’ module API, defined in emacs-module.h. What’s interesting about this API is that it doesn’t require linking against Emacs or any sort of library. Instead, at run time Emacs supplies the module’s initialization function with function pointers for the entire API.

As a demonstration, in this article I’ll build an Emacs joystick interface (Linux only) using a dynamic module. It will allow Emacs to read events from any joystick on the system. All the source code is here:

https://github.com/skeeto/joymacs

It includes a calibration interface (M-x joydemo) within Emacs:

Currently, Emacs’ emacs-module.h header is the entirety of the module documentation. It’s a bit thin and leaves ambiguities that requires some reading of the Emacs source code. Even reading the source, it’s not clear which behaviors are a reliable part of the interface. For example, if there’s a pending non-local exit, it’s safe for a function to return NULL since the return value is never inspected (Emacs 25.1), but will this always be the case? While mistakes are unforgiving (a hard crash), the API is mostly intuitive and it’s been pretty easy to feel my way around it.

Update: Philipp Stephani has written thorough, reliable module documentation.

Dynamic Module Types

All Emacs values — integers, floats, cons cells, vectors, strings, etc. — are represented as the polymorphic, pointer-valued type, emacs_value. Despite being a pointer, NULL is not a valid value, as convenient as that would be. The API includes functions for creating and extracting the fundamental types: integers, floats, strings. Almost all other object types can only be accessed by making Lisp function calls to regular Emacs functions from the module.

Modules also introduce a brand new Emacs object type: a user pointer. These are non-readable, opaque pointer values returned by modules, typically representing a handle to some resource, be it a memory block, database connection, or a joystick. These objects include a finalizer function pointer — which, surprisingly, is not permitted to be NULL — and their lifetime is managed by Emacs’ garbage collector.

User pointers are a somewhat dangerous feature since there’s little to stop Emacs Lisp code from misusing them. A Lisp program can take a user pointer from one module and pass it to a function in a different module. Since it’s just a pointer, there’s no way to type check it. At best, a module could maintain a table of all its live pointers, checking all user pointer arguments against the table before dereferencing. But I don’t expect this to be normal practice.

Module Initialization

After loading the module through the platform’s mechanism, the first thing Emacs does is check for the symbol plugin_is_GPL_compatible. While tacky, this is not surprising given the culture around Emacs.

Next it calls emacs_module_init(), passing it the first function pointer. From this, the module can get a Lisp environment and start doing Emacs things, such as binding module functions to Lisp symbols.

Here’s a complete “Hello, world!” example:

#include "emacs-module.h"

int plugin_is_GPL_compatible;

int
emacs_module_init(struct emacs_runtime *ert)
{
    emacs_env *env = ert->get_environment(ert);
    emacs_value message = env->intern(env, "message");
    const char hi[] = "Hello, world!";
    emacs_value string = env->make_string(env, hi, sizeof(hi) - 1);
    env->funcall(env, message, 1, &string);
    return 0;
}

In a real module, it’s common to create function objects for native functions, then fetch the fset symbol and make a Lisp call on it to bind the newly-created function object to a name. You’ll see this in action later.

Joystick API

The joystick API will closely resemble Linux’s own joystick API, making for a fairly thin wrapper. It’s so thin that Emacs almost doesn’t even need a dynamic module. This is because, on Linux, joysticks are just files under /dev/input/. Want to see the input events on the first joystick? Just read /dev/input/js0. So Plan 9.

Emacs already knows how to read files, but these virtual files are a little too special for that. The header linux/joystick.h defines a struct js_event:

struct js_event {
    uint32_t time;  /* event timestamp in milliseconds */
    int16_t value;
    uint8_t type;
    uint8_t number; /* axis/button number */
};

The idea is to read from the joystick device into this structure. The first several reads are initialization that define the axes and buttons of the joystick and their initial state. Further events are queued up for the file descriptor. This all means that the file can’t just be opened each time joystick input is needed. It has to be held open for the duration, and is typically configured non-blocking.

The Emacs package will be called joymacs and there will be three functions:

(joymacs-open N)
(joymacs-close JOYSTICK)
(joymacs-read JOYSTICK EVENT-VECTOR)

joymacs-open

The joymacs-open function will take an integer, opening the Nth joystick (/dev/input/jsN). It will create a file descriptor for the joystick device, returning it as a user pointer. Think of it as a sort of “joystick handle.” Now, it could instead return the file descriptor as an integer, but the user pointer has two significant benefits:

The resource will be garbage collected. If the caller loses track of a file descriptor returned as an integer, the joystick device will be held open until Emacs shuts down, using up one of Emacs’ file descriptors. By putting it in a user pointer, the garbage collector will have the module to release the file descriptor if the user loses track of it.
It should be difficult for the user to make a dangerous call. Emacs Lisp can’t create user pointers — they only come from modules — and so the module is less likely to get passed the wrong thing. In the case of joystick-close, the module will be calling close(2) on the argument. We definitely don’t want to make that system call on file descriptors owned by Emacs. Further, since user pointers are mutable, the module can ensure it doesn’t call close(2) twice.

Here’s the implementation for joymacs-open. I’ll over over each part in detail.

static emacs_value
joymacs_open(emacs_env *env, ptrdiff_t n, emacs_value *args, void *ptr)
{
    (void)ptr;
    (void)n;
    int id = env->extract_integer(env, args[0]);
    if (env->non_local_exit_check(env) != emacs_funcall_exit_return)
        return nil;
    char buf[64];
    int buflen = sprintf(buf, "/dev/input/js%d", id);
    int fd = open(buf, O_RDONLY | O_NONBLOCK);
    if (fd == -1) {
        emacs_value signal = env->intern(env, "file-error");
        emacs_value message = env->make_string(env, buf, buflen);
        env->non_local_exit_signal(env, signal, message);
        return nil;
    }
    return env->make_user_ptr(env, fin_close, (void *)(intptr_t)fd);
}

The C function name doesn’t matter to Emacs. It’s static because it doesn’t even matter if the function visible to Emacs. It will get the function pointer later as part of initialization.

This is the prototype for all functions callable by Emacs Lisp, regardless of its arity. It has four arguments:

It gets an environment, env, through which to call back into Emacs.
It gets n, the number of arguments. This is guaranteed to be the correct number of arguments, as specified later when creating the function object, so only variadic functions need to inspect this argument.
The Lisp arguments are passed as an array of values, args. There’s no type declaration when declaring a function object, so these may be of the wrong type. I’ll go over how to deal with this.
Finally, it gets an arbitrary pointer, supplied at function object creation time. This allows the module to create closures, but will usually be ignored.

The first thing the function does is extract its integer argument. This is actually an intmax_t, but I don’t think anyone has that many USB ports. An int will suffice.

    int id = env->extract_integer(env, args[0]);
    if (env->non_local_exit_check(env) != emacs_funcall_exit_return)
        return nil;

As for not underestimating fools, what if the user passed a value that isn’t an integer? Will the world come crashing down? Fortunately Emacs checks that in extract_integer and, if there’s a mismatch, sets a pending error signal in the environment. This is really great because checking types directly in the module is a real pain the ass. So, before committing to anything further, such as opening a file, I check for this signal and bail out early if necessary. In Emacs 25.1 it’s safe to return NULL since the return value will be completely ignored, but I’d rather hedge my bets.

By the way, the nil here is a global variable set in initialization. You don’t just get that for free!

The next step is opening the joystick device, read-only and non-blocking. The non-blocking is vital because the module would otherwise hang Emacs later if there are no events (well, except for the read being quickly interrupted by a POSIX signal).

    char buf[64];
    int buflen = sprintf(buf, "/dev/input/js%d", id);
    int fd = open(buf, O_RDONLY | O_NONBLOCK);

If the joystick fails to open (e.g. it doesn’t exist, or the user lacks permission), manually set an error signal for a non-local exit. I chose the file-error signal and I’m just using the filename as the signal data.

    if (fd == -1) {
        emacs_value signal = env->intern(env, "file-error");
        emacs_value message = env->make_string(env, buf, buflen);
        env->non_local_exit_signal(env, signal, message);
        return nil;
    }

Otherwise create the user pointer. No need to allocate any memory; just stuff it in the pointer itself. If the user mistakenly passes it to another module, it will sure be in for a surprise when it tries to dereference it.

    return env->make_user_ptr(env, fin_close, (void *)(intptr_t)fd);

The fin_close() function is defined as:

static void
fin_close(void *fdptr)
{
    int fd = (intptr_t)fdptr;
    if (fd != -1)
        close(fd);
}

The garbage collector will call this function when the user pointer is lost. If the user closes it early with joymacs-close, that function will set the user pointer to -1, an invalid file descriptor, so that it doesn’t get closed a second time here.

joymacs-close

Here’s joymacs-close, which is a bit simpler.

static emacs_value
joymacs_close(emacs_env *env, ptrdiff_t n, emacs_value *args, void *ptr)
{
    (void)ptr;
    (void)n;
    int fd = (intptr_t)env->get_user_ptr(env, args[0]);
    if (env->non_local_exit_check(env) != emacs_funcall_exit_return)
        return nil;
    if (fd != -1) {
        close(fd);
        env->set_user_ptr(env, args[0], (void *)(intptr_t)-1);
    }
    return nil;
}

Again, it starts by extracting its argument, relying on Emacs to do the check:

    int fd = (intptr_t)env->get_user_ptr(env, args[0]);
    if (env->non_local_exit_check(env) != emacs_funcall_exit_return)
        return nil;

If the user pointer hasn’t been closed yet, then close it and strip out the file descriptor to prevent further closes.

    if (fd != -1) {
        close(fd);
        env->set_user_ptr(env, args[0], (void *)(intptr_t)-1);
    }

joymacs-read

The joymacs-read function is doing something a little unusual for an Emacs Lisp function. It takes two arguments: the joystick handle and a 5-element vector. Instead of returning the event in some representation, it fills the vector with the event details. The are two reasons for this:

The API has no function for creating vectors … though the module could get the make-symbol vector and call it to create a vector.
The idiom for event pumps is for the caller to supply a buffer to the pump. This has better performance by avoiding lots of unnecessary allocations, especially since events tend to be message-like objects with a short, well-defined extent.

Here’s the full definition:

static emacs_value
joymacs_read(emacs_env *env, ptrdiff_t n, emacs_value *args, void *ptr)
{
    (void)n;
    (void)ptr;
    int fd = (intptr_t)env->get_user_ptr(env, args[0]);
    if (env->non_local_exit_check(env) != emacs_funcall_exit_return)
        return nil;
    struct js_event e;
    int r = read(fd, &e, sizeof(e));
    if (r == -1 && errno == EAGAIN) {
        /* No more events. */
        return nil;
    } else if (r == -1) {
        /* An actual read error (joystick unplugged, etc.). */
        emacs_value signal = env->intern(env, "file-error");
        const char *error = strerror(errno);
        size_t len = strlen(error);
        emacs_value message = env->make_string(env, error, len);
        env->non_local_exit_signal(env, signal, message);
        return nil;
    } else {
        /* Fill out event vector. */
        emacs_value v = args[1];
        emacs_value type = e.type & JS_EVENT_BUTTON ? button : axis;
        emacs_value value;
        if (type == button)
            value = e.value ? t : nil;
        else
            value =  env->make_float(env, e.value / (double)INT16_MAX);
        env->vec_set(env, v, 0, env->make_integer(env, e.time));
        env->vec_set(env, v, 1, type);
        env->vec_set(env, v, 2, value);
        env->vec_set(env, v, 3, env->make_integer(env, e.number));
        env->vec_set(env, v, 4, e.type & JS_EVENT_INIT ? t : nil);
        return args[1];
    }
}

As before, extract the first argument and check for a signal. Then call read(2) to get an event. If the read fails with EAGAIN, it’s not a real failure. There are just no more events, so return nil.

    struct js_event e;
    int r = read(fd, &e, sizeof(e));
    if (r == -1 && errno == EAGAIN) {
        /* No more events. */
        return nil;
    }

If the read failed with something else — perhaps the joystick was unplugged — signal an error. The strerror(3) string is used for the signal data.

    if (r == -1) {
        /* An actual read error (joystick unplugged, etc.). */
        emacs_value signal = env->intern(env, "file-error");
        const char *error = strerror(errno);
        emacs_value message = env->make_string(env, error, strlen(error));
        env->non_local_exit_signal(env, signal, message);
        return nil;
    }

Otherwise fill out the event vector. If the second argument isn’t a vector, or if it’s too short, the signal will automatically get raised by Emacs. The module can keep plowing through the vec_set() calls safely since it’s not committing to anything.

        /* Fill out event vector. */
        emacs_value v = args[1];
        emacs_value type = e.type & JS_EVENT_BUTTON ? button : axis;
        emacs_value value;
        if (type == button)
            value = e.value ? t : nil;
        else
            value =  env->make_float(env, e.value / (double)INT16_MAX);
        env->vec_set(env, v, 0, env->make_integer(env, e.time));
        env->vec_set(env, v, 1, type);
        env->vec_set(env, v, 2, value);
        env->vec_set(env, v, 3, env->make_integer(env, e.number));
        env->vec_set(env, v, 4, e.type & JS_EVENT_INIT ? t : nil);
        return args[1];

The Linux event struct has four fields and the function fills out five values of the vector. This is because the type field has a bit flag indicating initialization events. This is split out into an extra t/nil value. It also normalizes axis values and converts button values into t/nil, which makes more sense for Emacs Lisp. The event itself is returned since it’s a truthy value and it’s convenient for the caller.

The astute programmer might notice that the negative side of the axis could go just below -1.0, since INT16_MIN has one extra value over INT16_MAX (two’s complement). It doesn’t seem to be documented, but the joystick drivers I’ve seen never exactly return INT16_MIN, so this is in fact the correct way to normalize it.

Initialization

Update 2021: In a previous version of this article, I talked about interning symbols during initialziation so that they do not need to be re-interned each time the module is called. This no longer works, and it was probably never intended to be work in the first place. The lesson is simple: Do not reuse Emacs objects between module calls.

First grab the fset symbol since this function will be needed to bind names to the module’s functions.

    emacs_value fset = env->intern(env, "fset");

Using fset, bind the functions. The second and third arguments to make_function are the minimum and maximum number of arguments, which may look familiar. The last argument is that closure pointer I mentioned at the beginning.

    emacs_value args[2];
    args[0] = env->intern(env, "joymacs-open");
    args[1] = env->make_function(env, 1, 1, joymacs_open, doc, 0);
    env->funcall(env, fset, 2, args);

If the module is to be loaded with require like any other package, it needs to provide: (provide 'joymacs).

    emacs_value provide = env->intern(env, "provide");
    emacs_value joymacs = env->intern(env, "joymacs");
    env->funcall(env, provide, 1, &joymacs);

And that’s it!

The source repository now includes a port to Windows (XInput). If you’re on Linux or Windows, have Emacs 25 with modules enabled, and a joystick is plugged in, then make run in the repository should bring up Emacs running a joystick calibration demonstration. The module can’t poke at Emacs when events are ready, so instead there’s a timer that polls the module for events.

I’d like to someday see an Emacs Lisp game well-suited for a joystick.

Elfeed, cURL, and You

2016-06-16T18:22:16Z

This morning I pushed out an important update to Elfeed, my web feed reader for Emacs. The update should be available in MELPA by the time you read this. Elfeed now has support for fetching feeds using a cURL through a curl inferior process. You’ll need the program in your PATH or configured through elfeed-curl-program-name.

I’ve been using it for a couple of days now, but, while I work out the remaining kinks, it’s disabled by default. So in addition to having cURL installed, you’ll need to set elfeed-use-curl to non-nil. Sometime soon it will be enabled by default whenever cURL is available. The original url-retrieve fetcher will remain in place for time time being. However, cURL may become a requirement someday.

Fetching with a curl inferior process has some huge advantages.

It’s much faster

The most obvious change is that you should experience a huge speedup on updates and better responsiveness during updates after the first cURL run. There are important two reasons:

Asynchronous DNS and TCP: Emacs 24 and earlier performs DNS queries synchronously even for asynchronous network processes. This is being fixed on some platforms (including Linux) in Emacs 25, but now we don’t have to wait.

On Windows it’s even worse: the TCP connection is also established synchronously. This is especially bad when fetching relatively small items such as feeds, because the DNS look-up and TCP handshake dominate the overall fetch time. It essentially makes the whole process synchronous.

Conditional GET: HTTP has two mechanism to avoid transmitting information that a client has previously fetched. One is the Last-Modified header delivered by the server with the content. When querying again later, the client echos the date back like a token in the If-Modified-Since header.

The second is the “entity tag,” an arbitrary server-selected token associated with each version of the content. The server delivers it along with the content in the ETag header, and the client hands it back later in the If-None-Match header, sort of like a cookie.

This is highly valuable for feeds because, unless the feed is particularly active, most of the time the feed hasn’t been updated since the last query. This avoids sending anything other hand a handful of headers each way. In Elfeed’s case, it means it doesn’t have to parse the same XML over and over again.

Both of these being outside of cURL’s scope, Elfeed has to manage conditional GET itself. I had no control over the HTTP headers until now, so I couldn’t take advantage of it. Emacs’ url-retrieve function allows for sending custom headers through dynamically binding url-request-extra-headers, but this isn’t available when calling url-queue-retrieve since the request itself is created asynchronously.

Both the ETag and Last-Modified values are stored in the database and persist across sessions. This is the reason the full speedup isn’t realized until the second fetch. The initial cURL fetch doesn’t have these values.

Fewer bugs

As mentioned previously, Emacs has a built-in URL retrieval library called url. The central function is url-retrieve which asynchronously fetches the content at an arbitrary URL (usually HTTP) and delivers the buffer and status to a callback when it’s ready. There’s also a queue front-end for it, url-queue-retrieve which limits the number of parallel connections. Elfeed hands this function a pile of feed URLs all at once and it fetches them N at a time.

Unfortunately both these functions are incredibly buggy. It’s been a thorn in my side for years.

Here’s what the interface looks like for both:

(url-retrieve URL CALLBACK &optional CBARGS SILENT INHIBIT-COOKIES)

It takes a URL and a callback. Seeing this, the sane, unsurprising expectation is the callback will be invoked exactly once for time url-retrieve was called. In any case where the request fails, it should report it through the callback. This is not the case. The callback may be invoked any number of times, including zero.

In this example, suppose you have a webserver that will return an HTTP 404 for a requested URL. Below, I fire off 10 asynchronous requests in a row.

(defvar results ())
(dotimes (i 10)
  (url-retrieve "http://127.0.0.1:8080/404"
                (lambda (status) (push (cons i status) results))))

What would you guess is the length of results? It’s initially 0 before any requests complete and over time (a very short time) I would expect this to top out at 10. On Emacs 24, here’s the real answer:

(length results)
;; => 46

The same error is reported multiple times to the callback. At least the pattern is obvious.

(cl-count 0 results :key #'car)
;; => 9
(cl-count 1 results :key #'car)
;; => 8
(cl-count 2 results :key #'car)
;; => 7

(cl-count 9 results :key #'car)
;; => 1

Here’s another one, this time to the non-existent foo.example. The DNS query should never resolve.

(setf results ())
(dotimes (i 10)
  (url-retrieve "http://foo.example/"
                (lambda (status) (push (cons i status) results))))

What’s the length of results? This time it’s zero. Remember how DNS is synchronous? Because of this, DNS failures are reported synchronously as a signaled error. This gets a lot worse with url-queue-retrieve. Since the request is put off until later, DNS doesn’t fail until later, and you get neither a callback nor an error signal. This also puts the queue in a bad state and necessitated elfeed-unjam for manually clear it. This one should get fixed in Emacs 25 when DNS is asynchronous.

This last one assumes you don’t have anything listening on port 57432 (pulled out of nowhere) so that the connection fails.

(setf results ())
(dotimes (i 10)
  (url-retrieve "http://127.0.0.1:57432/"
                (lambda (status) (push (cons i status) results))))

On Linux, we finally get the sane result of 10. However, on Windows, it’s zero. The synchronous TCP connection will fail, signaling an error just like DNS failures. Not only is it broken, it’s broken in different ways on different platforms.

There are many more cases of callback weirdness which depend on the connection and HTTP session being in various states when thing go awry. These were just the easiest to demonstrate. By using cURL, I get to bypass this mess.

No more GnuTLS issues

At compile time, Emacs can optionally be linked against GnuTLS, giving it robust TLS support so long as the shared library is available. url-retrieve uses this for fetching HTTPS content. Unfortunately, this library is noisy and will occasionally echo non-informational messages in the minibuffer and in *Messages* that cannot be suppressed.

When not linked against GnuTLS, Emacs will instead run the GnuTLS command line program as an inferior process, just like Elfeed now does with cURL. Unfortunately this interface is very slow and frequently fails, basically preventing Elfeed from fetching HTTPS feeds. I suspect it’s in part due to an improper coding-system-for-read.

cURL handles all the TLS negotation itself, so both these problems disappear. The compile-time configuration doesn’t matter.

Windows is now supported

Emacs’ Windows networking code is so unstable, even in Emacs 25, that I couldn’t make any practical use of Elfeed on that platform. Even the Cygwin emacs-w32 version couldn’t cut it. It hard crashes Emacs every time I’ve tried to fetch feeds. Fortunately the inferior process code is a whole lot more stable, meaning fetching with cURL works great. As of today, you can now use Elfeed on Windows. The biggest obstable is getting cURL installed and configured.

Interface changes

With cURL, obviously the values of url-queue-timeout and url-queue-parallel-processes no longer have any meaning to Elfeed. If you set these for yourself, you should instead call the functions elfeed-set-timeout and elfeed-set-max-connections, which will do the appropriate thing depending on the value of elfeed-use-curl. Each also comes with a getter so you can query the current value.

The deprecated elfeed-max-connections has been removed.

Feed objects now have meta tags :etag, :last-modified, and :canonical-url. The latter can identify feeds that have been moved, though it needs a real UI.

See any bugs?

If you use Elfeed, grab the current update and give the cURL fetcher a shot. Please open a ticket if you find problems. Be sure to report your Emacs version, operating system, and cURL version.

As of this writing there’s just one thing missing compared to url-queue: connection reuse. cURL supports it, so I just need to code it up.

RSA Signatures in Emacs Lisp

2015-10-30T22:35:13Z

Emacs comes with a wonderful arbitrary-precision computer algebra system called calc. I’ve discussed it previously and continue to use it on a daily basis. That’s right, people, Emacs can do calculus. Like everything Emacs, it’s programmable and extensible from Emacs Lisp. In this article, I’m going to implement the RSA public-key cryptosystem in Emacs Lisp using calc.

If you want to dive right in first, here’s the repository:

https://github.com/skeeto/emacs-rsa

This is only a toy implementation and not really intended for serious cryptographic work. It’s also far too slow when using keys of reasonable length.

Evaluation with calc

The calc package is particularly useful when considering Emacs’ limited integer type. Emacs uses a tagged integer scheme where integers are embedded within pointers. It’s a lot faster than the alternative (individually-allocated integer objects), but it means they’re always a few bits short of the platform’s native integer type.

calc has a large API, but the user-friendly porcelain for it is the under-documented calc-eval function. It evaluates an expression string with format-like argument substitutions ($n).

(calc-eval "2^16 - 1")
;; => "65535"

(calc-eval "2^$1 - 1" nil 128)
;; => "340282366920938463463374607431768211455"

Notice it returns strings, which is one of the ways calc represents arbitrary precision numbers. For arguments, it accepts regular Elisp numbers and strings just like this function returns. The implicit radix is 10. To explicitly set the radix, prefix the number with the radix and #. This is the same as in the user interface of calc. For example:

(calc-eval "16#deadbeef")
;; => "3735928559"

The second argument (optional) to calc-eval adjusts its behavior. Given nil, it simply evaluates the string and returns the result. The manual documents the different options, but the only other relevant option for RSA is the symbol pred, which asks it to return a boolean “predicate” result.

(calc-eval "$1 < $2" 'pred "4000" "5000")
;; => t

Generating primes

RSA is founded on the difficulty of factoring large composites with large factors. Generating an RSA keypair starts with generating two prime numbers, p and q, and using these primes to compute two mathematically related composite numbers.

calc has a function calc-next-prime for finding the next prime number following any arbitrary number. It uses a probabilistic primarily test — the ~~Fermat~~ Miller-Rabin primality test — to efficiently test large integers. It increments the input until it finds a result that passes enough iterations of the primality test.

(calc-eval "nextprime($1)" nil "100000000000000000")
;; => "100000000000000003"

So to generate a random n-bit prime, first generate a random n-bit number and then increment it until a prime number is found.

;; Generate a 128-bit prime, 10 iterations (0.000084% error rate)
(calc-eval "nextprime(random(2^$1), 10)" nil 128)
"111618319598394878409654851283959105123"

Unfortunately calc’s random function is based on Emacs’ random function, which is entirely unsuitable for cryptography. In the real implementation I read n bits from /dev/urandom to generate an n-bit number.

(with-temp-buffer
  (set-buffer-multibyte nil)
  (call-process "head" "/dev/urandom" t nil "-c" (format "%d" (/ bits 8)))
  (let ((f (apply-partially #'format "%02x")))
    (concat "16#" (mapconcat f (buffer-string) ""))))

(Note: /dev/urandom is the right choice. There’s no reason to use /dev/random for generating keys.)

Computing e and d

From here the code just follows along from the Wikipedia article. After generating the primes p and q, two composites are computed, n = p * q and i = (p - 1) * (q - 1). Lacking any reason to do otherwise, I chose 65,537 for the public exponent e.

The function rsa--inverse is just a straight Emacs Lisp + calc implementation of the extended Euclidean algorithm from the Wikipedia article pseudocode, computing d ≡ e^-1 (mod i). It’s not much use sharing it here, so take a look at the repository if you’re curious.

(defun rsa-generate-keypair (bits)
  "Generate a fresh RSA keypair plist of BITS length."
  (let* ((p (rsa-generate-prime (+ 1 (/ bits 2))))
         (q (rsa-generate-prime (+ 1 (/ bits 2))))
         (n (calc-eval "$1 * $2" nil p q))
         (i (calc-eval "($1 - 1) * ($2 - 1)" nil p q))
         (e (calc-eval "2^16+1"))
         (d (rsa--inverse e i)))
    `(:public  (:n ,n :e ,e) :private (:n ,n :d ,d))))

The public key is n and e and the private key is n and d. From here we can compute and verify cryptographic signatures.

Signatures

To compute signature s of an integer m (where m < n), compute s ≡ m^d (mod n). I chose the right-to-left binary method, again straight from the Wikipedia pseudocode (lazy!). I’ll share this one since it’s short. The backslash denotes integer division.

(defun rsa--mod-pow (base exponent modulus)
  (let ((result 1))
    (setf base (calc-eval "$1 % $2" nil base modulus))
    (while (calc-eval "$1 > 0" 'pred exponent)
      (when (calc-eval "$1 % 2 == 1" 'pred exponent)
        (setf result (calc-eval "($1 * $2) % $3" nil result base modulus)))
      (setf exponent (calc-eval "$1 \\ 2" nil exponent)
            base (calc-eval "($1 * $1) % $2" nil base modulus)))
    result))

Verifying the signature is the same process, but with the public key’s e: m ≡ s^e (mod n). If the signature is valid, m will be recovered. In theory, only someone who knows d can feasibly compute s from m. If n is small enough to factor, revealing p and q, then d can be feasibly recomputed from the public key. So mind your Ps and Qs.

So that leaves one problem: generally users want to sign strings and files and such, not integers. A hash function is used to reduce an arbitrary quantity of data into an integer suitable for signing. Emacs comes with a bunch of them, accessible through secure-hash. It hashes strings and buffers.

(secure-hash 'sha224 "Hello, world!")
;; => "8552d8b7a7dc5476cb9e25dee69a8091290764b7f2a64fe6e78e9568"

Since the result is hexadecimal, just prefix 16# to turn it into a calc integer.

Here’s the signature and verification functions. Any string or buffer can be signed.

(defun rsa-sign (private-key object)
  (let ((n (plist-get private-key :n))
        (d (plist-get private-key :d))
        (hash (concat "16#" (secure-hash 'sha384 object))))
    ;; truncate hash such that hash < n
    (while (calc-eval "$1 > $2" 'pred hash n)
      (setf hash (calc-eval "$1 \\ 2" nil hash)))
    (rsa--mod-pow hash d n)))

(defun rsa-verify (public-key object sig)
  (let ((n (plist-get public-key :n))
        (e (plist-get public-key :e))
        (hash (concat "16#" (secure-hash 'sha384 object))))
    ;; truncate hash such that hash < n
    (while (calc-eval "$1 > $2" 'pred hash n)
      (setf hash (calc-eval "$1 \\ 2" nil hash)))
    (let* ((result (rsa--mod-pow sig e n)))
      (calc-eval "$1 == $2" 'pred result hash))))

Note the hash truncation step. If this is actually necessary, then your n is very easy to factor! It’s in there since this is just a toy and I want it to work with small keys.

Putting it all together

Here’s the whole thing in action with an extremely small, 128-bit key.

(setf message "hello, world!")

(setf keypair (rsa-generate-keypair 128))
;; => (:public  (:n "74924929503799951536367992905751084593"
;;               :e "65537")
;;     :private (:n "74924929503799951536367992905751084593"
;;               :d "36491277062297490768595348639394259869"))

(setf sig (rsa-sign (plist-get keypair :private) message))
;; => "31982247477262471348259501761458827454"

(rsa-verify (plist-get keypair :public) message sig)
;; => t

(rsa-verify (plist-get keypair :public) (capitalize message) sig)
;; => nil

Each of these operations took less than a second. For larger, secure-length keys, this implementation is painfully slow. For example, generating a 2048-bit key takes my laptop about half an hour, and computing a signature with that key (any size message) takes about a minute. That’s probably a little too slow for, say, signing ELPA packages.

Counting Processor Cores in Emacs

2015-10-14T03:17:16Z

One of the great advantages of dependency analysis is parallelization. Modern processors reorder instructions whose results don’t affect each other. Compilers reorder expressions and statements to improve throughput. Build systems know which outputs are inputs for other targets and can choose any arbitrary build order within that constraint. This article involves the last case.

The build system I use most often is GNU Make, either directly or indirectly (Autoconf, CMake). It’s far from perfect, but it does what I need. I almost always invoke it from within Emacs rather than in a terminal. In fact, I do it so often that I’ve wrapped Emacs’ compile command for rapid invocation.

I recently helped a co-worker set this set up for himself, so it had me thinking about the problem again. The situation in my config is much more complicated than it needs to be, so I’ll share a simplified version instead.

First bring in the usual goodies (we’re going to be making closures):

;;; -*- lexical-binding: t; -*-
(require 'cl-lib)

We need a couple of configuration variables.

(defvar quick-compile-command "make -k ")
(defvar quick-compile-build-file "Makefile")

Then a couple of interactive functions to set these on the fly. It’s not strictly necessary, but I like giving each a key binding. I also like having a history available via read-string, so I can switch between a couple of different options with ease.

(defun quick-compile-set-command (command)
  (interactive
   (list (read-string "Command: " quick-compile-command)))
  (setf quick-compile-command command))

(defun quick-compile-set-build-file (build-file)
  (interactive
   (list (read-string "Build file: " quick-compile-build-file)))
  (setf quick-compile-build-file build-file))

Now finally to the good part. Below, quick-compile is a non-interactive function that returns an interactive closure ready to be bound to any key I desire. It takes an optional target. This means I don’t use the above quick-compile-set-command to choose a target, only for setting other options. That will make more sense in a moment.

(cl-defun quick-compile (&optional (target ""))
  "Return an interaction function that runs `compile' for TARGET."
  (lambda ()
    (interactive)
    (save-buffer)  ; so I don't get asked
    (let ((default-directory
            (locate-dominating-file
             default-directory quick-compile-build-file)))
      (if default-directory
          (compile (concat quick-compile-command " " target))
        (error "Cannot find %s" quick-compile-build-file)))))

It traverses up (down?) the directory hierarchy towards root looking for a Makefile — or whatever is set for quick-compile-build-file — then invokes the build system there. I don’t believe in recursive make.

So how do I put this to use? I clobber some key bindings I don’t otherwise care about. A better choice might be the F-keys, but my muscle memory is already committed elsewhere.

(global-set-key (kbd "C-x c") (quick-compile)) ; default target
(global-set-key (kbd "C-x C") (quick-compile "clean"))
(global-set-key (kbd "C-x t") (quick-compile "test"))
(global-set-key (kbd "C-x r") (quick-compile "run"))

Each of those invokes a different target without second guessing me. Let me tell you, having “clean” at the tip of my fingers is wonderful.

Parallel Builds

An extension common to many different make programs is -j, which asks make to build targets in parallel where possible. These days where multi-core machines are the norm, you nearly always want to use this option, ideally set to the number of logical processor cores on your system. It’s a huge time-saver.

My recent revelation was that my default build command could be better: make -k is minimal. It should at least include -j, but choosing an argument (number of processor cores) is a problem. Today I use different machines with 2, 4, or 8 cores, so most of the time any given number will be wrong. I could use a per-system configuration, but I’d rather not. Unfortunately GNU Make will not automatically detect the number of cores. That leaves the matter up to Emacs Lisp.

Emacs doesn’t currently have a built-in function that returns the number of processor cores. I’ll need to reach into the operating system to figure it out. My usual development environments are Linux, Windows, and OpenBSD, so my solution should work on each. I’ve ranked them by order of importance.

Number of cores on Linux

Linux has the /proc virtual filesystem in the fashion of Plan 9, allowing different aspects of the system to be explored through the standard filesystem API. The relevant file here is /proc/cpuinfo, listing useful information about each of the system’s processors. To get the number of processors, count the number of processor entries in this file. I’ve wrapped it in if-file-exists so that it returns nil on other operating systems instead of throwing an error.

(when (file-exists-p "/proc/cpuinfo")
  (with-temp-buffer
    (insert-file-contents "/proc/cpuinfo")
    (how-many "^processor[[:space:]]+:")))

Number of cores on Windows

When I was first researching how to do this on Windows, I thought I would need to invoke the wmic command line program and hope the output could be parsed the same way on different versions of the operating system and tool. However, it turns out the solution for Windows is trivial. The environment variable NUMBER_OF_PROCESSORS gives every process the answer for free. Being an environment variable, it will need to be parsed.

(let ((number-of-processors (getenv "NUMBER_OF_PROCESSORS")))
  (when number-of-processors
    (string-to-number number-of-processors)))

Number of cores on BSD

This seems to work the same across all the BSDs, including OS X, though I haven’t yet tested it exhaustively. Invoke sysctl, which returns an undecorated number to be parsed.

(with-temp-buffer
  (ignore-errors
    (when (zerop (call-process "sysctl" nil t nil "-n" "hw.ncpu"))
      (string-to-number (buffer-string)))))

Also not complicated, but it’s the heaviest solution of the three.

Putting it all together

Join all these together with or, call it numcores, and ta-da.

(setf quick-compile-command (format "make -kj%d" (numcores)))

Now make is invoked correctly on any system by default.

Emacs Autotetris Mode

2014-10-19T21:45:53Z

For more than a decade now, Emacs has come with a built-in Tetris clone, originally written by XEmacs’ Glynn Clements. Just run M-x tetris any time you want to play. For anyone too busy to waste time playing Tetris, earlier this year I wrote an autotetris-mode that will play the Emacs game automatically.

https://github.com/skeeto/autotetris-mode

Load the source, autotetris-mode.el and M-x autotetris. It will start the built-in Tetris but make all the moves itself. It works best when byte compiled.

At the time I had read an article and was interested in trying my hand at my own Tetris AI. Like most things Emacs, the built-in Tetris game is very hackable. It’s also pretty simple and easy to understand. Rather than write my own I chose to build upon this one.

Heuristics

It’s not a particularly strong AI. It doesn’t pay attention to the next piece in queue, it doesn’t know the game’s basic shapes, and it doesn’t try to maximize the score (clearing multiple rows at once). The goal is to continue running for as long as possible. But since it’s able to get to the point where the game is so fast that the AI is unable to move pieces fast enough (it’s rate limited like a human player), that means it’s good enough.

When a new piece appears at the top of the screen, the AI, in memory, tries placing it in all possible positions and all possible orientations. For each of these positions it runs a heuristic on the resulting game state, summing five metrics. Each metric is scaled by a hand-tuned weight to adjust its relative priority. Smaller is better, so the position with the lowest score is selected.

Number of Holes

A hole is any open space that has a solid block above it, even if that hole is accessible without passing through a solid block. Count these holes.

Maximum Height

Add the height of the tallest column. Column height includes any holes in the column. The game ends when a column touches the top of the screen (or something like that), so this should be kept in check.

Mean Height

Add the mean height of all columns. The higher this is, the closer we are to losing the game. Since each row will have at least one hole, this will be a similar measure to the hole count.

Height Disparity

Add the difference between the shortest column height and the tallest column height. If this number is large it means we’re not making effective use of the playing area. It also discourages the AI from getting into that annoying situation we all remember: when you really need a 4x1 piece that never seems to come. Those are the brief moments when I truly believe the version I’m playing has to be rigged.

Surface Roughness

Take the root mean square of the column heights. A rougher surface leaves fewer options when placing pieces. This measure will be similar to the disparity measurement.

Emacs-specific Details

With a position selected, the AI sends player inputs at a limited rate to the game itself, moving the piece into place. This is done by calling tetris-move-right, tetris-move-left, and tetris-rotate-next, which, in the normal game, are bound to the arrow keys.

The built-in tetris-mode isn’t quite designed for this kind of extension, so it needs a little bit of help. I defined two pieces of advice to create hooks. These hooks alert my AI to two specific events in the game: the game start and a fresh, new piece.

(defadvice tetris-new-shape (after autotetris-new-shape-hook activate)
  (run-hooks 'autotetris-new-shape-hook))

(defadvice tetris-start-game (after autotetris-start-game-hook activate)
  (run-hooks 'autotetris-start-game-hook))

I talked before about the problems with global state. Fortunately, tetris-mode doesn’t store any game state in global variables. It stores everything in buffer-local variables, which can be exploited for use in the AI. To perform the “in memory” heuristic checks, it creates a copy of the game state and manipulates the copy. The copy is made by way of clone-buffer on the *Tetris* buffer. The tetris-mode functions all work equally as well on the clone, so I can use the existing game rules to properly place the next piece in each available position. The game’s own rules take care of clearing rows and checking for collisions for me. I wrote an autotetris-save-excursion function to handle the messy details.

(defmacro autotetris-save-excursion (&rest body)
  "Restore tetris game state after BODY completes."
  (declare (indent defun))
  `(with-current-buffer tetris-buffer-name
     (let ((autotetris-saved (clone-buffer "*Tetris-saved*")))
       (unwind-protect
           (with-current-buffer autotetris-saved
             (kill-local-variable 'kill-buffer-hook)
             ,@body)
         (kill-buffer autotetris-saved)))))

The kill-buffer-hook variable is also cloned, but I don’t want tetris-mode to respond to the clone being killed, so I clear out the hook.

That’s basically all there is to it! While watching it feels like it’s making dumb mistakes, not placing pieces in optimal positions, but it recovers well from these situations almost every time, so it must know what it’s doing. Currently it’s a better player than me, which is my rule-of-thumb for calling an AI successful.

Emacs Unicode Pitfalls

2014-06-13T05:58:34Z

GNU Emacs is seven years older than Unicode. Support for Unicode had to be added relatively late in Emacs’ existence. This means Emacs has existed longer without Unicode support (16 years) than with it (14 years). Despite this, Emacs has excellent Unicode support. It feels as if it was there the whole time.

However, as a natural result of Unicode covering all sorts of edge cases for every known human language, there are pitfalls and complications. As a user of Emacs, you’re not particularly affected by these, but extension developers might run into trouble while handling Emacs character-oriented data structures: strings and buffers.

In this article I’ll go over Elisp’s Unicode surprises. I’ve been caught by some of these myself. In fact, as a result of writing this article, I’ve discovered subtle encoding bugs in some of my own extensions. None of these pitfalls are Emacs’ fault. They’re just the result of complexities of natural language.

Unicode and Code Points

First, there are excellent materials online for learning Unicode. I recommend starting with UTF-8 and Unicode FAQ for Unix/Linux. There’s no reason for me to repeat all this information here, but I’ll attempt to quickly summarize it.

Unicode maps code points (integers) to specific characters, along with a standard name. As of this writing, Unicode defines over 110,000 characters. For backwards compatibility, the first 128 code points are mapped to ASCII. This trend continues for other character standards, like Latin-1.

In Emacs, Unicode characters are entered into a buffer with C-x 8 RET (insert-char). You can enter either the official name of the character (e.g. “GREEK SMALL LETTER PI” for π) or the hexadecimal code point. Outside of Emacs it depends on the application, but C-S-u followed by the hexadecimal code works for most of the applications I care about.

Encodings

The Unicode standard also describes several methods for encoding sequences of code points into sequences of bytes. Obviously a selection of 110,000 characters cannot be encoded with one byte per letter, so these are multibyte encodings. The two most popular encodings are probably UTF-8 and UTF-16.

UTF-8 was designed to be backwards compatible with ASCII, Unix, and existing C APIs (null-terminated C strings). The first 128 code points are encoded directly as a single byte. Every other character is encoded with two to six bytes, with the highest bit of each byte set to 1. This ensures that no part of a multibyte character will be interpreted as ASCII, nor will it contain a null (0). The latter means that C programs and C APIs can handle UTF-8 strings with few or no changes. Most importantly, every ASCII encoded file is automatically a UTF-8 encoded file.

UTF-16 encodes all the characters from the Basic Multilingual Plane (BMP) with two bytes. Even the original ASCII characters get two bytes (16 bits). The BMP covers virtually all modern languages and is generally all you’ll ever practically need. However, this doesn’t include the important TROPICAL DRINK or PILE OF POO characters from the supplemental (“astral”) plane. If you need to use these characters in UTF-16, you’re going to run into problems: characters outside the BMP don’t fit in two bytes. To accommodate these characters, UTF-16 uses surrogate pairs: these characters are encoded with two 16-bit units.

Because of this last point, UTF-16 offers no practical advantages over UTF-8. Its existence was probably a big mistake. You can’t do constant-time character lookup because you have to scan for surrogate pairs. It’s not backwards compatible and cannot be stored in null-terminated strings. In both Java and JavaScript, it leads to the awkward situation where the “length” of a string is not the number of characters, code points, or even bytes. Worst of all, it has serious security implications. New applications should avoid it whenever possible.

Emacs and UTF-8

Emacs internally stores all text as UTF-8. This was an excellent choice! When text leaves Emacs, such as writing to a file or to a process, Emacs automatically converts it to the coding system configured for that particular file or process. When it accepts text from a file or process, it either converts it to UTF-8 or preserves it as raw bytes.

There are two modes for this in Emacs: unibyte and multibyte. Unibyte strings/buffers are just raw bytes. They have constant access O(1) time but can only hold single-byte values. The byte-code compiler outputs unibyte strings.

Multibyte strings/buffers hold UTF-8 encoded code points. Character access is O(n) because the string/buffer has to be scanned to count characters.

The actual encoding is rarely relevant because there’s little way (and need) to access it directly. Emacs automatically converts text as needed when it leaves Emacs and arrives in Emacs, so there’s no need to know the internal encoding. If you really want to see it anyway, you can use string-as-unibyte to get a copy of a string with the exact same bytes, but as a byte-string.

(string-as-unibyte "π")
;; => "\317\200"

This can be reversed with string-as-multibyte), to change a unibyte string holding UTF-8 encoded text back into a multibyte string. Note that these functions are different than string-to-unibyte and string-to-multibyte, which will attempt a conversion rather than preserving the raw bytes.

The length and buffer-size functions always count characters in multibyte and bytes in unibyte. Being UTF-8, there are no surrogate pairs to worry about here. The string-bytes and position-bytes functions return byte information for both multibyte and unibyte.

To specify a Unicode character in a string literal without using the character directly, use \uXXXX. The XXXX is the hexadecimal code point for the character and is always 4 digits long. For characters outside the BMP, which won’t fit in four digits, use a capital U with eight digits: \UXXXXXXXX.

"\u03C0"
;; => "π"

"\U0001F4A9"
;; => "💩"  (PILE OF POO)

Finally, Emacs extends Unicode with 256 additional “characters” representing raw bytes. This allows raw bytes to be embedded distinctly within UTF-8 sequences. For example, it’s used to distinguish the code point U+0041 from the raw byte #x41. As far as I can tell, this isn’t used very often.

Combining Characters

Some Unicode characters are defined as combining characters. These characters modify the non-combining character that appears before it, typically with accents or diacritical marks.

For example, the word “naïve” can be written as six characters as "nai\u0308ve". The fourth character, U+0308 (COMBINING DIAERESIS), is a combining character that changes the “i” (U+0069 LATIN SMALL LETTER I) into an umlaut character.

The most commonly accented characters have a code of their own. These are called precomposed characters. This includes ï (U+00EF LATIN SMALL LETTER I WITH DIAERESIS). This means “naïve” can also be written as five characters as "na\u00EFve".

Normalization

So what happens when comparing two different representations of the same text? They’re not equal.

(string= "nai\u0308ve" "na\u00EFve")
;; => nil

To deal with situations like this, the Unicode standard defines four different kinds of normalization. The two most important ones are NFC (composition) and NFD (decomposition). The former uses precomposed characters whenever possible and the latter breaks them apart. The functions ucs-normalize-NFC-string and ucs-normalize-NFD-string perform this operation.

Pitfall #1: Proper string comparison requires normalization. It doesn’t matter which normalization you use (though NFD should be slightly faster), you just need to use it consistently. Unfortunately this can get tricky when using equal to compare complex data structures with multiple strings.

(string= (ucs-normalize-NFD-string "nai\u0308ve")
         (ucs-normalize-NFD-string "na\u00EFve"))
;; => t

Emacs itself fails to do this. It doesn’t normalize strings before interning them, which is probably a mistake. This means you can have differently defined variables and functions with the same canonical name.

(eq (intern "nai\u0308ve")
    (intern "na\u00EFve"))
;; => nil

(defun print-résumé ()
  "NFC-normalized form."
  (print "I'm going to sabotage your team."))

(defun print-résumé ()
  "NFD-normalized form."
  (print "I'd be a great asset to your team."))

(print-résumé)
;; => "I'm going to sabotage your team."

String Width

There are three ways to quantify multibyte text. These are often the same value, but in some circumstances they can each be different.

length: number of characters, including combining characters
bytes: number of bytes in its UTF-8 encoding
width: number of columns it would occupy in the current buffer

Most of the time, one character is one column (a width of one). Some characters, like combining characters, consume no columns. Many Asian characters consume two columns (U+4000, 䀀). Tabs consume tab-width columns, usually 8.

Generally, a string should have the same width regardless of which whether it’s NFD or NFC. However, due to bugs and incomplete Unicode support, this isn’t strictly true. For example, some combining characters, such as U+20DD ⃝, won’t combine correctly in Emacs nor in other applications.

Pitfall #2: Always measure text by width, not length, when laying out a buffer. Width is measured with the string-width function. This comes up when laying out tables in a buffer. The number of characters that fit in a column depends on what those characters are.

Fortunately I accidentally got this right in Elfeed because I used the format function for layout. The %s directive operates on width, as would be expected. However, this has the side effect that the output of may format change depending on the current buffer! Pitfall #3: Be mindful of the current buffer when using the format function.

(let ((tab-width 4))
  (length (format "%.6s" "\t")))
;; => 1

(let ((tab-width 8))
  (length (format "%.6s" "\t")))
;; => 0

String Reversal

Say you want to reverse a multibyte string. Simple, right?

(defun reverse-string (string)
  (concat (reverse (string-to-list string))))

(reverse-string "abc")
;; => "cba"

Wrong! The combining characters will get flipped around to the wrong side of the character they’re meant to modify.

(reverse-string "nai\u0308ve")
;; => "ev̈ian"

Pitfall #4: Reversing Unicode strings is non-trivial. The Rosetta Code page is full of incorrect examples, and I’m personally guilty of this, too. The other day I submitted a patch to s.el to correct its s-reverse function for Unicode. If it’s accepted, you should never need to worry about this.

Regular Expressions

Regular expressions operate on code points. This means combining characters are counted separately and the match may change depending on how characters are composed. To avoid this, you might want to consider NFC normalization before performing some kinds of regular expressions.

;; Like string= from before:
(string-match-p  "na\u00EFve" "nai\u0308ve")
;; => nil

;; The . only matches part of the composition
(string-match-p "na.ve" "nai\u0308ve")
;; => nil

Pitfall #5: Be mindful of combining characters when using regular expressions. Prefer NFC normalization when dealing with regular expressions.

Another potential problem is ranges, though this is quite uncommon. Ranges of characters can be expressed in inside brackets, e.g. [a-zA-Z]. If the range begins or ends with a decomposed combining character you won’t get the proper range because its parts are considered separately by the regular expression engine.

(defvar match-weird "[\u00E0-\u00F6]+")

(string-match-p match-weird "áâãäå")
;; => 0  (successful match)

(string-match-p (ucs-normalize-NFD-string match-weird) "áâãäå")
;; => nil

It’s especially important to keep all of this in mind when sanitizing untrusted input, such as when using Emacs as a web server. An attacker might use a denormalized or strange grapheme cluster to bypass a filter.

Interacting with the World

Here’s a mistake I’ve made twice now. Emacs uses UTF-8 internally, regardless of whatever encoding the original text came in. Pitfall #6: When working with bytes of text, the counts may be different than the original source of the text.

For example, HTTP/1.1 introduced persistent connections. Before this, a client connects to a server and asks for content. The server sends the content and then closes the connection to signal the end of the data. In HTTP/1.1, when Connection: close isn’t specified, the server will instead send a Content-Length header indicating the length of the content in bytes. The connection can then be re-used for more requests, or, more importantly, pipelining requests.

The main problem is that HTTP headers usually have a different encoding than the content body. Emacs is not prepared to handle multiple encodings from a single source, so the only correct way to talk HTTP with a network process is raw. My mistake was allowing Emacs to do the UTF-8 conversion, then measuring the length of the content in its UTF-8 encoding. This just happens to work fine about 99.9% of the time since clients tend to speak UTF-8, or something like it, anyway, but it’s not correct.

Emacs Lisp Buffer Passing Style

2014-05-27T01:58:09Z

Emacs Lisp strings are mutable, fixed-length character (multibyte) or byte (unibyte) arrays. Any operation that would change its length requires allocating a new string object. This is common in many programming languages’ strings. Python, Java, and JavaScript go even further, with strings being completely immutable.

In these languages, performing many string operations at a time, especially with the += operator, allocates many temporary strings. It’s also awkward. For these situations, Java provides a class, StringBuilder, so that these operations can be done with a temporary, efficient, mutable data structure that will emit the final string when complete.

java.util.Collection<T> collection;

public String toString() {
    StringBuilder sb = new StringBuilder();
    for (T element : collection) {
        sb.append(element);
    }
    return sb.toString();
}

In JavaScript a popular string building idiom is to use an array. Push the components onto an array and join() the result.

function toString(object) {
    var output = [];
    for (var k in object) {
        output.push(k);
        output.push(' -> ');
        output.push(object[k]);
        output.push('\n');
    }
    return output.join('');
}

toString({a: 1, b: 2});
// => "a -> 1\nb -> 2\n"

Emacs Lisp

What character sequence data structure already exists in Elisp that’s efficient at insert, update, and delete? Buffers, of course! I know it’s easy to forget, but editing sequences of characters is the primary purpose of Emacs, after all. To make use of a buffer as a string builder, use one of my favorite macros: with-temp-buffer. I like to combine this with setting standard-output so that all of the printing functions go there.

(defun to-string (alist)
  (with-temp-buffer
    (let ((standard-output (current-buffer)))
      (dolist (pair alist)
        (princ (cl-first pair))
        (princ " -> ")
        (princ (cl-second pair))
        (princ "\n")))
    (buffer-string)))

Update: Jon O. pointed out that Emacs has a with-output-to-string macro available to do this more concisely.

Internally Elisp buffers are gap buffers, a rather simple data structure where the data is split into two sequences with a “gap” in between. Insertion and deletion occurs at the gap, which is slid up and down the overall sequence. This makes gap buffers efficient for making lots of edits localized in a single area, just as a human would do while editing text.

Each character in a buffer is a full Unicode code point and can have an arbitrary set of properties associated with it (font-lock-face, read-only, nonstickiness, etc.). Along with inline image objects, this makes buffers rich enough to display rendered HTML (to a limited extent).

The Catch

There’s an important caveat to using buffers as mutable strings: they’re not managed by the garbage collector. Each buffer goes into the global buffer list, implemented internally as an intrusive linked list. If a buffer is not on this list, it’s a dead buffer.

Ultimately this makes buffer objects poor return values. It’s an impedance mismatch. The caller has to be careful to free (“kill”) the buffer. It’s easy to miss if an error is signaled. For example, url-retrieve and url-retrieve-synchronously return a buffer with the response from a web server. It’s not uncommon for Elisp programs to leak these buffers during normal operation.

(with-current-buffer (url-retrieve-synchronously some-url)
  (setf (point) url-http-end-of-headers)
  (prog1 (json-read)
    (kill-buffer)))

If json-read fails, the buffer is leaked.

As a side note: alternatively you could use my finalize package to associate the buffer with an object that is subject to garbage collection. The buffer will be killed immediately when the object is garbage collected.

Buffer Passing Style

To deal with this, my preferred idiom is what I call buffer-passing style. Rather than have the callee instantiate the buffer, the caller instantiates the buffer and “passes” it implicitly as the current buffer. The callee fills it with something. The caller should use something like with-temp-buffer so that the buffer has a clean life-cycle, fully managed by the caller.

Imagine instead of returning a buffer, url-retrieve-synchronously puts the result in the current buffer instead of returning a buffer. If anything goes wrong, the buffer will be automatically killed by with-temp-buffer.

(with-temp-buffer
  (url-retrieve-synchronously some-url)
  (setf (point) url-http-end-of-headers)
  (json-read))

Buffer-passing style is what I settled on for simple-httpd. Servlets are called with the output buffer as the current buffer and with standard-output set to this buffer. The servlet is only responsible for filling this buffer with content. Thanks to process-send-region, the content is never actually copied into a string.

(defservlet* search :application/json (q)
  (princ (json-encode (search-results q))))

I didn’t recognize buffer-passing style until much later. As a result, far too much of simple-httpd is still string oriented when it shouldn’t be.

An Emacs Foreign Function Interface

2014-04-26T16:25:51Z

For many years Richard Stallman (RMS) prohibited a foreign function interface (FFI) in GNU Emacs. An FFI is an API for dynamically calling native libraries at run-time, like the Java Native Interface (JNI). He was concerned that people might use it to make proprietary extensions to the popular editor. This was the same (paranoid) justification for rejecting a package manager in Emacs for many years, that someone might use it to distribute proprietary packages.

Fortunately, times have changed. RMS reevaluated his stances on FFI and on package managers. Today Emacs comes with a package manager (package.el), and there are multiple package repositories with no proprietary packages in sight. Though, outside of some unaccepted patches, no significant progress has been made to add an FFI.

A few weeks ago I did something about that by writing a package that adds an FFI. It requires no patches or any other changes to Emacs itself. Instead, it drives a subprocess running libffi, passing arguments and return values back and forth through a pipe, in the spirit of EmacSQL. It’s not as efficient as a built-in API, but it could potentially be distributed through an ELPA repository.

Emacs Lisp Foreign Function Interface

The API is modeled loosely after Julia’s elegant FFI. A call interface (CIF) doesn’t need to be prepared ahead of time. Provide all the necessary information at the call site and the library takes care of building and caching CIFs and handles for you.

API Examples

The core function for the FFI is ffi-call. Here’s an example that calls the system’s srand() and then rand().

;; seed with 0
(ffi-call nil "srand" [:void :uint32] 0)
;; => :void

(ffi-call nil "rand" [:sint32])
;; => 1102520059

The first two arguments are similar to the first two arguments of dlsym(). For ffi-call, the first argument is the library shared object name. The back-end automatically takes care of obtaining a handle on the library with dlopen(). In this case we’re accessing a function that’s already in the main program, so we pass nil. This is identical to passing NULL to dlsym(). In this FFI, nil always corresponds to NULL.

The second argument is the function name, just like dlsym()’s second argument.

The third argument is the function signature. It’s a vector of keywords declaring the return value type followed by the types of each argument. In this example, srand() returns nothing (void) and accepts a single 32-bit unsigned argument, so the signature is [:void :uint32].

The remaining arguments are the native function arguments. I can keep making the second FFI call (“rand”) to retrieve different numbers, using the first FFI call (“srand”) to reset the sequence.

Using a Library

Here’s another example, loading libm and calling cos.

;; cos(1.2)
(ffi-call "libm.so" "cos" [:double :double] 1.2)
;; => 0.362357754476674

The first time a library is used, the back-end creates a handle for it with dlopen(). Further calls will reuse the handle, trying to be as efficient as possible. Handles are never closed.

Pointers

Here are a couple of examples that use pointers. As stated before, nil is used to pass a NULL pointer. Like the underlying libffi, the FFI doesn’t care what kind of pointer you’re passing, just that it’s a pointer, so it’s declared with :pointer.

;; time(NULL);
(ffi-call nil "time" [:uint64 :pointer] nil)
;; => 1396496875

Strings are automatically copied to the subprocess, their lifetime tied to the lifetime of the Elisp string (note: this detail is still unimplemented). When used as arguments, they become pointers.

;; getenv("DISPLAY")
(ffi-call nil "getenv" [:pointer :pointer] "DISPLAY")
;; => 0x7fffc13ceb29

(ffi-get-string '0x7fffc13ceb29)
;; => ":0"

Pointers can be handled as values on the Elisp side. They’re represented as symbols whose name is an address. In the above example, 0x7fffc13ceb29 is one of these symbols. I would have preferred to use a plain integer to represent pointers, but, because Elisp integers are tagged, they’re guaranteed not to be wide enough for this. I plan to add pointer operators to do pointer arithmetic on these special pointer values.

The function ffi-get-string is used to retrieve the null-terminated string referenced by a pointer. If the string returned by getenv() needed to be freed (it doesn’t and shouldn’t), the FFI caller would need to be careful to call free() as another FFI call.

How It Works: The Stack Machine

My goal is to keep the back-end as simple as possible. All resource management is handled by Emacs, tied to garbage collection. For example, the pointer returned by dlopen() isn’t stored anywhere in the subprocess. It’s passed to Emacs and managed there. To call a function using the handle, the pointer is transmitted back to the subprocess.

To keep it simple, the back-end is just a stack machine with a simple human-readable bytecode. You can see the instruction set by looking at the big switch statement in ffi-glue.cc. For example, to push a signed 2-byte integer 237 onto the stack, send a j followed by an ASCII representation of the number (terminated by a space if needed): j237.

As usual, my assumption is that the Elisp printer and reader is faster than any possible serialization I could implement within Elisp itself. This also nicely sidesteps the byte-order issue.

The function signature is declared by pushing zeros of the return/argument types onto the stack, with a special void “value” used to communicate void. Once it’s all set up, the C instruction is called, collapsing the signature into a CIF handle: a pointer for the Elisp side to manage.

Pointers to raw strings of bytes are pushed onto the stack with the M instruction. It pops the top integer on the stack to get the byte count, reads that number of bytes from input into a buffer, null-terminates the buffer in case it’s used as a string, and finally puts a pointer to that buffer on the stack.

Calling functions is just a matter of pushing all the needed information onto the stack, invoking libffi to magically call the function, then popping the result off the stack. Popping a value transmits it to Elisp.

Stack Machine Example

Here’s a concise example that calls cos(1.2) (assuming libm.so is already linked). The actual Elisp-generated FFI bytecode doesn’t plan things quite this way — particularly because it needs to keep track of the various pointers involved — but this example keeps it simple.

d1.2d0d0w1Cp0w3McosSco

You can run this example manually by executing the ffi-glue program and pasting in that line as standard input. The result will be printed.

d1.2 : Push a double, 1.2, onto the stack. This will be the function argument.
d0d0 : Push a couple of zero doubles onto the stack. This is our function signature. It takes a double and returns a double.
w1 : Push an unsigned 32-bit 1 onto the stack. Instructions that use integers accept unsigned 32-bit integers. This 1 indicates that our function accepts one argument.
C : create a CIF. The integer 1 and the two 0 doubles are consumed and a pointer to a CIF is put on the stack. Elisp would normally pop this off and save it for future use, but we’re going to leave it there (and ultimately leak it in the example).
p0 : Push a NULL onto the stack. p means push a pointer and 0 is a NULL pointer. This is our library handle. We’re assuming cos will be in the main program.
w3Mcos : Put a pointer to the string “cos” into the stack. First push on the number 3 (string length), then M to read from input, then pass three bytes: “cos”. In our example, this buffer will be leaked because we lose the buffer pointer.
S : Call dlsym() on the string and handle on top of the stack. This consumes the top two values (NULL and “cos”), and pushes a function handle on top of the stack. At this point the stack has three values: 1.2, the CIF, and the function handle.
c : Call the function pointed to by the top of the stack. This consumes the top pointer, the CIF below it, and the CIF indicates how many more values to consume: just one in this case, since the function takes one argument. The function’s return value is pushed on the stack. If the function is void, the special void “value” is pushed on the stack.
o : Pop the top stack value, sending it to Emacs. This is what would be returned by ffi-call.

Before I got the Elisp side of things going, I was testing out the back-end by writing lots of little programs like this by hand.

A Safe FFI

While using an FFI through a pipe is slow compared to a built-in FFI, there is a distinct advantage. The FFI can never crash Emacs! Normally, making calls to an FFI is unsafe. It allows the programmer to violate normal language constraints. If the programmer misuses the FFI, the whole process may crash or become corrupt. This will lose any state held behind foreign interface, but Emacs will be safe.

In my package, the handle for the FFI Emacs subprocess is called the context. A context is automatically established and bound to the ffi-context global variable as needed. This context keeps track of CIFs, string buffers, handles, and any other resources held by the subprocess. If the subprocess dies, the context becomes meaningless since the pointers it holds are dead.

Limitations

This FFI package is about 80% complete. It occasionally leaks memory in the subprocess, it’s overly-sensitive to mis-typing, it doesn’t manage stdin/stdout, it can’t inspect/modify structs, and it can’t set up closures.

The last point, closures, would require some changes to the interprocess communication. The purpose here would be to allow foreign functions to call Elisp functions. The subprocess would need to be able to initiate activity with Elisp.

Manipulating structs is complex, and even libffi has limited support for working with them. It allows structs to be declared, but leaves alignment and access up to the user to sort out. That’s where the previously-mentioned pointer arithmetic comes into play.

Currently stdin, stdout, and stderr are problems, especially when I was trying to write a test GTK application with Elisp. Any command line junkie knows that GTK (and Qt) applications are ridiculously noisy. It spews hundreds of lines of warnings and notifications as part of its normal operation. This noise interferes with FFI communication with Emacs. I need to figure out how to separate this and get standard input/output/error to/from Emacs through separate channels.

Like libffi, there are no guarantees about variadic function calls. It should generally Just Work, but you can’t rely on it.

The whole thing will not work as well in 32-bit Emacs, where integers are limited to a tiny 29 bits. For example, those rand() return values will simply not fit. In the long run, this is probably the single largest barrier to making the FFI work smoothly. It’s too easy to run into large integer values.

Right now I consider it a proof of concept; an FFI really can be done this way. I don’t have any particular uses in mind, and, outside of the “cool factor,” I can’t actually think of any useful applications. If a solid FFI already existed, I may have tried to use it for EmacSQL rather than use this subprocess trick. My FFI is probably mature enough to drive SQLite, so maybe this is the future of EmacSQL.

If you can think of a good use for an Emacs FFI, please share it. I need good test ideas.

Emacs Lisp Defstruct Namespace Convention

2014-03-19T01:41:52Z

One of the drawbacks of Emacs Lisp is the lack of namespaces. Every defun, defvar, defcustom, defface, defalias, defstruct, and defclass establishes one or more names in the global scope. To work around this, package authors are strongly encouraged to prefix every global name with the name of its package. That way there should never be a naming conflict between two different packages.

(defvar mypackage-foo-limit 10)

(defvar mypackage--bar-counter 0)

(defun mypackage-init ()
  ...)

(defun mypackage-compute-children (node)
  ...)

(provide 'mypackage)

While this has solved the problem for the time being, attaching the package name to almost every identifier, including private function and variable names, is quite cumbersome. Namespaces can almost be hacked into the language by using multiple obarrays, but symbols have internal linked lists that prohibit inclusion in multiple obarrays.

By convention, private names are given a double-dash after the namespace. If a “bar counter” is an implementation detail that may disappear in the future, it will be called mypackage--bar-counter to warn users and other package authors not to rely on it.

There’s been a recent push to follow this namespace-prefix policy more strictly, particularly with the depreciation of cl and introduction of cl-lib. I suspect someday when namespaces are finally introduced, packages with strictly clean namespaces with be at an advantage, somehow automatically supported. Nic Ferrier has proposed ideas for how to move forward on this.

How strict are we talking?

Over the last few years I’ve gotten much stricter in my own packages when it comes to namespace prefixes. You can see the progression going from javadoc-lookup (2010) where I was completely sloppy about it, to EmacSQL (2014) where every single global identifier is meticulously prefixed.

For a time I considered names such as make-* and with-* to be exceptions to the rule, since these names are idioms inherited from Common Lisp. The namespace comes after the expected prefix. I’ve changed my mind about this, which has caused me to change my usage of defstruct (now cl-defstruct).

Just as in Common Lisp, by default cl-defstruct defines a constructor starting with make-*. This is fine in Common Lisp, where it’s a package-private function by default, but in Emacs Lisp this pollutes the global namespace.

(require 'cl-lib)

;; Defines make-circle, circle-x, circle-y, circle-radius, circle-p
(cl-defstruct circle
  x y radius)

(defvar unit-circle (make-circle :x 0.0 :y 0.0 :radius 1.0))

unit-circle
;; => [cl-struct-circle 0.0 0.0 1.0]

(circle-radius unit-circle)
;; => 1.0

This constructor isn’t namespace clean, so package authors should avoid defstruct’s default. If the package is named circle then all of the accessors are perfectly fine, though.

To fix this, I now use another, more recent Emacs Lisp idiom: name the constructor create. That is, for the package circle, we desire circle-create. To get this behavior from cl-defstruct, use the :constructor option.

;; Clean!
(cl-defstruct (circle (:constructor circle-create))
  x y radius)

(circle-create :x 0 :y 0 :radius 1)
;; => [cl-struct-circle 0 0 1]

(provide 'circle)

This affords a new opportunity to craft a better constructor. Have cl-defstruct define a private constructor, then manually write a constructor with a nicer interface. It may also do additional work, like enforce invariants or initialize dependent slots.

(cl-defstruct (circle (:constructor circle--create))
  x y radius)

(defun circle-create (x y radius)
  (let ((circle (circle--create :x x :y y :radius radius)))
    (if (< radius 0)
        (error "must have non-negative radius")
      circle)))

(circle-create 0 0 1)
;; => [cl-struct-circle 0 0 1]

(circle-create 0 0 -1)
;; error: "must have non-negative radius"

This is now how I always use cl-defstruct in Emacs Lisp. It’s a tidy convention that will probably become more common in the future.

Introducing EmacSQL

2014-02-06T05:52:37Z

Yesterday I made the first official release of EmacSQL, an Emacs package I’ve been working on for the past few weeks. EmacSQL is a high-level SQL database for Emacs. It primarily targets SQLite as a back-end, but it also currently supports PostgreSQL and MySQL.

https://github.com/skeeto/emacsql

It’s available on MELPA and is ready for immediate use. It depends on the finalizers package I added last week.

While there’s a non-Elisp component, SQLite, there are no special requirements for the user to worry about. When the package’s Elisp is compiled, if a C compiler is available it will use it to compile a SQLite binary for EmacSQL. If not, it will later offer to download a pre-built binary that I built. Ideally this makes the non-Elisp part of EmacSQL completely transparent and users can pretend Emacs has a built-in relational database.

The official SQLite command line shell is not used even if present, and I’ll explain why below.

Just as Skewer jump started my web development experience, EmacSQL has been a crash course in SQL and relational databases. Before starting this project I knew little about this topic and I’ve gained a lot of appreciation for it in the process. Building an Emacs extension is a very rapid way to dive into a new topic.

If you’re a total newb about this stuff like I was and want to learn SQL for SQLite yourself, I highly recommend Using SQLite. It’s a really solid introduction.

High-level SQL Compiler

By “high-level” I mean that it goes beyond assembling strings containing SQL code. In EmacSQL, statements are assembled from s-expressions which, behind the scenes, are compiled into SQL using some simple rules. This means if you already know SQL you should be able to hit the ground running with EmacSQL. Here’s an example,

(require 'emacsql)

;; Connect to the database, SQLite in this case:
(defvar db (emacsql-connect "~/office.db"))

;; Create a table with 3 columns:
(emacsql db [:create-table patients
             ([name (id integer :primary-key) (weight float)])])

;; Insert a few rows:
(emacsql db [:insert :into patients
             :values (["Jeff" 1000 184.2] ["Susan" 1001 118.9])])

;; Query the database:
(emacsql db [:select [name id]
             :from patients
             :where (< weight 150.0)])
;; => (("Susan" 1001))

;; Queries can be templates, using $s1, $i2, etc. as parameters:
(emacsql db [:select [name id]
             :from patients
             :where (> weight $s1)]
         100)
;; => (("Jeff" 1000) ("Susan" 1001))

A query is a vector of keywords, identifiers, parameters, and data. Thanks to parameters, these s-expression statements should not need to be constructed dynamically at run-time.

The compilation rules are listed in the EmacSQL documentation so I won’t repeat them in detail here. In short, lisp keywords become SQL keywords, row-oriented information is always presented as vectors, expressions are lists, and symbols are identifiers, except when quoted.

[:select [name weight] :from patients :where (< weight 150.0)]

That compiles to this,

SELECT name, weight FROM patients WHERE weight < 150.0;

Also, any readable lisp value can be stored in an attribute. Integers are mapped to INTEGER, floats are mapped to REAL, nil is mapped to NULL, and everything else is printed and stored as TEXT. The specifics vary depending on the back-end.

Parameters

A symbol beginning with a dollar sign is a parameter. It has a type — identifier (i), scalar (s), vector (v), schema (S) — and an argument position.

[:select [$i1] :from $i2 :where (< $i3 $s4)]

Given the arguments name people age 21, three symbols and an integer, it compiles to:

SELECT name FROM people WHERE age < 21;

A vector parameter refers to rows to be inserted or as a set for an IN expression.

[:insert-into people [name age] :values $v1]

Given the argument (["Jim" 45] ["Jeff" 34]), a list of two rows, this becomes,

INSERT INTO people (name, age) VALUES ('"Jim"', 45), ('"Jeff"', 34);

And this,

[:select * :from tags :where (in tag $v1)]

Given the argument [hiking camping biking] becomes,

SELECT * FROM tags WHERE tag IN ('hiking', 'camping', 'biking');

When writing these expressions keep in mind the command emacsql-show-last-sql. It will display in the minibuffer the SQL result of the s-expression statement before the point.

Schemas

A table schema is a list whose first element is a column specification vector (i.e. row-oriented information is presented as vectors). The remaining elements are table constraints. Here are the examples from the documentation,

;; No constraints schema with four columns:
([name id building room])

;; Add some column constraints:
([(name :unique) (id integer :primary-key) building room])

;; Add some table constraints:
([(name :unique) (id integer :primary-key) building room]
 (:unique [building room])
 (:check (> id 0)))

In the handful of EmacSQL databases I’ve created for practice and testing, I’ve put the schema in a global constant. A table schema is a part of a program’s type specifications, and rows are instances of that type, so it makes sense to declare schemas up top with things like defstructs.

These schemas can be substituted into a SQL statement using a $S parameter (capital “S” for Schema).

(defconst foo-schema-people
  '([(person-id integer :primary-key) name age]))

;; ...

(defun foo-init (db)
  (emacsql db [:create-table $i1 $S2] 'people foo-schema-people))

Back-ends

Everything I’ve discussed so far is restricted to the SQL statement compiler. It’s completely independent of the back-end implementations, themselves mostly handling strings of SQL statements.

SQLite Implementation Difficulties

A little over a year ago I wrote a pastebin webapp in Elisp. I wanted to use SQLite as a back-end for storing pastes but struggled to get the SQLite command shell, sqlite3, to cooperate with Emacs. The problem was that all of the output modes except for “tcl” are ambiguous. This includes the “csv” formatted output. TEXT values can dump newlines, allowing rows to span an arbitrary number of lines. They can dump things that look like the sqlite3 prompt, so it’s impossible to know when sqlite3 is done printing results. I ultimately decided the command shell was inadequate as an Emacs subprocess.

Recently there was some discussion from alexbenjm and Andres Ramirez on an Elfeed post about using SQLite as an Elfeed back-end. This inspired me to take another look and that’s when I came up with a workaround for SQLite’s ambiguity: only store printed Elisp values for TEXT values! With print-escape-newlines set, TEXT values no longer span multiple lines, and I can use read to pull in data from sqlite3. All of sqlite3’s output modes were now unambiguous.

However, after making significant progress I discovered an even bigger issue: GNU Readline. The sqlite3 binary provided by Linux package repositories is almost always compiled with Readline support. This makes the tool much more friendly to use, but it’s a huge problem for Emacs.

First, sqlite3 the command shell is not up to the same standards as SQLite the database. Not by a long shot. In my short time working with SQLite I’ve already discovered several bugs in the command shell. For one, it’s not properly integrated with GNU Readline. There’s an .echo meta-command that turns command echoing on and off. That is, it repeats your command back to you. Useful in some circumstances, though not mine. The bug is that this echo is separate from GNU Readline’s echo. When Readline is active and .echo is enabled, there are actually two echos. Turn it off and there’s one echo.

Pseudo-terminals

Under some circumstances, like when communicating over a pipe rather than a PTY, Readline will mostly become deactivated. This would have been a workaround, but when Readline is disabled sqlite3 heavily buffers its output. This breaks any sort of interaction. Even worse, on Windows stderr is not always unbuffered, so sqlite3’s error messages may not appear for a long time (another bug).

Besides the problem of getting Readline to shut up, another problem is getting Readline to stop acting on control characters. The first 32 characters in ASCII are control characters. A pseudo-terminal (PTY) that is not in raw mode will immediately act upon any control characters it sees. There’s no escaping them.

Emacs communicates with subprocesses through a PTY by default (probably an early design mistake), limiting the kind of data that can be transmitted. You can try this yourself in a comint mode sometime where a subprocess is used (not a socket like SLIME). Fire up M-x sql-sqlite (part of Emacs) and try sending a string containing byte 0x1C (28, file separator). You can type one by pressing C-q C-\. Send that byte and the subprocess dies.

There are two ways to work around this. One is to use a pipe (bind process-connection-type to nil). Pipes don’t respond to control characters. This doesn’t work with sqlite3 because of the previously-mentioned buffering issue.

The other way to work around this is to put the PTY in raw mode. Unfortunately there’s no function to do this so you need to call stty. Of course, this program needs to run on the same PTY, so a start-process-shell-command is required.

(start-process-shell-command name buffer "stty raw && ")

Windows has neither stty nor PTYs (nor any of PTY’s issues) so you’ll need to check the operating system before starting the process. Even this still doesn’t work for sqlite3 because Readline itself will respond to control characters. There’s no option to disable this.

There’s a package called esqlite that is also a SQLite front-end. It’s built to use sqlite3 and therefore suffers from all of these problems.

A Custom SQLite Binary

Since sqlite3 proved unreliable I developed my own protocol and external program. It’s just a tiny bit of C that accepts a SQL string and returns results as an s-expression. I’m not longer constrained to storing readable values, but I’m still keeping that paradigm. First, it keeps the C glue program simple and, more importantly, I can rely entirely on the Emacs reader to parse the results. This makes communication between Emacs and the subprocess as fast as it can possibly be. The reader is faster than any possible Elisp program.

As I mentioned before, this C program is compiled when possible, and otherwise a pre-built binary is fetched from my server (popular platforms only, obviously). It’s likely EmacSQL will have at least one working back-end on whatever you’re using.

Other Back-ends

Both PostgreSQL and MySQL are also supported, though these require the user have the appropriate client programs installed (psql or mysql). Both of these are much better behaved than sqlite3 and, with the stty trick, each can reliably be used without any special help. Both pass all of the unit tests, so, in theory, they’ll work just as well as SQLite.

To use them with the example at the beginning of this article, require emacsql-psql or emacsql-mysql, then swap emacsql-connect for the constructors emacsql-psql or emacsql-mysql (along with the proper arguments). All three of these constructors return an emacsql-connection object that works with the same API.

EmacSQL only goes so far to normalize the interfaces to these databases, so for any non-trivial program you may not be able to swap back-ends without some work. All of the EmacSQL functions that operate on connections are generic functions (EIEIO), so changing back-ends will only have an effect on the program’s SQL statements. For example, if you use q SQLite-ism (dynamic typing) it won’t translate to either of the other databases should they be swapped in.

I’ll cover the connections API, and what it takes to implement a new back-end, in a future post. Outside of the PTY caveats, it’s actually very easy. The MySQL implementation is just 80 lines of code.

EmacSQL’s Future

I hope this becomes a reliable and trusted database solution that other packages can depend upon. Twice so far, the pastebin demo and Elfeed, I’ve really wanted something like this and, instead, ended up having to hack together my own database.

I’ve already started a branch on Elfeed re-implementing its database in EmacSQL. Someday it may become Elfeed’s primary database if I feel there’s no disadvantage to it. EmacSQL builds SQLite with the full-text search engine enabled, which opens to the door to a powerful, fast Elfeed search API. Currently the main obstacle is actually Elfeed’s database API being somewhat incompatible with ACID database transactions — shortsightedness on my part!

Emacs Lisp Object Finalizers

2014-01-27T05:24:16Z

*Update: Emacs 25.1 (released Sept. 2016) formally introduced finalizers to Emacs Lisp. This article is left here for historical purposes.

Problem: You have a special resource, such as a buffer or process, associated with an Emacs Lisp object which is not managed by the garbage collector. You want this resource to be cleaned up when the owning lisp object is garbage collected. Unlike some other languages, Elisp doesn’t provide finalizers for this job, so what do you do?

Solution: This is Emacs Lisp. We can just add this feature to the language ourselves!

I’ve already implemented this feature as a package called finalize, available on MELPA. I will be using it as part of a larger, upcoming project.

https://github.com/skeeto/elisp-finalize

In this article I will describe how it works.

Processes and Buffers

Process and buffers are special types of objects. Immediately after instantiation these objects are added to a global list. They will never become unreachable without explicitly being killed. The garbage collector will never manage them for you.

This is a problem for APIs like those provided by the url package. The functions url-retrieve and url-retrieve-synchronously create buffers and hand them back to their callers. Ownership is transfered to the caller and the caller must be careful to kill the buffer, or transfer ownership again, before it returns. Otherwise the buffer is “leaked.” The url package tries to manage this a little bit with url-gc-dead-buffers, but this can’t be relied upon.

Another issue is when a process is started and is stored in a struct or some other kind of object. There is probably a “close” function that accepts one of these structs and kills the process. But if that function isn’t called, due to a bug or an error condition, it will become a “dangling” process. If the struct is completely lost, it will probably be inconvenient to deal with the process — the “close” function is no longer useful.

With Macros

A common way to deal with this problem is using a with- macro. This macro establishes a resource, evaluates a body, and ensures the resource is properly cleaned up regardless of the body’s termination state. The latter is accomplished using unwind-protect. For example, with-temp-buffer,

;; Fetch the first 10 bytes of foo.txt
(with-temp-buffer
  (insert-file-contents "foo.txt" nil 0 10)
  (buffer-string))

This expands (roughly) to the following expression.

(let ((temp-buffer (generate-new-buffer "*temp*")))
  (with-current-buffer temp-buffer
    (unwind-protect
        (progn
          (insert-file-contents "foo.txt" nil 0 10)
          (buffer-string))
      (and (buffer-live-p temp-buffer)
           (kill-buffer temp-buffer)))))

For dealing with open files, Common Lisp has with-open-stream. It establishes a binding for a new stream over its body and ensures the stream is closed when the body is complete. There’s no chance for a stream to be left open, leaking a system resource.

However, with- macros aren’t useful in asynchronous situations. In Emacs this would be the case for asynchronous sub-processes, such as an attached language interpreter. The extent of the process goes beyond a single body.

Finalizers

What would really be useful is to have a callback — a finalizer — that runs when an object is garbage collected. This ensures that the resource will not outlive its owner, restoring management back to the garbage collector. However, Emacs provides no such hook.

Fortunately this feature can be built using weak hash tables and the post-gc-hook, a list of functions that are run immediately after garbage collection.

Weak References

I’ve discussed before how to create weak references in Elisp. The only weak references in Emacs are built into weak hash tables. Normally the language provides weak references first and hash tables are built on top of them. With Emacs we do this backwards.

The make-hash-table function accepts a key argument :weakness to specify how strongly keys and values should be held by the table. To make a weak reference just create a hash table of size 1 and set :weakness to t.

(defun weak-ref (thing)
  (let ((ref (make-hash-table :size 1 :weakness t :test 'eq)))
    (prog1 ref
      (setf (gethash t ref) thing))))

(defun deref (ref)
  (gethash t ref))

The same trick can be used to detect when an object is garbage collected. If the result of deref is nil, then the object was garbage collected. (Or the weakly-referenced object is nil, but this object will never be garbage collected anyway.)

To check if we need to run a finalizer all we have to do is create a weak reference to the object, then check the reference after garbage collection. This check can be done in a post-gc-hook function.

Registration

To avoid cluttering up post-gc-hook with one closure per object we’ll keep a register of all watched objects.

(defvar finalizable-objects ())

(defun register (object callback)
  (push (cons (weak-ref object) callback) finalizable-objects))

Now a function to check for missing objects, try-finalize.

(defun try-finalize ()
  (let ((alive (cl-remove-if-not #'deref finalizable-objects :key #'car))
        (dead (cl-remove-if #'deref finalizable-objects :key #'car)))
    (setf finalizable-objects alive)
    (mapc #'funcall (mapcar #'cdr dead))))

(add-hook 'post-gc-hook #'try-finalize)

Now to try it out. Create a process, stuff it in a vector (like a defstruct), register delete-process as a finalizer, and, for the sake of demonstration, immediately forget the vector.

;;; -*- lexical-binding: t; -*-
(let ((process (start-process "ping" nil "ping" "localhost")))
  (register (vector process) (lambda () (delete-process process))))

;; Assuming the garbage collector has not already run.
(get-process "ping")
;; => #

;; Force garbage collection.
(garbage-collect)

(get-process "ping")
;; => nil

The garbage collector killed the process for us!

There are some problems with this implementation. Using cl-remove-if is unwise in a post-gc-hook function. It allocates lots of new cons cells but garbage collection is inhibited while the function is run. The docstring warns us:

Garbage collection is inhibited while the hook functions run, so be careful writing them.

Similarly, all of the finalizers are run within the context of this memory-sensitive hook. Instead they should be delayed until the next evaluation turn (i.e. run-at-time of 0). Some of the finalizers could also fail, which would cause the remaining finalizers to never run. The real implementation deals with all of these issues.

A major drawback to these Emacs Lisp finalizers compared to other languages is that the actual object is not available. We don’t know it’s getting collected until after it’s already gone. This solves the object resurrection problem, but it’s darn inconvenient. One possible workaround in the case of defstructs and EIEIO objects is to make a copy of the original object (copy-sequence or clone) and run the finalizer on the copy as if it was the original.

The Real Implementation

The real implementation is more carefully namespaced and its API has just one function: finalize-register. It works just like register above but it accepts &rest arguments to be passed to the finalizer. This makes the registration call simpler and avoids some significant problems with closures.

(let ((process (start-process "ping" nil "ping" "localhost")))
  (finalize-register (vector process) #'delete-process process))

Here’s a more formal example of how it might really be used.

(cl-defstruct (pinger (:constructor pinger--create))
  process host)

(defun pinger-create (host)
  (let* ((process (start-process "pinger" nil "ping" host))
         (object (pinger--create :process process :host host)))
    (finalize-register object #'delete-process process)
    object))

To make things cleaner for EIEIO classes there’s also a finalizable mixin class that ensures the finalize generic function is called on a copy of the object (the original object is gone) when it’s garbage collected.

Here’s how it would be used for the same “pinger” concept, this time as an EIEIO class. An advantage here is that anyone can manually call finalize early if desired.

(require 'eieio)
(require 'finalizable)

(defclass pinger (finalizable)
  ((process :initarg :process :reader pinger-process)
   (host :initarg :host :reader pinger-host)))

(defun pinger-create (host)
  (make-instance 'pinger
                 :process (start-process "ping" nil "ping" host)
                 :host host))

(defmethod finalize ((pinger pinger))
  (delete-process (pinger-process pinger)))

It’s a small package but I think it can be quite handy.

Measure Elisp Object Memory Usage with Calipers

2014-01-26T01:15:02Z

A couple of weeks ago I wrote a library to measure the retained memory footprint of arbitrary Elisp objects for the purposes of optimization. It’s called Caliper.

https://github.com/skeeto/caliper

Note, Caliper requires predd, my predicate dispatch library. Neither of these packages are on MELPA or Marmalade since they’re mostly for fun.

The reason I wanted this was that I came across a post on reddit where someone had scraped 217,000 Jeopardy! questions from J! Archive and dumped them out into a single, large JSON file. The significance of the effort is that it dealt with some of the inconsistencies of J! Archive’s data presentation, normalizing them for the JSON output.

JEOPARDY_QUESTIONS1.json.gz (12MB, 53MB uncompressed)

When I want to examine a JSON dataset like this I have three preferred options:

Load it into a browser page and poke at it from JavaScript remotely with Skewer. With the JSON text weighing in at 53MB and with such a large object count, I decided this was too large for a browser page. It definitely could be done, it’s just that the browser is not the place to be working on large datasets.
Load it into Clojure. I’m familiar with Clojure’s data.json. This is not a bad choice, but there’s something else I always reach for first if I can.
Load it into Emacs using json.el (part of Emacs). This is what I ended up doing.

(defvar jeopardy
  (with-temp-buffer
    (insert-file-contents "/tmp/JEOPARDY_QUESTIONS1.json")
    (json-read)))

(length jeopardy)
;; => 216930

Here, jeopardy is bound to a vector of 216,930 association lists (alists). I’m curious exactly how much heap memory this data structure is using. To find out, we need to walk the data structure and sum the sizes of everything we come across. However, care must be taken not to count the identical objects twice, such as symbols, which, being interned, appear many times in this data.

Measuring Object Sizes

This is lisp so let’s start with the cons cell. A cons cell is just a pair of pointers, called car and cdr.

These are used to assemble lists.

So a cons cell itself — the shallow size — is two words: 16 bytes on a 64-bit operating system. To make sure Elisp doesn’t happen to have any additional information attached to cons cells, let’s take a look at the Emacs source code.

struct Lisp_Cons
  {
    /* Car of this cons cell.  */
    Lisp_Object car;

    union
    {
      /* Cdr of this cons cell.  */
      Lisp_Object cdr;

      /* Used to chain conses on a free list.  */
      struct Lisp_Cons *chain;
    } u;
  };

The return value from garbage-collect backs this up. The first value after each type is the shallow size of that type. From here on, all values have been computed for 64-bit Emacs running on x86-64 GNU/Linux.

(garbage-collect)
;; => ((conses 16 9923172 2036943)
;;     (symbols 48 57017 54)
;;     (miscs 40 10203 18892)
;;     (strings 32 4810027 197961)
;;     (string-bytes 1 104599635)
;;     (vectors 16 103138)
;;     (vector-slots 8 2921744 131076)
;;     (floats 8 12494 5816)
;;     (intervals 56 119911 69249)
;;     (buffers 960 134)
;;     (heap 1024 593412 133853))

A Lisp_Object is just a pointer to a lisp object. The retained size of a cons cell is its shallow size plus, recursively, the retained size of the objects in its car and cdr.

Integers and Floats

Integers are a special case. Elisp uses what is called tagged integers. They’re not heap-allocated objects. Instead they’re embedded inside the object pointers. That is, those Lisp_Object pointers in Lisp_Cons will hold integers directly. This means to Caliper integers have retained size of 0. We can use this to verify Caliper’s return value for cons cells.

(caliper-object-size 100)
;; => 0

(caliper-object-size (cons 100 200))
;; => 16

Tagged integers are fast and save on memory. They also compare properly with eq, which is just a pointer (identity) comparison. However, because a few bits need to be reserved for differentiating them from actual pointers these integers have a restricted dynamic range.

Floats are not tagged and exist as immutable objects in the heap. That’s why eql is still useful in Elisp — it’s like eq but will handle numbers properly. (By convention you should use eql for integers, too.)

Symbols and Strings

Not counting the string’s contents, a string’s base size is 32 bytes according to garbage-collect. The length of the string can’t be used here because that counts characters, which vary in size. There’s a string-bytes function for this. A string’s size is 32 plus its string-bytes value.

(string-bytes "naïveté")
;; => 9
(caliper-object-size "naïveté")
;; => 41  (i.e. 32 + 9)

As you can see from above, symbols are huge. Without even counting either the string holding the name of the symbol or the symbol’s plist, a symbol is 48 bytes.

(caliper-object-size 'hello)
;; => 1038

This 1,038 bytes is a little misleading. The symbol itself is 48 bytes, the string "hello" is 37 bytes, and the plist is nil. The retained size of nil is significant. On my system, nil’s plist has 4 key-value pairs, which themselves have retained sizes. When examining symbols, caliper doesn’t care if they’re interned or not, including symbols like nil and t. However, nil is only counted once, so it will have little impact on a large data structure.

Miscellaneous

Outside of vectors, measuring object sizes starts to get fuzzy. For example, it’s not possible to examine the exact internals of a hash table from Elisp. We can see its contents and the number of elements it can hold without re-sizing, but there’s intermediate structure that’s not visible. Caliper makes rough estimates for each of these types.

Circularity and Double Counting

To avoid double counting objects, a hash table with a test of eq is dynamically bound by the top level call. It’s used like a set. Before an object is examined, the hash table is checked. If the object is listed, the reported size is 0 (it consumes no additional space than already accounted for).

This automatically solves the circularity problem. There’s no way we can traverse into the same data structure a second time because we’ll stop when we see it twice.

Using Caliper

So what’s the total retained size of the jeopardy structure? About 124MB.

(caliper-object-size jeopardy)
;; => 130430198

For fun, let’s see if how much we can improve on this.

json.el will return alists for objects by default, but this can be changed by setting json-object-type to something else. Initially I thought maybe using plists instead would save space, but I later realized that plists use exactly the same number of cons cells as alists. If this doesn’t sound right, try to picture the cons cells in your head (an exercise for the reader).

(defvar jeopardy
  (let ((json-object-type 'plist))
    (with-temp-buffer
      (insert-file-contents "~/JEOPARDY_QUESTIONS1.json")
      (setf (point) (point-min))
      (json-read))))

(caliper-object-size jeopardy)
;; => 130430077 (plist)

Strangely this is 121 bytes smaller. I don’t know why yet, but in the scope of 124MB that’s nothing.

So what do these questions look like?

(elt jeopardy 0)
;; => (:show_number "4680"
;;     :round "Jeopardy!"
;;     :answer "Copernicus"
;;     :value "$200"
;;     :question "..." ;; omitted
;;     :air_date "2004-12-31"
;;     :category "HISTORY")

They’re (now) plists of 7 pairs. All of the keys are symbols, and, as such, are interned and consuming very little memory. All of the values are strings. Surely we can do better here. The strings can be interned and the numbers can be turned into tagged integers. The :category values would probably be good candidates for conversion into symbols.

Here’s an interesting fact about Jeopardy! that can be exploited for our purposes. While Jeopardy! covers a broad range of trivia, it does so very shallowly. The same answers appear many times. For example, the very first answer from our dataset, Copernicus, appears 14 times. That makes even the answers good candidates for interning.

(cl-loop for question across jeopardy
         for answer = (plist-get question :answer)
         count (string= answer "Copernicus"))
;; => 14

A string pool is trivial to implement. Just use a weak, equal hash table to track strings. Making it weak keeps it from leaking memory by holding onto strings for longer than necessary.

(defvar string-pool
  (make-hash-table :test 'equal :weakness t))

(defun intern-string (string)
  (or (gethash string string-pool)
      (setf (gethash string string-pool) string)))

(defun jeopardy-fix (question)
  (cl-loop for (key value) on question by #'cddr
           collect key
           collect (cl-case key
                     (:show_number (read value))
                     (:value (if value (read (substring value 1))))
                     (:category (intern value))
                     (otherwise (intern-string value)))))

(defvar jeopardy-interned
  (cl-map 'vector #'jeopardy-fix jeopardy))

So how are we looking now?

(caliper-object-size jeopardy-interned)
;; => 83254322

That’s down to 79MB of memory. Not bad! If we print-circle this, taking advantage of string interning in the printed representation, I wonder how it compares to the original JSON.

(with-temp-buffer
  (let ((print-circle nil))
    (prin1 jeopardy-interned (current-buffer))
    (buffer-size)))
;; => 45554437

About 44MB, down from JSON’s 53MB. With print-circle set to nil it’s about 48MB.

Emacs Byte-code Internals

2014-01-04T05:07:26Z

Byte-code compilation is an underdocumented — and in the case of the recent lexical binding updates, undocumented — part of Emacs. Most users know that Elisp is usually compiled into a byte-code saved to .elc files, and that byte-code loads and runs faster than uncompiled Elisp. That’s all users really need to know, and the GNU Emacs Lisp Reference Manual specifically discourages poking around too much.

People do not write byte-code; that job is left to the byte compiler. But we provide a disassembler to satisfy a cat-like curiosity.

Screw that! What if I want to handcraft some byte-code myself? :-) The purpose of this article is to introduce the internals of Elisp byte-code interpreter. I will explain how it works, why lexically scoped code is faster, and demonstrate writing some byte-code by hand.

The Humble Stack Machine

The byte-code interpreter is a simple stack machine. The stack holds arbitrary lisp objects. The interpreter is backwards compatible but not forwards compatible (old versions can’t run new byte-code). Each instruction is between 1 and 3 bytes. The first byte is the opcode and the second and third bytes are either a single operand or a single intermediate value. Some operands are packed into the opcode byte.

As of this writing (Emacs 24.3) there are 142 opcodes, 6 of which have been declared obsolete. Most opcodes refer to commonly used built-in functions for fast access. (Looking at the selection, Elisp really is geared towards text!) Considering packed operands, there are up to 27 potential opcodes unused, reserved for the future.

opcodes 48 - 55
opcode 97
opcode 128
opcodes 169 - 174
opcodes 180 - 181
opcodes 183 - 191

The easiest place to access the opcode listing is in bytecomp.el. Beware that some of the opcode comments are currently out of date.

Segmentation Fault Warning

Byte-code does not offer the same safety as normal Elisp. Bad byte-code can, and will, cause Emacs to crash. You can try out for yourself right now,

emacs -batch -Q --eval '(print (#[0 "\300\207" [] 0]))'

Or evaluate the code manually in a buffer (save everything first!),

(#[0 "\300\207" [] 0])

This segfault, caused by referencing beyond the end of the constants vector, is not an Emacs bug. Doing a boundary test would slow down the byte-code interpreter. Not performing this test at run-time is a practical engineering decision. The Emacs developers have instead chosen to rely on valid byte-code output from the compiler, making a disclaimer to anyone wanting to write their own byte-code,

You should not try to come up with the elements for a byte-code function yourself, because if they are inconsistent, Emacs may crash when you call the function. Always leave it to the byte compiler to create these objects; it makes the elements consistent (we hope).

You’ve been warned. Now it’s time to start playing with firecrackers.

The Byte-code Object

A byte-code object is functionally equivalent to a normal Elisp vector except that it can be evaluated as a function. Elements are accessed in constant time, the syntax is similar to vector syntax ([...] vs. #[...]), and it can be of any length, though valid functions must have at least 4 elements.

There are two ways to create a byte-code object: using a byte-code object literal or with make-byte-code. Like vector literals, byte-code literals don’t need to be quoted.

(make-byte-code 0 "" [] 0)
;; => #[0 "" [] 0]

#[1 2 3 4]
;; => #[1 2 3 4]

(#[0 "" [] 0])
;; error: Invalid byte opcode

The elements of an object literal are:

Function parameter (lambda) list
Unibyte string of byte-code
Constants vector
Maximum stack usage
Docstring (optional, nil for none)
Interactive specification (optional)

Parameter List

The parameter list takes on two different forms depending on if the function is lexically or dynamically scoped. If the function is dynamically scoped, the argument list is exactly what appears in lisp code.

(byte-compile (lambda (a b &optional c)))
;; => #[(a b &optional c) "\300\207" [nil] 1]

There’s really no shorter way to represent the parameter list because preserving the argument names is critical. Remember that, in dynamic scope, while the function body is being evaluated these variables are globally bound (eww!) to the function’s arguments.

When the function is lexically scoped, the parameter list is packed into an Elisp integer, indicating the counts of the different kinds of parameters: required, &optional, and &rest.

The least significant 7 bits indicate the number of required arguments. Notice that this limits compiled, lexically-scoped functions to 127 required arguments. The 8th bit is the number of &rest arguments (up to 1). The remaining bits indicate the total number of optional and required arguments (not counting &rest). It’s really easy to parse these in your head when viewed as hexadecimal because each portion almost always fits inside its own “digit.”

(byte-compile-make-args-desc '())
;; => #x000  (0 args, 0 rest, 0 required)

(byte-compile-make-args-desc '(a b))
;; => #x202  (2 args, 0 rest, 2 required)

(byte-compile-make-args-desc '(a b &optional c))
;; => #x302  (3 args, 0 rest, 2 required)

(byte-compile-make-args-desc '(a b &optional c &rest d))
;; => #x382  (3 args, 1 rest, 2 required)

The names of the arguments don’t matter in lexical scope: they’re purely positional. This tighter argument specification is one of the reasons lexical scope is faster: the byte-code interpreter doesn’t need to parse the entire lambda list and assign all of the variables on each function invocation.

Unibyte String Byte-code

The second element is a unibyte string — it strictly holds octets and is not to be interpreted as any sort of Unicode encoding. These strings should be created with unibyte-string because string may return a multibyte string. To disambiguate the string type to the lisp reader when higher values are present (> 127), the strings are printed in an escaped octal notation, keeping the string literal inside the ASCII character set.

(unibyte-string 100 200 250)
;; => "d\310\372"

It’s unusual to see a byte-code string that doesn’t end with 135 (#o207, byte-return). Perhaps this should have been implicit? I’ll talk more about the byte-code below.

Constants Vector

The byte-code has very limited operands. Most operands are only a few bits, some fill an entire byte, and occasionally two bytes. The meat of the function that holds all the constants, function symbols, and variables symbols is the constants vector. It’s a normal Elisp vector and can be created with vector or a vector literal. Operands reference either this vector or they index into the stack itself.

(byte-compile (lambda (a b) (my-func b a)))
;; => #[(a b) "\302\134\011\042\207" [b a my-func] 3]

Note that the constants vector lists the variable symbols as well as the external function symbol. If this was a lexically scoped function the constants vector wouldn’t have the variables listed, being only [my-func].

Maximum Stack Usage

This is the maximum stack space used by this byte-code. This value can be derived from the byte-code itself, but it’s pre-computed so that the byte-code interpreter can quickly check for stack overflow. Under-reporting this value is probably another way to crash Emacs.

Docstring

The simplest component and completely optional. It’s either the docstring itself, or if the docstring is especially large it’s a cons cell indicating a compiled .elc and a position for lazy access. Only one position, the start, is needed because the lisp reader is used to load it and it knows how to recognize the end.

Interactive Specification

If this element is present and non-nil then the function is an interactive function. It holds the exactly contents of interactive in the uncompiled function definition.

(byte-compile (lambda (n) (interactive "nNumber: ") n))
;; => #[(n) "\010\207" [n] 1 nil "nNumber: "]

(byte-compile (lambda (n) (interactive (list (read))) n))
;; => #[(n) "\010\207" [n] 1 nil (list (read))]

The interactive expression is always interpreted, never byte-compiled. This is usually fine because, by definition, this code is going to be waiting on user input. However, it slows down keyboard macro playback.

Opcodes

The bulk of the established opcode bytes is for variable, stack, and constant access opcodes, most of which use packed operands.

0 - 7 : (stack-ref) stack reference
8 - 15 : (varref) variable reference (from constants vector)
16 - 23 : (varset) variable set (from constants vector)
24 - 31 : (varbind) variable binding (from constants vector)
32 - 39 : (call) function call (immediate = number of arguments)
40 - 47 : (unbind) variable unbinding (from constants vector)
129, 192-255 : (constant) direct constants vector access

Except for the last item, each kind of instruction comes in sets of 8. The nth such instruction means access the nth thing. For example, the instruction “2” copies the third stack item to the top of the stack. An instruction of “9” pushes onto the stack the value of the variable named by the second element listed in the constants vector.

However, the 7th and 8th such instructions in each set take an operand byte or two. The 7th instruction takes a 1-byte operand and the 8th takes a 2-byte operand. A 2-byte operand is written in little-endian byte-order regardless of the host platform.

For example, let’s manually craft an instruction that returns the value of the global variable foo. Each opcode has a named constant of byte-X so we don’t have to worry about their actual byte-code number.

(require 'bytecomp)  ; named opcodes

(defvar foo "hello")

(defalias 'get-foo
  (make-byte-code
    #x000                 ; no arguments
    (unibyte-string
      (+ 0 byte-varref)   ; ref variable under first constant
      byte-return)        ; pop and return
    [foo]                 ; constants
    1))                   ; only using 1 stack space

(get-foo)
;; => "hello"

Ta-da! That’s a handcrafted byte-code function. I left a “+ 0” in there so that I can change the offset. This function has the exact same behavior, it’s just less optimal,

(defalias 'get-foo
  (make-byte-code
    #x000
    (unibyte-string
      (+ 3 byte-varref)     ; 4th form of varref
      byte-return)
    [nil nil nil foo]
    1))

If foo was the 10th constant, we would need to use the 1-byte operand version. Again, the same behavior, just less optimal.

(defalias 'get-foo
  (make-byte-code
    #x000
    (unibyte-string
      (+ 6 byte-varref)     ; 7th form of varref
      9                     ; operand, (constant index 9)
      byte-return)
    [nil nil nil nil nil nil nil nil nil foo]
    1))

Dynamically-scoped code makes heavy use of varref but lexically-scoped code rarely uses it (global variables only), instead relying heavily on stack-ref, which is faster. This is where the different calling conventions come into play.

Calling Convention

Each kind of scope gets its own calling convention. Here we finally get to glimpse some of the really great work by Stefan Monnier updating the compiler for lexical scope.

Dynamic Scope Calling Convention

Remembering back to the parameter list element of the byte-code object, dynamically scoped functions keep track of all its argument names. Before executing a function the interpreter examines the lambda list and binds (varbind) every variable globally to an argument.

If the caller was byte-compiled, each argument started on the stack, was popped and bound to a variable, and, to be accessed by the function, will be pushed back right onto the stack (varref). There’s a lot of argument indirection for each function call.

Lexical Scope Calling Convention

With lexical scope, the argument names are not actually bound for the evaluation byte-code. The names are completely gone because the compiler has converted local variables into stack offsets.

When calling a lexically-scoped function, the byte-code interpreter examines the integer parameter descriptor. It checks to make sure the appropriate number of arguments have been provided, and for each unprovided &optional argument it pushes a nil onto the stack. If the function has a &rest parameter, any extra arguments are popped off into a list and that list is pushed onto the stack.

From here the function can access its arguments directly on the stack without any named variable misdirection. It can even consume them directly.

;; -*- lexical-binding: t -*-
(defun foo (x) x)

(symbol-function #'foo)
;; => #[#x101 "\207" [] 2]

The byte-code for foo is a single instruction: return. The function’s argument is already on the stack so it doesn’t have to do anything. Strangely the maximum stack usage element is wrong here (2), but it won’t cause a crash.

;; (As of this writing `byte-compile' always uses dynamic scope.)

(byte-compile 'foo)
;; => #[(x) "\010\207" [x] 1]

It takes longer to set up (x is implicitly bound), it has to make an explicit variable dereference (varref), then it has to clean up by unbinding x (implicit unbind). It’s no wonder lexical scope is faster!

Note that there’s also a disassemble function for examining byte-code, but it only reveals part of the story.

(disassemble #'foo)
;; byte code:
;;   args: (x)
;; 0       varref    x
;; 1       return

Compiler Intermediate “lapcode”

The Elisp byte-compiler has an intermediate language called lapcode (“Lisp Assembly Program”), which is much easier to optimize than byte-code. It’s basically an assembly language built out of s-expressions. Opcodes are referenced by name and operands, including packed operands, are handled whole. Each instruction is a cons cell, (opcode . operand), and a program is a list of these.

Let’s rewrite our last get-foo using lapcode.

(defalias 'get-foo
  (make-byte-code
    #x000
    (byte-compile-lapcode
      '((byte-varref . 9)
        (byte-return)))
    [nil nil nil nil nil nil nil nil nil foo]
    1))

We didn’t have to worry about which form of varref we were using or even how to encode a 2-byte operand. The lapcode “assembler” took care of that detail.

Project Ideas?

The Emacs byte-code compiler and interpreter are fascinating. Having spent time studying them I’m really tempted to build a project on top of it all. Perhaps implementing a programming language that targets the byte-code interpreter, improving compiler optimization, or, for a really big project, JIT compiling Emacs byte-code.

People can write byte-code!

Emacs Lisp Readable Closures

2013-12-30T23:52:38Z

I’ve stated before that one of the unique features of Emacs Lisp is that its closures are readable. Closures can be serialized by the printer and read back in with the reader. I am unaware of any other programming language that has this feature. In fact it’s essential for Elisp byte-code compilation because byte-compiled Elisp files are merely s-expressions of byte-code dumped out as source.

Lisp Printing

The Lisp family of languages are homoiconic. Lisp source code is written in the syntax of its own data structures, s-expressions. Since a compiler/interpreter is usually provided at run-time, a consequence of this is that reading and printing are a fundamental feature of Lisps. A value can be handed to the printer, which will serialize the value into an s-expression as a sequence of characters. Later on the reader can parse the s-expression back into an equal value.

To compare, JavaScript originally had half of this in place. JavaScript has convenient object syntax for defining an associative array, known today as JSON. The eval function could (dangerously) be used as a reader for parsing a string containing JSON-encoded data into a value. But until JSON.stringify() became standard, developers had to write their own printer. Lisp s-expression syntax is much more powerful (and complicated) than JSON, maintaining both identity and cycles (e.g. *print-circle*).

Not all values can be read. They’ll still print (when *print-readably* is nil) but will do so using special syntax that will signal an error in the reader: #<. For example, in Emacs Lisp buffers cannot be serialized so they print using this syntax.

(prin1-to-string (current-buffer))
;; => "#"

It doesn’t matter what’s between the angle brackets, or even that there’s a closing angle bracket. The reader will signal an error as soon as it hits a #<.

Almost Everything Prints Readably

Elisp has a small set of primitive data types. All of these primitive types print readably:

integer (1024, ?a)
float (1.7)
cons/list ((...))
vector (one-dimensional, [...])
bool-vector (#&n"...")
string ("...")
char-table (#^[...])
hash-table (readable as of Emacs 23.3, #s(hash-table ...))
byte-code function object (#[...])
symbol

Here are all the non-readable types. Each one has a good reason for not being serializable.

buffer
process (external state)
frame (user interface element)
marker (live, automatically updates)
overlay (belongs to a buffer)
built-in functions (native code)
user-ptr (opaque pointers from Emacs 25 dynamic modules)

And that’s it. Every other value in Elisp is constructed from one or more of these primitives, including keymaps, functions, macros, syntax tables, defstruct structs, and EIEIO objects. This means that as long as these values don’t refer to an unreadable value, they themselves can be printed.

An interesting note here is that, unlike the Common Lisp Object System (CLOS), EIEIO objects are readable by default. To Elisp they’re just vectors, so of course they print. CLOS objects are unreadable without manually defining a print method per class.

Elisp Closures

Elisp got lexical scoping in Emacs 24, released in June 2012. It’s now one of the relatively few languages to have both dynamic and lexical scope. Like Common Lisp, variables declared with defvar (and family) continue to have dynamic scope. For backwards compatibility with old Lisp code, lexical scope is disabled by default. It’s enabled for a specific file or buffer by setting lexical-binding to non-nil.

With lexical scope, anonymous functions become closures, a powerful functional programming primitive: a function plus a captured lexical environment. It also provides some performance benefits. In my own tests, compiled Elisp with lexical scope enabled is about 10% to 15% faster than with the default dynamic scope.

What do closures look like in Emacs Lisp? It takes on two forms depending on whether the closure is compiled or not. For example, consider this function, foo, that takes two arguments and returns a closure that returns the first argument.

;; -*- lexical-binding: t; -*-
(defun foo (x y)
  (lambda () x))

(foo :bar :ignored)
;; => (closure ((y . :ignored) (x . :bar) t) () x)

An uncompiled closure is a list beginning with the symbol closure. The second element is the lexical environment, the third is the argument list (lambda list), and the rest is the body of the function. Here we can see that both x and y have been “closed over.” This is a little bit sloppy because the function never makes use of y. Capturing it has a few problems.

The closure has a larger footprint than necessary.
Values are held longer than necessary, delaying collection.
It affects the readability of the closure, which I’ll get to later.

Fortunately the compiler is smart enough to see this and will avoid capturing unused variables. To prove this, I’ve now compiled foo so that it returns a compiled closure.

(foo :bar :ignored)
;; => #[0 "\300\207" [:bar] 1]

What’s returned here is a byte-code function object, with the #[...] syntax. It has these elements:

The function’s lambda list (zero arguments)
Byte-codes stored in a unibyte string
Constants vector
Maximum stack space needed by this function

Notice that the lexical environment has been captured in the constants vector, specifically noting the lack of :ignored in this vector. The compiler didn’t capture it.

For those curious about the byte-code here’s an explanation. The string syntax shown is in octal, representing a string containing two bytes: 192 and 135. The Elisp byte-code interpreter is stack-based. The 192 (constant 0) says to push the first constant onto the stack. The 135 (return) says to pop the top element from the stack and return it.

(coerce "\300\207" 'list)
;; => (192 135)

The Readable Closures Catch

Since closures are byte-code function objects, they print readably. You can capture an environment in a closure, serialize it, read it back in, and evaluate it. That’s pretty cool! This means closures can be transmitted to other Emacs instances in a multi-processing setup (i.e. Elnode, Async)

The catch is that it’s easy to accidentally capture an unreadable value, especially buffers. Consider this function bar which uses a temporary buffer as an efficient string builder. It returns a closure that returns the result. (Weird, but stick with me here!)

(defun bar (n)
  (with-temp-buffer
    (let ((standard-output (current-buffer)))
      (loop for i from 0 to n do (princ i))
      (let ((string (buffer-string)))
        (lambda () string)))))

The compiled form looks fine,

(bar 3)
;; => #[0 "\300\207" ["0123"] 1]

But the interpreted form of the closure has a problem. The with-temp-buffer macro silently introduced a new binding — an abstraction leak.

(bar 3)
;; => (closure ((string . "0123")
;;              (temp-buffer . #)
;;              (n . 3) t)
;;      () string)

The temporary buffer is mistakenly captured in the closure making it unreadable, but only in its uncompiled form. This creates the awkward situation where compiled and uncompiled code has different behavior.

Clojure-style Multimethods in Emacs Lisp

2013-12-18T23:06:15Z

This past week I added Clojure-style multimethods to Emacs Lisp through a package I call predd (predicate dispatch). I believe it is Elisp’s very first complete multiple dispatch object system! That is, methods are dispatched based on the dynamic, run-time type of more than one of its arguments.

https://github.com/skeeto/predd

(Unfortunately I was unaware of the other Clojure-style multimethod library when I wrote mine. However, my version is much more complete, has better performance, and is public domain.)

As of version 23.2, Emacs includes a CLOS-like object system cleverly named EIEIO. While CLOS (Common Lisp Object System) is multiple dispatch, EIEIO is, like most object systems, only single dispatch. The predd package is also very different than my other Elisp object system, @, which was prototype based and, therefore, also single dispatch (and comically slow).

The Clojure multimethods documentation provides a good introduction. The predd package works almost exactly the same way, except that due to Elisp’s lack of namespacing the function names are prefixed with predd-. Also different is that the optional hierarchy (h) argument is handled by the dynamic variable predd-hierarchy, which holds the global hierarchy.

Combination Example

To define a multimethod, pick a name and give it a classifier function. The classifier function will look at the method’s arguments and return a dispatch value. This value is used to select a particular method. What makes predd a multiple dispatch system is the dispatch value can be derived from any number of methods arguments. Because the dispatch value is computed at run-time this is called a late binding.

Here I’m going to define a multimethod called combine that takes two arguments. It combines its arguments appropriately depending on their dynamic run-time types.

(predd-defmulti combine (lambda (a b) (vector (type-of a) (type-of b)))
  "Appropriately combine A and B.")

The classifier uses type-of, an Elisp built-in, to examine its argument types. It returns them as tuple in the form of a vector. The classifier of a method can be accessed with predd-classifier, which I’ll use to demonstrate what these dispatch values will look like.

(funcall (predd-classifier 'combine) 1 2)    ; => [integer integer]
(funcall (predd-classifier 'combine) 1 "2")  ; => [integer string]

I chose a vector for the dispatch value because I like the bracket style when defining methods (you’ll see below). The dispatch value can be literally anything that equal knows how to compare, not just vectors. Note that it’s actually faster to create a list than a vector up to a length of about 6, so this multimethod would be faster if the classifier returned a list — or even better: a single cons.

Now define some methods for different dispatch values.

(predd-defmethod combine [integer integer] (a b)
  (+ a b))

(predd-defmethod combine [string string] (a b)
  (concat a b))

(predd-defmethod combine [cons cons] (a b)
  (append a b))

Now try it out.

(combine 1 2)            ; => 3
(combine "a" "b")        ; =>"ab"
(combine '(1 2) '(3 4))  ; => (1 2 3 4)

(combine 1 '(3 4))
; error: "No method found in combine for [integer cons]"

Notice in the last case it didn’t know how to combine these two types, so it threw an error. In this simple example where we’re only calling a single function, so rather than use the predd-defmethod macro these methods can be added directly with the predd-add-method function. This has the exact same result except that it has slightly better performance (no wrapper functions).

(predd-add-method 'combine [integer integer] #'+)
(predd-add-method 'combine [string string]   #'concat)
(predd-add-method 'combine [cons cons]       #'append)

Use the Hierarchy

Hmmm, the + function is already polymorphic. It seamlessly operates on both floats and integers. So far it seems there’s no way to exploit this with multimethods. Fortunately we can solve this by defining our own ad hoc hierarchy using predd-derive. Both integers and floats are a kind of number. It’s important to note that type-of never returns number. We’re introducing that name here ourselves.

(type-of 1.0)  ; => float

(predd-derive 'integer 'number)
(predd-derive 'float 'number)

;; Types can derive from multiple parents, like multiple inheritance
(predd-derive 'integer 'exact)
(predd-derive 'float 'inexact)

This says that integer and float are each a kind of number. Now we can use number in a dispatch value. When it sees something like [float integer] it knows that it matches [number number].

(predd-add-method 'combine [number number] #'+)

(combine 1.5 2)  ; => 3.5

We can check the hierarchy explicitly with predd-isa-p (like Clojure’s isa?). It compares two values just like equal, but it also accounts for all predd-derive declarations. Because of this extra concern, unlike equal, predd-isa-p is not commutative.

(predd-isa-p 'number 'number)  ; => 0
(predd-isa-p 'float 'number)   ; => 1
(predd-isa-p 'number 'float)   ; => nil

(predd-isa-p [float float] [number number])  ; => 2

(Remember that 0 is truthy in Elisp.) The integer returned is a distance metric used by method dispatch to determine which values are “closer” so that the most appropriate method is selected.

You might be worried that introducing number will make the multimethod slower. Examining the hierarchy will definitely have a cost after all. Fortunately predd has a dispatch cache, so introducing this indirection will have no additional performance penalty after the first call with a particular dispatch value.

Struct Example

Something that really sets these multimethods apart from other object systems is a lack of concern about encapsulation — or really about object data in general. That’s the classifier’s concern. So here’s an example of how to combine predd with defstruct from cl/cl-lib.

Imagine we’re making some kind of game where each of the creatures is represented by an actor struct. Each actor has a name, hit points, and active status effects.

(defstruct actor
  (name "Unknown")
  (hp 100)
  (statuses ()))

The defstruct macro has a useful inheritance feature that we can exploit for our game to create subtypes. The parent accessors will work on these subtypes, immediately providing some (efficient) polymorphism even before multimethods are involved.

(defstruct (player (:include actor))
  control-scheme)

(defstruct (stinkmonster (:include actor))
  (type 'sewage))

(actor-hp (make-stinkmonster))  ; => 100

As a side note: this isn’t necessarily the best way to go about modeling a game. We probably shouldn’t be relying on inheritance too much, but bear with me for this example.

Say we want an attack method for handling attacks between different types of monsters. Elisp structs have a very useful property by default: they’re simply vectors whose first element is a symbol denoting its type. We can use this in a multimethod classifier.

(make-player)
;; => [cl-struct-player "Unknown" 100 nil nil]

(predd-defmulti attack
    (lambda (attacker victim)
      (vector (aref attacker 0) (aref victim 0)))
  "Perform an attack from ATTACKER on VICTIM.")

Let’s define a base case. This will be overridden by more specific methods (determined by that distance metric).

(predd-defmethod attack [cl-struct-actor cl-struct-actor] (a v)
  (decf (actor-hp v) 10))

We could have instead used :default for the dispatch value, which is a special catch-all value. The actor-hp function will signal an error for any victim non-actors anyway. However, not using :default will force both argument types to be checked. It will also demonstrate specialization for the example.

However, before we can make use of this we need to teach predd about the relationship between these structs. It doesn’t check defstruct hierarchies. This step is what makes combining defstruct and predd a little unwieldy. A wrapper macro is probably due for this.

(predd-derive 'cl-struct-player 'cl-struct-actor)
(predd-derive 'cl-struct-stinkmonster 'cl-struct-actor)

(let ((player (make-player))
      (monster (make-stinkmonster)))
  (attack player monster)
  (actor-hp monster))
;; => 90

When the stinkmonster attacks players it doesn’t do damage. Instead it applies a status effect.

(predd-defmethod attack [cl-struct-stinkmonster cl-struct-player] (a v)
  (pushnew (stinkmonster-type a) (actor-statuses v)))

(let ((player (make-player))
      (monster (make-stinkmonster)))
  (attack monster player)
  (actor-statuses player))
;; => (sewage)

If the monster applied a status effect in addition to the default attack behavior then CLOS-style method combination would be far more appropriate here (if only it was available in Elisp). The method would instead be defined as an “after” method and it would automatically run in addition to the default behavior.

If I was actually building a system combing structs and predd, I would be using this helper function for building classifiers. It returns a dispatch value for selected arguments.

;;; -*- lexical-binding: t; -*-

(defun struct-classifier (&rest pattern)
  (lambda (&rest args)
    (loop for select-p in pattern and arg in args
          when select-p collect (elt arg 0))))

;; Takes 3 arguments, dispatches on the first 2 argument types.
(predd-defmulti speak (struct-classifier t t nil))

;; Messages sent to the player are displayed.
(predd-defmethod speak '(cl-struct-actor cl-struct-player) (from to message)
  (message "%s says %s." (actor-name from) message))

The Future

As of this writing there isn’t yet a prefer-method for disambiguating equally preferred dispatch values. I will add it in the future. I think prefer-method gets unwieldy quickly as the type hierarchy grows, so it should be avoided anyway.

I haven’t put predd in MELPA or otherwise published it yet. That’s what this post is for. But I think it’s ready for prime time, so feel free to try it out.

Emacs Lisp Reddit API Wrapper

2013-12-16T23:27:23Z

A couple of months ago I wrote an Emacs Lisp wrapper for the reddit API. I didn’t put it in MELPA, not yet anyway. If anyone is finding it useful I’ll see about getting that done. My intention was give it some exercise and testing before putting it out there for people to use, locking down the API. You can find it here,

https://github.com/skeeto/emacs-reddit-api

Except for logging in, the library is agnostic about the actual API endpoints themselves. It just knows how to translate between Elisp and the reddit API protocol. This makes the library dead simple to use. I had considered supporting OAuth2 authentication rather than password authentication, but reddit’s OAuth2 support is pretty rough around the edges.

Library Usage

The reddit API has two kinds of endpoints, GET and POST, so there are really only three functions to concern yourself with.

reddit-login
reddit-get
reddit-post

And one variable,

reddit-session

The reddit-login function is really just a special case of reddit-post. It returns a session value (cookie/modhash tuple) that is used by the other two functions for authenticating the user. Just as you get automatically with almost all Elisp data structures — probably more so than any other popular programming language — it can be serialized with the printer and reader, allowing a reddit session to be maintained across Emacs sessions.

The return value of reddit-login generally doesn’t need to be captured. It automatically sets the dynamic variable reddit-session, which is what the other functions access for authentication. This can be bound with let to other session values in order to switch between different users.

Both reddit-get and reddit-post take an endpoint name and a list of key-value pairs in the form of a property list (plist). (The api-type key is automatically supplied.) They each return the JSON response from the server in association list (alist) form. The actual shape of this data matches the response from reddit, which, unfortunately, is inconsistent and unspecified, so writing any sort of program to operate on the API requires lots of trial and error. If the API responded with an error, these functions signal a reddit-error.

Typical usage looks like so. Notice that values need not be only strings; they just need to print to something reasonable.

;; Login first
(reddit-login "your-username" "your-password")

;; Subscribe to a subreddit
(reddit-post "/api/subscribe" '(:sr "t5_2s49f" :action sub))

;; Post a comment
(reddit-post "/api/comment/" '(:text "Hello world." :thing_id "t1_cd3ar7y"))

For plists keys I considered automatically converting between dashes and underscores so that the keywords could have Lisp-style names. But the reddit API is inconsistent, using both, so there’s no correct way to do this.

To further refine the API it might be worth defining a function for each of the reddit endpoints, forming a facade for the wrapper library, hiding way the plist arguments and complicated responses. That would eliminate the trial and error of using the API.

(defun reddit-api-comment (parent comment)
  (if (null reddit-session)
      (error "Not logged in.")
    ;; TODO: reduce the return value into a thing/struct
    (reddit-post "/api/comment/" '(:thing_id parent :text comment))))

Furthermore there could be defstructs for comments, posts, subreddits, etc. so that the “thing” ID stuff is hidden away. This is basically what was already done for sessions out of necessity. I might add these structs and functions someday but I don’t currently have a need for it.

It would be neat to use this API to create an interface to reddit from within Emacs. I imagine it might look like one of the Emacs mail clients, or like Elfeed. Almost everything, including viewing image posts within Emacs, should be possible.

Background

For the last 3.5 years I’ve been a moderator of /r/civ, starting back when it had about 100 subscribers. As of this writing it’s just short of 60k subscribers and we’re now up to 9 moderators.

A few months ago we decided to institute a self-post-only Sunday. All day Sunday, midnight to midnight Eastern time, only self-posts are allowed in the subreddit. One of the other moderators was turning this on and off manually, so I offered to write a bot to do the job. There weren’t any Lisp wrappers yet (though raw4j could be used with Clojure), so I decided to write one.

As mentioned before, the reddit API leaves a lot to be desired. It randomly returns errors, so a correct program needs to be prepared to retry requests after a short delay, depending on the error. My particular annoyance is that the /api/site_admin endpoint requires that most of its keys are supplied, and it’s not documented which ones are required. Even worse, there’s no single endpoint to get all of the required values, the key names between endpoints are inconsistent, and even the values themselves can’t be returned as-is, requiring massaging/fixing before returning them back to the API.

I hope other people find this library useful!

Emacs, Thanksgiving, and Hanukkah

2013-11-28T22:25:36Z

Today is Thanksgiving in the United States. It also happens to be Hanukkah. There’s been news going around that Thanksgiving and Hanukkah will not coincide again for about 80,000 years. This sounded somewhat unbelievable to me because the Gregorian repeats every 400 years. I decided to compute it for myself to double-check this figure.

I’m not Jewish and I know very little about Hanukkah, so I had to look it up. After learning that Hanukkah is based on the Hebrew calendar, the rumors were sounding more believable. The Hebrew calendar repeats every 689,472 Hebrew years. This means the correspondence between Gregorian and Hebrew calendars is about 14 billion years. That 80,000 seems lowball.

Since I decided to use Emacs Lisp for the computation, I fortunately was able to ignore all the unfamiliar, complicated rules for the Hebrew calendar: Emacs knows how to compute Hebrew dates. It can be accessed through the function calendar-hebrew-date-string.

;; Thanksgiving 2013
(calendar-hebrew-date-string '(11 28 2013))
;; => "Kislev 25, 5774"

Hanukkah begins on the 25th of Kislev, so I can write a quick-and-dirty function to detect if a date is the first day of Hanukkah.

(defun hanukkah-p (date)
  "Return non-nil if DATE is Hanukkah."
  (string-match-p "^Kislev 25" (calendar-hebrew-date-string date)))

Next I need a function to compute Thanksgiving, which is really simple. Thanksgiving falls on the fourth Thursday of November.

(defun thanksgiving (year)
  "Return the date of Thanksgiving for YEAR."
  (loop for day from 1 upto 7
        when (= 4 (calendar-day-of-week `(11 ,day ,year)))
        return `(11 ,(+ day 21) ,year)))

If there was no calendar-day-of-week I could compute it using Zeller’s algorithm, which I already happen to have implemented,

(defun cal/day-of-week (year month day)
  "Return day of week number (0-7)."
  (let* ((Y (if (< month 3) (1- year) year))
         (m (1+ (mod (+ month 9) 12)))
         (y (mod Y 100))
         (c (/ Y 100)))
    (mod (+ day (floor (- (* 26 m) 2) 10) y (/ y 4) (/ c 4) (* -2 c)) 7)))

Now for each year find Thanksgiving and test it for Hanukkah. I started with 1942 because that’s when the fourth-Thursday-of-November rule was established. Presumably due to the regexp part, this expression takes a moment to compute.

(loop for year from 1942 to 80000
      when (hanukkah-p (thanksgiving year))
      collect year)
;; => (2013 79043 79290 79537 79564 79635 79784 79811 79882)

My result exactly matches what I’m seeing elsewhere. The rumors are correct! The next coincidence occurs on November 23rd, 79043. Thanks, Emacs!

Elfeed Tips and Tricks

2013-11-26T00:38:20Z

This past weekend I had some questions from next-user-here (NUH) on my original Elfeed post about changing some of Elfeed’s behavior. NUH is an Elisp novice so accomplishing some of the requested modifications wasn’t obvious. A novice is mostly limited to setting variables, not defining advice or using hooks. I’ve also been using Elfeed daily for about three months now as my sole web feed reader and along the way I’ve developed some best practices. In addition to responding to some of NIH’s questions here, I’d like to share some tips and tricks.

Custom Entry Launchers

Currently you can press “b” to launch one or more entries in your browser. You can use “y” to copy an single entry to the clipboard. What if you want to make another action.

In my configuration I have a fancy binding that sends the entry URLs in the selected region to youtube-dl for downloading the videos. It’s too large to share as a snippet so here’s a small example of something similar using a program called xcowsay.

(defun xcowsay (message)
  (call-process "xcowsay" nil nil nil message))

(defun elfeed-xcowsay ()
  (interactive)
  (let ((entry (elfeed-search-selected :single)))
    (xcowsay (elfeed-entry-title entry))))

(define-key elfeed-search-mode-map "x" #'elfeed-xcowsay)

Now when I hit “x” over an entry in Elfeed I’m greeted by a cow announcing the title.

Entry Listing Customization

The search buffer you see when starting Elfeed, where entries are listed, can be customized a few different ways. First, this buffer does grow dynamically. After re-sizing the window/frame horizontally you just have to refresh the view by pressing g (an Emacs convention). How it fills out depends on the settings of these variables,

elfeed-search-title-max-width
elfeed-search-title-min-width
elfeed-search-trailing-width

They control how wide the different columns should be as the window size changes. An important caveat to this is that the cache stored in elfeed-search-cache must be cleared before the changes will be reflected in the display. This cache exists because building the display, assembling all the special faces, is actually quite CPU-intensive. It was an optimization I established early on.

(clrhash elfeed-search-cache)

If you set these variables in your start-up configuration you don’t need to worry about clearing the cache because it will already be empty. It’s only a concern when playing with the settings.

Date Display

Another question was about adding time to the entry listing. Elfeed only displays the entry’s date. Dates are formatted by the function elfeed-search-format-date. This can be redefined to display dates differently.

(defun elfeed-search-format-date (date)
  (format-time-string "%Y-%m-%d %H:%M" (seconds-to-time date)))

It’s given epoch seconds as a float and it returns a string to display as a date.

Faces and Colors

All of the faces used in the display are declared for customization, so these can be changed to whatever you like.

elfeed-search-date-face
elfeed-search-title-face
elfeed-search-feed-face
elfeed-search-tag-face

Say you suffered a head injury and decided you want your Elfeed dates to be bold, purple, and underlined,

(custom-set-faces
 '(elfeed-search-date-face
   ((t :foreground "#f0f"
       :weight extra-bold
       :underline t))))

Database Manipulation

Feeds and entries in the database can be manipulated to become whatever you want them to be. Because Elfeed is regularly modifying the database, the trick is to perform the manipulation at just the right time.

Feed Title Changes

Say you want to change a feed title because you don’t like the title supplied by the feed. For example, the title to my blog’s feed is “null program” but instead you think it should be “Seriously Handsome Programmer” (head injury, remember?). The function elfeed-db-get-feed can be used to fetch a feed’s data structure from the database, given it’s exact URL as listed in your elfeed-feeds.

(let ((feed (elfeed-db-get-feed "https://nullprogram.com/feed/")))
  (setf (elfeed-feed-title feed) "Seriously Handsome Programmer"))

Hold it, that didn’t work. First, that display cache is getting in the way again. Feed titles change very infrequently so they’re cached aggressively. More importantly, next time you update your feeds Elfeed will re-synchronize the feed title with the official title. It’s going to fight against your intervention.

The solution is to do it with a little bit of advice just before the title is displayed. Advise the function elfeed-search-update with some “before” advice.

(defadvice elfeed-search-update (before nullprogram activate)
  (let ((feed (elfeed-db-get-feed "https://nullprogram.com/feed/")))
    (setf (elfeed-feed-title feed) "Seriously Handsome Programmer")))

Entry Tweaking

Automatic entry modification should happen immediately upon discovery so that it looks like the entry arrived that way. This is done through the elfeed-new-entry-hook. Generally this would be used for applying custom tags. These examples are from the documentation:

;; Mark all YouTube entries
(add-hook 'elfeed-new-entry-hook
          (elfeed-make-tagger :feed-url "youtube\\.com"
                              :add '(video youtube)))

;; Entries older than 2 weeks are marked as read
(add-hook 'elfeed-new-entry-hook
          (elfeed-make-tagger :before "2 weeks ago"
                              :remove 'unread))

;; Building subset feeds
(add-hook 'elfeed-new-entry-hook
          (elfeed-make-tagger :feed-url "example\\.com"
                              :entry-title '(not "something interesting")
                              :add 'junk
                              :remove 'unread))

Due to a feature I recently ported from my personal configuration, this tagger helper function is less necessary. You can put lists in your elfeed-feeds list to supply automatic tags.

(setq elfeed-feeds
      '(("https://nullprogram.com/feed/" blog emacs)
        "http://www.50ply.com/atom.xml"  ; no autotagging
        ("http://nedroid.com/feed/" webcomic)))

Content Tweaking

Going beyond tagging you could change the content of the feed. Say you want to make feeds 100 times better.

(defun hundred-times-better (entry)
  (let* ((original (elfeed-deref (elfeed-entry-content entry)))
         (replace (replace-regexp-in-string "keyboard" "leopard" original)))
    (setf (elfeed-entry-content entry) (elfeed-ref replace))))

(add-hook 'elfeed-new-entry-hook #'hundred-times-better)

The same trick could be used to remove advertising, change the date, change the title, etc. The elfeed-deref and elfeed-ref parts are needed to fetch and store content in the content database. Only a reference is stored on the structure. You can actually use these functions at any time outside of Elfeed, but they’ll eventually get garbage collected if Elfeed doesn’t know about them.

(setf ref (elfeed-ref "Hello, World"))
;; => [cl-struct-elfeed-ref "907d14fb3af2b0d4f18c2d46abe8aedce17367bd"]

(elfeed-deref ref)
;; => "Hello, World"

Deletion

A question that’s been asked few times is if entries can be deleted. To start off, the answer to that question is “no.” There is no function provided to remove entries from the database. If you want to remove entries you’re probably taking the wrong approach.

The main problem with removal is that Elfeed needs to keep track of what it’s seen before. If an entry is removed and then rediscovered, it will reappear as unread. There are better ways to “remove” entries, such as tagging them specially.

On a moderately-powerful computer Elfeed can easily handle at least several tens of thousands of database entries. If “too many entries” ever becomes a performance problem I’d rather solve it by making the database faster than by removing information from the database. It’s already very date-oriented so that older entries are infrequently touched.

If storage is a concern, you shouldn’t get too worked up about that. As of this post I have about 6,000 entries in my database and the index file is only 3.5 MB. The content database after garbage collection, which is the data/ directory under ~/.elfeed/, with these 6k entries is 17MB. When I run M-x elfeed-db-compact, currently an experimental feature, it drops down to 1.8MB. That’s less than 1 kB per entry. It’s also less than my personal Liferea database of roughly the same amount of content (~15MB) before I wrote Elfeed.

If even this storage is still too much you can always blow away your data/ content database directory. This is safe to do even while Emacs is running. You’ll still see all of the entries listed in the search buffer but won’t be able to read them within Emacs until after the next database update (when it re-fetches the most recent entry content).

You can also clear out the content database from within Elisp by visiting every entry and clearing its content field.

(with-elfeed-db-visit (entry _)
  (setf (elfeed-entry-content entry) nil))

(elfeed-db-gc)  ;; garbage collect everything

The same sort of expression can be used to run over all known entries to perform other changes. If there was a delete function you might use it here to remove entries older than a certain date, then hope they’re not rediscovered.

If you never want to store entry content (you never read entries within Emacs), you can use a hook to always drop it on the floor as it arrives,

(add-hook 'elfeed-new-entry-hook
          (lambda (entry) (setf (elfeed-entry-content entry) nil)))

Questions?

If you have any questions or suggestions about how to make Elfeed do what you want it to do, feel free to ask. Some things may actually require that I make changes to Elfeed to support it, though I hope I’ve anticipated your particular need well enough to avoid that.

The Elfeed Database

2013-09-09T05:53:41Z

The design of Elfeed’s database took some experimentation before any part of it was settled. A major design constraint was Emacs’ very limited file input/output. There’s no random access and, without the aid of an external program, files must always be read and written wholesale. That’s not database-friendly at all! In the end I settled on a design that minimized the size of the frequently rewritten parts, an index with two different data models, by storing immutable data in a loose-file, content-addressable database.

At the moment there really aren’t any pure-Elisp database solutions for Emacs. This is almost certainly due to the aforementioned I/O limitations. I ran into this same problem last year when I created an Emacs pastebin server. I attempted, and failed, to interface with a SQLite database through it’s command line program. Nic Ferrier has published a generic database interface, but it lacks concrete implementations.

As a bit of good news, as far as I know Emacs does properly handle atomic file updates across all platforms, so a pure-Elisp database developer would never have to worry about only writing half the database. It’s always a safe operation. Worst case scenario you’re left with an old version of data rather than no data at all.

A real possibility for a database would be connecting to an established database server via TCP with an Emacs network process. If the server has a specified wire protocol Elisp could talk to it efficiently. In fact, there’s exists pg.el that does exactly this for PostgreSQL. Unfortunately I was not able to get this working with my pastebin, nor is this solution appropriate for Elfeed. It would be unreasonable to require users to first set up a PostgreSQL server just to read web feeds!

Ultimately it would seem that any efficient Emacs database requires the help of an external program. The notmuch mail client, which inspired Elfeed, does this. To access the notmuch database a command line program is run once for each request. A query is passed as a program argument and the output of the program is parsed into the result.

The Early Database

For the first few days of its existence Elfeed only had an in-memory database. Closing Emacs would lose everything. For my personal usage patterns, where I read, or at least address, all entries that arrive — and especially because I use Elfeed on a couple of different computers — I don’t really need to track things long term. I could easily mark everything after a certain date as read and forget about them. However, it would be nice to have and, more importantly, many people wouldn’t use Elfeed without persistence between Emacs sessions.

So, for the first database I did what I always do: dumped the data structure to a file using the printer and parsed it back in later using the reader. This is dead simple in Lisp, it’s very fast, and it even works for circular data structures. It’s something I missed so much with the much-less-capable JSON format earlier this year that I wrote a JavaScript library to do it.

(defun save-data (file data)
  (with-temp-file file
    (let ((standard-output (current-buffer))
          (print-circle t))  ; Allow circular data
      (prin1 data))))

(defun load-data (file)
  (with-temp-buffer
    (insert-file-contents file)
    (read (current-buffer))))

(save-data "demo.dat" '(a b c ["1" 2 3]))
(load-data "demo.dat")
;; => (a b c ["1" 2 3])

Anything with a printed representation can be serialized and stored this way, including symbols, string, numbers, lists, vectors (structs, objects), hash tables, and even compiled functions (.elc files). Basically every Emacs library that stores data on disk uses this technique.

Unfortunately, this is where I hit another serious database constraint: print-circle is broken in Emacs 24.3, the current stable release. This means Elfeed cannot take advantage of this useful feature, at least not for a long time, as I had been counting on. The final database is slightly slower and larger than strictly required as a result.

The Content Database

After breaking the circular references of the in-memory database I finally had persistence for the first time. With the naive printer/reader approach it was slow, almost 1 second to write just a few thousand entries on my 6-year-old laptop (my minimum requirements target machine). I wanted Elfeed to support hundreds of thousands of entries, if not millions, so this was much too slow.

The big slowdown was writing out all the entry content each time the database is saved. These large strings containing HTML that rarely change. There’s no reason to write these out every time, nor is there a reason to even keep them in memory all the time, as it’s rarely accessed. The solution is a loose-file, content-addressable database, very similar to an unpacked Git object database.

The content database stores immutable sequences of characters — not just raw bytes, but rather multibyte strings — using an unspecified coding system (right now it’s UTF-8 for all platforms). The filename for the content is the content hashed with SHA-1 (“content-addressable”). To limit the number of files per directory, these files are stored in subdirectories named by the first hex-encoded byte of the hash (just like Git). A database of 4 items might look like this:

data/
   18/
      18ff6f11945b1e9f3e3c4cae8b5275d36b9944e1
      184c06a83f0bc73a8345c6d886f9043bcae095f8
   6b/
      6b59ae257f2bea24703d8adf5747049c138dfc82
   cc/
      cc47d53872ae2a9186151ef1a68392a94e1f091f

Something really neat about the content database is that it’s completely agnostic about Elfeed. If it weren’t for Elfeed’s garbage collector, anyone could use it to store arbitrary content. The function elfeed-ref accepts a string and returns a reference into the database. Because of the hash, providing the same string in the future will return the same reference without actually performing a write. References are dereferenced with elfeed-deref.

(setf ref (elfeed-ref "Hello, world!"))
;; => [cl-struct-elfeed-ref "943a702d06f34599aee1f8da8ef9f7296031d699"]

(elfeed-deref ref)
;; => "Hello, world"

With content stored elsewhere, entries are a struct containing only some small metadata: title, link, date, and a content database reference. Writing out many of them at once is much, much faster.

I don’t expect it happens often, but this also means content is de-duplicated. If two entries happen to have the same content they’ll share content database storage. A small savings.

At this point it’s really tempting to get fancier and really put this content database to use. The core index itself could be stored as raw content, and the root to accessing the database would be a single SHA-1 hash referencing it — again, very similar to Git. If an index stores a reference to the previously written index, then the the Elfeed database would be an immutable structure tracking its entire history. Such a change would cost virtually nothing in performance, just disk space.

Multiple Representations

With all the content out of the way, the database is now just a lean index. At this point it’s a hash table mapping feed IDs to feeds. Feeds contain a list of its entries. To build the entry listing for the elfeed-search buffer, Elfeed needs to visit each feed in the hash table, gather its entries into one giant list, then finally sort that list by date. At around O(n log n), that sort operation is a real performance killer. Completely unacceptable. To fix this we need to think about how the data is updated and used.

First, entries are always viewed in date order, no exceptions. From my experience of using web feeds for the last six years I never had a reason to list feed entries by any other order. The vast majority of the time, newer entries are most relevant, and if I need to look for something specific I can search for it.

We definitely want to store entries in date-order so we can create entry listings without performing a sort: something around O(n) or so. Inserting new entries into this structure should also be efficient.

Second, entries are never removed from the database. This isn’t e-mail. Even if a user doesn’t want to see an entry again, we have to keep track of it. Otherwise it will show up as new if it’s discovered in a feed again, which is likely. Things are added to the database and never removed. In Elfeed, I use a junk tag to completely hide entries I don’t want to see, and I always have a -junk element in my filter.

There’s an important caveat to this one that I had missed until after the public release: entry dates can change! When a previously discovered entry is read from a feed, Elfeed updates (read: mutates) the entry struct to reflect the new state. This includes the date. It’s very likely that a date-sorted representation won’t tolerate date changes underneath it since it’s keying off of them. Either we refuse to update the entry date, or we remove the entry, update the date, and then re-insert it (how it currently works).

Third, entries are generally added with a recent date. After the database is initially populated, it’s only picking up new items. We should prefer adding recently-dated entries be faster than adding older entries. I didn’t get a chance to take advantage of this, but it’s something to keep in mind.

Fourth, entries need to be keyed by an ID string. Each entry has a unique, unchanging identifier string, either provided by the feed itself (RSS’s guid or Atom’s id) or generated intelligently by Elfeed. Especially because of the print-circle bug, we need to be able to talk about feeds in terms of their ID — an indirect pointer.

(Actually, even when RSS guid tags are present, they’re permalinks by default. So, unfortunately, RSS IDs are not at all resistant to collisions across feeds. To work around this, entry identifiers are a pair of strings: feed ID and entry ID. Atom doesn’t have this problem, but we’re stuck with the lowest common denominator.)

A date-oriented representation would be unable to efficiently look up an entry by its ID, so it needs to be supplemented by an ID-oriented representation. This means we need two representations in our database: date-oriented and ID-oriented.

So what do we use? Well, for keeping entries sorted by date we want some sort of balanced tree. A B-tree is probably a good choice. Rather than write one I went with an AVL tree since Emacs comes with a library for it (avl-tree). It’s already debugged and optimized! The bad news is that the internal structure is unspecified, so there are no guarantees that it can be serialized. A future update to the library may break the Elfeed database. I also had to hack into it to work around a security issue. The comparison function is embedded in the tree. After deserializing the database, Elfeed needs to ensure that no one stuck a malicious function in there.

The choice for an ID database was super-easy: a hash table. Due to the print-circle bug, this is actually the main representation. The AVL tree only stores IDs and it has to reach into the hash table to do any date comparisons. If print-circle was working I could store the same exact entry objects in the AVL tree as the hash table, so mutating them would update them in all representations. However, with print-circle off, on deserialization these would become unique objects and updates would break.

The Future

That’s where the database is today. I put in a few extra fields that aren’t actually used yet, so that there’s room to make a few changes without breaking the database. Perhaps someday I’ll work out a whole new database structure, or maybe a proper database library will come into existence, and this post will simply document the old database.

A Handy Emacs Package Configuration Macro

2013-06-02T00:00:00Z

Update April 2015: I now use use-package instead of the with-package macro explained below. It’s cleaner, nicer, and better maintained.

I was inspired by a post recently written by Milkypostman (the M in MELPA). He describes some of his init.el configuration, specifically focusing on an after macro that wraps the misdesigned eval-after-load function. I wanted to take this macro further in three ways:

The delayed expression should be properly byte-compiled, which doesn’t happen by default with eval-after-load.
In a few cases my expression depends on multiple, independent packages but eval-after-load only accepts one.
If I’m specifying packages when using my macro, why bother listing them at the top of my initialize file? I could DRY things up by learning what packages to install when the macro is used. Here’s the kicker: I can pretend that every available package is already installed like built-in packages!

The result is a pair of macros with-package and with-package* which can be found in package-helper.el. The latter form doesn’t wait but immediately loads the specified packages with require. It’s shaped just like Milkypostman’s after macro, except that it can accept a list of packages in place of a single symbol. Also, the package names aren’t quoted; they don’t need to be since this is a macro instead of a function.

Here’s a typical use case for each macro. That expose higher-order function is from my personal utility library. The expressions to be evaluated depend on both packages and neither needs to be loaded immediately, so I’m using the first form of the macro.

(with-package (skewer-mode utility)
  (skewer-setup)
  (define-key skewer-mode-map (kbd "C-c $")
    (expose #'skewer-bower-load "jquery" "1.9.1")))

(with-package* smex
  (smex-initialize)
  (global-set-key (kbd "M-x") 'smex))

For the second one, I’m going to be using smex right away (takes over M-x), so I use the second form, which immediately loads smex. The macro isn’t really necessary at all here since I could just use require and follow it with these expressions, but I really like how this organizes my init.el. It creates a domain-specific language (DSL) just for Emacs configuration. Each package configuration is grouped up in a clean let-like form. Since I’ve added syntax highlighting to with-package it looks very elegant. Normal syntax highlighters aren’t going to do this, so here’s a screenshot of my buffer.

JavaScript developers with a keen eye may notice a familiar pattern here. This macro is shaped a bit like the Asynchronous Module Definition (AMD), with asynchronousy in mind. Since this is Lisp with a powerful macro system, I get to hide away the function wrapper part.

Using this macro has caused me to use eval-after-load with just about everything. This has cut my initialization time down to about 10% of what it was before! On those occasions that I do restart Emacs, it’s really nice that it’s back to under 1 second (0.6 seconds vs 6 seconds).

The problem of eval-after-load

I’m calling eval-after-load poorly designed because it’s a perfect example of an inappropriate use of eval. In function form it should have accepted a function as its second argument instead of an s-expression, so it would work like a hook. This is even more inappropriate now that Emacs has proper lexical closures, which is the perfect mechanism for delayed evaluation. The whole point of eval-after-load is to speed up Emacs initialization time, but using eval is slow. To the compiler, this isn’t code, just data. This means no byte-compilation and no compiler warnings.

A possible alternative design for eval-after-load would be a hook named something like -load-hook. Then when load or require loads a file, it runs the hook with the matching name. This removes eval-after-load as its own standalone language concept.

(add-hook 'skewer-mode-load-hook (lambda () ...))

The problem here is when the package is already loaded the hook is never run. In contrast, when eval-after-load is used on an already-loaded package, the expression is immediately evaluated.

Given this, if there was something I could change about this it would simply be for eval-after-load, whatever it would be called, to take a function for the second argument. I would also provide a simple macro just like after that wraps this function. Why not just a macro? The function form would be really useful for a situation like this,

(eval-after-load 'skewer-mode #'skewer-setup)

Here there’s no need to instantiate a new anonymous function or s-expression. If all it’s doing is calling a zero-arity function, that function can be passed in directly.

Prototype-based Elisp Objects with @

2013-04-07T00:00:00Z

Reflection from the future: This library is super slow and inefficient. It should probably not be used for anything serious.

Last weekend I had the itch to play around with a multiple-inheritance prototype-based object system in lisp. It would look a lot like JavaScript’s object system but wanted to try experimenting some different ideas. My favorite lisp to hack in is Emacs Lisp, so that’s what I built it on. What I ended up with is actually pretty neat. Despite the lack of reader macros in Elisp, I still managed to introduce new syntax by manipulating symbols at compile time.

https://github.com/skeeto/at-el

See the README for a quick demonstration. What follows is the long explanation.

It’s called @, due to the syntax that it adds to Elisp as a domain-specific language. It’s a mini-language, really. The name is also a challenge to the code that supports Elisp, because so much of it — including emacs-lisp-mode and Paredit — doesn’t properly handle @ in identifiers. ~~Even Maruku, the Markdown to HTML translator I use for this blog, has bugs that won’t allow it to handle the @ characters in my code, so I had to forgo most syntax highlighting for this post.~~ (Update: I now use Kramdown so this is no longer an issue.)

Fortunately require does manage just fine.

(require '@)

Objects in @ are vectors with the symbol @ as the first element. The rest of the elements are implementation specific, but, at the moment, the second element is a plist (property list) of all of that object’s properties.

The root object of @ is @, and all other objects are instances of this object, either directly or indirectly. Because it’s prototype based, creating a new object is a matter of extending one or more (multiple-inheritance) existing objects. This is done with the function @extend.

;; Create a brand new object
(defvar foo (@extend @))

If no objects are given to @extend, @ will be used as the parent object, so it’s not necessary as an argument above. This is actually very important, as objects that don’t inherit from @ will not work at all! I’ll get into that detail in a bit. Additionally, @extend accepts keyword arguments, which become properties on the created object.

The function @ is used to access properties on an object. Remember, Elisp is a lisp-2 meaning that variables and functions exist in their own namespaces. This means there can be both a variable @ (the root object) and function @ (property accessor).

(setf rectangle (@extend :width 3 :height 4))
(@ rectangle :width)  ; => 3
(@ rectangle :height)  ; => 4

The @ function is also setf-able, so setting properties should be obvious to any lisper.

(setf (@ rectangle :width) 13)
(@ rectangle :width)  ; => 13

Like JavaScript, methods are just functions stored in properties on an object. In @, the first argument for a method is the object itself, which is called @@ by convention.

(setf (@ rectangle :area)
  (lambda (@) (* (@ @@ :width) (@ @@ :height))))

(funcall (@ rectangle :area) rectangle)  ; => 52

New Syntax

Here’s the first really neat part. I find all that (@ @@ ...) business to be visually unpleasing. Fortunately this can be fixed by adding syntax. The macro def@ transforms variables that look like @: into these @ accessors. The following declaration is equivalent to the lambda assignment above. It’s meant to be very convenient.

(def@ rectangle :area ()
  (* @:width @:height))

This macro walks the body of the function at compile-time (macro expansion time) and transforms these symbols into the full @ calls above. Like most lisp macros, this has no run-time performance cost.

Because using funcall all the time and remembering to pass the object as the first argument is tedious, the @! function is provided for calling methods.

(@! rectangle :area)  ; => 52

The @: variables become function calls when in function position.

(def@ rectangle :double-area ()
  (* 2 (@:area))

In a lisp-1 this would happen for free, but in Elisp this situation expands to the @! form.

Inheritance

This rectangle is starting to look like a nice re-usable object. There’s a @ convention for this: prefix “class” object names with @.

(setf @rectangle rectangle)

Now to create new rectangle objects.

(setf foo (@extend @rectangle :width 3 :height 7.1))
(@! foo :area)  ; => 21.3

Notice that the foo object doesn’t actually have an :area property on itself. It was found on its parent, @rectangle by inheritance. :width and :height were not looked up on the parent because they’re already bound on foo.

Here’s another re-usable prototype. Notice that @: variables are also setf-able — using push in this case.

(defvar @colored (@extend :color ()))

(def@ @colored :mix (color)
  (push color @:color))

The object system has multiple-inheritance, so colored rectangles can be created from these two objects. The parent objects of an object are listed in the :proto property as a list (similar to JavaScript’s __proto__), which can be modified at any time to change an object’s prototype chain.

(defvar foo (@extend @colored @rectangle :width 10 :height 4))

(@! foo :area)  ; => 40
(@! foo :mix :red)
(@! foo :mix :blue)
(@ foo :color)  ; => (:blue :red)

Even though the initial property was read from the parent, the assignment (push), like all assignments, actually occurred on foo.

Setters and Getters

Remember how I said that objects that don’t eventually inherit from @ will be broken? This is because properties are actually set and accessed through :set and :get methods. That is, @ calls these methods as needed. The @ object provides the default actions for these. An interesting part of the @ code: initially setting :set on @ is a circularity problem, so there’s a special bootstrap step to accomplish it.

By providing your own you can fundamentally change how your object works. For example, here’s an @immutable mix-in which prevents all property assignments. It’s provided as part of @.

(defvar @immutable (@extend))

(def@ @immutable :set (property _value)
  (error "Object is immutable, cannot set %s" property))

This :set method will be found before the @ :set method, so it gets overridden.

Remember how I said all object have a :proto that can be used to modify the objects inheritance? This can be used to freeze an object’s properties in place. Here’s a :freeze method for all objects.

(def@ @ :freeze ()
  "Make this object immutable."
  (push @immutable @:proto))

Pretty cool, eh?

The :get method can be used to provide virtual properties.

(defvar @squares (@extend))

(def@ @squares :get (property)
  (if (numberp property)
      (expt property 2)
    (@^:get property)))  ; explained in a moment

(mapcar (lambda (n) (@ @squares n)) '(0 1 2 3 4))
; => (0 1 4 9 16)

I use this technique in the @vector class under lib/ to expose the elements of the internal vector as if they were properties. Brian used this trick to make a @buffer prototype that wraps Emacs’ buffers, with methods provided virtually by :get. For example, the :string property would return a lambda that calls buffer-string.

With multiple-inheritance and these setters and getters, there are a lot of interesting mix-in possibilities. I’m only just discovering some of them now.

Supermethods

Sometimes it’s really useful to call supermethods. There’s syntax for this: @^:. This calls the next method of that name in the prototype chain. For example, here’s a @watchable mix-in (also provided by @) that allows other code to be notified of changes to an object. It needs to override :set but still call the original :set.

(defvar @watchable (@extend :watchers nil))

(def@ @watchable :watch (callback)
  (push callback @:watchers))

(def@ @watchable :unwatch (callback)
  (setf @:watchers (remove callback @:watchers)))

(def@ @watchable :set (property new)
  (dolist (callback @:watchers)
    (funcall callback @@ property new))
  (@^:set property new))

This behavior is also used for constructors. By convention, the :init method is the constructor. It should generally call the next constructor with (@^:init). @ has a no-op, no-argument :init method to bottom-out this process.

(def@ @rectangle :init (width height)
  (@^:init)
  (setf @:width width @:height height))

(@! (@! @rectangle :new 13.2 2.1) :area) ; => 27.72

As shown, the :new method provided by the @ object combines both @extend and :init to provide simple single-object inheritance.

The Cost of @

In the lib/ directory there are a bunch of example objects implemented: including @vector, @queue, @stack, and @heap. I found these to be very enjoyable to write, and they’ve been the testing grounds for @. @heap uses an internal @vector instance and exercises @’s features the most.

The performance cost of @ very apparent with @heap. Even byte-compiled it’s slower than the naive implementation (compose push and sort) for even as high as 1,000 elements. While I think @ leads to elegant code, there’s still plenty to do for performance. It’s comically slow.

This really caught Brian’s interest, because it was an opportunity to put on his programming language designer’s hat — which I believe to be his favorite hat. He’s been trying different caching strategies to reduce all the walking of the prototype chain. This effort can be found in the other repository branches and in his fork. The system is so dynamic that cache invalidation is a really complex problem.

Every time a property is set, @ has to find the :set property for that object, which generally means walking all the way up to @. Because :proto can be modified at any time, every property look-up requires computing the precedence order (lazily). This all makes property assignment quite expensive! I can understand why real object systems aren’t this flexible. It comes at a high price.

The Limits of Emacs Advice

2013-01-22T00:00:00Z

Today at work I was using impatient-mode to share some code with Brian. It makes for a really handy live pastebin. To limit the buffer to the relevant code, I narrowed it down with narrow-to-region. However, the browser wouldn’t update to show only the narrowed region until I made an edit. This makes sense because impatient-mode hooks after-change-functions. Narrowing the buffer doesn’t change anything in the buffer, so, as expected, this hook is not called.

The solution would be to also join whatever hook is called when the buffer restriction changes. Unfortunately, no such hook exists. I thought I could create this hook with some advice, but this turns out to be currently impossible.

Emacs Advice

What’s advice? It’s a handy feature of Emacs lisp that allows users to modify the behavior of almost any function without having to redefine it. It works a little bit like methods in the Common Lisp Object System (CLOS): advice is code than can be evaluated before, after, or around a function.

Advice is defined with defadvice. Duh. For example, say we wanted to be silly and have Emacs say “Ouch!” when a line is killed with kill-line. We can advise this function to display a message.

(defadvice kill-line (after say-ouch activate)
  (message "Ouch!"))

This says we want to advise the function kill-line, we want this advise to execute after kill-line has run, our advice is named “say-ouch”, and we want to immediately activate this advice so it gets used right away. The rest is the body of the advice, like the body of a function. After evaluating this defadvice, every time I hit C-k Emacs says “Ouch!” in the minibuffer. Cool!

narrow-to-region and widen

A hook is a variable that holds a list of functions. (Or maybe hooks are the functions in this list? Emacs’ documentation calls both of these things hooks.) These functions are called, usually without arguments, when some specific event occurs. For example, every mode has its own mode hook which is called when the mode is activated in a buffer. This allows users to extend or modify the mode — like by enabling additional minor modes — without editing the mode’s source code directly.

To make our hook work we need to advise narrow-to-region and widen to run the hook after they’ve done their work. These are the primitive narrowing functions which all the other narrowing functions eventually call, like narrow-to-defun, narrow-to-page, and any other mode-specific narrowing. Advising these two functions will cover all buffer narrowing. It should be this simple.

(defvar change-restriction-hook ())

(defadvice narrow-to-region (after hook activate)
  (run-hooks 'change-restriction-hook))

(defadvice widen (after hook activate)
  (run-hooks 'change-restriction-hook))

At first this seems to work. I can add a test hook see them activate when I use M-x narrow-to-region and M-x widen. However, when I use other narrowing functions, like narrow-to-defun, my hook functions aren’t called.

Is there a narrowing primitive I missed? I check the source code. Nope, these are lisp functions which ultimately call narrow-to-region. Is the advice not getting used when called indirectly? I test that out.

(defun foo ()
  (interactive)
  (narrow-to-region 1 2))

This works fine. Hmmm, these other functions are byte-compiled, maybe that’s the problem.

(byte-compile 'foo)

Bingo. The advice has stopped working. It has something to do with byte-compilation.

Bytecode

Let’s take a look at the bytecode for foo.

(symbol-function 'foo)
;; => #[nil "\300\301}\207" [1 2] 2 nil nil]

I don’t know too much about Emacs’ byte code, but here’s the gist of it. A compiled function is a special type of vector (hence the #[] form). This is a legal s-expression which you can use directly in regular Elisp code just like it was a function. The only reason you’d do so is for obfuscation, so it would look very suspicious.

The first element of this function vector is the parameter list — empty in this case. The second is a string containing the actual bytecodes. The rest holds the various constants from the function body. This includes the symbols of other functions called by this function. It’s important to note that narrow-to-region does not appear in this list!

Curious. Let’s take a closer look at the bytecode.

(coerce (aref (symbol-function 'foo2) 1) 'list)
;; => (192 193 125 135)

Looking at bytecomp.el from the Emacs distribution I can see that codes 192 and 193 are used for accessing constants. This pushes my constants 1 and 2 onto a stack for use as function arguments. Next up is 125, which corresponds to byte-narrow-to-region. Gotcha!

It turns out narrow-to-region is so special — probably because it’s used very frequently — that it gets its own bytecode. The primitive function call is being compiled away into a single instruction. This means my advice will not be considered in byte-compiled code. Darnit. The same is true for widen (code 126).

Where to go now?

Since it’s not possible to hook or advise the buffer-narrowing primitives, impatient-mode would need to hook some other event that tends to happen at the same time. Perhaps any time a command is executed in the current buffer it could check for changes to the buffer restriction and, if so, update any attached web clients. I’ll figure something out.

Turning Asynchronous into Synchronous in Elisp

2013-01-14T00:00:00Z

As a new user of nREPL I was poking around nrepl.el, seeing what sorts of Elisp tricks I could learn. Even though it was written 6 months before Skewer, and I was completely unaware of nREPL’s existence until two weeks ago, there’s a lot of similarity between nrepl.el and Skewer. Due to serving the same purpose for different platforms, this isn’t very surprising.

In particular, Skewer has skewer-eval for sending a string to the browser for evaluation. Like JavaScript, Emacs Lisp is single-threaded: there’s only one execution context at a time and it has to return to the top-level before a new context can execute. There are no continuations or coroutines. skewer-eval requires coordination with an external process (the browser) making it inherently asynchronous. So as a second, optional argument, a callback can be provided for receiving the result.

;; Echo the result in the minibuffer.
(skewer-eval "Math.pow(2.1, 3.1)"
             (lambda (r) (message (cdr (assoc 'value r)))))

However, the equivalent function in nrepl.el, nrepl-eval, is synchronous! It returns the evaluation result. “That’s not true! That’s impossible!”

;; !!!
(plist-get (nrepl-eval "(Math/pow 2.1 3.1)") :value)
;; => "9.97423999265871"

Well, it turns out what I said above about execution contexts wasn’t completely true. There’s exactly one sneaky function that breaks the rule: accept-process-output. It blocks the current execution context allowing some other execution contexts to run, including timers and I/O. However, it will lock up Emacs’ interface. nrepl-eval uses this function to poll for a response from the nREPL process.

When I saw this, a lightbulb went off in my head. This lone loophole in Emacs execution model can be abused to provide interesting benefits. Specifically, it can be used to create a latch synchronization primitive.

The full source code is here if you want to dive right in. I’ll be going over a simplified version piece-by-piece below.

https://github.com/skeeto/elisp-latch

The Latch Primitive

The idea of a latch is that a thread can wait on the latch, blocking its execution. It will remain in that state until another thread notifies the latch, releasing any threads blocked on the latch. Here’s how it might look in Lisp.

(defvar result nil)

(defvar my-latch (make-latch))

(defun get-result ()
  (if result
      result
    (wait my-latch) ; Block, waiting for the result
    result))

(defun set-result (value)
  (setf result value)
  (notify my-latch)) ; Release anyone waiting on my-latch

The pattern above is similar to a promise, which we will later implement on top of latches. In our latch implementation I’d also like to optionally pass a value from notify to anyone waiting, which would make the above simpler.

Emacs doesn’t have threads but instead non-preemptive execution contexts. Ignoring the Emacs UI lockup, we can mostly ignore that distinction for now.

To exploit accept-process-output each latch needs to have its own process object. When blocking on a latch it will simply wait for that process to receive input. To notify a latch, we need to send data to that process.

For the process, we’ll ask Emacs to make a pseudo-terminal “process.” It’s basically just a pipe for Emacs to talk to itself. It’s possible to literally make a pipe, which is better for this purpose, but that’s currently broken. To make such a process, we call start-process with nil as the program name (third argument).

Let’s start by making a new class called latch.

(require 'eieio)

(defclass latch ()
  ((process :initform (start-process "latch" nil nil))
   (value :initform nil)))

This class has two slots, process and value. The process slot holds the aforementioned process we’ll be blocking on. The value slot will be used to pass a value from notify to wait. The process slot is initialized with a brand new process object upon instantiation.

(defmethod wait ((latch latch))
  (accept-process-output (slot-value latch 'process))
  (slot-value latch 'value))

(defmethod notify ((latch latch) &optional value)
  (setf (slot-value latch 'value) value)
  (process-send-string (slot-value latch 'process) "\n"))

To wait, call accept-process-output on the latch’s private process. This function won’t return until data is sent to the process. By that time, the value slot will be filled in with the value from notify.

To notify, send a newline with process-send-string. The data to send is arbitrary, but I wanted to send as little as possible (one byte) and I figure a newline might be safer when it comes to flushing any sort of buffer. Buffers tend to flush on newlines. Before sending data, we set the value slot to the value that wait will return.

That’s basically it! However, processes are not garbage collected by Emacs, so we need a destroy destructor method. The name destroy here is not special to Emacs. It’s something for the user of the library to call.

(defmethod destroy ((latch latch))
  (ignore-errors
    (delete-process (slot-value latch 'process))))

(defun make-latch ()
  (make-instance 'latch))

I also made a convenience constructor function make-latch, with the conventional name make-, since users shouldn’t have to call make-instance for our classes.

That’s enough to turn skewer-eval into a synchronous function.

(defun skewer-eval-synchronously (js-code)
  (lexical-let ((latch (make-latch)))
    (skewer-eval js-code (apply-partially #'notify latch))
    (prog1 (wait latch)
      (destroy latch))))

In combination with lexical-let, apply-partially returns a closure that will notify the latch with the return value passed to it from skewer. We need to get the return value from wait, destroy the latch, then return the value, so I use a prog1 for this.

One-use Latches

In my experimenting, I noticed the prog1 pattern coming up a lot. Having to destroy my latch after a single use was really inconvenient. Fortunately this pattern can be captured by a subclass: one-time-latch.

(defclass one-time-latch (latch)
  ())

(defun make-one-time-latch ()
  (make-instance 'one-time-latch))

(defmethod wait :after ((latch one-time-latch))
  (destroy latch))

This subclass destroys the latch after the superclass’s wait is done, through an :after method (purely for side-effects). CLOS is fun, isn’t it?

(defun skewer-eval-synchronously (js-code)
  (lexical-let ((latch (make-one-time-latch)))
    (skewer-eval js-code (apply-partially #'notify latch))
    (wait latch)))

There, that’s a lot more elegant.

If eieio was a more capable mini-CLOS I could also demonstrate a countdown-latch, but this would require an :around method. Most uses of notify would need to skip over the superclass method.

Promises

We can build promises on top of our latch implementation. Basically, a promise is a one-time-latch where we can query the notify value more than once. In a one-time-latch we can only wait once.

Our promise will have two similar methods, deliver (like notify), and retrieve (like wait). If a value has been delivered already, retrieve will return that value. Otherwise, it will block and wait until a value is delivered,

(defclass promise ()
  ((latch :initform (make-one-time-latch))
   (delivered :initform nil)
   (value :initform nil)))

(defun make-promise ()
  (make-instance 'promise))

It has three slots, the one-time-latch used for blocking, a Boolean determining the delivery status, and the value of the promise.

(defmethod deliver ((promise promise) value)
  (if (slot-value promise 'delivered)
      (error "Promise has already been delivered.")
    (setf (slot-value promise 'value) value)
    (setf (slot-value promise 'delivered) t)
    (notify (slot-value promise 'latch) value)))

(defmethod retrieve ((promise promise))
  (if (slot-value promise 'delivered)
      (slot-value promise 'value)
    (wait (slot-value promise 'latch))))

A promise can only be delivered once, so it throws an error if it is attempted more than once. Otherwise it updates the promise state and releases anything waiting on it.

What to do with this?

Locking up Emacs’ UI really limits the usefulness of this library. Since Emacs’ primary purpose is being a text editor, it needs to remain very lively or else the user will become annoyed. If I used a synchronous version of skewer-eval, Emacs would completely lock up (easily interrupted with C-g) until the browser responds — which would be never if no browser is connected. That’s unacceptable.

Also, not very many Emacs functions have the callback pattern. The only core function I’m aware of that does is url-retrieve, but it already has a url-retrieve-synchronously counterpart.

Please tell me if you have a neat use of any of this!

An Emacs Pastebin

2012-12-29T00:00:00Z

Luke is doing an interesting ~~three~~five-part tutorial on writing a pastebin in PHP: PHP Like a Pro (2, 3, 4, 5). The tutorial is largely an introduction to the set of tools a professional would use to accomplish a more involved project, the most interesting of which, for me, is Vagrant.

Because I have no intention of ever using PHP, I decided to follow along in parallel with my own version. I used Emacs Lisp with my simple-httpd package for the server. I really like my servlet API so was a lot more fun than I expected it to be! Here’s the source code,

https://github.com/skeeto/emacs-pastebin

Here’s what it looked like once I was all done,

It has syntax highlighting, paste expiration, and light version control. The server side is as simple as possible, consisting of only three servlets,

/pastebin/: static files
/pastebin/get: serves (immutable) pastes in JSON
/pastebin/post: accepts new pastes in JSON, returns the ID

A paste’s JSON is the raw paste content plus some metadata, including post date, expiration date, language (highlighting), parent paste ID, and title. That’s it! The server is just a database and static file host. It performs no dynamic page generation. Instead, the client-side JavaScript does all the work.

For you non-Emacs users, the repository has a pastebin-standalone.el which can be used to launch a standalone instance of the pastebin server, so long as you have Emacs on your computer. It will fetch any needed dependencies automatically. See the header comment of this file for instructions.

IDs

A paste ID is four or more randomly-generated numbers, letters, dashes or underscores, with some minor restrictions (pastebin-id-valid-p). It’s appended to the end of the servlet URL.

/pastebin/
/pastebin/get/

In the first case, the servlet entirely ignores the ID. Its job is only to serve static files. In the second case the server looks up the ID in the database and returns the paste JSON.

The client-side inspects the page’s URL to determine the ID currently being viewed, if any. It performs an asynchronous request to /pastebin/get/ to fetch the paste and insert the result, if found, into the current page.

Form submission isn’t done the normal way. Instead, the submission is intercepted by an event handler, which wraps the form data up in JSON (much cleaner to parse!) and sends it asynchronously to /pastebin/post via POST. This servlet inserts the paste in the database and responds in text/plain with the paste ID it generated. The client-side then redirects the browser to the paste URL for that paste.

Features

As I said, the server performs no page generation, so syntax highlighting is done in the client with highlight.js. I could have used htmlize and supported any language that Emacs supports. However, I wanted to keep the server as simple as possible, and, more importantly, I really don’t trust Emacs’ various modes to be secure in operating on arbitrary data. That’s a huge attack surface and these modes were written without security in mind (fairly reasonable). It’s actually a deliberate feature for Emacs to automatically eval Elisp in comments under certain circumstances.

Version control is accomplished by keeping track of which paste was the parent of the paste being posted. When viewing a paste, the content is also placed in a textarea for editing. Submitting this form will create a new paste with the current paste as the parent. When viewing a paste that has a parent, a “diff” option is provided to view a diff patch of the current paste with its parent (see the screenshot above). Again, the server is dead simple, so this patch is computed by JavaScript after fetching the parent paste from the server.

Databases

As part of my fun I made a generic database API for the servlets, then implemented three different database backends. I used eieio, Emacs Lisp’s CLOS-like object system, to implement this API. Creating a new database backend is just a matter of making a new class that implements two specific methods.

The first, and default, implementation uses an Elisp hash table for storage, which is lost when Emacs exits.

The second is a flat-file database. I estimate it should be able to support at least 16 million different pastes gracefully. The on-disk format for pastes is an s-expression. Basically, this is read by Emacs, expiration date checked, converted to JSON, then served to the client.

To my great surprise there is practically no support for programmatic access to a SQL database from GNU Emacs Lisp (other Emacsen do). The closest I found was pg.el, which is asynchronous by necessity. However, the specific target I had in mind was SQLite.

I did manage to implement a third backend that uses SQLite, but it’s a big hack. It invokes the sqlite3 command line program once for every request, asking for a response in CSV — the only output format that seems to escape unambiguously. This response then has to be parsed, so long as it’s not too long to blow the regex stack.

Update February 2014: I have found a solution to this problem!

Future

This has been an educational project for me. As a tutorial and for practice I’ll probably write the server again from scratch using other languages and platforms (Node.js and Hunchentoot maybe?), keeping the same front-end.

How a simple-httpd Vulnerability Slipped In

2012-12-18T00:00:00Z

Over Thanksgiving weekend I discovered a vulnerability in my recent simple-httpd overhaul. I fixed it immediately and pushed out the patch. Despite being careful about the translation of request paths to filesystems paths, one thing slipped by. Here’s how.

When writing my original web server a couple of years ago I established a global variable, httpd-root, which is the location on the filesystem from where all files are served. Nothing above this directory should be visible to clients in any way. The simple, dangerous way to do this is with a plain concat.

(concat httpd-root request-path)

The vulnerability here is that request-path could contain .., a reference to the parent directory. This would allow a request to access anything on the filesystem. This was obvious to me at the time, so I wrote a httpd-clean-path to remove any .. portions in the request. As long as httpd-root isn’t an empty string, this closes all the holes.

(concat httpd-root (httpd-clean-path request-path))

A couple of years later after, I’ve honed my Emacs Lisp skill, I go through refactoring everything. I’ve since learned about the function expand-file-name and when I see concat being used to build a path, I change it to this function, which more appropriate for the job. This happened in commit 3b405343 (2012-08-07). I’m using httpd-clean-path to handle everything dangerous, so it’s safe, right?!

(expand-file-name (httpd-clean-path request-path) httpd-root)

When the path being expanded by expand-file-name is an absolute path, that path is returned directly, ignoring the second argument. Unfortunately, anything beginning with ~ is an absolute path, because this automatically expands into a home directory. With concat this didn’t need to be handled because any ~ was always prepended with httpd-root. Now that I was using expand-file-name, this allowed everyone read-access to everything in the hosting user’s home directory if the request path started the request path with a ~. Doh!

(expand-file-name "~foo" "/etc")  ; => "/home/foo"

The fix is dead simple: prefix the cleaned path with a ./ before using expand-file-name. This forces the path to be relative so that it’s expanded properly.

(expand-file-name (concat "./" "~foo") "/etc")  ; => "/etc/~foo"

I apologize to anyone using simple-httpd. The fix has been on MELPA for about a month now, so make sure you’re updated. I myself have a long-running simple-httpd instance exposed to the Internet (impatient-mode makes for a wonderful pastebin!), and when I found this my stomach sunk in panic. I do keep an eye on my *httpd* log and I never saw anyone exploit this. Due to simple-httpd’s obscurity I’m pretty sure no one else discovered this anyway. The vulnerability only existed for about three months before I caught it. If you are able to find any other vulnerability please tell me!

Elisp Weak References

2012-12-17T00:00:00Z

Today I added a skewer-eval-print-last-expression function to Skewer, functionality I’ve been sorely missing for awhile. To properly support it I needed a hash table with automatically expiring entries. Specifically, I needed to keep track of state in Emacs that I couldn’t trust the (untrusted) browser to track for me. The alternative would be to send an encrypted blob to the browser along with code to evaluate, which would send back with the result. Instead of getting into questionable, hand-rolled encryption I wrote an expiring hash table implementation.

This had me take a careful look over Elisp’s hash table documentation, which reminded me of a cool feature they have: key/value weakness. The hash table can be configured such that it doesn’t prevent its keys and values from being garbage collected. Elisp’s hash tables are really flexible in this regard; any combination of key and value weakness is supported. This is more flexible than Java’s WeakHashMap, which only supports weak keys. For example, to make a hash table that weakly holds its values,

(make-hash-table :weakness 'value)

Oddly, Elisp lacks functionality to use weak references more generally. Fortunately this can be fixed!

(defun weak-ref (thing)
  (let ((ref (make-hash-table :size 1 :weakness t :test 'eq)))
    (prog1 ref
      (puthash t thing ref))))

(defun deref (ref)
  (gethash t ref))

weak-ref wraps an object in a weak hash table of size 1 under the key t. The second function, deref, fetches the object from the hash table if it’s still there. Otherwise it returns nil. Here it is in action,

(setq ref (weak-ref (list 1 2 3)))

;; It's still there.
(deref ref)  ; => (1 2 3)

;; Now run garbage collection.
(garbage-collect)

;; The list has been garbage collected.
(deref ref)  ; => nil

I had to use setq here instead of defvar because garbage collection seems to always get triggered after defvar.

I don’t have a use-case for this at the moment. Weak references are mostly useful in hash tables (caches), and these functions would be entirely redundant in that case. I originally implemented these as macros, but I feel it made them too inflexible — they couldn’t be passed as a function.

A Use For Macrolet

2012-12-06T00:00:00Z

I recently had a good use for Common Lisp’s macrolet special operator. Just as let establishes a new variable bindings and flet establishes new function bindings, macrolet establishes a new macro definitions.

For example, here’s a locally-defined anaphoric lambda macro called fn.

(macrolet ((fn (&body body) `(lambda (_) ,@body)))
  (map 'string (fn (if (standard-char-p _) _ #\*)) "naïve"))
;; => "na*ve"

My particular use case was about making my code cleaner for a brainfuck interpreter. The state of the machine was being tracked by this struct. (Interesting side note: SBCL warns about using p as a slot name because the accessor function will look like a predicate.)

(defstruct bf
  (p 0)
  (mem (make-array 30000 :initial-element 0)))

The BF instructions + and - increment the byte at the data pointer. The Common Lisp incf and decf macros can be used to do this. Similarly, the , instruction sets the byte at the data pointer, which can be done with setf. All three of these macros are place-modifying.

(defun interp (program state)
  ;; ...
  (incf (aref (bf-mem state) (bf-p state)))
  ;; ...
  (decf (aref (bf-mem state) (bf-p state)))
  ;; ...
  (setf (aref (bf-mem state) (bf-p state)) (char-code (read-char))))

That’s a whole lot of redundancy for a Lisp program. Under similar circumstances elsewhere I might use flet to reduce it.

;; This won't work.
(defun interp (program state)
  (flet ((ref () (aref (bf-mem state) (bf-p state))))
    ;; ...
    (incf (ref))
    ;; ...
    (decf (ref))))

The problem is that ref isn’t a generalized reference, which incf, decf, and setf all require. Common Lisp’s place-modifying utilities are implemented as macros. It’s known at compile-time what kind of place they are modifying: a variable, array index, object/struct slot, car, cdr, or many other things (Emacs cl package allows all sorts of things to be setfed, like (point)). The macro expands into the proper form for setting that kind of place.

The specific expansion is implementation-dependent, but, for example, setf could expand into a setq when the first argument is a symbol. New generalized references can be defined with defsetf.

In my case, a simple macro expansion can fill the role. Below, the place-modifying macro will expand ref (after looking elsewhere) to decide what to do, and ref will expand to an aref form.

(defun interp (program state)
  (macrolet ((ref () '(aref (bf-mem state) (bf-p state))))
    ;; ...
    (incf (ref))
    ;; ...
    (decf (ref))
    ;; ...
    (setf (ref) (char-code (read-char)))))

Because the macro has no parameters I could have even more easily used symbol-macrolet. I just didn’t think of it at the time.

Emacs Abnormal Termination

2012-09-28T00:00:00Z

Update: This bug was fixed in Emacs 24.4 (released October 2014).

A few months ago I filed a bug report for Emacs (upstream) when I stumbled across Emacs aborting under very specific circumstances. I was editing in markdown-mode and a regular expression replacement on lists would reliably, and frustratingly, cause Emacs to crash.

Through a sort-of binary search I only loaded only half of markdown-mode to see in which half it would trigger, then I cut that half in half again and repeated recursively until I had it down to a small expression that causes a --no-init-file (-q) Emacs to abort. It almost looks like I found it through fuzz testing. Change or remove anything even slightly and it no longer triggers the abort.

To trigger it, there’s an after-change-functions hook that performs a regular expression search immediately after a replace-regexp. A peek at the backtrace with gdb shows that this somehow causes the point to leave the bounds of the buffer. Emacs detects this as an assertion before dereferencing anything, and it aborts, thus preventing a buffer overflow vulnerability. This is important for my Emacs web server because if there’s a way to trigger this bug in the web server I’d much rather have it abort than run arbitrary shellcode injected in by a malicious HTTP request.

My bug report has seen no activity since I posted it. I can understand why. The circumstances to trigger it are unlikely and it’s a very old bug, so it’s low priority. It’s also a huge pain to debug. Hacking on Emacs from Lisp is pleasant but hacking on Emacs from C is not. The bug likely sits in the bowels of the complicated regular expression engine, making it even more unpleasant. I personally have no interest in trying to fix it myself.

So, since it looks like it’s here for the long haul it’s kind of fun to implement an abort function on top of it, allowing Elisp programs to terminate Emacs abnormally — you know, in case kill-emacs isn’t fun enough.

(defun abort ()
  "Ask Emacs to abnormally terminate itself (bug#12077)."
  (interactive)
  (with-temp-buffer
    (insert "#\n*\n")
    (goto-char (point-min))
    (add-hook 'after-change-functions
              (lambda (a b c) (re-search-forward "")))
    (replace-regexp "^\\*" " *")))

It’s interactive so you could even bind a key to it.

Programs as Elisp Macros

2012-09-21T00:00:00Z

This evening I came across an interesting idea: using system programs as functions. The original idea goes to sh, a Python module that exposes system programs as functions. There’s also a Clojure library called shake to do the same thing in Clojure.

Thanks to symbols, I think the idea maps especially well onto Lisp because arguments don’t need to be provided as strings. Here are some examples,

(ls -lh)
(uname -a)
(cat /etc/debian_version)
(git checkout -b foo)

It’s easy to achieve the same effect in Elisp,

;;; -*- lexical-binding: t; -*-
(require 'cl)

(defun make-shell-macro (program)
  (fset program
        (cons 'macro
              (lambda (&rest args)
                `(with-temp-buffer
                   (funcall #'call-process
                            ,(symbol-name program) nil t nil
                            ,@(mapcar #'prin1-to-string args))
                   (buffer-string))))))

(let ((path (mapcan #'directory-files (parse-colon-path (getenv "PATH")))))
  (dolist (program (remove-if (lambda (f) (member f '("." ".."))) path))
    (let ((symbol (intern program)))
      (unless (fboundp symbol)
        (make-shell-macro symbol)))))

Evaluating the above will install macros for all programs in your PATH, except where you already have functions or macros defined. I messed up on the latter point while writing this and broke Emacs enough to require a restart. The system program is called synchronously and the output is returned as a string.

However, because arguments aren’t evaluated (macros) this has limited usefulness. These function calls are static and can’t be passed variable arguments. In order to do that arguments would need to be evaluated and symbols would need to be quoted. For example,

(defun git-checkout (branch)
  (git 'checkout branch))

(defun ls-l (file)
  (ls '-l file))

So I think I’d prefer this interface to the one provided by Clojure’s shake (and my Elisp code at the top). I have little need to call programs with static arguments.

Elisp Recursive Descent Parser (rdp)

2012-09-20T00:00:00Z

I recently developed a recursive descent parser, named rdp, for use in Emacs Lisp programs. I’ve already used it to write a compiler.

https://github.com/skeeto/rdp

It’s available as a package on MELPA.

The Long Story

Last month Brian invited me to take a free, online programming languages course with him. You may recall that we developed a programming language together so it was only natural we would take this class.

The first part of the class is oriented around a small programming language created just for this class called ParselTongue. It looks like this:

deffun evenp(x)
    if ==(x, 0) then
        true
    else if ==(x, 1) then
            false
        else evenp(-(x, 2))
in defvar x = 14 in {
    while (evenp(x)) { x--; };   # Make sure x odd
    print("This is an odd number: ");
    print(x);
    ""; # No output
}

I’ve gotten so used to having a solid Emacs major mode when coding that I can’t stand writing code without the support of a major mode. Since this language was invented recently just for this class there was no mode for it, nor would there be unless someone stepped up to make one. I ended up taking that role. It was an opportunity to learn how to create a major mode, something I had never done before.

It’s called psl-mode.

At first it was just some syntax highlighting (very easy) and some poor automatic indentation. The indentation function would get confused by anything non-trivial. It’s actually really hard to get it right. I’ve grown a much better appreciation for automatic indentation in other modes.

In an attempt to improve this I decided I would try to fully parse the language and use the resulting parse tree to determine indentation — something like the depth of the pointer in the tree. My experience with Perl’s Parse::RecDescent some years ago was very positive and I wanted to reproduce that effect. However, rather than write the grammar in a separate language that mixes in the programming language, which I find extremely messy, instead I wanted to use pure s-expressions. A grammar looks very nice as an alist of symbols.

Arithmetic Parser

For example, here’s a grammar for simple arithmetic expressions, including operator precedence and grouping (i.e. “4 + 5 * 2.5”, “(4 + 5) * 2.5”, etc.).

(defvar arith-tokens
  '((sum       prod  [([+ -] sum)  no-sum])
    (prod      value [([* /] prod) no-prod])
    (num     . "-?[0-9]+\\(\\.[0-9]*\\)?")
    (+       . "\\+")
    (-       . "-")
    (*       . "\\*")
    (/       . "/")
    (pexpr     "(" [sum prod num pexpr] ")")
    (value   . [pexpr num])
    (no-prod . "")
    (no-sum  . "")))

Strings are regular expressions , the only thing to actually match input text (terminals). Lists are sequences, where each element in the list must match in order. Vectors (in brackets) are choices where one of the elements must match. Symbols name an expression so that it can be referred to by other expression recursively.

Give this alist to the parser and it will return an s-expression of the parse tree of the current buffer. Due to the way the grammar must be written this parse tree isn’t really pleasant to handle directly. For example, a series of multiplications (“1 * 2 * 3 * 4”) wouldn’t parse to a nice flat list but with further depth for each additional operand.

To help squash these, the parser will accept an alist of symbols and functions which process the parse tree at parse time. For example, these corresponding functions will make sure "4 * 5 * 6" gets parsed into (* 4 (* 5 (* 6 1))).

(defun arith-op (expr)
  (destructuring-bind (a (op b)) expr
    (list op a b)))

(defvar arith-funcs
  `((sum     . ,#'arith-op)
    (prod    . ,#'arith-op)
    (num     . ,#'string-to-number)
    (+       . ,#'intern)
    (-       . ,#'intern)
    (*       . ,#'intern)
    (/       . ,#'intern)
    (pexpr   . ,#'cadr)
    (value   . ,#'identity)
    (no-prod . ,(lambda (e) '(* 1)))
    (no-sum  . ,(lambda (e) '(+ 0)))))

Notice how normal Emacs functions could be supplied directly in most cases! That makes this approach so elegant in my opinion.

Also, in arith-op note the use of destructuring-bind. I’ve found that macro to be invaluable when writing these syntax tree functions.

In this case, we can be even more clever. Rather than build a nice parse tree, the expression can be evaluated directly. All it takes is one small change,

(defun arith-op (expr)
  (destructuring-bind (a (op b)) expr
    (funcall op a b)))

With this, the parser returns the computed value directly. So this evaluates to 120.

(rdp-parse-string "4 * 5 * 6" arith-tokens arith-funcs)

ParselTongue Compiler

I discovered this useful side effect while making my ParselTongue parser. The original intention was that I’d parse the buffer for use in indentation, then maybe I’d create an interpreter to evaluate the parser output. However, the resulting parse tree was looking a lot like Elisp. In an epiphany I realized I could simply emit valid Elisp directly and forgo writing the interpreter altogether. And so I accidentally created a ParselTongue compiler! This was incredibly exciting for me to realize.

This ParselTongue program,

defvar obj = {x: 1} in { obj.x }

Compiles to this Elisp,

(let ((obj (list (cons 'x 1))))
  (progn (cdr (assq 'x obj))))

Because it compiles to such a high level language, and because ParselTongue is very Lisp-like semantically, it’s a bit unconventional: the compiler emits code during parsing. In fact, when the parser backtracks, some emitted code is thrown away.

By the end of the first evening I had implemented the majority of the compiler, which quickly took precedence over indentation. The compiler is now integrated as part of psl-mode. The current buffer can be evaluated at any time with psl-eval-buffer. This function compiles the buffer and has Emacs eval the result, printing the output in the minibuffer. Compiler output can be viewed with psl-show-elisp-compilation (mostly for my own debugging).

After a few days I integrated indentation with parsing, which required modifying the parser (changes included in rdp itself). The parser needed to keep track of where the point is in the parse tree. For indentation it basically counts the depth into the parse tree, plus a few more checks for special cases.

The parser was intentionally isolated from the rest of psl-mode so that it could be separated for general use, which I have now done. It’s been a really handy general purpose tool since then. That arithmetic parser is only 35 lines of code and took about half-an-hour to create.

Future Directions

I also wrote a bencode parser — only the bencode-tokens and bencode-funcs alists are needed to parse bencode, about 30 LOC. Careful observation will reveal that I cheated and the result is a little hackish. Due to the way strings work, bencode is not context-free so it can’t be parsed purely by the grammar. I can work around it by having the parse tree function for strings consume input, since it’s called during parsing.

I’ll be using rdp to parse many more things in the future, I’m sure. It’s much more powerful than I expected.

Fractal Rendering in Emacs

2012-09-14T00:00:00Z

Taking advantage of Emacs’ image-mode and the handy Netpbm format it’s possible to generate and render images inside Emacs using Elisp. This function will generate a Sierpinski carpet and display the result in a buffer.

(defun sierpinski (s)
  (pop-to-buffer (get-buffer-create "*sierpinski*"))
  (fundamental-mode) (erase-buffer)
  (labels ((fill-p (x y)
                   (cond ((or (zerop x) (zerop y)) "0")
                         ((and (= 1 (mod x 3)) (= 1 (mod y 3))) "1")
                         (t (fill-p (/ x 3) (/ y 3))))))
    (insert (format "P1\n%d %d\n" s s))
    (dotimes (y s) (dotimes (x s) (insert (fill-p x y) " "))))
  (image-mode))

It’s best called with powers of three,

(sierpinski (expt 3 5))

This one should look quite familiar. Using the same technique,

(defun mandelbrot ()
  (pop-to-buffer (get-buffer-create "*mandelbrot*"))
  (let ((w 400) (h 300) (d 32))
    (fundamental-mode) (erase-buffer)
    (set-buffer-multibyte nil)
    (insert (format "P6\n%d %d\n255\n" w h))
    (dotimes (y h)
      (dotimes (x w)
        (let* ((cx (* 1.5 (/ (- x (/ w 1.45)) w 0.45)))
               (cy (* 1.5 (/ (- y (/ h 2.0)) h 0.5)))
               (zr 0) (zi 0)
               (v (dotimes (i d d)
                    (if (> (+ (* zr zr) (* zi zi)) 4) (return i)
                      (psetq zr (+ (* zr zr) (- (* zi zi)) cx)
                             zi (+ (* (* zr zi) 2) cy))))))
          (insert-char (floor (* 256 (/ v 1.0 d))) 3))))
    (image-mode)))

Tweak it with a colormap,

(defun colormap (v)
  "Given a value between 0 and 1.0, insert a P6 color."
  (dotimes (i 3)
    (insert-char (floor (* 256 (min 0.99 (sqrt (* (- 3 i) v))))) 1)))

One of the project ideas on my mental back-burner of things I’ll never get to is to create a little graphics library for Elisp. It would use a technique like this to pull it off. Assuming support was compiled in, Emacs can even render SVGs to a buffer, so creating a rich graphics library wouldn’t be difficult at all. Plus, unlike bare Elisp, it would be fast.

Markov Chain Text Generation

2012-09-05T00:00:00Z

You may have been confused by yesterday’s nonsense post. That’s because it was generated by a few Elisp Markov chain functions. It was fed my entire blog and used to generate a ~1500 word post. I tidied up a bit to make sure the markup was valid and parenthesis were balanced, but that’s about it.

The algorithm is really simple and I was quite surprised by the quality of the output. After feeding it Great Expectations and A Princess of Mars (easily obtainable from Project Gutenberg) I had a good laugh at some of the output. Some choice quotes,

He wiped himself again, as if he didn’t marry her by hand.

I admit having done so, and the summer afternoon toned down into the house.

My favorite of yesterday’s post was this one,

Suppose you want to read a great story, I recommend it.

The output also looks like some types of spam, so this may be how some spammers generate content in order to get around spam filters.

To build a Markov chain from input, the program looks at markov-text-state-size words (default 3) and makes note of what word follows. Then it slides the window forward one word and repeats. To generate text, the last markov-text-state-size words outputted is the state and the next word is selected from these notes at random, weighted by the frequency of its appearance in the input text. Smaller state sizes generates more random output and larger state sizes generates better structured output. Too large and the output is the input verbatim.

For example, given this sentence and a state size of two words,

Quickly, he ran and he ran until he couldn’t.

The produced chain looks like this in alist form,

((("Quickly," "he") "ran")
 (("he" "ran") "and" "until")
 (("ran" "and") "he")
 (("and" "he") "ran")
 (("ran" "until") "he")
 (("until" "he") "couldn't.")
 (("he" "couldn't.")))

Because there are two options for (“he” “ran”), the generator might loop around that state for awhile like so,

Quickly, he ran and he ran and he ran and he ran until he couldn’t.

Or it might skip the section altogether,

Quickly, he ran until he couldn’t.

Also notice that the punctuation is part of the word. This makes the output more natural, automatically forming sentences. More so, my program also holds onto all newlines. This breaks the output into nice paragraphs without any extra effort. Since I wrote it in Elisp, I use fill-paragraph to properly wrap the paragraphs as I generate them, so superfluous single newlines don’t hurt anything.

One problem I did run into with my input text was quotes. I was using novels so there is a lot of quoted text (character dialog). The generated text tends to balance quotes poorly. My solution for the moment is to strip these out along with spaces when forming words. That’s still not ideal.

I’m going to play with this a bit more, using it as a tool for other project ideas (ERC bot, etc.). I already did this by including a lorem ipsum generator alongside the markov-text package. The input text is Cicero’s De finibus bonorum et malorum, the original source of lorem ipsum. This was actually the original inspiration for this project, after I saw lorem-ipsum.el on EmacsWiki and decided I could do better.

simple-httpd and impatient-mode

2012-08-20T00:00:00Z

After settling in with MELPA I wanted to see about into turning my Emacs web server into an installable package. Someone had already uploaded my code to Marmalade after taking credit for all the work and slapping the GPL on it (my version is public domain). So, due to that and because the name httpd.el is already overloaded as it is, I renamed it to simple-httpd. That’s the name of the package in MELPA.

I did more than rename the package; it got an overhaul. I rewrote a few functions, tossed a whole bunch of functions, created a test suite, and finally added directory listings — a feature that had long been on the TODO list. To keep with the name “simple”, I ripped out the clunky servlet system (sorry Chunye). This new version was leaner, cleaner, and more useful.

I’ve definitely improved my software development skill over the last three years since I originally wrote it. In my refactor I made it buffer oriented. When a request comes in, the server fills a buffer with the response and sends it back. This means I could send a Content-Length header and use keep-alive to serve multiple requests over one connection. It also suggested a new servlet paradigm — the servlet prepares a buffer and the server sends it to the client.

Servlets

So I ended up adding servlet support again, from scratch. This time it’s really easy to use. Here’s a “Hello, World” servlet,

(defservlet hello-world text/plain ()
  (insert "Hello, World"))

The “function name” part is the path to the servlet. This one would be found at /hello-world. The second is the MIME type as a symbol. We’re just sending plain text in this example. The third is the argument list. A servlet takes up to three arguments: the path, the query alist, and the full request object (which includes the first two). Unless a more specific servlet is defined, this servlet handles everything under its root. In this case /hello-world, including /hello-world/foo and /hello-world/foo/bar.txt. This is why the path argument is relevant.

This servlet uses the path to get a name,

(defservlet hello text/plain (path)
  (insert "hello, " (file-name-nondirectory path)))

If you visit /hello/Chris it will send you “Hello, Chris”. Servlets are trivial to write!

This one serves the contents of the *scratch* buffer,

(defservlet scratch text/plain ()
  (insert-buffer-substring (get-buffer "*scratch*")))

In the background I continue to use Chunye’s symbol dispatch technique, so all servlets are actually functions that begin with httpd/ (http/hello-world and httpd/hello). For a more advanced servlet, the function can be written directly. There’s another macro, with-httpd-buffer to help keep this simple. The server will always pass four arguments (the three servlet arguments plus one more), so when creating the function directly it needs to accept at least four arguments.

(defun httpd/hello (proc path &rest args)
  (with-httpd-buffer proc "text/plain"
    (insert "hello, " (file-name-nondirectory path))))

The proc object here is the network connection process, providing more exclusive access to the client. This allows the servlet to do more interesting things like respond in the future (long polls). The with-httpd-buffer macro creates a temporary buffer and, when the body completes, sends an HTTP header and the buffer as the content, similar to defservlet.

With access to the process, the servlet can do more specialized things like send custom headers with httpd-send-header, send files with httpd-send-file, send an error page with httpd-error, or do redirects with httpd-redirect. The file server part of the server is actually just another a servlet as well: httpd/. This could be redefined to redirect the browser to our example servlet (HTTP 301).

(defun httpd/ (proc &rest args)
  (httpd-redirect proc "/hello-world"))

impatient-mode

I showed this to Brian, like I do everything, and he found my servlet concept to be compelling, especially the buffer-serving servlet. I believe his exact words were, “That’s so simple.” He found it interesting enough that he wrote a mode based on it called impatient-mode!

It serves a buffer’s content live to the web browser, including syntax highlighting (via htmlize). Updates to the buffer are communicated by a long-poll. The browser initiates a request in the background for an update. Emacs adds the request to a list. A hook in after-change-functions updates all the browsers waiting for an update.

Enabling impatient-mode, a minor mode, publishes the buffer. If the server’s running, the list of published buffers can be found under /imp — i.e. http://localhost:8080/imp. The buffer can be accessed directly at /imp/live/, which is where /imp will link.

Perhaps the coolest thing is serving an HTML buffer without htmlize. That is, send the raw buffer as text/html. Brian has a demo of this in the linked post. You can tweak CSS and HTML and watch it update live in the browser as you edit. It’s a really neat way to edit CSS, since it’s often unintuitive (at least for me).

impatient-mode can also be installed through MELPA.

Elisp Unit Testing with ERT

2012-08-15T00:00:00Z

Emacs 24 comes with a unit testing library, ERT (Emacs Lisp Regression Testing). I learned about it after watching Extending Emacs Rocks! and I’ve been using it ever since. It’s been a pleasant experience; enough so that I made a key binding for it so that I can effortlessly run tests at any time. When I recently made a major overhaul to my Emacs web server I added a small test suite using ERT.

Emacs also comes with the ERT manual so it’s easy to start learning, but here’s the gist of it. There are essentially two macros to worry about: ert-deftest and should. The first is used to create tests and the second behaves like assert but with nicer behavior. Here’s an example,

(ert-deftest example-test ()
  (should (= (+ 9 2) 11)))

ert-deftest is what you’d expect from every other def*. The empty parameter list does nothing at the moment other than to make it feel like writing a defun. The body is evaluated as normal. This is all turned into an anonymous function which is stuffed in the plist of the symbol example-test. When it comes time to running tests, they are found by searching the plists of every interned symbol.

The other macro, should, takes one argument: a form that should evaluate to true. There is also a should-not and a should-error, which do what you would expect.

Tests are run with M-x ert. It will ask for a test selector, where t selects all defined tests. There are many ways to select a subset of all tests (:new, :passed, :failed, etc.) but I usually just run all of them (as my key binding makes obvious). The results are displayed in a separate pop-up buffer which, as usual, can be dismissed with q.

Running ERT

What makes should special is error reporting. When tests fail you will be provided with the forms that failed and their return values. For example, if we modify the test above to fail.

(ert-deftest example-test ()
  (should (= (+ 9 2) 100)))

Then run the test and it will note the failure. There is also some red coloring not captured here.

F example-test
    (ert-test-failed
     ((should
       (=
        (+ 9 2)
        100))
      :form
      (= 11 100)
      :value nil))

Displayed are the forms we were comparing — (+ 9 2) and 100 — and what they evaluated to: (= 11 100). If I put the point at the test result and type . it will take me to the test definition so that I can start looking further. Or I can press b to see a backtrace, m to see all output messages from that test, or, if I’m in disbelief, r to rerun that test.

Mocking

Elisp’s dynamic bindings really come in handy when functions need to be mocked. For example, say I have a function that, at some point, needs to check whether or not a particular file exists. This would be done using file-exists-p. Creating or removing the file in the filesystem before the test isn’t a well-contained unit test. Tests running in parallel could interfere and there are a number of ways something could go wrong.

Instead I’ll temporarily override the definition of file-exists-p with a mock function using let’s cousin, flet. Note that file-exists-p is a C source function but I can still override it as if it was any regular lisp function.

(defun determine-next-action ()
  (if (file-exists-p "death-star-plans.org")
      'bring-him-the-passengers
    'tear-this-ship-apart))

(ert-deftest file-check-test ()
  (flet ((file-exists-p (file) t))
    (should (eq (determine-next-action) 'bring-him-the-passengers)))
  (flet ((file-exists-p (file) nil))
    (should (eq (determine-next-action) 'tear-this-ship-apart))))

This is a very simple mock. For a real unit test I might want the mock to return t for some filename patterns and nil for others. There’s an extension to ERT, el-mock.el, which assists in creating more complex mocks, but I haven’t used or needed it yet.

Since it’s so convenient I’m going to be using ERT more and more until it becomes second-nature.

Switching to the Emacs Lisp Package Archive

2012-08-12T00:00:00Z

Update June 2017: I no longer use Emacs’ package.el and instead manage packages and their dependencies (manually) through my own decentralized package system called gpkg (“git package”).

For those who are unaware, Emacs 24 was finally released this past June. I had been following the official repository for about a year before the release using what was becoming version 24, very quickly becoming dependent on several of the new features. Now that it’s been officially released I’m back to using a stable version of Emacs, about which I’m quite relieved.

One of the new features that I hadn’t been using until recently was the package manager, package, and the Emacs Lisp Package Archive (ELPA). You can now ask Emacs to download and install new modes and extensions from the Internet. By default, it only uses the official archive. It only hosts packages with copyright assigned to the FSF — quite restrictive. There are alternatives, the most popular of which is Marmalade. Fortunately it’s easy to ask package to use additional repositories, so this is a non-issue.

Because it was still unstable and buggy at the time, I avoided using it when setting up my configuration repository. Instead I opted to gather packages by way of Git submodules. I’d give package a shot once Emacs 24 was released. Once it was released in June it was just a matter of time until I invested into this new system.

The trigger was an e-mail from one of my readers, Rolando. He asked me if I could move my recently updated memoization function into its own repository and touch it up so that it could be turned into a package with MELPA, another alternative package repository. This forced me to finally investigate.

It turns out MELPA is really interesting. Each package is described by a “recipe” file, which is essentially just a tiny s-expression listing the repository URL. In the case of my memoization package,

(memoize :repo "skeeto/emacs-memoize"
         :fetcher github)

From a package maintainer’s point-of-view, this is fantastic. I don’t have to take any extra steps to publish updates to my package. I just keep doing what I do and it happens automatically. However, I need to be more careful about not pushing broken commits — which is why I started unit testing (to be covered in a future post). And I need to be extra careful with my SSH keys, since they’re now used to publish code that other people automatically trust and execute.

Excited about MELPA and wanting to actually use my own package, I started throwing out my submodules, replacing them with their package equivalents. If you follow my configuration repository you probably noticed all the recent disruption, because updating requires manual intervention. Git leaves submodules around (for good reason!) so they need to be manually removed.

I also heavily updated and renamed my web server (now called simple-httpd) to provide it as a package (also to be covered in a future post). Thanks to MELPA, I follow the package rather than my own repository since it follows so closely (< 1 hour).

Another barrier was that I was using an old version of Magit due to a bad interaction of modern versions with Wombat, my preferred color theme. After some face tweaking, I not only fixed it but I made it better than it was before. Sinking a an hour or two into these sorts of annoyances usually works out really well. I need to remind myself of this in the future when I run into annoyance issues.

Surprisingly, package doesn’t seem to be written with managed configuration in mind. The provided functionally is designed to be used interactively rather than programmatically. package-install is only meant to be invoked once, so care needs to be taken in listing packages in a configuration and doing everything in the right order. Here’s how I have it set up at the moment, after after listing the packages to use in my-packages,

(require 'package)
(add-to-list 'package-archives
             '("melpa" . "http://melpa.milkbox.net/packages/") t)
(package-initialize)
(unless package-archive-contents
  (package-refresh-contents))
(dolist (p my-packages)
  (when (not (package-installed-p p))
    (package-install p)))

Upgrading/updating is currently a manual process. Run package-refresh-contents, list the packages with list-packages, type U to mark updates, then x to execute the upgrade. Sometime I may work that into my configuration to be done automatically once-per-week or something.

I really look forward to making more use of the package manager, especially as packages can more easily become interdependent, reducing duplication of effort.

Programmatically Setting Lisp Docstrings

2012-08-02T00:00:00Z

I just updated my Elisp memoization function so that it’s no longer a dirty hack. To work around the lack of closures, due to the lack of lexical scope in Elisp, the original version used uninterned symbols to store the look-up table. The new version in the post uses lexical-let, which does the same thing internally to fake a closure. The new version in my dotfiles repository uses the brand new Emacs 24 lexical scoping.

It was “dirty” because it built a lambda function out of a list at run time, taking advantage of the way Elisp currently handles functions. The reason for this was that I wanted to inject the original documentation string into the new function which can’t normally be done when lambda is used the correct way. When I updated the function I fixed this as well. It uses a trick provided by Elisp, which is different than the Common Lisp way that I assumed.

Both Elisp and Common Lisp have a documentation function for programmatically accessing symbol documentation. The Elisp version only provides function documentation, so it only accepts one argument.

(defun foo ()
  "Foo."
  nil)

(documentation 'foo)
=> "Foo."

The Common Lisp version must be told what type of documentation to return, such as function or variable (defvar, defconst).

(documentation 'foo 'function)
=> "Foo."

As it might be expected, this is setf-able! It’s possible to update or modify documentation strings without needing to redefine the function.

(setf (documentation 'foo 'function) "New doc string.")

Unfortunately it’s not setf-able in Elisp. Instead you can set the function-documentation property of the symbol. The documentation function will prefer this over the string stored in the function itself.

(put 'foo 'function-documentation "Foo updated.")

(documentation 'foo)
=> "Foo updated."

The downside is that this is a second place to put docstrings, leading to surprising behavior for developers unaware of this hack.

(put 'foo 'function-documentation "Old docstring.")

(defun foo ()
  "New docstring."
  nil)

(documentation 'foo)
=> "Old docstring."

This can be fixed by setting the symbol property for function-documentation to nil.

(put 'foo 'function-documentation nil)

I prefer the Common Lisp method.

Literal Arrays and Vectors in Lisp

2012-07-17T00:00:00Z

Despite being a Lisper, Unlike Brian I haven’t gotten into Clojure yet. I’ve been following along at a safe distance. Due to a recent post of his I learned about a significant difference between Clojure and other Lisps when it comes to arrays/vectors.

In this recent post, Brian wrote a ClojureScript let-like macro to hide JavaScript asynchronous function chains so that they can be used just like regular synchronous functions. Follow Clojure’s style, the asynchronous functions are written inside a vector rather than a list to indicate to the macro that they’re special.

(doasync
  [text [fetch "/foo/json"]
   url (str text ".html")
   result [fetch url]
   _ (.show view result)
   _ [timeout 1000]
   _ (.makeEditable view)])

That sounded completely reasonable to me, since array literals are rarely used inside code Common Lisp. When they are used, it’s as a global constant.

A few days later when I was talking to Brian at the metaphorical water cooler he mentioned that the macro was actually conflicting with what he would normally write. Sometimes he really did want to use a vector literal in a let binding. Why would he do that? In Common Lisp, that’s just asking for trouble — same for Elisp and Scheme.

(let ((v #(1 2 3)))
  (foo v))

The reason why this is a bad idea is that the same exact array will always be passed to foo. The array is created once at read time by the reader and re-used for the life of that code. If anyone makes a modification to the array it will damage the array for everyone using it.

(defun foo ()
  #(1 2 3))
(eq (foo) (foo))
=> T

The safer method is to create a fresh array every time by not using a literal but instead calling vector.

(let ((v (vector 1 2 3)))
  (foo v))

Clojure data structures are immutable, including vectors, so using the same exact vector in multiple places is safe. That makes use literal vectors in code less awkward. But that still left a question hanging: why was Brian using literal vectors so often that he needed one so soon after writing this macro?

In Common Lisp, they’re not very useful because the elements are not evaluated by the parser. When this vector is evaluated the result is a vector where the second element is a list containing three atoms.

#(1 (+ 2 3) 4)
=> #(1 (+ 2 3) 4)

Evaluated arrays return themselves unchanged. To do most useful things, a fresh vector needs to be constructed piecemeal. If somehow the uniqueness of a literal array wasn’t an issue, they still couldn’t be used for much.

(defun foo (x)
  #(x x x))
(foo 10)
=> #(X X X)

To achieve the desired effect, the vector function needs to be used again. Because it’s a normal function call, the arguments are evaluated.

(defun foo (x)
  (vector x x x))
(foo 10)
=> #(10 10 10)

However, to my surprise, Clojure doesn’t work like this! Literal vectors have their elements evaluated and, if necessary, are created fresh on every use — exactly like a call to vector.

(defn foo [x]
  [x x x])
(foo 10)
=> [10 10 10]
(identical? (foo 10) (foo 10))
=> false

If the exact form of the vector is needed unevaluated, it needs to be quoted just like lists.

(defn foo [x]
  '[x x x])
(foo 10)
=> [x x x]
(identical? (foo 10) (foo 10))
=> true

After further reflection, I now feel like this is the right way to go about implementing vectors. When I was first learning Lisp the non-evaluating nature of arrays really caught me by surprise. Vectors should evaluate their elements by default; if the Common Lisp behavior is needed it can always be quoted. It’s impossible to “fix” any established Lisp of course, so I’m merely wishing this was the behavior defined decades ago.

To recap: normally in Lisp, vectors evaluate to themselves, like numbers and strings. Instead, evaluation of a vector should return a new vector containing the results of each of the element evaluated. Since Clojure’s data structures are immutable, the compiler can take a shortcut when it can guarantee each of a vector’s elements always evaluate to themselves, and have the vector evaluate to itself — purely as an optimization.

Fake Emacs Namespaces

2011-08-18T00:00:00Z

Back in May I wrote a crude defpackage function for Elisp, modeled after Common Lisp’s version. I’m calling them fakespaces.

https://github.com/skeeto/elisp-fakespace

It works like so (see example.el for detailed information on this code),

(require 'fakespace)

(defpackage example
  (:use cl ido)
  (:export example-main example-var eq-hello hello))

(defvar my-var 100
  "A hidden variable.")

(defvar example-var nil
  "A public variable.")

(defun my-func ()
  "A private function."
  my-var)

(defun example-main ()
  "An exported function. Notice we can access all the private
variables and functions from here."
  (interactive)
  (list (list (my-func) my-var) example-var
        (ido-completing-read "New value: " (list "foo" "bar"))))

(defun eq-hello (sym)
  (eq sym 'hello))

(end-package)

Notice end-package at the end, which is not needed in Common Lisp. That’s part of what makes it crude.

If you run those functions and try changing the assignment of non-exported symbols, you’ll see the namespace separation in action. my-var and my-func are a completely different symbols than the ones you’re seeing after end-package.

It’s really simple in how it works (it’s 40 lines of code). The defpackage macro takes a snapshot of the symbol table. Then new symbols get interned through various function and variable definitions. Finally end-package compares the current symbol table to the snapshot and uninterns any new symbols. These symbols will be unaccessible to other code, effectively giving them their own namespace.

Snapshots are pushed onto a stack, so it’s safe to create a new package within another package, as long as end-package is used properly. This is necessary when one namespaced package depends on another, because the dependency will tend to be loaded in the middle of defining the current package.

in-package is not provided, so there’s no way to get the symbols back to where they can be accessed. It’s impossible to modify a package using fake namespacing. Worst of all, implementing in-package is currently (and will likely always be) impossible. When symbols are uninterned they would need to be stored in a package symbol table for future re-interning. in-package’s job would be to unintern and store away the current package’s symbols and then place the new package’s symbols into the main symbol table.

However, symbols cannot be re-interned. This is because it’s impossible for a symbol to exist in two different obarrays at the same time, so the functionality is intentionally not provided. An obarray is an Elisp vector containing symbols. It’s treated like a hash table: the symbol is hashed to choose a location in the vector. If the slot is already taken, the symbol is invisibly chain behind the residing symbol by an inaccessible linked list. If the symbol was in two obarrays at once, it would need to be able to chain to two different symbols at the same time.

Providing access to symbols through a colon-specificed namespace (my-package:my-symbol) is also currently impossible — without hacking in C anyway.

There’s a neat trick to the :export list. The defpackage macro definition actually ignores that list altogether, because it works automatically. By the time defpackage is invoked, the listed symbols have already been interned by the reader, so they get stored in the snapshot.

I doubt I’ll ever make use of this for my own packages. This was mostly a fun exercise in toying with Elisp.

Elisp Function Composition

2010-11-15T00:00:00Z

During my recent Elisp hacking I've run into the situation enough times where I really wanted function composition that I officially implemented it for myself. While there is an apply-partially function, Elisp does not currently come with a compose function. Here's an Elisp definition,

;; ID: f0c736a9-afec-3e3f-455c-40997023e130
(defun compose (&rest funs)
  "Return function composed of FUNS."
  (lexical-let ((lex-funs funs))
    (lambda (&rest args)
      (reduce 'funcall (butlast lex-funs)
              :from-end t
              :initial-value (apply (car (last lex-funs)) args)))))

Here it is in action with three functions.

(funcall (compose 'prin1-to-string 'random* 'exp) 10)

I'll be using this in later posts (and linking back here when I do).

Emacs Find All Files

2010-09-30T00:00:00Z

Here's another bit of code I started using recently. I often find myself wanting to open — or reopen after kill-matching-buffers — all the files under a specific point in the file system. I'm using it at work now to open up all the source files in a deep Java source tree on small-ish project. Once it's all open I can switch to any file quickly with ido's fuzzy matching, flattening out the directory structure a bit. (And the ridiculous "security" software at work imposes a 3-second I/O block when opening files, so I get to pay this all up front at once rather than having it later break my flow.)

This just recursively travels down the sub-directories opening a buffer for everything it comes across. It ignores dot-files, like the ones your source control might litter.

;; ID: 72dc0a9e-c41c-31f8-c8f5-d9db8482de1e
(defun find-all-files (dir)
  "Open all files and sub-directories below the given directory."
  (interactive "DBase directory: ")
  (let* ((list (directory-files dir t "^[^.]"))
         (files (remove-if 'file-directory-p list))
         (dirs (remove-if-not 'file-directory-p list)))
    (dolist (file files)
      (find-file-noselect file))
    (dolist (dir dirs)
      (find-file-noselect dir)
      (find-all-files dir))))

One caveat: if you have a symbolic link that creates a file system loop, this will probably get hung on it.

Elisp Higher-order Conversion to Interactive

2010-09-29T00:00:00Z

For those not familiar with extending Emacs, when you create a function in Elisp it cannot be called directly by the user ("interactively") without declaring the function interactive. The simplest way to do this is by adding (interactive) to the top of the function definition. The interactive call can be made more complex, if needed, to ask the user interactively for input.

(defun hello-world ()
  "Example function."
  (interactive)
  (message "hello"))

There are some handy higher-order functions in Elisp, such as compose and apply-partially. Today I wanted to bind the output of apply-partially to a key. My situation was this: I use revert-buffer often enough that it needs a binding. Also because I use it so much, I wanted it to stop asking me for confirmation. (Yes, there are other ways to do this including revert-without-query, but I wanted a general solution.) Using apply-partially I could supply the needed function arguments at keybind time.

The problem is that you can only bind interactive functions, and the output of apply-partially is not interactive. A quick way to work around this is to wrap it in an anonymous function, which also takes away the need for apply-partially.

(lambda () (interactive) (revert-buffer nil t))

I'd rather there be another higher-order function that takes a non-interactive function and creates an interactive version. Here it is,

;; ID: c7db6dec-e7ab-3b0f-bf26-0fa268674c6c
(defun expose (function)
  "Return an interactive version of FUNCTION."
  (lexical-let ((lex-func function))
    (lambda ()
      (interactive)
      (funcall lex-func))))

Now the binding looks like this,

(global-set-key [f2] (expose (apply-partially 'revert-buffer nil t)))

I think this more clearly expresses my intention than the lambda wrapper would. Maybe?

Distributed Computing with Emacs

2010-08-07T00:00:00Z

I got an Elisp idea today and even went as far as implementing a proof of concept for it: distributed computing with Emacs Lisp. As usual for me the idea takes advantage of Lisp features to make the task pretty simple, very specifically Elisp's implementation. In this case it's the Lisp reader, printer, and the fact that Elisp functions have a printed representation, both byte-compiled and not.

Here's the proof of concept code: dist-emacs.el

A central server listens for TCP connections. Clients offering their CPU for use connect to the server and await instructions. The server sends a single, no-argument, anonymous function to the client. The client calls the function, returning the resulting form back to the server. In order to transmit the function it's encoded into a string using the Lisp printer, and the client turns it back into an executable function with the Lisp reader.

For some simple security there is a shared password between the client and server. When the server sends a function it includes a signature, and the client only runs code that matches the signature. To create a signature the string encoded version of the function is appended with the password (both strings) and hashed with a secure hashing algorithm. Only someone who knows the password — including other clients — can create the signature.

(defun sign-sexp (password sexp)
  "Return signature of the given s-exp."
  (sha1 (format "%s%s" password sexp))

To make it easy for the client to read in both the signature and the function we just cons them together before encoding them as text.

(defun encode (password sexp)
  "Encode a s-exp for transmission to client."
  (prin1-to-string (cons (sign-sexp password sexp) sexp)))

The client calls the Lisp reader on the string, then checks the signature in the car cell against the s-expression in the cdr cell. This will return the function if it's legitimate, otherwise nil.

(defun decode (password str)
  "Decode string into s-exp, checking the signature in the process."
  (let* ((cons (read str))
         (sig  (car cons))
         (sexp (cdr cons)))
    (if (equal sig (sign-sexp password sexp))
        sexp
      nil)))

And that's the core of it. It just needs some network code to move the string between computers. That part can be found in the linked source above.

To demo this, I'll use the whiten function from my previous post. I'll run it with three different strings on three different computers. Assume we started the dist-emacs server (dist-start) and connected three clients (dist-connect) from three computers to it. The clients were fired up from scratch so there's no whiten function on them yet, but there is one defined on the server. First we'll send the function definition to the clients. The dist-dist function takes a list of functions and passes each one to a client. Ideally I'd want this function to be more intelligent, managing a work queue so that an arbitrary length list of functions will be fed one at a time to each client. That's not the case here.

(dist-dist (mapcar (lambda (p)
                          `(lambda ()
                             (fset 'whiten ,(symbol-function 'whiten))))
                      dist-clients))

Also like in the previous post, this is an abstraction leak with the Emacs implementation. But I like this trick so I'm going to use it anyway. :-) Next we call it on each client with a different string.

(dist-dist (list (lambda () (whiten "good"))
                 (lambda () (whiten "news"))
                 (lambda () (whiten "everyone"))))

The way I have it set up for my proof of concept the results are just spit back into the server's *Messages* buffer. If we watch that buffer we can see each results come back in one at a time as each machine finishes. I can watch Emacs saturate the CPU on every client machine simultaneously as it works.

"2577343027adf7817185db876032d8ed"
"46a65dac2c0040afde175adf1e9a81fd"
"f39baf9e74475dd5be7d5495a025fe84"

This isn't the same order as the clients, but the order in which the jobs were completed.

As for the practicality, I doubt there really is one. It's really only a neat concept (or maybe not even that). For almost the exact same reasons as my distributed JavaScript idea, this is a solution looking for a problem. The problem needs to be able to be broken into small computation units, because Emacs has no threading, and it has to be low bandwidth, because it has to be parsed all at once from a string. If you want to pass large data sets it needs to be done out-of-band, which probably defeats the purpose. There seem to be few to no problems that fit these limitations.

Elisp Memoize

2010-07-26T00:00:00Z

Memoization is something I think should be packaged as a standard function for just about every language. That's not generally the case, but luckily this is easy to fix in Lisps. I needed memoization recently for an Elisp project I'm working on. I could have hand-written one but a generic memoization function would have worked just fine. Since I didn't find any generic Elisp memoization on-line I wrote my own.

Download: memoize.el

Just put it in your path and (require 'memoize) it. Here's the core function.

;; ID: 83bae208-da65-3e26-2ecb-4941fb310848
(defun memoize-wrap (func)
  "Return the memoized version of FUNC."
  (lexical-let ((table (make-hash-table :test 'equal)))
    (lambda (&rest args)
      (let ((value (gethash args table)))
        (if value
            value
          (puthash args (apply func args) table))))))

The hash table is stored inside the fake closure provided by lexical-let. In a previous version of this function, I stored it in an uninterned symbol, which is what is going on behind the scenes of lexical-let.

Note that in the full code it keeps the original function documentation intact. I want the memoization wrapper to be an unobtrusive as possible.

Here's a demo of it in action. This whiten function is computationally expensive: it performs key whitening. It repeats a hash function thousands of times to produce an expensive value. This isn't something you generally want to memoize, but stick with me.

(defun whiten (key)
  "Perform key whitening with the md5 hash function."
  (dotimes (i 100000 key)
    (setq key (md5 key))))

(whiten "password")   ; takes a couple of seconds

On my laptop that takes a couple of seconds to run. Increase that counter if it's quick on your computer. My memoize package provides a memoize function which will create a new function that wraps the original, then installs the new function in place of the old one if we give it the function symbol.

(memoize 'whiten)

The first time you run it after memoization it will be slow, but after that the memoization kicks in for a quick return.

There are two Elisp specific issues at hand. First is that memoizing an interactive function will produce a non-interactive function. It would be easy to fix this problem when it comes to non-byte-compiled functions, but recovering the interactive definition from a byte-compiled function is more complex than I care to deal with. Besides, interactive functions are always used for their side effects so there's no reason to memoize them.

Second is a limitation of Elisp hash tables. There's no way to distinguish a nil value and no value. The hash table returns nil for both. This means you cannot memoize nil returns. But a computationally expensive function shouldn't be returning nil anyway.

Update: As of August 2012, me and several other people have gotten good mileage out of this function! It's an essential part of my Emacs dotfiles.

Emacs ParEdit and IELM

2010-06-10T00:00:00Z

ParEdit is a powerful extension to Emacs that I've just begun using recently. It's a minor mode that forces all parenthesis, square brackets, and quotes to be balanced at all times. While it's useful for any programming language it's especially suited for Lisps, because it's designed for manipulating nested parenthesis — i.e. s-expressions. It's not currently part of Emacs so you have to drop the script in your load-path somewhere.

I've frequently thought that a Lisp-based shell would be an interesting and powerful tool, much like a normal Lisp REPL. Programs would be treated like Lisp functions. For example,

wellons@luna:~$ (ls -l .emacs)
-rw------- 1 wellons wellons 4859 2010-06-10 23:20 .emacs
wellons@luna:~$

But typing all those parenthesis all the time would be quite the nuisance. I know this from experience typing at Lisp REPLs. I imagined something that works exactly like ParEdit would be needed to make all that work go away. To save even more time each prompt would begin with a nested pair, with the cursor placed between them. Then typing a quick command is no different than a normal shell.

wellons@luna:~$ ()

Well, in Emacs we have both ParEdit and REPLs, so we can compose these features together with just a little advice. Here's how to do it with the Interactive Emacs-Lisp Mode (IELM) REPL. First tell IELM to use ParEdit,

(add-hook 'ielm-mode-hook (lambda () (paredit-mode 1)))

The function in IELM that spits out the next prompt is ielm-eval-input, so we give it the advice to call the ParEdit function afterwards to insert a parenthesis pair.

(defadvice ielm-eval-input (after ielm-paredit activate)
  "Begin each IELM prompt with a ParEdit parenthesis pair."
  (paredit-open-round))

And that's it! Note that the first IELM prompt is not placed by this function so it won't appear until the second prompt.

*** Welcome to IELM ***  Type (describe-mode) for help.
ELISP>
ELISP> ()

If you want to enter a single atom and don't need parenthesis, just hit backspace once. This is much less common so it gets the extra keystroke.

This can be done for inferior-lisp and SLIME to enhance those REPLs as well. You just have to figure out which defun to advise.

Emacs Web Servlets

2009-11-03T00:00:00Z

Remember that Emacs web server I wrote back in May? Well, I got an e-mail last night from Chunye Wang containing a patch with a variant of my dynamic lisp idea, called "servlets" (not to be confused with Java servlets). Chunye had similar concept for an Emacs web server for a long time, but never implemented because Emacs lacked network functionality until recently (Specifically, make-network-process in Emacs 22.1, June 2007). This led Chunye to find my implementation.

Again, you can clone/view the code here. I turned the patch into a series of commits,

git clone git://github.com/skeeto/emacs-http-server.git

This is some cool stuff here.

The servlets are simply functions installed under an "httpd/" namespace, where the trailing slash represents the server root. So, the function httpd/example-servlet will be executed when "/example-servlet" is requested from the server. The servlet runs on a temporary buffer, whose contents are served when the servlet function returns.

To assist in HTML generation, Chunye also wrote a function to turn an S-expression into HTML, similar to the one I described in the web server previous post. Symbols are converted into strings, alists are attributes, and the elisp symbol indicates code to be executed, and the results used to generate HTML. For a simple hello word,

(html (head (title "hello world")) (body "hello world"))

And for some dynamic content, a die roller,

(defun httpd/roll-die (uri-query req uri-path)
  "Rolls a die with the requested number of sides (default 6)."
  (let ((sides
         (1- (string-to-number (or (cadr (assoc "sides" uri-query)) "6")))))
    (httpd-generate-html
     '(html
       (head
        (title "Die Roll Servlet"))
       (body
        (h1 "Die Roll Servlet")
        "You rolled a "
        (b
         (elisp (list (number-to-string (1+ (random sides)))))))))))

That one would be accessed from the browser with with "/roll-die" or "/roll-die?sides=100".

Chunye provided some sample servlets that list the buffers, with links that serve them up. There is also another servlet that will switch the current buffer, which I find compelling. All of Emacs' functionality is available to the servlet.

Now, to write a servlet that runs the Emacs psychiatrist ...

Lisp Fantasy Name Generator

2009-07-03T00:00:00Z

Earlier this year I implemented the RinkWorks fantasy name generator in Perl. I think lisp lends itself even better for that, and so I have a partial elisp implementation for you.

What stands out for me is that the patterns can easily be represented as a S-expression. We represent substitutions with symbols, literals with strings, and groups with lists. For example, this pattern,

s(ith|<'C>)V

can be represented in code as,

(s ("ith" ("'" C)) V)

I want a function I can apply to this to generate a name. First, I set up an association list with symbols and its replacements,

(defvar namegen-subs
  '((s ach ack ad age ald ale an ang ar ard as ash at ath augh
       aw ban bel bur cer cha che dan dar del den dra dyn
       ech eld elm em en end eng enth er ess est et gar gha
       hat hin hon ia ight ild im ina ine ing ir is iss it
       kal kel kim kin ler lor lye mor mos nal ny nys old om
       on or orm os ough per pol qua que rad rak ran ray ril
       ris rod roth ryn sam say ser shy skel sul tai tan tas
       ther tia tin ton tor tur um und unt urn usk ust ver
       ves vor war wor yer)
    (v a e i o u y)
    ...
    (d elch idiot ob og ok olph olt omph ong onk oo oob oof oog
       ook ooz org ork orm oron ub uck ug ulf ult um umb ump umph
       un unb ung unk unph unt uzz))
  "Substitutions for the name generator.")

Since we will need this in a couple places, make a function to randomly select an element from a list,

(defun randth (lst)
  "Select random element from the given list."
  (nth (random (length lst)) lst))

A function for replacing a symbol,

(defun namegen-select (sym)
  "Select a replacement for the given symbol."
  (if (null (assoc sym namegen-subs))
      (throw 'bad-symbol
             (concat "Invalid substitution symbol: " (format "%s" sym)))
    (symbol-name (randth (cdr (assoc sym namegen-subs))))))

And finally, the generator. Find a string, pass it through, find a symbol, substitute it, find a list, pick one element and recurse on it.

(defun namegen (sexp)
  "Generate a name from the given sexp generator."
  (cond
   ((null sexp) "")
   ((stringp sexp) sexp)
   ((symbolp sexp) (namegen-select sexp))
   ((listp sexp)
    (concat (if (listp (car sexp)) (namegen (randth (car sexp)))
              (namegen (car sexp)))
            (namegen (cdr sexp))))))

That's it! We can apply it to the expression above,

(namegen '(s ("ith" ("'" C)) V))
-> "rynithi"

But that's really the easy part. The hard part would be converting the original pattern into the S-expression, which I don't plan on doing right now.

Something else to note: this is thousands of times faster than the Perl version I wrote earlier.

I threw the code in with the rest of my name generation code (namegen.el),

git clone git://github.com/skeeto/fantasyname.git

S-expressions are handy anywhere.

United States Hamiltonian Paths

2009-06-21T00:00:00Z

Awhile ago I wanted to find every Hamiltonian path in the contiguous 48 states. That is, trips that visit each state exactly once. Writing a program to search for Hamiltonian paths is easy (I did this already). The most time consuming part was actually putting together the data that specified the graph to be searched. I hope someone somewhere finds it useful. Here is a map for reference,

It took me several passes before I stopped finding errors. I think I have it all right now, but there could still be some mistakes. If you see one, leave a comment and I'll fix it here. Here is the graph as an S-expression alist; the car (first) element in each list is a state, and the cdr (rest) is the unordered list of states that can be reached from it.

((me nh)
 (nh vt ma me)
 (vt ny ma nh)
 (ma ri ct ny nh vt)
 (ny pa nj ma ct vt)
 (ri ma ct)
 (ct ri ma ny)
 (nj pa ny de)
 (de md pa nj)
 (pa nj ny de md wv oh)
 (md pa de va wv)
 (va md wv ky tn nc)
 (nc va tn ga sc)
 (sc nc ga)
 (ga fl sc al nc tn)
 (al ms fl ga tn)
 (ms la ar tn al)
 (tn ms al ga nc va ky mo ar)
 (ky wv va tn mo il in oh)
 (wv md pa oh ky va)
 (oh pa wv ky in mi)
 (fl al ga)
 (mi wi oh in)
 (wi mn ia il mi)
 (il in ky mo ia wi)
 (in oh ky il mi)
 (mo il ky tn ar ok ks ne ia)
 (ar mo tn ms la tx ok)
 (la ms ar tx)
 (tx ok nm ar la)
 (ok ks mo ar tx nm co)
 (ks ok co ne mo)
 (ne sd ia mo ks co wy)
 (sd nd mn ia ne wy mt)
 (nd mt sd mn)
 (ia ne mo il wi mn sd)
 (mn wi ia sd nd)
 (mt id wy sd nd)
 (wy id ut co ne sd mt)
 (co ne ks ok nm ut wy)
 (nm co ok tx az)
 (az nm ut ca nv)
 (ut nv id wy co az)
 (id mt wy ut nv or wa)
 (wa or id)
 (or wa id nv ca)
 (nv or id ut az ca)
 (ca az nv or))

Note that all paths must start or end in Maine because it connects to only one other state.

Elisp Wishlist

2009-05-29T00:00:00Z

Update: It looks like all these wishes, except the last one, may actually be coming true! Guile can run Elisp better than Emacs! The idea is that the Elisp engine is replaced with Guile — the GNU project's Scheme implementation designed to be used as an extension language — and written in Scheme is an Elisp compiler that targets Guile's VM. The extension language of Emacs then becomes Scheme, but Emacs is still able to run all the old Elisp code. At the same time Elisp itself, which I'm sure many people will continue to use, gets an upgrade of arbitrary precision, closures, and better performance.

I've been using elisp a lot lately, but unfortunately it's missing a lot of features that one would find in a more standard lisp. The following are some features I wish elisp had. Many of these could be fit into a generic "be more like Scheme or Common Lisp". Some of these features would break the existing mountain of elisp code out there, requiring a massive rewrite, which is likely the main reason they are being held back.

Closures, and maybe continuations. Closures are one of the features I miss the most when writing elisp. They would allow the implementation of Scheme-style lazy evaluation with delay and force, among other neat tools. Continuations would just be a neat thing to have, though they come with a performance penalty.

Closures would also pretty much require Emacs switch to lexical scoping.

Arbitrary precision. Really, any higher order language's numbers should be bignums. Emacs 22 does come with the Calc package which provides arbitrary precision via defmath. Perl does something like this with the bignum module.

Packages/namespaces. Without namespaces all of the Emacs packages prefix their functions and variables with its name (i.e. dired-). Some real namespaces would be useful for large projects.

C interface. This is something GNU Emacs will never have because Richard Stallman considers Emacs shared libraries support to be a GPL threat. If Emacs could be dynamically extended some useful libraries could be linked in and exposed to elisp.

Concurrency. If some elisp is being executed Emacs will lock up. This is a particular problem for Gnus. Again, Emacs would really need to switch to lexical scoping before this could happen. Threading would be nice.

Speed. Emacs lisp is pretty slow, even when compiled. Lexical scoping would help with performance (compile time vs. run time binding).

Regex type. I mention this last because I think this would be really cool, and I am not aware of any other lisps that do it. Emacs does regular expressions with strings, which is silly and cumbersome. Backslashes need extra escaping, for example. Instead, I would rather have a regex type like Perl and Javascript have. So instead of,

(string-match "\\w[0-9]+" "foo525")

we have,

(string-match /\w[0-9]+/ "foo525")

Naturally there would be a regexpp predicate for checking its type. There could also be a function for compiling a regexp from a string into a regexp object. As a bonus, I would also like to use it directly as a function,

(/\w[0-9]+/ "foo525")

I think a regexp price would really give elisp an edge, and would be entirely appropriate for a text editor. It could also be done without breaking anything (keep string-style regexp support).

There is more commentary over at EmacsWiki: Why Does Elisp Suck.

Elisp Running Time Macro

2009-05-28T00:00:00Z

I wanted an elisp macro that could measure the running time of a block of code. Specifically, I wanted it to work like this,

(measure-time
  ...
  body
  ...)

And it would return the running time as seconds in floating point. Well, here's a macro that does it!

;; ID: 6a3f3d99-f0da-329a-c01c-bb6b868f3239
(defmacro measure-time (&rest body)
  "Measure and return the running time of the code block."
  (declare (indent defun))
  (let ((start (make-symbol "start")))
    `(let ((,start (float-time)))
       ,@body
       (- (float-time) ,start))))

It's only good for up to around 18 hours, then the time integer overflows. If only Emacs had arbitrary precision numbers. Here it is in action using my binomial function from last week.

(measure-time
  (nck 20 10)
  (nck 30 7))

Which, just now, returned 3.643713 seconds when executed.

Emacs Web Server

2009-05-17T00:00:00Z

As part of my quest of developing solid knowledge of GNU Emacs lisp, I have implemented a pseudo-HTTP/1.0 web server within Emacs. Behold,

git clone git://github.com/skeeto/emacs-http-server.git

To all other non-emacsen text editors, can your text editor do that?! Ha! Even though elisp is a slow, closure-less, dynamically scoped, ugly cousin of more popular lisps, it's still a lot of fun to write.

To fire it up, load it into Emacs and run the extended command (M-x) httpd-start. By default it will serve files from "~/public_html". To change this, change the variable httpd-root to the desired web root. You can stop the server with httpd-stop.

It's about 200 lines of code and can serve static websites made of small, static files. I say small files because it serves files from buffers, meaning it has to read the entire file in first.

For a simple, text editor based server it can hold up to a pretty decent load. At one point I hit it with 8 wget instances all making rapid recursive downloads and my manual navigation wasn't slowed down noticeably. Despite running in the slow elisp interpreter, I think it can have much better performance by caching commonly served files in buffers.

It should run, unmodified, anywhere a modern Emacs can run, so I expect that it's already very portable. I can imagine it being useful in a situation where someone needs to temporarily host some files but there isn't a web server on the machine. Just grab this script and throw it at Emacs.

Well, it only does IPv4 right now, though I expect IPv6 only requires changing one number (namely, 4 to 6). I don't have any IPv6 systems to test it on.

When writing it I also had security in mind so, as far as I know, it should be safe to use. It cleans up the GET from the client so that no files underneath the serving root can be accessed.

The server log is lisp itself. Here is an example log starting the server, serving one request, and halting,

'(log
  (start "Wed May 13 23:33:34 2009")
  (connection
   (date "Wed May 13 23:36:25 2009")
   (address "192.168.0.3")
   (get "/0001.html")
   (req
    ("Referer" "http://192.168.0.2:8080/")
    ("Connection" "keep-alive")
    ("Keep-Alive" "300")
    ("Accept-Charset" "ISO-8859-1,utf-8;q=0.7,*;q=0.7")
    ("Accept-Encoding" "gzip,deflate")
    ("Accept-Language" "en-us,en;q=0.5")
    ("Accept" "image/png,image/*;q=0.8,*/*;q=0.5")
    ("User-Agent" "Mozilla/5.0 [...] Iceweasel/3.0.9 (Debian-3.0.9-1)")
    ("Host" "192.168.0.2:8080")
    ("GET" "/0001.html" "HTTP/1.1"))
   (path "~/public_html/0001.html")
   (status 200))
  (stop "Wed May 13 23:38:17 2009"))

The log is alists of alists, making a hierarchical tree structure that can be explored with some simple lisp functions. Normally this sort of thing is done with XML, but lisp already has its own structured format: lists!

When GET is a directory, it looks for "index.html" and serves that if it exists. More indexes can be added to the variable httpd-indexes. This can actually be done in a special ".htaccess.el" file.

If a ".htaccess.el" exists in the directory from which a file is being served, Emacs will first load/execute it. You see, it's just a lisp program. If you wanted to add a new index file name, the hypertext access file could contain this,

(add-to-list 'httpd-indexes "0001.html")

It's a bit like a .emacs file.

But I think one of the coolest things about having a lisp-based server is that the server can be modified in place without disrupting or restarting it. In my Emacs web server, the only change that requires a restart is changing the server port. In fact, I wrote most of it while the server was running and tested my changes from a browser right as I made them — all on the same instance of the server.

If you want to look into the AI side of this, the server could modify its own code in response to its use.

I also had the idea of creating dynamic websites with elisp, in the same way PHP or Perl does. If a .el file (or .elc) is accessed, the server would pass the GET/POST arguments as an alist to a function in the elisp file. The server would also provide some nifty HTML generation macros. A dynamic script might look like this,

(defun script (get)
  (html
   (head
    (title "My Script"))
   (body
    (h1 "Your Query")
    (p (concat "Your query was "
               (html-sanitize (cdr (assoc "q" get)) "."))))))

However, this is not (yet?) implemented. Just an idea.

I will continue to work on it, though I don't expect to add much more to it. I will mostly improve the code and documentation.