Emacs Lisp Object Finalizers

*Update: Emacs 25.1 (released Sept. 2016) formally introduced finalizers to Emacs Lisp. This article is left here for historical purposes.

Problem: You have a special resource, such as a buffer or process, associated with an Emacs Lisp object which is not managed by the garbage collector. You want this resource to be cleaned up when the owning lisp object is garbage collected. Unlike some other languages, Elisp doesn’t provide finalizers for this job, so what do you do?

Solution: This is Emacs Lisp. We can just add this feature to the language ourselves!

I’ve already implemented this feature as a package called finalize, available on MELPA. I will be using it as part of a larger, upcoming project.

In this article I will describe how it works.

Processes and Buffers

Process and buffers are special types of objects. Immediately after instantiation these objects are added to a global list. They will never become unreachable without explicitly being killed. The garbage collector will never manage them for you.

This is a problem for APIs like those provided by the url package. The functions url-retrieve and url-retrieve-synchronously create buffers and hand them back to their callers. Ownership is transfered to the caller and the caller must be careful to kill the buffer, or transfer ownership again, before it returns. Otherwise the buffer is “leaked.” The url package tries to manage this a little bit with url-gc-dead-buffers, but this can’t be relied upon.

Another issue is when a process is started and is stored in a struct or some other kind of object. There is probably a “close” function that accepts one of these structs and kills the process. But if that function isn’t called, due to a bug or an error condition, it will become a “dangling” process. If the struct is completely lost, it will probably be inconvenient to deal with the process — the “close” function is no longer useful.

With Macros

A common way to deal with this problem is using a with- macro. This macro establishes a resource, evaluates a body, and ensures the resource is properly cleaned up regardless of the body’s termination state. The latter is accomplished using unwind-protect. For example, with-temp-buffer,

;; Fetch the first 10 bytes of foo.txt
(with-temp-buffer
  (insert-file-contents "foo.txt" nil 0 10)
  (buffer-string))

This expands (roughly) to the following expression.

(let ((temp-buffer (generate-new-buffer "*temp*")))
  (with-current-buffer temp-buffer
    (unwind-protect
        (progn
          (insert-file-contents "foo.txt" nil 0 10)
          (buffer-string))
      (and (buffer-live-p temp-buffer)
           (kill-buffer temp-buffer)))))

For dealing with open files, Common Lisp has with-open-stream. It establishes a binding for a new stream over its body and ensures the stream is closed when the body is complete. There’s no chance for a stream to be left open, leaking a system resource.

However, with- macros aren’t useful in asynchronous situations. In Emacs this would be the case for asynchronous sub-processes, such as an attached language interpreter. The extent of the process goes beyond a single body.

Finalizers

What would really be useful is to have a callback — a finalizer — that runs when an object is garbage collected. This ensures that the resource will not outlive its owner, restoring management back to the garbage collector. However, Emacs provides no such hook.

Fortunately this feature can be built using weak hash tables and the post-gc-hook, a list of functions that are run immediately after garbage collection.

Weak References

I’ve discussed before how to create weak references in Elisp. The only weak references in Emacs are built into weak hash tables. Normally the language provides weak references first and hash tables are built on top of them. With Emacs we do this backwards.

The make-hash-table function accepts a key argument :weakness to specify how strongly keys and values should be held by the table. To make a weak reference just create a hash table of size 1 and set :weakness to t.

(defun weak-ref (thing)
  (let ((ref (make-hash-table :size 1 :weakness t :test 'eq)))
    (prog1 ref
      (setf (gethash t ref) thing))))

(defun deref (ref)
  (gethash t ref))

The same trick can be used to detect when an object is garbage collected. If the result of deref is nil, then the object was garbage collected. (Or the weakly-referenced object is nil, but this object will never be garbage collected anyway.)

To check if we need to run a finalizer all we have to do is create a weak reference to the object, then check the reference after garbage collection. This check can be done in a post-gc-hook function.

Registration

To avoid cluttering up post-gc-hook with one closure per object we’ll keep a register of all watched objects.

(defvar finalizable-objects ())

(defun register (object callback)
  (push (cons (weak-ref object) callback) finalizable-objects))

Now a function to check for missing objects, try-finalize.

(defun try-finalize ()
  (let ((alive (cl-remove-if-not #'deref finalizable-objects :key #'car))
        (dead (cl-remove-if #'deref finalizable-objects :key #'car)))
    (setf finalizable-objects alive)
    (mapc #'funcall (mapcar #'cdr dead))))

(add-hook 'post-gc-hook #'try-finalize)

Now to try it out. Create a process, stuff it in a vector (like a defstruct), register delete-process as a finalizer, and, for the sake of demonstration, immediately forget the vector.

;;; -*- lexical-binding: t; -*-
(let ((process (start-process "ping" nil "ping" "localhost")))
  (register (vector process) (lambda () (delete-process process))))

;; Assuming the garbage collector has not already run.
(get-process "ping")
;; => #<process ping>

;; Force garbage collection.
(garbage-collect)

(get-process "ping")
;; => nil

The garbage collector killed the process for us!

There are some problems with this implementation. Using cl-remove-if is unwise in a post-gc-hook function. It allocates lots of new cons cells but garbage collection is inhibited while the function is run. The docstring warns us:

Garbage collection is inhibited while the hook functions run, so be careful writing them.

Similarly, all of the finalizers are run within the context of this memory-sensitive hook. Instead they should be delayed until the next evaluation turn (i.e. run-at-time of 0). Some of the finalizers could also fail, which would cause the remaining finalizers to never run. The real implementation deals with all of these issues.

A major drawback to these Emacs Lisp finalizers compared to other languages is that the actual object is not available. We don’t know it’s getting collected until after it’s already gone. This solves the object resurrection problem, but it’s darn inconvenient. One possible workaround in the case of defstructs and EIEIO objects is to make a copy of the original object (copy-sequence or clone) and run the finalizer on the copy as if it was the original.

The Real Implementation

The real implementation is more carefully namespaced and its API has just one function: finalize-register. It works just like register above but it accepts &rest arguments to be passed to the finalizer. This makes the registration call simpler and avoids some significant problems with closures.

(let ((process (start-process "ping" nil "ping" "localhost")))
  (finalize-register (vector process) #'delete-process process))

Here’s a more formal example of how it might really be used.

(cl-defstruct (pinger (:constructor pinger--create))
  process host)

(defun pinger-create (host)
  (let* ((process (start-process "pinger" nil "ping" host))
         (object (pinger--create :process process :host host)))
    (finalize-register object #'delete-process process)
    object))

To make things cleaner for EIEIO classes there’s also a finalizable mixin class that ensures the finalize generic function is called on a copy of the object (the original object is gone) when it’s garbage collected.

Here’s how it would be used for the same “pinger” concept, this time as an EIEIO class. An advantage here is that anyone can manually call finalize early if desired.

(require 'eieio)
(require 'finalizable)

(defclass pinger (finalizable)
  ((process :initarg :process :reader pinger-process)
   (host :initarg :host :reader pinger-host)))

(defun pinger-create (host)
  (make-instance 'pinger
                 :process (start-process "ping" nil "ping" host)
                 :host host))

(defmethod finalize ((pinger pinger))
  (delete-process (pinger-process pinger)))

It’s a small package but I think it can be quite handy.

Have a comment on this article? Start a discussion in my public inbox by sending an email to ~skeeto/public-inbox@lists.sr.ht [mailing list etiquette] , or see existing discussions.

null program

Chris Wellons

wellons@nullprogram.com (PGP)
~skeeto/public-inbox@lists.sr.ht (view)