nullprogram.com/blog/2021/09/15/
This article was discussed on Hacker News.
Years ago, OpenBSD gained two new security system calls, pledge(2)
(originally tame(2)
) and unveil
. In both, an application
surrenders capabilities at run-time. The idea is to perform initialization
like usual, then drop capabilities before handling untrusted input,
limiting unwanted side effects. This feature is applicable even where type
safety isn’t an issue, such as Python, where a program might still get
tricked into accessing sensitive files or making network connections when
it shouldn’t. So how can a Python program access these system calls?
As discussed previously, it’s quite easy to access C APIs from
Python through its ctypes
package, and this is no exception.
In this article I show how to do it. Here’s the full source if you want to
dive in: openbsd.py
.
I’ve chosen these extra constraints:
-
As extra safety features, unnecessary for correctness, attempts to call
these functions on systems where they don’t exist will silently do
nothing, as though they succeeded. They’re provided as a best effort.
-
Systems other than OpenBSD may support these functions, now or in the
future, and it would be nice to automatically make use of them when
available. This means no checking for OpenBSD specifically but instead
feature sniffing for their presence.
-
The interfaces should be Pythonic as though they were implemented in
Python itself. Raise exceptions for errors, and accept strings since
they’re more convenient than bytes.
For reference, here are the function prototypes:
int pledge(const char *promises, const char *execpromises);
int unveil(const char *path, const char *permissions);
The string-oriented interface of pledge
will make this a whole
lot easier to implement.
Finding the functions
The first step is to grab functions through ctypes
. Like a lot of Python
documentation, this area is frustratingly imprecise and under-documented.
I want to grab a handle to the already-linked libc and search for either
function. However, getting that handle is a little different on each
platform, and in the process I saw four different exceptions, only one of
which is documented.
I came up with passing None to ctypes.CDLL
, which ultimately just passes
NULL
to dlopen(3)
. That’s really all I wanted. Currently on
Windows this is a TypeError. Once the handle is in hand, try to access the
pledge
attribute, which will fail with AttributeError if it doesn’t
exist. In the event of any exception, just assume the behavior isn’t
available. If found, I also define the function prototype for ctypes
.
_pledge = None
try:
_pledge = ctypes.CDLL(None, use_errno=True).pledge
_pledge.restype = ctypes.c_int
_pledge.argtypes = ctypes.c_char_p, ctypes.c_char_p
except Exception:
_pledge = None
Catching a broad Exception isn’t great, but it’s the best we can do since
the documentation is incomplete. From this block I’ve seen TypeError,
AttributeError, FileNotFoundError, and OSError. I wouldn’t be surprised if
there are more possibilities, and I don’t want to risk missing them.
Note that I’m catching Exception rather than using a bare except
. My
code will not catch KeyboardInterrupt nor SystemExit. This is deliberate,
and I never want to catch these.
The same story for unveil
:
_unveil = None
try:
_unveil = ctypes.CDLL(None, use_errno=True).unveil
_unveil.restype = ctypes.c_int
_unveil.argtypes = ctypes.c_char_p, ctypes.c_char_p
except Exception:
_unveil = None
Pythonic wrappers
The next and final step is to wrap the low-level call in an interface that
hides their C and ctypes
nature.
Python strings must be encoded to bytes before they can be passed to C
functions. Rather than make the caller worry about this, we’ll let them
pass friendly strings and have the wrapper do the conversion. Either may
also be NULL
, so None is allowed.
def pledge(promises: Optional[str], execpromises: Optional[str]):
if not _pledge:
return # unimplemented
r = _pledge(None if promises is None else promises.encode(),
None if execpromises is None else execpromises.encode())
if r == -1:
errno = ctypes.get_errno()
raise OSError(errno, os.strerror(errno))
As usual, a return of -1 means there was an error, in which case we fetch
errno
and raise the appropriate OSError.
unveil
works a little differently since the first argument is a path.
Python functions that accept paths, such as open
, generally accept
either strings or bytes. On unix-like systems, paths are fundamentally
bytestrings and not necessarily Unicode, so it’s necessary to accept
bytes. Since strings are nearly always more convenient, they take both.
The unveil
wrapper here will do the same. If it’s a string, encode it,
otherwise pass it straight through.
def unveil(path: Union[str, bytes, None], permissions: Optional[str]):
if not _unveil:
return # unimplemented
r = _unveil(path.encode() if isinstance(path, str) else path,
None if permissions is None else permissions.encode())
if r == -1:
errno = ctypes.get_errno()
raise OSError(errno, os.strerror(errno))
That’s it!
Trying it out
Let’s start with unveil
. Initially a process has access to the whole
file system with the usual restrictions. On the first call to unveil
it’s immediately restricted to some subset of the tree. Each call reveals
a little more until a final NULL
which locks it in place for the rest of
the process’s existence.
Suppose a program has been tricked into accessing your shell history,
perhaps by mishandling a path:
def hackme():
try:
with open(pathlib.Path.home() / ".bash_history"):
print("You've been hacked!")
except FileNotFoundError:
print("Blocked by unveil.")
hackme()
If you’re a Bash user, this prints:
Using our new feature to restrict the program’s access first:
# restrict access to static program data
unveil("/usr/share", "r")
unveil(None, None)
hackme()
On OpenBSD this now prints:
Working just as it should!
With pledge
we declare what abilities we’d like to keep by supplying a
list of promises, pledging to use only those abilities afterward. A
common case is the stdio
promise which allows reading and writing of
open files, but not opening files. A program might open its log file,
then drop the ability to open files while retaining the ability to write
to its log.
An invalid or unknown promise is an error. Does that work?
>>> pledge("doesntexist", None)
OSError: [Errno 22] Invalid argument
So far so good. How about the functionality itself?
pledge("stdio", None)
hackme()
The program is instantly killed when making the disallowed system call:
If you want something a little softer, include the error
promise:
pledge("stdio error", None)
hackme()
Instead it’s an exception, which will be a lot easier to debug when it
comes to Python, so you probably always want to use it.
OSError: [Errno 78] Function not implemented
The core dump isn’t going to be much help to a Python program, so you
probably always want to use this promise. In general you need to be extra
careful about pledge
in complex runtimes like Python’s which may
reasonably need to do many arbitrary, undocumented things at any time.