Software Serial Codes

Download the entire source here: serial.el

I was thinking recently about how software serial codes might work in proprietary software. They're a self-consistent series of letters and numbers entered during installation. The software can, without phoning home, verify the validity of the code. This sort of thing is just DRM that relies completely on obscurity for security, so it's only a matter of time before the secret is out, breaking the restriction. But it's still fun to speculate about them.

In general, serial codes have these two properties.

  1. Self-consistent. The serial code can be validated without querying into a database. There is a mathematical consistency, using either a secret algorithm or secret values. Imagine the set of all possible serial numbers (the key space), valid and invalid, and valid serial numbers are distributed pseudo-randomly throughout that set.
  2. Short. Short enough for a human to manage manually. They're generally typed in by hand.

Sometimes serial codes are an entire file, one that would take hours to type in by hand, but I'm not considering those right now. They have a trivial solution: use a cryptographic digital signature, because length isn't an issue. The validation algorithm just checks the signature against the key on file.

A digital signature is the obvious solution for a serial code, but to protect these against even the shallowest brute-force attacks it requires using numbers long enough to break criteria #2.

Here's how I would do it. A serial number consists of two parts: a code and a checksum. The code is some random number. The checksum is a number derived from the code.

When encoded, the code is first and the checksum is second.

For my checksum I'm just going to use a hash function: mix the code component and some secret password together, then hash it to produce the checksum. To generate a serial number all we have to do is generate a random number as our code, generate the checksum from it, then encode them together.

Since we want to keep it short, we have to choose how many characters of the serial code to dedicate to each component. It's like a slider we can slide between each side of the encoded serial number. As we allocate more characters to the code component, the density of valid serial numbers in the key space increases. If we go all the way and shrink the checksum to zero characters, then every possible serial number is valid. The denser the valid serial numbers, the easier it is to stumble across one during brute force.

In the other direction, the more characters we allocate to the checksum the more distinct the serial numbers become. If we allocate all characters to the checksum, the other extreme, then no serial number is valid. The more distinct the serial numbers are, the easier they are to find by brute force.

The difficulty of finding a valid serial number by brute force at various split positions is going to look something like this. The exact curve is a function of the checksum algorithm and the knowledge of the attacker.

The plot of difficulty vs. split position is shaped like a U.

Since it looks like this, I'm just going to split it right in the middle: allocate half the characters to the code and half to the checksum. To begin, I'm going to encode the serial number as hexadecimal digits. First let's define a little bit about what our serial codes look like. I'm going to implement this in my favorite try-stuff language, Elisp.

(defvar serial-password "password"
  "Secret password used in serial generation and checks.")

(defvar serial-code-len 12
  "Length of the code component of the serial number.")

(defvar serial-check-len 12
  "Length of the check component of the serial number.")

I decided to use 12 characters, 6 bytes, to represent each component. This makes for a 24-character serial code. A bit long, but we'll use a trick to shorten that later.

Next, define the hash and mixing functions. For the hash I'm just going to use SHA1 and truncate the output. For the password mixing function, we're just going to concatenate them. There are stronger ways to perform this mix, such as with HMAC, but concat is good enough for a DRM scheme.

(defalias 'serial-hash 'sha1
  "Hash function for generating serial codes.")

(defalias 'serial-mixer 'concat
  "Function for mixing the password and the code part of the serial number.")

Below, first function is just a random number generator. It could be fancier, like the UUID generating function, but this is good enough. The compose function was defined in a previous post.

(defun serial-gen-code-hex ()
  "Generate a the code component of a new serial number."
  (substring (serial-hash (mapconcat (compose 'prin1-to-string 'random)
                                     (make-list serial-code-len 256) ""))
             0 serial-code-len))

(defun serial-gen-hex ()
  "Generate a hex-encoded serial number."
  (let* ((code (serial-gen-code-hex))
         (check (substring (serial-hash (serial-mixer serial-password code))
                           0 serial-check-len)))
    (concat code check)))

The serial codes it generates look like this.

34ef80635ca7340f1f10e170
8471cc8e87378f15377cf038
2ba0b74bfb86262d5400264e
5e02185457970a0b89656e9b
cf0d611512f1aacb22d8095d

Next we need a function that accepts these numbers and checks their self-consistency. Simple enough: extract the code component, generate the checksum, and check against the checksum component.

(defun serial-check-hex (serial)
  "Return t if the given hex-encoded serial is valid."
  (let ((code (substring serial 0 serial-code-len))
        (check (substring serial (- serial-check-len))))
    (string= check (substring (serial-hash (serial-mixer serial-password code))
                              0 serial-check-len))))

This quick test below always returns true, so it's working at least partially right.

(serial-check-hex (serial-gen-hex))

Those serial numbers could be shorter because we're limiting ourselves to 0-9 and a-f. Instead, we could base64 encode the serial numbers to cram more information in each character.

(defun serial-hex-to-bytes (hex)
  "Convert hexadecimal string to string of bytes."
  (if (= 0 (length hex))
      ""
    (concat (char-to-string (string-to-number (substring hex 0 2) 16))
            (serial-hex-to-bytes (substring hex 2)))))

(defun serial-gen ()
  "Generate a base64 encoded serial number."
  (base64-encode-string (serial-hex-to-bytes (serial-gen-hex))))

(defun serial-check (serial)
  "Return t if base64 serial number is valid."
  (serial-check-hex (mapconcat (apply-partially 'format "%02x")
                               (base64-decode-string serial) "")))

Now the serial codes from serial-gen look like this.

Nsu8TXFC4IdvlBxc
b5HduiyZmNmcq4fQ
vUd6KuIyDgmBgDWW
dRMW6/rctm2Z5TKn
ol4cP+5b3Y7vqWEZ

One problem with these is it's easy to mix up the zeros, Os, ones, and ls. A filter could be added to the generator to reject and regenerate any serial number containing these confusing characters, but that's outside the scope of this post.

Have a comment on this article? Start a discussion in my public inbox by sending an email to ~skeeto/public-inbox@lists.sr.ht [mailing list etiquette] , or see existing discussions.

This post has archived comments.

null program

Chris Wellons

wellons@nullprogram.com (PGP)
~skeeto/public-inbox@lists.sr.ht (view)