nullprogram.com/blog/2019/07/10/
tl;dr: passphrase2pgp.
One of my long-term concerns has been losing my core cryptographic keys,
or just not having access to them when I need them. I keep my important
data backed up, and if that data is private then I store it encrypted.
My keys are private, but how am I supposed to encrypt them? The chicken
or the egg?
The OpenPGP solution is to (optionally) encrypt secret keys using a key
derived from a passphrase. GnuPG prompts the user for this passphrase
when generating keys and when using secret keys. This protects the keys
at rest, and, with some caution, they can be included as part of regular
backups. The OpenPGP specification, RFC 4880 has many options
for deriving a key from this passphrase, called String-to-Key, or S2K,
algorithms. None of the options are great.
In 2012, I selected the strongest S2K configuration at the time and,
along with a very strong passphrase, put my GnuPG keyring on the
internet as part of my public dotfiles repository. It was a
kind of super-backup that would guarantee their availability anywhere
I’d need them.
My timing was bad because, with the release of GnuPG 2.1 in 2014, GnuPG
fundamentally changed its secret keyring format. S2K options are now
(quietly!) ignored when deriving the protection keys. Instead it
auto-calibrates to much weaker settings. With this new version of GnuPG,
I could no longer update the keyring in my dotfiles repository without
significantly downgrading its protection.
By 2017 I was pretty irritated with the whole situation. I let my
OpenPGP keys expire, and then I wrote my own tool to replace
the only feature of GnuPG I was actively using: encrypting my backups
with asymmetric encryption. One of its core features is that the
asymmetric keypair can be derived from a passphrase using a memory-hard
key derivation function (KDF). Attackers must commit a significant
quantity of memory (expensive) when attempting to crack the passphrase,
making the passphrase that much more effective.
Since the asymmetric keys themselves, not just the keys protecting them,
are derived from a passphrase, I never need to back them up! They’re
also always available whenever I need them. My keys are essentially
stored entirely in my brain as if I was a character in a William
Gibson story.
Tackling OpenPGP key generation
At the time I had expressed my interest in having this feature for
OpenPGP keys. It’s something I’ve wanted for a long time. I first took
a crack at it in 2013 (now the the old-version
branch) for
generating RSA keys. RSA isn’t that complicated but it’s very
easy to screw up. Since I was rolling it from scratch, I didn’t
really trust myself not to subtly get it wrong. Plus I never figured out
how to self-sign the key. GnuPG doesn’t accept secret keys that aren’t
self-signed, so it was never useful.
I took another crack at it in 2018 with a much more brute force
approach. When a program needs to generate keys, it will either read
from /dev/u?random
or, on more modern systems, call getentropy(3)
.
These are all ultimately system calls, and I know how to intercept
those with Ptrace. If I want to control key generation for any
program, not just GnuPG, I could intercept these inputs and replace them
with the output of a CSPRNG keyed by a passphrase.
Keyed: Linux Entropy Interception
In practice this doesn’t work at all. Real programs like GnuPG and
OpenSSH’s ssh-keygen
don’t rely solely on these entropy inputs. They
also grab entropy from other places, like getpid(2)
,
gettimeofday(2)
, and even extract their own scheduler and execution
time noise. Without modifying these programs I couldn’t realistically
control their key generation.
Besides, even if it did work, it would still be fragile and unreliable
since these programs could always change how they use the inputs. So,
ultimately, it was more of an experiment than something practical.
passphrase2pgp
For regular readers, it’s probably obvious that I recently learned
Go. While searching for good projects idea for cutting my teeth, I
noticed that Go’s “extended” standard library has a lot of useful
cryptographic support, so the idea of generating the keys myself may be
worth revisiting.
Something else also happened since my previous attempt: The OpenPGP
ecosystem now has widespread support for elliptic curve cryptography. So
instead of RSA, I could generate a Curve25519 keypair, which, by design,
is basically impossible to screw up. Not only would I be generating
keys on my own terms, I’d being doing it in style, baby.
There are two different ways to use Curve25519:
- Digital signatures: Ed25519 (EdDSA)
- Diffie–Hellman (encryption): X25519 (ECDH)
In GnuPG terms, the first would be a “sign only” key and the second is
an “encrypt only” key. But can’t you usually do both after you generate
a new OpenPGP key? If you’ve used GnuPG, you’ve probably seen the terms
“primary key” and “subkey”, but you probably haven’t had think about
them since it’s all usually automated.
The primary key is the one associated directly with your identity.
It’s always a signature key. The OpenPGP specification says this is a
signature key only by convention, but, practically speaking, it really
must be since signatures is what holds everything together. Like
packaging tape.
If you want to use encryption, independently generate an encryption key,
then sign that key with the primary key, binding that key as a subkey
to the primary key. This all happens automatically with GnuPG.
Fun fact: Two different primary keys can have the same subkey. Anyone
could even bind any of your subkeys to their primary key! They only need
to sign the public key! Though, of course, they couldn’t actually use
your key since they’d lack the secret key. It would just be really
confusing, and could, perhaps in certain situations, even cause some
OpenPGP clients to malfunction. (Note to self: This demands
investigation!)
It’s also possible to have signature subkeys. What good is that?
Paranoid folks will keep their primary key only on a secure, air-gapped,
then use only subkeys on regular systems. The subkeys can be revoked and
replaced independently of the primary key if something were to go wrong.
In Go, generating an X25519 key pair is this simple (yes, it actually
takes array pointers, which is rather weird):
package main
import (
"crypto/rand"
"fmt"
"golang.org/x/crypto/curve25519"
)
func main() {
var seckey, pubkey [32]byte
rand.Read(seckey[:]) // FIXME: check for error
seckey[0] &= 248
seckey[31] &= 127
seckey[31] |= 64
curve25519.ScalarBaseMult(&pubkey, &seckey)
fmt.Printf("pub %x\n", pubkey[:])
fmt.Printf("sec %x\n", seckey[:])
}
The three bitwise operations are optional since it will do these
internally, but it ensures that the secret key is in its canonical form.
The actual Diffie–Hellman exchange requires just one more function call:
curve25519.ScalarMult()
.
For Ed25519, the API is higher-level:
package main
import (
"crypto/rand"
"fmt"
"golang.org/x/crypto/ed25519"
)
func main() {
seed := make([]byte, ed25519.SeedSize)
rand.Read(seed) // FIXME: check for error
key := ed25519.NewKeyFromSeed(seed)
fmt.Printf("pub %x\n", key[32:])
fmt.Printf("sec %x\n", key[:32])
}
Signing a message with this key is just one function call:
ed25519.Sign()
.
Unfortunately that’s the easy part. The other 400 lines of the real
program are concerned only with encoding these values in the complex
OpenPGP format. That’s the hard part. GnuPG’s --list-packets
option
was really useful for debugging this part.
OpenPGP specification
(Feel free to skip this section if the OpenPGP wire format isn’t
interesting to you.)
Following the specification was a real challenge, especially since many
of the details for Curve25519 only appear in still incomplete (and still
erroneous) updates to the specification. I certainly don’t envy the
people who have to parse arbitrary OpenPGP packets. It’s finicky and has
arbitrary parts that don’t seem to serve any purpose, such as redundant
prefix and suffix bytes on signature inputs. Fortunately I only had to
worry about the subset that represents an unencrypted secret key export.
OpenPGP data is broken up into packets. Each packet begins with a tag
identifying its type, followed by a length, which itself is a variable
length. All the packets produced by passphrase2pgp are short, so I could
pretend lengths were all a single byte long.
For a secret key export with one subkey, we need the following packets
in this order:
- Secret-Key: Public-Key packet with secret key appended
- User ID: just a length-prefixed, UTF-8 string
- Signature: binds Public-Key packet (1) and User ID packet (2)
- Secret-Subkey: Public-Subkey packet with secret subkey appended
- Signature: binds Public-Key packet (1) and Public-Subkey packet (4)
A Public-Key packet contains the creation date, key type, and public key
data. A Secret-Key packet is the same, but with the secret key literally
appended on the end and a different tag. The Key ID is (essentially) a
SHA-1 hash of the Public-Key packet, meaning the creation date is part
of the Key ID. That’s important for later.
I had wondered if the SHAttered attack could be used to create
two different keys with the same full Key ID. However, there’s no slack
space anywhere in the input, so I doubt it.
User IDs are usually a RFC 2822 name and email address, but that’s only
convention. It can literally be an empty string, though that wouldn’t be
useful. OpenPGP clients that require anything more than an empty string,
such as GnuPG during key generation, are adding artificial restrictions.
The first Signature packet indicates the signature date, the signature
issuer’s Key ID, and then optional metadata about how the primary key is
to be used and the capabilities the key owner’s client. The signature
itself is formed by appending the Public-Key packet portion of the
Secret-Key packet, the User ID packet, and the previously described
contents of the signature packet. The concatenation is hashed, the hash
is signed, and the signature is appended to the packet. Since the
options are included in the signature, they can’t be changed by another
person.
In theory the signature is redundant. A client could accept the
Secret-Key packet and User ID packet and consider the key imported. It
would then create its own self-signature since it has everything it
needs. However, my primary target for passphrase2pgp is GnuPG, and it
will not accept secret keys that are not self-signed.
The Secret-Subkey packet is exactly the same as the Secret-Key packet
except that it uses a different tag to indicate it’s a subkey.
The second Signature packet is constructed the same as the previous
signature packet. However, it signs the concatenation of the Public-Key
and Public-Subkey packets, binding the subkey to that primary key. This
key may similarly have its own options.
To create a public key export from this input, a client need only chop
off the secret keys and fix up the packet tags and lengths. The
signatures remain untouched since they didn’t include the secret keys.
That’s essentially what other people will receive about your key.
If someone else were to create a Signature packet binding your
Public-Subkey packet with their Public-Key packet, they could set their
own options on their version of the key. So my question is: Do clients
properly track this separate set of options and separate owner for the
same key? If not, they have a problem!
The format may not sound so complex from this description, but there are
a ton of little details that all need to be correct. To make matters
worse, the feedback is usually just a binary “valid” or “invalid”. The
world could use an OpenPGP format debugger.
Usage
There is one required argument: either --uid
(-u
) or --load
(-l
). The former specifies a User ID since a key with an empty User ID
is pretty useless. It’s my own artificial restriction on the User ID.
The latter loads a previously-generated key which will come with a User
ID.
To generate a key for use in GnuPG, just pipe the output straight into
GnuPG:
$ passphrase2pgp --uid "Foo <foo@example.com>" | gpg --import
You will be prompted for a passphrase. That passphrase is run through
Argon2id, a memory-hard KDF, with the User ID as the salt.
Deriving the key requires 8 passes over 1GB of state, which takes my
current computers around 8 seconds. With the --paranoid
(-x
) option
enabled, that becomes 16 passes over 2GB (perhaps not paranoid enough?).
The output is 64 bytes: 32 bytes to seed the primary key and 32 bytes to
seed the subkey.
Despite the aggressive KDF settings, you will still need to choose a
strong passphrase. Anyone who has your public key can mount an offline
attack. A 10-word Diceware or Pokerware passphrase is more than
sufficient (~128 bits) while also being quite reasonable to memorize.
Since the User ID is the salt, an attacker couldn’t build a single
rainbow table to attack passphrases for different people. (Though your
passphrase really should be strong enough that this won’t matter!) The
cost is that you’ll need to use exactly the same User ID again to
reproduce the key. In theory you could change the User ID afterward to
whatever you like without affecting the Key ID, though it will require a
new self-signature.
The keys are not encrypted (no S2K), and there are few options you can
choose when generating the keys. If you want to change any of this, use
GnuPG’s --edit-key
tool after importing. For example, to set a
protection passphrase:
$ gpg --edit-key Foo
gpg> passwd
There’s a lot that can be configured from this interface.
If you just need the public key to publish or share, the --public
(-p
) option will suppress the private parts and output only a public
key. It works well in combination with ASCII armor, --armor
(-a
).
For example, to put your public key on the clipboard:
$ passphrase2pgp -u '...' -ap | xclip
The tool can create detached signatures (--sign
, -S
) entirely on its
own, too, so you don’t need to import the keys into GnuPG just to make
signatures:
$ passphrase2pgp --sign --uid '...' program.exe
This would create a file named program.exe.sig
with the detached
signature, ready to be verified by another OpenPGP implementation. In
fact, you can hook it directly up to Git for signing your tags and
commits without GnuPG:
$ git config --global gpg.program passphrase2pgp
This only works for signing, and it cannot verify (verify-tag
or
verify-commit
).
It’s pretty tedious to enter the --uid
option all the time, so, if
omitted, passphrase2pgp will infer the User ID from the environment
variables REALNAME and EMAIL. Combined with the KEYID environment
variable (see the README for details), you can easily get away with
never storing your keys: only generate them on demand when needed.
That’s how I intend to use passphrase2pgp. When I want to sign a file,
I’ll only need one option, one passphrase prompt, and a few seconds of
patience:
$ passphrase2pgp -S path/to/file
January 1, 1970
The first time you run the tool you might notice one offensive aspect of
its output: Your key will be dated January 1, 1970 — i.e. unix epoch
zero. This predates PGP itself by more than two decades, so it might
alarm people who receive your key.
Why do this? As I noted before, the creation date is part of the Key ID.
Use a different date, and, as far as OpenPGP is concerned, you have a
different key. Since users probably don’t want to remember a specific
datetime, at seconds resolution, in addition to their passphrase,
passphrase2pgp uses the same hard-coded date by default. A date of
January 1, 1970 is like NULL in a database: no data.
If you don’t like this, you can override it with the --time
(-t
) or
--now
(-n
) options, but it’s up to you to remain consistent.
Vanity Keys
If you’re interested in vanity keys — e.g. where the Key ID spells out
words or looks unusual — it wouldn’t take much work to hack up the
passphrase2pgp source into generating your preferred vanity keys. It
would easily beat anything else I could find online.
Reconsidering limited OpenPGP
Initially my intention was never to output an encryption subkey, and
passphrase2pgp would only be useful for signatures. By default it still
only produces a sign key, but you can still get an encryption subkey
with the --subkey
(-s
) option. I figured it might be useful to
generate an encryption key, even if it’s not output by default. Users
can always ask for it later if they have a need for it.
Why only a signing key? Nobody should be using OpenPGP for encryption
anymore. Use better tools instead and retire the 20th century
cryptography. If you don’t have an encryption subkey, nobody can
send you OpenPGP-encrypted messages.
In contrast, OpenPGP signatures are still kind of useful and lack a
practical alternative. The Web of Trust failed to reach critical mass,
but that doesn’t seem to matter much in practice. Important OpenPGP keys
can be bootstrapped off TLS by strategically publishing them on HTTPS
servers. Keybase.io has done interesting things in this area.
Further, GitHub officially supports OpenPGP signatures, and I
believe GitLab does too. This is another way to establish trust for a
key. IMHO, there’s generally too much emphasis on binding a person’s
legal identity to their OpenPGP key (e.g. the idea behind key-signing
parties). I suppose that’s useful for holding a person legally
accountable if they do something wrong. I’d prefer trust a key with has
an established history of valuable community contributions, even if done
so only under a pseudonym.
So sometime in the future I may again advertise an OpenPGP public key.
If I do, those keys would certainly be generated with passphrase2pgp. I
may not even store the secret keys on a keyring, and instead generate
them on the fly only when I occasionally need them.