Articles tagged openpgp at null program

Keyringless GnuPG

2019-08-09T23:52:39Z

This article was discussed on Hacker News.

My favorite music player is Audacious. It follows the Winamp Classic tradition of not trying to manage my music library. Instead it waits patiently for me to throw files and directories at it. These selections will be informally grouped into transient, disposable playlists of whatever I fancy that day.

This matters to me because my music collection is the result of around 25 years of hoarding music files from various sources including CD rips, Napster P2P sharing, and, most recently, YouTube downloads. It’s not well-organized, but it’s organized well enough. Each album has its own directory, and related albums are sometimes grouped together under a directory for a particular artist.

Over the years I’ve tried various music players, and some have either wanted to manage this library or hide the underlying file-organized nature of my collection. Both situations are annoying because I really don’t want or need that abstraction. I’m going just fine thinking of my music library in terms of files, thank you very much. Same goes for ebooks.

GnuPG is like a media player that wants to manage your whole music library. Rather than MP3s, it’s crypto keys on a keyring. Nearly every operation requires keys that have been imported into the keyring. Until GnuPG 2.2.8 (June 2018), which added the --show-keys command, you couldn’t even be sure what you were importing until after it was already imported. Hopefully it wasn’t garbage.

GnuPG does has a pretty good excuse. It’s oriented around the Web of Trust model, and it can’t follow this model effectively without having all the keys at once. However, even if you don’t buy into the Web of Trust, the GnuPG interface still requires you to play by its rules. Sometimes I’ve got a message, a signature, and a public key and I just want to verify that they’re all consistent with each other, damnit.

$ gpg --import foo.asc
gpg: key 1A719EF63AEB2CFE: public key "foo" imported
gpg: Total number processed: 1
gpg:               imported: 1
$ gpg --verify --trust-model always message.txt.sig message.txt
gpg: Signature made Fri 09 Aug 2019 05:44:43 PM EDT
gpg:                using EDDSA key ...1A719EF63AEB2CFE
gpg: Good signature from "foo" [unknown]
gpg: WARNING: Using untrusted key!
$ gpg --batch --yes --delete-key 1A719EF63AEB2CFE

Three commands and seven lines of output when one of each would do. Plus there’s a false warning: Wouldn’t an “always” trust model mean that this key is indeed trusted?

Signify

Compare this to OpenBSD’s signify (also). There’s no keyring, and it’s up to the user — or the program shelling out to signify — to supply the appropriate key. It’s like the music player that just plays whatever I give it. Here’s a simplified usage overview:

signify -G [-c comment] -p pubkey -s seckey
signify -S [-x sigfile] -s seckey -m message
signify -V [-x sigfile] -p pubkey -m message

When generating a new keypair (-G), the user must choose the destination files for the public and secret keys. When signing a message (a file), the user must supply the secret key and the message. When verifying a file, the user must supply the public key and the message. This is a popular enough model that other, compatible implementations with the same interface have been developed.

Signify is deliberately incompatible with OpenPGP and uses its own simpler, and less featureful, format. Wouldn’t it be nice to have a similar interface to verify OpenPGP signatures?

SimpleGPG

Well, I thought so. So I put together a shell script that wraps GnuPG and provides such an interface:

SimpleGPG

The interface is nearly identical to signify, and the GnuPG keyring is hidden away as if it didn’t exist. The main difference is that the keys and signatures produced and consumed by this tool are fully compatible with OpenPGP. You could use this script without requiring anyone else to adopt something new or different.

To avoid touching your real keyring, the script creates a temporary keyring directory each time it’s run. The GnuPG option --homedir instructs it to use this temporary keyring and ignore the usual one. The temporary keyring is destroyed when the script exits. This is kind of clunky, but there’s no way around it.

Verification looks roughly like this in the script:

$ tmp=$(mktemp -d simplegpg-XXXXXX)
$ gpg --homedir $tmp
$ gpg --homedir $tmp --import foo.asc
$ gpg --homedir $tmp --verify message.txt.sig message.txt
$ rm -rf $tmp

Generating a key is trivial, and there’s only a prompt for the protection passphrase. Like signify, it will generate an Ed25519 key and all outputs are ASCII-armored.

$ simplegpg -G -p keyname.asc -s keyname.pgp
passphrase:
passphrase (confirm):

Since signify doesn’t have a concept of a user ID for a key, just an “untrusted comment”, the user ID is not emphasized here. The default user ID will be “simplegpg key”, so, if you plan to share the key with regular GnuPG users who will need to import it into a keyring, you probably want to use -c to give it a more informative name.

Unfortunately due GnuPG’s very limited, keyring-oriented interface, key generation is about three times slower than it should be. That’s because the protection key is run though the String-to-Key (S2K) algorithm three times:

Immediately after the key is generated, the passphrase is converted to a key, the key is encrypted, and it’s put onto the temporary keyring.
When exporting, the key passphrase is again run through the S2K to get the protection key to decrypt it.
The export format uses a slightly different S2K algorithm, so this export S2K is now used to create yet another protection key.

Technically the second could be avoided since gpg-agent, which is always required, could be holding the secret key material. As far as I can tell, gpg-agent simply does not learn freshly-generated keys. I do not know why this is the case.

This is related to another issue. If you’re accustomed to GnuPG, you may notice that the passphrase prompt didn’t come from pinentry, a program specialized for passphrase prompts. GnuPG normally uses it for this. Instead, the script handles the passphrase prompt and passes the passphrase to GnuPG (via a file descriptor). This would not be necessary if gpg-agent did its job. Without this part of the script, users are prompted three times, via pinentry, for their passphrase when generating a key.

When signing messages, the passphrase prompt comes from pinentry since it’s initiated by GnuPG.

$ simplegpg -S -s keyname.pgp -m message.txt
passphrase:

This will produce message.txt.sig with an OpenPGP detached signature.

The passphrase prompt is for --import, not --detach-sign. As with key generation, the S2K is run more than necessary: twice instead of once. First to generate the decryption key, then a second time to generate a different encryption key for the keyring since the export format and keyring use different algorithms. Ugh.

But at least gpg-agent does its job this time, so only one passphrase prompt is necessary. In general, a downside of these temporary keyrings is that gpg-agent treats each as different keys, and you will need to enter your passphrase once for each message signed. Just like signify.

Verification, of course, requires no prompting and no S2K.

$ simplegpg -V -p keyname.asc -m message.txt

That’s all there is to keyringless OpenPGP signatures. Since I’m not interested in the Web of Trust or keyservers, I wish GnuPG was more friendly to this model of operation.

passphrase2pgp

I mentioned that SimpleGPG is fully compatible with other OpenPGP systems. This includes my own passphrase2pgp, where your secret key is stored only in your brain. No need for a secret key file. In the time since I first wrote about it, passphrase2pgp has gained the ability to produce signatures itself!

I’ve got my environment set up — $REALNAME, $EMAIL, and $KEYID per the README — so I don’t need to supply a user ID argument, nor will I be prompted to confirm my passphrase since it’s checked against a known fingerprint. Generating the public key, for sharing, looks like this:

$ passphrase2pgp -K --armor --public >keyname.asc

Or just:

$ passphrase2pgp -ap >keyname.asc

Like with signify and SimplePGP, to sign a message I’m prompted for my passphrase. It takes longer since the “S2K” here is much stronger by necessity. The passphrase is used to generate the secret key, then from that the signature on the message:

$ passphrase2pgp -S message.txt

For the SimpleGPG user on the other side it all looks the same as before:

$ simplegpg -V -p keyname.asc -m message.txt

I’m probably going to start signing my open source software releases, and this is how I intend to do it.

The Long Key ID Collider

2019-07-22T21:27:02Z

Over the last couple weeks I’ve spent a lot more time working with OpenPGP keys. It’s a consequence of polishing my passphrase-derived PGP key generator. I’ve tightened up the internals, and it’s enabled me to explore the corners of the format, try interesting things, and observe how various OpenPGP implementations respond to weird inputs.

For one particularly cool trick, take a look at these two (private) keys I generated yesterday. Here’s the first:

And the second:

Concatenate these and then import them into GnuPG to have a look at them. To avoid littering in your actual keyring, especially with private keys, use the --homedir option to set up a temporary keyring. I’m going to omit that option in the examples.

$ gpg --import < keys.asc
gpg: key 992A5EEE1D1049FA: public key "nullprogram.com" imported
gpg: key 992A5EEE1D1049FA: secret key imported
gpg: key 992A5EEE1D1049FA: public key "nullprogram.com" imported
gpg: key 992A5EEE1D1049FA: secret key imported
gpg: Total number processed: 2
gpg:               imported: 2
gpg:       secret keys read: 2
gpg:   secret keys imported: 2

The user ID is “nullprogram.com” since I made these and that’s me taking credit. “992A5EEE1D1049FA” is called the long key ID: a 64-bit value that identifies the key. It’s the lowest 64 bits of the full key ID, a 160-bit SHA-1 hash. In the old days everyone used a short key ID to identify keys, which was the lowest 32 bits of the key. For these keys, that would be “1D1049FA”. However, this was deemed way too short, and everyone has since switched to long key IDs, or even the full 160-bit key ID.

The key ID is nothing more than a SHA-1 hash of the key creation date — unsigned 32-bit unix epoch seconds — and the public key material. So secret keys have the same key ID as their associated public key. This makes sense since they’re a key pair and they go together.

Look closely and you’ll notice that both keypairs have the same long key ID. If you hadn’t already guessed from the title of this article, these are two different keys with the same long key ID. In other words, I’ve created a long key ID collision. The GnuPG --list-keys command prints the full key ID since it’s so important:

$ gpg --list-keys
---------------------
pub   ed25519 2019-07-22 [SCA]
      A422F8B0E1BF89802521ECB2992A5EEE1D1049FA
uid           [ unknown] nullprogram.com

pub   ed25519 2019-07-22 [SCA]
      F43BC80C4FC2603904E7BE02992A5EEE1D1049FA
uid           [ unknown] nullprogram.com

I was only targeting the lower 64 bits, but I actually managed to collide the lowest 68 bits by chance. So a long key ID still isn’t enough to truly identify any particular key.

This isn’t news, of course. Nor am I the first person to create a long key ID collision. In 2013, David Leon Gil published a long key ID collision for two 4096-bit RSA public keys. However, that is the only other example I was able to find. He did not include the private keys and did not elaborate on how he did it. I know he did generate viable keys, not just garbage for the public key portions, since they’re both self-signed.

Creating these keys was trickier than I had anticipated, and there’s an old, clever trick that makes it work. Building atop the work I did for passphrase2pgp, I created a standalone tool that will create a long key ID collision and print the two keypairs to standard output:

https://github.com/skeeto/pgpcollider

Example usage:

$ go get -u github.com/skeeto/pgpcollider
$ pgpcollider --verbose > keys.asc

This can take up to a day to complete when run like this. The tool can optionally coordinate many machines — see the --server / -S and --client / -C options — to work together, greatly reducing the total time. It took around 4 hours to create the keys above on a single machine, generating a around 1 billion extra keys in the process. As discussed below, I actually got lucky that it only took 1 billion. If you modify the program to do short key ID collisions, it only takes a few seconds.

The rest of this article is about how it works.

Birthday Attacks

An important detail is that this technique doesn’t target any specific key ID. Cloning someone’s long key ID is still very expensive. No, this is a birthday attack. To find a collision in a space of 2^64, on average I only need to generate 2^32 samples — the square root of that space. That’s perfectly feasible on a regular desktop computer. To collide long key IDs, I need only generate about 4 billion IDs and efficiently do membership tests on that set as I go.

That last step is easier said than done. Naively, that might look like this (pseudo-code):

seen := map of long key IDs to keys
loop forever {
    key := generateKey()
    longID := key.ID[12:20]
    if longID in seen {
        output seen[longID]
        output key
        break
    } else {
        seen[longID] = key
    }
}

Consider the size of that map. Each long ID is 8 bytes, and we expect to store around 2^32 of them. That’s at minimum 32 GB of storage just to track all the long IDs. The map itself is going to have some overhead, too. Since these are literally random lookups, this all mostly needs to be in RAM or else lookups are going to be very slow and impractical.

And I haven’t even counted the keys yet. As a saving grace, these are Ed25519 keys, so that’s 32 bytes for the public key and 32 bytes for the private key, which I’ll need if I want to make a self-signature. (The signature itself will be larger than the secret key.) That’s around 256GB more storage, though at least this can be stored on the hard drive. However, to address these from the map I’d need at least 38 bits, plus some more in case it goes over. Just call it another 8 bytes.

So that’s, at a bare minimum, 64GB of RAM plus 256GB of other storage. Since nothing is ideal, we’ll need more than this. This is all still feasible, but will require expensive hardware. We can do a lot better.

Keys from seeds

The first thing you might notice is that we can jettison that 256GB of storage by being a little more clever about how we generate keys. Since we don’t actually care about the security of these keys, we can generate each key from a seed much smaller than the key itself. Instead of using 8 bytes to reference a key in storage, just use those 8 bytes to store the seed used to make the key.

counter := rand64()
seen := map of long key IDs to 64-bit seeds
loop forever {
    seed := counter
    counter++
    key := generateKey(seed)
    longID := key.ID[12:20]
    if longID in seen {
        output generateKey(seen[longID])
        output key
        break
    } else {
        seen[longID] = seed
    }
}

I’m incrementing a counter to generate the seeds because I don’t want to experience the birthday paradox to apply to my seeds. Each really must be unique. I’m using SplitMix64 for the PRNG since I learned it’s the fastest for Go, so a simple increment to generate seeds is perfectly fine.

Ultimately, this still uses utterly excessive amounts of memory. Wouldn’t it be crazy if we could somehow get this 64GB map down to just a few MBs of RAM? Well, we can!

Rainbow tables

For decades, password crackers have faced a similar problem. They want to precompute the hashes for billions of popular passwords so that they can efficiently reverse those password hashes later. However, storing all those hashes would be unnecessarily expensive, or even infeasible.

So they don’t. Instead they use rainbow tables. Password hashes are chained together into a hash chain, where a password hash leads to a new password, then to a hash, and so on. Then only store the beginning and the end of each chain.

To lookup a hash in the rainbow table, run the hash chain algorithm starting from the target hash and, for each hash, check if it matches the end of one of the chains. If so, recompute that chain and note the step just before the target hash value. That’s the corresponding password.

For example, suppose the password “foo” hashes to 9bfe98eb, and we have a reduction function that maps a hash to some password. In this case, it maps 9bfe98eb to “bar”. A trivial reduction function could just be an index into a list of passwords. A hash chain starting from “foo” might look like this:

foo -> 9bfe98eb -> bar -> 27af0841 -> baz -> d9d4bbcb

In reality a chain would be a lot longer. Another chain starting from “apple” might look like this:

apple -> 7bbc06bc -> candle -> 82a46a63 -> dog -> 98c85d0a

We only store the tuples (foo, d9d4bbcb) and (apple, 98c85d0a) in our database. If the chains had been one million hashes long, we’d still only store those two tuples. That’s literally a 1:1000000 compression ratio!

Later on we’re faced with reversing the hash 27af0841, which isn’t listed directly in the database. So we run the chain forward from that hash until either I hit the maximum chain length (i.e. password not in the table), or we recognize a hash:

27af0841 -> baz -> d9d4bbcb

That d9d4bbcb hash is listed as being in the “foo” hash chain. So I regenerate that hash chain to discover that “bar” leads to 27af0841. Password cracked!

Collider rainbow table

My collider works very similarly. A hash chain works like this: Start with a 64-bit seed as before, generate a key, get the long key ID, then use the long key ID as the seed for the next key.

There’s one big difference. In the rainbow table the purpose is to run the hash function backwards by looking at the previous step in the chain. For the collider, I want to know if any of the hash chains collide. So long as each chain starts from a unique seed, it would mean we’ve found two different seeds that lead to the same long key ID.

Alternatively, it could be two different seeds that lead to the same key, which wouldn’t be useful, but that’s trivial to avoid.

A simple and efficient way to check if two chains contain the same sequence is to stop them at the same place in that sequence. Rather than run the hash chains for some fixed number of steps, they stop when they reach a distinguishing point. In my collider a distinguishing point is where the long key ID ends with at least N 0 bits, where N determines the average chain length. I chose 17 bits.

func computeChain(seed) {
    loop forever {
        key := generateKey(seed)
        longID := key.ID[12:20]
        if distinguished(longID) {
            return longID
        }
        seed = longID
    }
}

If two different hash chains end on the same distinguishing point, they’re guaranteed to have collided somewhere in the middle.

To determine where two chains collided, regenerate each chain and find the first long key ID that they have in common. The step just before are the colliding keys.

counter := rand64()
seen := map of long key IDs to 64-bit seeds
loop forever {
    seed := counter
    counter++
    longID := computeChain(seed)
    if longID in seen {
        output findCollision(seed, seen[longID])
        break
    } else {
        seen[longID] = seed
    }
}

Hash chains computation is embarrassingly parallel, so the load can be spread efficiently across CPU cores. With these rainbow(-like) tables, my tool can generate and track billions of keys in mere megabytes of memory. The additional computational cost is the time it takes to generate a couple more chains than otherwise necessary.

Predictable, Passphrase-Derived PGP Keys

2019-07-10T04:18:29Z

tl;dr: passphrase2pgp.

One of my long-term concerns has been losing my core cryptographic keys, or just not having access to them when I need them. I keep my important data backed up, and if that data is private then I store it encrypted. My keys are private, but how am I supposed to encrypt them? The chicken or the egg?

The OpenPGP solution is to (optionally) encrypt secret keys using a key derived from a passphrase. GnuPG prompts the user for this passphrase when generating keys and when using secret keys. This protects the keys at rest, and, with some caution, they can be included as part of regular backups. The OpenPGP specification, RFC 4880 has many options for deriving a key from this passphrase, called String-to-Key, or S2K, algorithms. None of the options are great.

In 2012, I selected the strongest S2K configuration at the time and, along with a very strong passphrase, put my GnuPG keyring on the internet as part of my public dotfiles repository. It was a kind of super-backup that would guarantee their availability anywhere I’d need them.

My timing was bad because, with the release of GnuPG 2.1 in 2014, GnuPG fundamentally changed its secret keyring format. S2K options are now (quietly!) ignored when deriving the protection keys. Instead it auto-calibrates to much weaker settings. With this new version of GnuPG, I could no longer update the keyring in my dotfiles repository without significantly downgrading its protection.

By 2017 I was pretty irritated with the whole situation. I let my OpenPGP keys expire, and then I wrote my own tool to replace the only feature of GnuPG I was actively using: encrypting my backups with asymmetric encryption. One of its core features is that the asymmetric keypair can be derived from a passphrase using a memory-hard key derivation function (KDF). Attackers must commit a significant quantity of memory (expensive) when attempting to crack the passphrase, making the passphrase that much more effective.

Since the asymmetric keys themselves, not just the keys protecting them, are derived from a passphrase, I never need to back them up! They’re also always available whenever I need them. My keys are essentially stored entirely in my brain as if I was a character in a William Gibson story.

Tackling OpenPGP key generation

At the time I had expressed my interest in having this feature for OpenPGP keys. It’s something I’ve wanted for a long time. I first took a crack at it in 2013 (now the the old-version branch) for generating RSA keys. RSA isn’t that complicated but it’s very easy to screw up. Since I was rolling it from scratch, I didn’t really trust myself not to subtly get it wrong. Plus I never figured out how to self-sign the key. GnuPG doesn’t accept secret keys that aren’t self-signed, so it was never useful.

I took another crack at it in 2018 with a much more brute force approach. When a program needs to generate keys, it will either read from /dev/u?random or, on more modern systems, call getentropy(3). These are all ultimately system calls, and I know how to intercept those with Ptrace. If I want to control key generation for any program, not just GnuPG, I could intercept these inputs and replace them with the output of a CSPRNG keyed by a passphrase.

Keyed: Linux Entropy Interception

In practice this doesn’t work at all. Real programs like GnuPG and OpenSSH’s ssh-keygen don’t rely solely on these entropy inputs. They also grab entropy from other places, like getpid(2), gettimeofday(2), and even extract their own scheduler and execution time noise. Without modifying these programs I couldn’t realistically control their key generation.

Besides, even if it did work, it would still be fragile and unreliable since these programs could always change how they use the inputs. So, ultimately, it was more of an experiment than something practical.

passphrase2pgp

For regular readers, it’s probably obvious that I recently learned Go. While searching for good projects idea for cutting my teeth, I noticed that Go’s “extended” standard library has a lot of useful cryptographic support, so the idea of generating the keys myself may be worth revisiting.

Something else also happened since my previous attempt: The OpenPGP ecosystem now has widespread support for elliptic curve cryptography. So instead of RSA, I could generate a Curve25519 keypair, which, by design, is basically impossible to screw up. Not only would I be generating keys on my own terms, I’d being doing it in style, baby.

There are two different ways to use Curve25519:

Digital signatures: Ed25519 (EdDSA)
Diffie–Hellman (encryption): X25519 (ECDH)

In GnuPG terms, the first would be a “sign only” key and the second is an “encrypt only” key. But can’t you usually do both after you generate a new OpenPGP key? If you’ve used GnuPG, you’ve probably seen the terms “primary key” and “subkey”, but you probably haven’t had think about them since it’s all usually automated.

The primary key is the one associated directly with your identity. It’s always a signature key. The OpenPGP specification says this is a signature key only by convention, but, practically speaking, it really must be since signatures is what holds everything together. Like packaging tape.

If you want to use encryption, independently generate an encryption key, then sign that key with the primary key, binding that key as a subkey to the primary key. This all happens automatically with GnuPG.

Fun fact: Two different primary keys can have the same subkey. Anyone could even bind any of your subkeys to their primary key! They only need to sign the public key! Though, of course, they couldn’t actually use your key since they’d lack the secret key. It would just be really confusing, and could, perhaps in certain situations, even cause some OpenPGP clients to malfunction. (Note to self: This demands investigation!)

It’s also possible to have signature subkeys. What good is that? Paranoid folks will keep their primary key only on a secure, air-gapped, then use only subkeys on regular systems. The subkeys can be revoked and replaced independently of the primary key if something were to go wrong.

In Go, generating an X25519 key pair is this simple (yes, it actually takes array pointers, which is rather weird):

package main

import (
	"crypto/rand"
	"fmt"

	"golang.org/x/crypto/curve25519"
)

func main() {
	var seckey, pubkey [32]byte
	rand.Read(seckey[:]) // FIXME: check for error
	seckey[0] &= 248
	seckey[31] &= 127
	seckey[31] |= 64
	curve25519.ScalarBaseMult(&pubkey, &seckey)
	fmt.Printf("pub %x\n", pubkey[:])
	fmt.Printf("sec %x\n", seckey[:])
}

The three bitwise operations are optional since it will do these internally, but it ensures that the secret key is in its canonical form. The actual Diffie–Hellman exchange requires just one more function call: curve25519.ScalarMult().

For Ed25519, the API is higher-level:

package main

import (
	"crypto/rand"
	"fmt"

	"golang.org/x/crypto/ed25519"
)

func main() {
	seed := make([]byte, ed25519.SeedSize)
	rand.Read(seed) // FIXME: check for error
	key := ed25519.NewKeyFromSeed(seed)
	fmt.Printf("pub %x\n", key[32:])
	fmt.Printf("sec %x\n", key[:32])
}

Signing a message with this key is just one function call: ed25519.Sign().

Unfortunately that’s the easy part. The other 400 lines of the real program are concerned only with encoding these values in the complex OpenPGP format. That’s the hard part. GnuPG’s --list-packets option was really useful for debugging this part.

OpenPGP specification

(Feel free to skip this section if the OpenPGP wire format isn’t interesting to you.)

Following the specification was a real challenge, especially since many of the details for Curve25519 only appear in still incomplete (and still erroneous) updates to the specification. I certainly don’t envy the people who have to parse arbitrary OpenPGP packets. It’s finicky and has arbitrary parts that don’t seem to serve any purpose, such as redundant prefix and suffix bytes on signature inputs. Fortunately I only had to worry about the subset that represents an unencrypted secret key export.

OpenPGP data is broken up into packets. Each packet begins with a tag identifying its type, followed by a length, which itself is a variable length. All the packets produced by passphrase2pgp are short, so I could pretend lengths were all a single byte long.

For a secret key export with one subkey, we need the following packets in this order:

Secret-Key: Public-Key packet with secret key appended
User ID: just a length-prefixed, UTF-8 string
Signature: binds Public-Key packet (1) and User ID packet (2)
Secret-Subkey: Public-Subkey packet with secret subkey appended
Signature: binds Public-Key packet (1) and Public-Subkey packet (4)

A Public-Key packet contains the creation date, key type, and public key data. A Secret-Key packet is the same, but with the secret key literally appended on the end and a different tag. The Key ID is (essentially) a SHA-1 hash of the Public-Key packet, meaning the creation date is part of the Key ID. That’s important for later.

I had wondered if the SHAttered attack could be used to create two different keys with the same full Key ID. However, there’s no slack space anywhere in the input, so I doubt it.

User IDs are usually a RFC 2822 name and email address, but that’s only convention. It can literally be an empty string, though that wouldn’t be useful. OpenPGP clients that require anything more than an empty string, such as GnuPG during key generation, are adding artificial restrictions.

The first Signature packet indicates the signature date, the signature issuer’s Key ID, and then optional metadata about how the primary key is to be used and the capabilities the key owner’s client. The signature itself is formed by appending the Public-Key packet portion of the Secret-Key packet, the User ID packet, and the previously described contents of the signature packet. The concatenation is hashed, the hash is signed, and the signature is appended to the packet. Since the options are included in the signature, they can’t be changed by another person.

In theory the signature is redundant. A client could accept the Secret-Key packet and User ID packet and consider the key imported. It would then create its own self-signature since it has everything it needs. However, my primary target for passphrase2pgp is GnuPG, and it will not accept secret keys that are not self-signed.

The Secret-Subkey packet is exactly the same as the Secret-Key packet except that it uses a different tag to indicate it’s a subkey.

The second Signature packet is constructed the same as the previous signature packet. However, it signs the concatenation of the Public-Key and Public-Subkey packets, binding the subkey to that primary key. This key may similarly have its own options.

To create a public key export from this input, a client need only chop off the secret keys and fix up the packet tags and lengths. The signatures remain untouched since they didn’t include the secret keys. That’s essentially what other people will receive about your key.

If someone else were to create a Signature packet binding your Public-Subkey packet with their Public-Key packet, they could set their own options on their version of the key. So my question is: Do clients properly track this separate set of options and separate owner for the same key? If not, they have a problem!

The format may not sound so complex from this description, but there are a ton of little details that all need to be correct. To make matters worse, the feedback is usually just a binary “valid” or “invalid”. The world could use an OpenPGP format debugger.

Usage

There is one required argument: either --uid (-u) or --load (-l). The former specifies a User ID since a key with an empty User ID is pretty useless. It’s my own artificial restriction on the User ID. The latter loads a previously-generated key which will come with a User ID.

To generate a key for use in GnuPG, just pipe the output straight into GnuPG:

$ passphrase2pgp --uid "Foo " | gpg --import

You will be prompted for a passphrase. That passphrase is run through Argon2id, a memory-hard KDF, with the User ID as the salt. Deriving the key requires 8 passes over 1GB of state, which takes my current computers around 8 seconds. With the --paranoid (-x) option enabled, that becomes 16 passes over 2GB (perhaps not paranoid enough?). The output is 64 bytes: 32 bytes to seed the primary key and 32 bytes to seed the subkey.

Despite the aggressive KDF settings, you will still need to choose a strong passphrase. Anyone who has your public key can mount an offline attack. A 10-word Diceware or Pokerware passphrase is more than sufficient (~128 bits) while also being quite reasonable to memorize.

Since the User ID is the salt, an attacker couldn’t build a single rainbow table to attack passphrases for different people. (Though your passphrase really should be strong enough that this won’t matter!) The cost is that you’ll need to use exactly the same User ID again to reproduce the key. In theory you could change the User ID afterward to whatever you like without affecting the Key ID, though it will require a new self-signature.

The keys are not encrypted (no S2K), and there are few options you can choose when generating the keys. If you want to change any of this, use GnuPG’s --edit-key tool after importing. For example, to set a protection passphrase:

$ gpg --edit-key Foo
gpg> passwd

There’s a lot that can be configured from this interface.

If you just need the public key to publish or share, the --public (-p) option will suppress the private parts and output only a public key. It works well in combination with ASCII armor, --armor (-a). For example, to put your public key on the clipboard:

$ passphrase2pgp -u '...' -ap | xclip

The tool can create detached signatures (--sign, -S) entirely on its own, too, so you don’t need to import the keys into GnuPG just to make signatures:

$ passphrase2pgp --sign --uid '...' program.exe

This would create a file named program.exe.sig with the detached signature, ready to be verified by another OpenPGP implementation. In fact, you can hook it directly up to Git for signing your tags and commits without GnuPG:

$ git config --global gpg.program passphrase2pgp

This only works for signing, and it cannot verify (verify-tag or verify-commit).

It’s pretty tedious to enter the --uid option all the time, so, if omitted, passphrase2pgp will infer the User ID from the environment variables REALNAME and EMAIL. Combined with the KEYID environment variable (see the README for details), you can easily get away with never storing your keys: only generate them on demand when needed.

That’s how I intend to use passphrase2pgp. When I want to sign a file, I’ll only need one option, one passphrase prompt, and a few seconds of patience:

$ passphrase2pgp -S path/to/file

January 1, 1970

The first time you run the tool you might notice one offensive aspect of its output: Your key will be dated January 1, 1970 — i.e. unix epoch zero. This predates PGP itself by more than two decades, so it might alarm people who receive your key.

Why do this? As I noted before, the creation date is part of the Key ID. Use a different date, and, as far as OpenPGP is concerned, you have a different key. Since users probably don’t want to remember a specific datetime, at seconds resolution, in addition to their passphrase, passphrase2pgp uses the same hard-coded date by default. A date of January 1, 1970 is like NULL in a database: no data.

If you don’t like this, you can override it with the --time (-t) or --now (-n) options, but it’s up to you to remain consistent.

Vanity Keys

If you’re interested in vanity keys — e.g. where the Key ID spells out words or looks unusual — it wouldn’t take much work to hack up the passphrase2pgp source into generating your preferred vanity keys. It would easily beat anything else I could find online.

Reconsidering limited OpenPGP

Initially my intention was never to output an encryption subkey, and passphrase2pgp would only be useful for signatures. By default it still only produces a sign key, but you can still get an encryption subkey with the --subkey (-s) option. I figured it might be useful to generate an encryption key, even if it’s not output by default. Users can always ask for it later if they have a need for it.

Why only a signing key? Nobody should be using OpenPGP for encryption anymore. Use better tools instead and retire the 20th century cryptography. If you don’t have an encryption subkey, nobody can send you OpenPGP-encrypted messages.

In contrast, OpenPGP signatures are still kind of useful and lack a practical alternative. The Web of Trust failed to reach critical mass, but that doesn’t seem to matter much in practice. Important OpenPGP keys can be bootstrapped off TLS by strategically publishing them on HTTPS servers. Keybase.io has done interesting things in this area.

Further, GitHub officially supports OpenPGP signatures, and I believe GitLab does too. This is another way to establish trust for a key. IMHO, there’s generally too much emphasis on binding a person’s legal identity to their OpenPGP key (e.g. the idea behind key-signing parties). I suppose that’s useful for holding a person legally accountable if they do something wrong. I’d prefer trust a key with has an established history of valuable community contributions, even if done so only under a pseudonym.

So sometime in the future I may again advertise an OpenPGP public key. If I do, those keys would certainly be generated with passphrase2pgp. I may not even store the secret keys on a keyring, and instead generate them on the fly only when I occasionally need them.

Why I've Retired My PGP Keys and What's Replaced It

2017-03-12T21:54:38Z

Update August 2019: I’ve got a PGP key again but only for signing. I use another of my own tools, passphrase2pgp, to manage it.

tl;dr: Enchive (rhymes with “archive”) has replaced my use of GnuPG.

Two weeks ago I tried to encrypt a tax document for archival and noticed my PGP keys had just expired. GnuPG had (correctly) forbidden the action, requiring that I first edit the key and extend the expiration date. Rather than do so, I decided to take this opportunity to retire my PGP keys for good. Over time I’ve come to view PGP as largely a failure — it never reached the critical mass, the tooling has always been problematic, and it’s now a dead end. The only thing it’s been successful at is signing Linux packages, and even there it could be replaced with something simpler and better.

I still have a use for PGP: encrypting sensitive files to myself for long term storage. I’ve also been using it to consistently to sign Git tags for software releases. However, very recently this lost its value, though I doubt anyone was verifying these signatures anyway. It’s never been useful for secure email, especially when most people use it incorrectly. I only need to find a replacement for archival encryption.

I could use an encrypted filesystem, but which do I use? I use LUKS to protect my laptop’s entire hard drive in the event of a theft, but for archival I want something a little more universal. Basically I want the following properties:

Sensitive content must not normally be in a decrypted state. PGP solves this by encrypting files individually. The archive filesystem can always be mounted. An encrypted volume would need to be mounted just prior to accessing it, during which everything would be exposed.
I should be able to encrypt files from any machine, even less-trusted ones. With PGP I can load my public key on any machine and encrypt files to myself. It’s like a good kind of ransomware.
It should be easy to back these files up elsewhere, even on less-trusted machines/systems. This isn’t reasonably possible with an encrypted filesystem which would need to be backed up as a huge monolithic block of data. With PGP I can toss encrypted files anywhere.
I don’t want to worry about per-file passphrases. Everything should be encrypted with/to the same key. PGP solves this by encrypting files to a recipient. This requirement prevents most stand-alone crypto tools from qualifying.

I couldn’t find anything that fit the bill, so I did exactly what you’re not supposed to do and rolled my own: Enchive. It was loosely inspired by OpenBSD’s signify. It has the tiny subset of PGP features that I need — using modern algorithms — plus one more feature I’ve always wanted: the ability to generate a keypair from a passphrase. This means I can reliably access my archive keypair anywhere.

On Enchive

Here’s where I’d put the usual disclaimer about not using it for anything serious, blah blah blah. But really, I don’t care if anyone else uses Enchive. It exists just to scratch my own personal itch. If you have any doubts, don’t use it. I’m putting it out there in case anyone else is in the same boat. It would also be nice if any glaring flaws I may have missed were pointed out.

Not expecting it to be available as a nice package, I wanted to make it trivial to build Enchive anywhere I’d need it. Except for including stdint.h in exactly one place to get the correct integers for crypto, it’s written in straight C89. All the crypto libraries are embedded, and there are no external dependencies. There’s even an “amalgamation” build, so make isn’t required: just point your system’s cc at it and you’re done.

Algorithms

For encryption, Enchive uses Curve25519, ChaCha20, and HMAC-SHA256.

Rather than the prime-number-oriented RSA as used in classical PGP (yes, GPG 2 can do better), Curve25519 is used for the asymmetric cryptography role, using the relatively new elliptic curve cryptography. It’s stronger cryptography and the keys are much smaller. It’s a Diffie-Hellman function — an algorithm used to exchange cryptographic keys over a public channel — so files are encrypted by generating an ephemeral keypair and using this ephemeral keypair to perform a key exchange with the master keys. The ephemeral public key is included with the encrypted file and the ephemeral private key is discarded.

I used the “donna” implementation in Enchive. Despite being the hardest to understand (mathematically), this is the easiest to use. It’s literally just one function of two arguments to do everything.

Curve25519 only establishes the shared key, so next is the stream cipher ChaCha20. It’s keyed by the shared key to actually encrypt the data. This algorithm has the same author as Curve25519 (djb), so it’s natural to use these together. It’s really straightforward, so there’s not much to say about it.

For the Message Authentication Code (MAC), I chose HMAC-SHA256. It prevents anyone from modifying the message. Note: This doesn’t prevent anyone who knows the master public key from replacing the file wholesale. That would be solved with a digital signature, but this conflicts with my goal of encrypting files without the need of my secret key. The MAC goes at the end of the file, allowing arbitrarily large files to be encrypted single-pass as a stream.

There’s a little more to it (IV, etc.) and is described in detail in the README.

Usage

The first thing you’d do is generate a keypair. By default this is done from /dev/urandom, in which case you should immediately back them up. But if you’re like me, you’ll be using Enchive’s --derive (-d) feature to create it from a passphrase. In that case, the keys are backed up in your brain!

$ enchive keygen --derive
secret key passphrase:
secret key passphrase (repeat):
passphrase (empty for none):
passphrase (repeat):

The first prompt is for the secret key passphrase. This is converted into a Curve25519 keypair using an scrypt-like key derivation algorithm. The process requires 512MB of memory (to foil hardware-based attacks) and takes around 20 seconds.

The second passphrase (or the only one when --derive isn’t used), is the protection key passphrase. The secret key is encrypted with this passphrase to protect it at rest. You’ll need to enter it any time you decrypt a file. The key derivation step is less aggressive for this key, but you could also crank it up if you like.

At the end of this process you’ll have two new files under $XDG_CONFIG_DIR/enchive: enchive.pub (32 bytes) and enchive.sec (64 bytes). The first you can distribute anywhere you’d like to encrypt files; it’s not particularly sensitive. The second is needed to decrypt files.

To encrypt a file for archival:

$ enchive archive sensitive.zip

No prompt for passphrase. This will create sensitive.zip.enchive.

To decrypt later:

$ enchive extract sensitive.zip.enchive
passphrase:

If you’ve got many files to decrypt, entering your passphrase over and over would get tiresome, so Enchive includes a key agent that keeps the protection key in memory for a period of time (15 minutes by default). Enable it with the --agent flag (it may be enabled by default someday).

$ enchive --agent extract sensitive.zip.enchive

Unlike ssh-agent and gpg-agent, there’s no need to start the agent ahead of time. It’s started on demand as needed and terminates after the timeout. It’s completely painless.

Both archive and extract operate stdin to stdout when no file is given.

Feature complete

As far as I’m concerned, Enchive is feature complete. It does everything I need, I don’t want it to do anything more, and at least two of us have already started putting it to use. The interface and file formats won’t change unless someone finds a rather significant flaw. There is some wiggle room to replace the algorithms in the future should Enchive have that sort of longevity.

Publishing My Private Keys

2012-06-24T00:00:00Z

Update March 2017: I no longer use PGP. Also, there’s a bug in GnuPG that silently discards these security settings, and it’s unlikely to ever get fixed. You’ll need to find/build an old version of GnuPG if you want to properly protect your secret keys.

Update August 2019: I’ve got a PGP key again, but I’m using my own tool, passphrase2pgp, to manage it. This tool allows for a particular workflow that GnuPG has never and will never provide. It doesn’t rely on S2K as described below.

One of the items in my dotfiles repository is my PGP keys, both private and public. I believe this is a unique approach that hasn’t been done before — a public experiment. It may seem dangerous, but I’ve given it careful thought and I’m only using the tools already available from GnuPG. It ensures my keys are well backed-up (via the Torvalds method) and available wherever I should need them.

In your GnuPG directory there are two core files: secring.gpg and pubring.gpg. The first contains your secret keys and the second contains public keys. secring.gpg is not itself encrypted. You can (should) have different passphrases for each key, after all. These files (or any PGP file) can be inspected with --list-packets. Notice it won’t prompt for a passphrase in order to get this data,

$ gpg --list-packets ~/.gnupg/secring.gpg
:secret key packet:
    version 4, algo 1, created 1298734547, expires 0
    skey[0]: [2048 bits]
    skey[1]: [17 bits]
    iter+salt S2K, algo: 9, SHA1 protection, hash: 10, salt: ...
    protect count: 10485760 (212)
    protect IV:  a6 61 4a 95 44 1e 7e 90 88 c3 01 70 8d 56 2e 11
    encrypted stuff follows
:user ID packet: "Christopher Wellons <...>"
:signature packet: algo 1, keyid 613382C548B2B841
... and so on ...

Each key is encrypted individually within this file with a passphrase. If you try to use the key, GPG will attempt to decrypt it by asking for the passphrase. If someone were to somehow gain access to your secring.gpg, they’d still need to get your passphrase, so pick a strong one. The official documentation advises you to keep your secring.gpg well-guarded and only rely on the passphrase as a cautionary measure. I’m ignoring that part.

If you’re using GPG’s defaults, your secret key is encrypted with CAST5, a symmetric block cipher. The encryption key is your passphrase salted (mixed with a non-secret random number) and hashed with SHA-1 65,536 times. Using the hash function over and over is called key stretching. It greatly increases the amount of required work for a brute-force attack, making your passphrase more effective. All of these settings can be adjusted to better protect the secret key at the cost of less portability. Since I’ve chosen to publish my secring.gpg in my dotfiles repository I cranked up the settings as far as I can.

I changed the cipher to AES256, which is more modern, more trusted, and more widely used than CAST5. For the passphrase digest, I selected SHA-512. There are better passphrase digest algorithms out there but this is the longest, slowest one that GPG offers. The PGP spec supports between 1024 and 65,011,712 digest iterations, so I picked one of the largest. 65 million iterations takes my laptop over a second to process — absolutely brutal for someone attempting a brute-force attack. Here’s the command to change to this configuration on an existing key,

gpg --s2k-cipher-algo AES256 --s2k-digest-algo SHA512 --s2k-mode 3 \
    --s2k-count 65000000 --edit-key 

When the edit key prompt comes up, enter passwd to change your passphrase. You can enter the same passphrase again and it will re-use it with the new configuration.

I’m feeling quite secure with my secret key, despite publishing my secring.gpg. Before now, I was much more at risk of losing it to disk failure than having it exposed. I challenge anyone who doubts my security to crack my secret key. I’d rather learn that I’m wrong sooner than later!

With this established in my dotfiles repository, I can more easily include private dotfiles. Rather than use a symmetric cipher with an individual passphrase on each file, I encrypt the private dotfiles to myself. All my private dotfiles are managed with one key: my PGP key. This also plays better with Emacs. While it supports transparent encryption, it doesn’t even attempt to manage your passphrase (with good reason). If the file is encrypted with a symmetric cipher, Emacs will prompt for a passphrase on each save. If I encrypt them with my public key, I only need the passphrase when I first open the file.

How it works right now is any dotfile that ends with .priv.pgp will be decrypted into place — not symlinked, unfortunately, since this is impossible. The install script has a -p switch to disable private dotfiles, such as when I’m using an untrusted computer. gpg-agent ensures that I only need to enter my passphrase once during the install process no matter how many private dotfiles there are.