null program

Don't Write Your Own E-mail Validator

This man disapproves of your e-mail validator. Gmail has a nice feature: when delivering e-mail, everything including and after a + in a Gmail address is ignored. For example, mail arriving at all of these addresses would go to the same place if they were Gmail addresses,

account@example.com
account+nullprogram@example.com
account+slashdot@example.com

Thanks to this feature, when a user acquires a Gmail account, Google is actually providing about a googol (as in the number 10100) different e-mail addresses to that user! Quite appropriate, really.

I have seen other mailers do similar things, like ignoring everything after dashes. A nice advantage to this is when registering at a new website I can customize my e-mail address for them by, say, throwing the website name in it. Because I have a google of e-mail addresses available, it is impossible to run out, so I can give every person I meet their own version of my address. The custom address can come in handy for sorting and filtering, and it will also tell me who is selling out my e-mail address. This, of course, assumes that someone isn't stripping out the extra text in my address to counter the Gmail feature.

However, in my personal experience, most websites will not permit +'s in addresses. This is completely ridiculous, because it means that virtually every website will incorrectly invalidate perfectly valid e-mail addresses. Even major websites, like coca-cola.com, screw this up. They see the + in the address and give up.

In fact, if I do a Google search for "email validation regex" right now, 9 of the first 10 results return websites with regular expressions that are complete garbage and will toss out many common, valid addresses. The only useful result was at the fifth spot (linked below).

For the love of Stallman's beard, stop writing your own e-mail address validators!

Why shouldn't you even bother writing your own? Because the proper Perl regular expression for RFC822 is over 6 kilobytes in length! Follow that link and look at that. This is the smallest regular expression you would need to get it right.

If you really insist on having a nice short one and don't want to use a validation library, which, again, is a stupid idea and you should be using a library, then use the dumbest, most liberal expression you can. (Just don't forget the security issues.) Like this,

.+@.+

Seriously, if you add anything else you most almost surely make it incorrectly reject valid addresses. Note that e-mail addresses can contain spaces, and even more than one @! These are valid addresses,

"John Doe"@example.com
"@TheBeach"@example.com

I have not yet found a website that will accept either of these, even though both are completely valid addresses. Even MS Outlook, which I use at work (allowing me to verify this), will refuse to send e-mail to these addresses (Gmail accepts it just fine). Hmmm... maybe having an address like these is a good anti-spam measure!

So if your e-mail address is "John Doe"@example.com no one using Outlook can send you e-mail, which sounds like a feature to me, really.

So, everyone, please stop writing e-mail validation regular expressions. The work has been done, and you will only get it wrong, guaranteed.

This is a similar rant I came across while writing mine: Stop Doing Email Validation the Wrong Way.


A DnD Story: The Fire Gem

The Fire Gem If you are at all familiar with Dungeons and Dragons, you may have already heard of the stories of The Tale of Eric and the Dread Gazebo by Richard Aronson and The Head of Vecna. These are real classics. Read them now, if you haven't already, before you go on. If you are unfamiliar with tabletop role-playing games you should go learn a little about them first.

I would like to share one of my Dungeons and Dragons stories. It won't be anywhere as interesting as the two stories I mentioned, but I have gotten some laughs out of people I tell it to.

Back in my college days, a friend was running a 3.5 campaign and asked me to join in. The other players had rolled up their characters and done a couple practice sessions before I joined, so I was jumping into a group that had already had their start.

It was an evil campaign. Everyone in the group was either evil or neutral. We were serving some kind of evil sorcerer. I felt that being evil allowed much more variety of play and usually gave us more freedom in the game world than being good. If things started to get dry, someone would do something stupid like burn down the nearest town inn. The lawful evil characters weren't so happy about this, though.

I rolled up a chaotic evil half-orc barbarian named Gnohkk. I was lucky on my rolls and had some wonderful stats, so he was very big and strong. Gnohkk started with a two-handed axe that contained a magic fire gem in its head. After some training he was able to cause a fiery explosion at will centered at the location of the fire gem. If an untrained character wielded the axe, upon striking anything he would have to make a saving throw to prevent an explosion.

Because Gnohkk would also suffer from the explosion, this wasn't terribly useful at first, except against targets particularly vulnerable to fire, like trolls and magic trees. The training was mainly to make the axe usable.

At some point during the campaign the party got a hold of a ring of fire resistance, which was then used by Gnohkk. With this ring he was invulnerable to his own explosions, making the axe very powerful. The ring also provided a handful of simple fire spells that made combat more interesting.

Sadly, not long after the ring was found the fire gem axe chipped broke. Before discarding the broken axe, Gnohkk carefully removed the fire gem for safe keeping.

After an intense battle against some type of monsters with big horns, Gnohkk found himself a nice metal helmet. He also managed to scrounge up 10 horns from the monster corpses, which gave him a great idea. When the party got to the next town, Gnohkk visited the blacksmith, and requested to have him affix the 10 horns around the outside of the helmet. In the front of the helmet he wanted to have a socket created and the fire gem embedded inside it. The blacksmith said it would take a couple of days.

I was hoping this giant helmet would make Gnohkk more intimidating. So far in the campaign, every time he had attempted to intimidate an NPC he failed his rolls and came off looking stupid instead.

So, this was pretty cool: a giant horned helmet with a gem in the front. With the helmet on, Gnohkk could explode into a ball of flames at will and without warning. What a wonderful toy for a chaotic evil!

When Gnohkk returned to the blacksmith a couple days later he found the shop was destroyed. Some of the walls remained, the roof was gone, and weapons and tools were thrown all over the area. The blacksmith's charred body had been dead for some time. However, the helmet was completed and basically intact. Gnohkk shrugged, grabbed the helmet and left, all without having to pay a single copper to the now dead blacksmith.

We figured out later what happened (the GM probably just told us). Gnohkk failed to tell the blacksmith about the properties of the gem. When the blacksmith tapped the gem into the socket, the fire gem activated and exploded, killing him and destroying his shop.


Controlling a Minefield

Naval Mine (this is not a space mine) Some time ago I was watching through the entire series of Deep Space 9. It was a Star Trek television show about a space station that rests next a wormhole that connects to the other side of the galaxy (The Delta quadrant).

The Delta quadrant is ruled by a group called the Dominion, and they are looking to conquer the Federation side of the galaxy (the Alpha quadrant). At one point during the series, the Federation needs to temporarily disable the wormhole to prevent Dominion ships from crossing through. They do this by mining the wormhole with identical, cloaked, self-replicating mines.

If a mine is destroyed, the neighboring mines will replicate a replacement. The minefield repairs itself. This makes removing the minefield within a reasonable amount of time difficult to impossible. If even a single mine is left behind, it can replicate the entire minefield again.

The most interesting question here is this:

When the Federation returns and wants to remove the minefield, how would they do it? What would stop the Dominion from doing the same thing?

The first thing that comes to mind is having a kill signal, but what would this signal be? It could simply be a plain "kill" command, but the Dominion could also broadcast such a signal to disable the minefield. Consider that the Dominion could capture a single mine and study everything about its workings. The minefield itself could therefore hold no secrets whatsoever. This leaves out any possibility of a secret kill command stored in the mines.

Here's what I would do, assuming that humans or aliens have not yet discovered some giant breakthrough in factoring in the Star Trek universe. I would randomly generate two very large prime numbers. Today, two 1024-bit primes should be more than enough, but in 350 years even larger numbers would probably be necessary. Then, I multiply these two number together and store this number in the mine software. To disable the minefield, I simply broadcast these two numbers into the minefield. The mines would be programmed to take the product of any pairs of numbers it receives. If the product matches the internal number, the mine shuts down.

Voila! A method for shutting down the minefield. The enemy can know everything about every single mine's construction, including the software and data stored on every mine, but will be unable to disable the minefield without factoring a very large composite number, which would presumably be difficult or impossible (within a reasonable amount of time).

Another possibility would be using a hash. Come up with a strong passphrase, then use a hashing algorithm like SHA-1 or MD5, or whatever is available and appropriate in 350 years, to hash the passphrase. Store the hash in the mines. When you want to disable the minefield, broadcast the passphrase. These mines will hash the broadcast and compare it to the stored hash. It's really the same solution as before: a one-way function. This is also similar to how passwords are stored inside a computer today.

If we wanted more commands, like "don't blow up any ships for awhile" or "increase minefield density", we could generate more composites corresponding to each command. However, once a command is issued, the secret -- the two prime numbers -- is out, and it cannot be used again. In this case, I would go into the realm of public key cryptography.

I would issue a command, along with a timestamp, and maybe even a nonce that could double as a global identifier for the command, and sign the whole deal using my private key. On each mine I would store the public key. When a command is received, the mines would check the signature before executing the command. I could then issue repeat commands, as the timestamps would change each time. An adversary learns nothing when a command is issued, because the time stamps would make any replay attacks useless.

Minefields just like this exist today all over the Internet, as botnets. Thousands of computers all around the world become infected with malware and come under the control of a single individual or group. Individual machines in the botnet could be taken out, but removing the entire botnet is difficult as it grows and repairs itself. Any security researcher could disassemble the botnet malware and learn anything about it, so the malware can store no secrets. How does a malicious person control the botnet, then, without someone else taking control? Public key cryptography, just as described above.


Play NetHack

Your ally in your search for The Amulet Patience. NetHack is all about patience. Never be too hasty, as the game is extremely unforgiving. If you let your guard down for just a few turns, you can easily lose everything. Death is permanent (with some exceptions), so dying means starting it all over with a new character.

I got into NetHack a couple of years ago. I play it in cycles, playing heavily for a couple of months at a time, then take a break for a couple months after I tire of getting stuck at the same point each game. I generally return a better player. The game is very complex, taking probably a hundred hours of play just to nail down the basic gameplay and techniques.

This is what my typical session looks like. I like it almost as simple as it comes. By default, color is off, but I like to have it on (a lot more information is available with color). The graphics may seem crappy, but it is said,

While the graphics may seem primitive by today's standards, today's gameplay seems primitive by NetHack standards.

The dingo is blinded by the flash!  The dingo turns to flee!

                                                                       -------
                                                                       |++...|
       --------------                               ----------         |.+...|
       |............|              --------         |>........#        ---|---
       |............|              |......|         |........|#           #
       |.........<...#            #.......|         |........|#         ###
       |...^........|           ###|......|         --.-------#         #
       --.----.------           #  ------.-#          #       ###       #
         #######              ###        ###          ###       #       #
            ####              #           #             #       ###     #
               ###          ###           ####          ###       #     #
                 #          #               #  #          #       ###---.----
                 ##       ###               #########     #         #.......|
                 -.----   #                      # -.-----|--        |@.....|
                 |.`...####                      ##-........|        |...[..|
                 |....|                            |........|        |......|
                 |{...|         0##################-........|########-......|
                 ------                            |........|#       --------
                                                   ----------#

wellons the Sightseer       St:12 Dx:12 Co:18 In:12 Wi:8 Ch:16  Neutral S:1967
Dlvl:6  $:1329 HP:48(48) Pw:17(17) AC:7  Xp:5/185 T:3717

Like most rogue-like games, playing NetHack is a rewarding experience, so if you never tried it out before, I suggest taking a look at the NetHack Guidebook and firing it up for yourself! If you don't want to install it, but you have a telnet client available, you can connect to the nethack.alt.org server to play and watch others play, which is a good way to learn more. Building it from source can be a little tricky (took me a little while to figure it out the first time), so using your package manager is advised if you do choose to install it to try it out.

Another good resource is the NetHack wiki Wikihack. However, if you don't want to "spoil" yourself, don't go there. Learning things about the game outside of the game is not considered cheating, but rather spoiling. I don't mind spoiling myself. I will run into plenty of things I have not yet spoiled myself on. The game is hard enough as it is!

There is a theory that if someone ever completely beats NetHack under the hardest conditions, the source code will be thrown out and replaced with something even harder and more unforgiving. There is another theory which states that this has already happened.

When play for the first time, start with a Valkyrie or a Barbarian as these classes are pretty tough against baddies, simpler to play, and more foolproof than some of the squishier classes, like wizards.

WARNING: SPOILERS FOLLOW

To give an idea of some of the interesting gameplay, I have a couple examples.

               #
          -----.--
          |....y.|     ## 
          |......|     #
          |.n....@f######
          |......|
          |....n.|
          --------

I am the @ symbol, which is how NetHack represents the player. I am entering the room with two awakened nymphs (n) and a yellow light (y). Behind me is my cat who fights alongside me and assists me. Nymphs do not do damage, but rather steal your equipment. They approach you, charm you into giving them something from your inventory, which includes weapons and armor, and teleport away. This means that you generally want to kill them before they close in.

Yellow lights also do no damage. They run up to you and explode, causing you to be blind for a number of turns. Normally, this can be a really bad situation if you do not have the right tools available to you, which is typical for early in the game. In the worst case, which also happens to be the most likely, while I work on the nymphs the yellow light will run up to be blinding me. Now, since I cannot see the nymphs, I cannot attack them at range and they run up to me each stealing one item. If I remain blind for long enough, they may find me again and steal something else while I am helpless. They would rob me blind! It would be up to my cat to dispatch them.

However, I did have some good tools available. I happened to get extremely lucky (the Random Number God was nice to me that day) and find both a cloak of displacement and jumping boots at a store at dungeon level 1. The cloak makes me appear to be in a different place than I really am while the jumping boots let me travel several squares in a single turn. I was also playing a rogue, so my main attack was already a ranged one: throwing daggers.

The yellow light ran up to my displaced image and exploded, causing no blindness to me. One down. Next, I threw my plentiful supply of daggers at the nymphs, keeping my distance by jumping around the room. I managed to prevent a deadly situation, being stuck naked with no weapon, by applying my resources well. With NetHack, I had plenty of time to plan out each move I made. There is no timer. NetHack moves at my pace.

There was a another situation a couple months ago (which I will not bother drawing) where I was attacked in the middle of a room by a werewolf. The werewolf summoned help immediately and I was surrounded by winter wolves and coyotes and such. Winter wolves can be dangerous because they shoot deadly frost bolts, and I had not yet have the cold resistance intrinsic.

I cut a hole through the canines towards the hallway where I could fight them one at a time, and I lost most of my health in the process. Then I prayed to recover my health. Then I starting hacking away at the canines. Again, I was low on health and running out of options. The winter wolves were firing bolts down the hallway at me. Engraving Elbereth on the floor was not going to save me from the ranged frost bolts. Things were looking quite grim for me. The end seemed near.

I did the only thing I could think to do at this point: I reached down and took a bite into one of the dead frost wolves that was resting on the floor beneath me.

Take a moment and picture this. Frost bolts are zipping down a dark cold hallway at a nearly dead Valkyrie. There are shattered potions and frozen liquids all over the floor. The winter wolves have caused the temperature to plummet. Her breath hangs visibly in the air. She quickly crouches down and shoves her face into a dead wolf, taking a nasty bite right into its furry, bloodied side. Between her clenched teeth she tears off a piece dripping with blood. Desperately, she stuffs her face with the fresh kill, trying to eat as much as fast as she can as she dodges more frost bolts.

At this moment, the Random Number God blessed me with good fortune, and I was granted the cold resistance intrinsic just before the next frost bolt was going to kill me, allowing me to harmlessly absorb it and continue hacking away the remaining canines. Her reward for victory was a canine feast!

So, yes, the simple looking NetHack can be quite exciting. :-)


A GNU Octave Feature

At work they recently moved me to a new project. It is a Matlab-based data analysis thing. I haven't really touched Matlab in over a year (the last time I used Matlab at work), and, instead, use GNU Octave at home when the language is appropriate. I got so used to Octave that I found a pretty critical feature missing from Matlab's implementation: treat an expression as if it were of the type of its output.

Let's say we want to index into the result of a function. Take, for example, the magic square function, magic(). This spits out a magic square of the given size. In Octave we can generate a 4x4 magic square and chop out the middle 2x2 portion in one line.

octave> magic(4)(2:3,2:3)
ans =

   11   10
    7    6

Or more possibly clearly,

octave> [magic(4)](2:3,2:3)
ans =

   11   10
    7    6

Try this in Matlab and you will get a big, fat error. You have to assign the magic square to a temporary variable to do the same thing. I kept trying to do this sort of thing in Matlab and was thinking to myself, "I know I can do this somehow!". Nope, I was just used to having Octave.

Where this really shows is when you want to reshape a matrix into a nice, simple vector. If you have a matrix M and want to count the number of NaN's it has, you can't just apply the sum() function over isnan() because it only does sums of columns. You can get around this with a special index, (:).

So, to sum all elements in M directly,

octave> sum(M(:))

In Octave, to count NaN's with isnan(),

octave> sum(isnan(M)(:))

Again, Matlab won't let you index the result of isnan() directly. Stupid. I guess the Matlab way to do this is to apply sum() twice.

Every language I can think of handles this properly. C, C++, Perl, Ruby, etc. It is strange that Matlab itself doesn't have it. Score one more for Octave.


The Arcfour Stream Cipher

Stream ciphers are one of the two types of symmetric key algorithms, which is when the same key is used for encryption and decryption. They follow this simple concept: take a good pseudo-random number generator and combine, usually by XOR, its output with your plaintext stream. This is very similiar to the one-time pad, but the random pad is pseudo-random rather than truly random. The key is the seed (or part of one) for the PRNG.

Probably the most well known stream cipher is the extremely simple, yet cryptographically strong, Arcfour algorithm. The official name is actually RC4, which comes from RSA Security where it was developed. It was a trade secret until someone anonymously leaked the algorithm to the public. The name RC4 is still trademarked, though, so Arcfour is generally used instead, meaning "Alleged RC4" (alleged because RSA Security never confirmed the algorithm as being RC4). You have almost certainly used the cipher yourself, because it is used in applications such as WEP and SSL.

The algorithm has two parts: the key schedule algorithm and pseudo-random generation algorithm. The key schedule uses the key, and possible a non-secret initialization vector, to set up the state of the PRNG. The state is an array of length 256 holding all of the values from 0 to 255 in some order, along with two integers (initialized to 0 after the key schedule). The algorithm looks like this,

for i from 0 to 255
    S[i] := i
endfor
j := 0
for i from 0 to 255
    j := (j + S[i] + key[i mod keylength]) mod 256
    swap(S[i],S[j])
endfor

The PRNG deals with one byte at a time, emitting a stream of bytes,

i := 0
j := 0
while GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap(S[i],S[j])
    output S[(S[i] + S[j]) mod 256]
endwhile

If you implement this in C and use the char type, you can toss the modulus parts because they will just work automatically.

Now you just XOR your message with the output of the PRNG. The Wikipedia article probably explains it better than I can, so check it out if you still don't follow.

Now, Arcfour has some flaws. Specifically, the algorithm itself doesn't specify how an initialization vector is applied, which is important. Using a plan key twice is bad it allows an adversary to get information easily. For example, Lets say you have two messages A and B. You use the same key k, which will produce the same keystream K. Now, you create your two ciphertexts CA and CB

CA = A ^ K
CB = B ^ K

But notice if the adversary has both ciphertexts,

CA ^ CB = A ^ K ^ B ^ K = A ^ B

They are left with your two original messages XORed together. Let me illustrate: we have two messages as bitmap images (here as PNGs for the web),

Plain pattern GNU Head

Encrypt them using the same key. In this case, my key was "somekey".

Plain pattern encrypted GNU Head encrypted

If the adversary has both of these second "images", she can XOR them together (having no knowledge of the key!) and get this,

Images superimposed

An initialization vector (IV) is a set of bytes we combine with the key. The IV is not a secret, as it is sent plaintext with the ciphertext. If different IVs were used above with the same key, XORing the ciphertext would result in nothing, because the keystreams are totally different for each message.

However, the way the IV is combined is important too. Simply concatenating the IV and the key can lead to weaknesses due to the way the key schedule algorithm works. Something more secure would be a cryptographic hash of the key and the IV. The reason WEP is broken is because in its design the IV wasn't used properly.

Another weakness is that the initial bytes of the keystream have low entropy. That is, some bits tend to be 0's or 1's consistently, which can leak information to an adversary. This can be countered by tossing the first few bytes of the keystream. Often the first 768 bytes are dropped, but a conservative amount would be 3072 bytes. Another way to deal with this is running the key schedule algorithm 10 or 20 times (not reinitializing the S array between them of course) rather than just once, which is the way CipherSaber-2 does it.

Yet another weakness is that the keystream becomes distinguishable from random data after about a gigabyte of output. That is, after about a gigabyte, the entropy of the overall stream can become too low and compromise the security of the message. A solution might be to change the IV each gigabyte.

I wrote an implementation of Arcfour in C, which you can get from my Git repostiory with,

git clone http://git.nullprogram.com/arcfour.git

Or grab a snapshot.

It is written as a library that can be used in different applications. Included are a couple programs that make use of it. I strongly suggest writing your own implementation. It is really easy to do, and you will automatically have the algorithm memorized once you do it. The Wikipedia article has some test vectors you can use to test it.


Two-Man Double Blind Coke vs. Pepsi Taste Test

The Setup: 6 labeled cups, 2 drinks, and a 20-sided die My fiancee, Kelsey, claimed that she could tell the difference between Coke and Pepsi. I wanted to put this to the test. Since there were only two of us, arranging the test wasn't a simple matter of asking someone else to pour some cups. I also wanted to do this right: testing must be double-blind. I devised a little scheme that allowed us to perform two different tests.

The first test was seeing if Kelsey or I could determine which beverage was Pepsi and which was Coke. The second was determining if there was any distinction in taste between the two drinks at all, which consisted matching two different samples together. The second test also acts as a check on the first test.

We bought one bottle of each at CVS. Next, we labeled six different cups with the numbers one though six. Each odd number is paired with the following even number. Kelsey, who was alone, used a die with an even number of sides (this includes a something as simple as a coin toss) to put one beverage in cup #1 and the other in cup #2. In this case, we used my 20-sided die I use for Dungeons and Dragons, because using it for this purpose was just full of win.

The die is important here as a random number generator. If it is left up to a human to decide what drinks go where, we may bias the setup. For example, I may be more likely to put Pepsi in an odd-numbered cup than an even-numbered one.

It is difficult for human beings to behave randomly. Try generating a list of 50 coin tosses yourself. I mean, without a coin. Just type a series of 50 H's and T's. If you examine your list of flips, you will find that you often generate very improbable series of flips (excessive heads or tails) and exhibit patterns. We need dice or coins to make decisions for us in this experiment.

To do it right, the beverage must be chosen before the roll: "I will be pouring Pepsi now.". Roll the die. If it rolls an odd number, pour the drink into the odd cup (#1). Write this information down and keep it secret.

The Final Setup Next, after allowing the foam to calm down (which might accidentally reveal information to me), Kelsey leaves the room and I enter. I perform a similar procedure to distribute the drinks into cups #3 and #4, then #5 and #6. I keep track of what drink, #1 or #2, goes into what cup. I keep it secret. These last two cups are for the purpose of the second experiment.

At this point Kelsey knows what was in cups #1 and #2, but not where they went. I know where they went, but not what was in the cups. Together we know everything, but individually we know nothing.

In order to make the second test double-blind, and allow me to participate, I leave the room. Kelsey rolls the die. If it rolls odd she switches the label on cups #5 and #6. It is important that these cups are identical. One flaw potential, however, is that the liquid in each cup may look unique. One may be more fizzy or one cup a little more full. Noticing this may happen subconsciously, which is the whole point of doing double-blind tests.

Ok, we didn't actually do this last part because I didn't think of it till later.

We sample all four cups (#3 - #6) in pairs, alone, making notes on what beverage we think is in what cup. Once we are both done we share our secrets and see how well we did.

Our results? Neither of us could tell the difference between Coke and Pepsi.


Sudoku Solver

My 'Thinking Chair' I was at my fiancee's parent's house over Fourth of July weekend. Her family likes to leave plenty of reading material right by the toilet, which is something fairly new to me. They take their time on the john quite seriously.

While I was in there I saw a large book of Sudoku puzzles. Since the toilet is a good spot to think (I like to call it my " thinking chair"), I thought out an algorithm for solving Sudokus. I then left the bathroom and implemented it in order to verify that it worked.

The method is trial-and-error, which it does recursively: fill in the next available spot with a valid number as defined by the rules (cannot have the same number in a column, row, or partition), and recurse. The function reports success (true) when a solution was found, or failure (false), which means we try the next available number. If no more valid numbers are available for testing at the current position, then the puzzle is not solvable (we made an error at a previous position), so we stop recursing and return failure.

More formally,

Note that the recursion depth does not exceed 81, as it only recurses once per blank square. The "game tree" is broad rather than deep. It doesn't have to duplicate the puzzle matrix in memory either because all operations can be done in place.

Here is the implementation in C I typed up just after I left the bathroom,

int solve (char matrix[9][9])
{
  /* Find an empty spot. */
  int x, y, i, j, s = 0;
  for (i = 0; i < 9 && !s; i++)
    for (j = 0; j < 9 && !s; j++)
      if (matrix[i][j] == 0)
	{
	  x = i; y = j; s = 1;
	}
  
  /* No empty spots, we found a solution! */
  if (!s)
    return 1;
  
  /* Determine legal numbers for this spot. */
  char nums[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
  for (i = 0, j = y; i < 9; i++)
    nums[(int)matrix[i][j]] = 0; /* Vertically */
  for (i = x, j = 0; j < 9; j++)
    nums[(int)matrix[i][j]] = 0; /* Horizontally */
  for (i = 0; i < 3; i++)
    for (j = 0; j < 3; j++)
      nums[(int)matrix
	   [i + ((int)(x / 3)) * 3]
	   [j + ((int)(y / 3)) * 3]
	   ] = 0;                /* Within the partition */
  
  /* Try each possible number and recurse for each. */
  for (i = 1; i < 10; i++)
    if (nums[i] > 0)
      {
	matrix[x][y] = i;
	if (solve (matrix))
	  return 1;
      }
  
  /* Each attempt failed: reset this position and report failure. */
  matrix[x][y] = 0;
  return 0;
}

I assumed that it would be slow solving the puzzles, having to search a wide tree, but it turns out to be very fast. It solves normal human-solvable puzzles in a couple of milliseconds. Wikipedia has a near-worst case Sudoku that is designed to make algorithms like mine perform their worst.

Worst-case Sudoku

char worst_case[9][9] =
  {
    {0, 0, 0,   0, 0, 0,   0, 0, 0},
    {0, 0, 0,   0, 0, 3,   0, 8, 5},
    {0, 0, 1,   0, 2, 0,   0, 0, 0},
    
    {0, 0, 0,   5, 0, 7,   0, 0, 0},
    {0, 0, 4,   0, 0, 0,   1, 0, 0},
    {0, 9, 0,   0, 0, 0,   0, 0, 0},
    
    {5, 0, 0,   0, 0, 0,   0, 7, 3},
    {0, 0, 2,   0, 1, 0,   0, 0, 0},
    {0, 0, 0,   0, 4, 0,   0, 0, 9}
  };

On my laptop, my program solves this in 15 seconds, which means that it should take no more than 15 seconds to solve any given Sudoku puzzle. This provides me a nice upper limit.

There is a way to "defeat" this particular puzzle. For example, say an attacker was trying to perform a denial-of-service (DoS) attack on your Sudoku solver by giving it puzzles like this one (making your server spend lots of time solving only a few puzzles). However, these puzzles assume a certain guessing order. By simply randomizing the order of guessing, both in choosing positions and the order that numbers are guessed, the attacker will have a much harder time creating a difficult puzzle. The worst case could very well be the best case. This is very similar to how Perl randomizes its hash array hash functions.

Now suppose we kept our guess order random then "solved" an empty Sudoku puzzle. What we have is a solution to a randomly generated Sudoku. To turn it into a puzzle, we just back it off a bit. A Sudoku is only supposed to have a single unambiguous solution, so we can only back off until just before the point where two solutions becomes possible. If you imagine a solution tree, this would be backing up a branch until you hit a fork.

Normally, Sudokus are symmetric (in the matrix sense), but completely randomizing the position guessing order won't achieve this. To make this work, the randomizing process can be adjusted to only select random points on the upper triangle (including the diagonal). For each point it selects not on the diagonal, the mirror point is automatically selected next. This will preserve symmetry when generating puzzles.

One issue remains: there seems to be no way to control the difficulty of the puzzles it generates. Maybe a number of open spaces left behind is a good metric? This will require some further study (and another post!).


Variable Declarations and the C Call Stack

A co-worker asked me a question today about C/C++ pointers,

If a pointer is declared inside a function with no explicit initialization, can I assume that the pointer is initialized to NULL?

We were down in the lab and, therefore, he had no Internet access to look it up himself, which is why he asked. When I code C, it is just a sort of mental habit to not use a non-static function variable without first initializing it, but is this accurate? I knew the answer was "no", but I wanted to be able to explain the "why".

Anyway, I quickly recalled some of my experimental C programs and thought carefully about the mechanics of what is going on behind-the-scenes, allowing me to confidently give him a "no" answer. I then threw this together in a few seconds to prove it,

#include <stdio.h>

void a ()
{
  int   * x = (int *) 0x012345FF;
  double  y = -63454;
  (void) x;
  (void) y;
}

void b ()
{
  int   * x;
  double  y;
  printf ("%p, %f\n", x, y);
}

int main ()
{
  a ();
  b ();
  return 0;
}

When you compile it, make sure you don't use the optimization options (-O, -O2, or -O3 for gcc) because they change the inner-workings of the program. It might do things like make those functions inline (so they won't be on the stack as I am intending), or even toss out a(), as it appears to do nothing. The compiler sees that, even though I "used" variables in a() by casting them to void, nothing is really happening, so a() can be ignored. We can probably get around this with a tacked on volatile declaration, which you might see a lot of in a micro-controller program. In a micro-controller, some memory addresses are mapped to registers external to the software, so, from the compiler's point of view, access to these locations may look like nothing is really happening. Optimizing away variables that point to these memory locations will lead to an incorrect binary, so your robot or laser guided shark or whatever won't work.

Anyway, compiling with optimization will break my example! So don't do it here.

When compiling, you should get some warnings about using uninitialized variables, which is kind of the point of my example. Ignore it. That warning alone gives away the answer to the main question, really, but this example is a bit more fun!

Before you run it, study it and think about what the output should look like. When a() is called, its stack frame goes into the call stack, which contains the two declared variables. These variables are then assigned as part of the function execution. When a() returns, the frame is popped off the stack. Then b() is called, and, as the variable declarations are exactly the same, it will fit right over top of a()'s old stack frame, and its variables will line up. x and y are not assigned any value, so they pick up whatever junk was lying around, which happens to be the values assigned in a().

When you run the program, this is the output,

0x12345ff, -63454.000000

The pointer is not initialized to NULL. If x is passed back uninitialized under the assumption that a NULL is being passed, some other poor function that handles the return value may dereference it, resulting in possibly some nasal demons, but most likely an annoying segmentation fault. Worse, this error may occur far, far away from where the actual problem is, and even worse than that, only sometimes (depending on the state of the call stack at just the right moment).

Note here that I am talking about non-static function variable declarations. Global variables and static function variables will not be on the stack. They are placed in a fixed location (in the data segment), and their values are implicitly initialized to 0 at compile time.


A One-Time Pad Mistake

In a previous post I discussed one-time pads. Note: this is about a cryptosystem, not some kind of menstruation disaster. The information for this post comes from a little gem I saw on the xkcd forums. I generally don't lurk around there because it suffers from the same disease most non-self-moderating forums suffer from: anal retentive, power-abusing, whiny moderators. I lucked out this time.

The user Berengal posed the question,

One thing I've been wondering about is if you can use a single one-time-pad to encrypt other one-time-pads to send around.

My first though was, "Hey, why didn't I think of this?". Then, after a moment, I realized that it was the sort of thing that is too good to be true. This is along the lines of ideas that break the laws of thermodynamics. We are looking at a perpetual motion machine here. Just as you cannot create or destroy energy, neither can you use a one-time pad to distribute more than one other equal length one-time pad. Let's make it formal,

In any closed one-time pad cryptosystem, the total number of one-time pad bits in the system remains the same.

Also, if something like this could work, people would be using it already everywhere. Before I got a chance to think too much about it, user AJR spoiled it with an excellent proof on why it won't work.

Note below that I use ^ to indicate exclusive or (XOR).

Suppose you have a master pad, K, that you use to distribute two message pads, k1 and k2. You have two messages, m1 and m2. We then have four transmitted texts: two encrypted message pads e1 and e2, and two ciphertexts c1 and c2. We define these as,

e1 = K ^ k1
e2 = K ^ k2

c1 = k1 ^ m1
c2 = k2 ^ m2

Suppose an adversary is able to record all four transmitted texts. He can apply them like so,

e1 ^ c1 = K ^ k1 ^ k1 ^ m1 = K ^ m1

e2 ^ c2 = K ^ k2 ^ k2 ^ m2 = K ^ m2

This cancels the original message keys and effectively leaves you encrypting two messages with the same pad, which is exactly the wrong thing to do when using one-time pads. Once that is done, the adversary can do tricks like,

e1 ^ e2 ^ c1 ^ c2 = m1 ^ m2

What is left is the two messages XORed together with no keys/pads involved, which, depending on the messages, might reveal something. Don't make this mistake.

Update: My friend Gavin thought he discovered a way to make one-time pads useful, but I was able to show him that it simply reduces to the same situation here. His website details the algorithm and the crack that breaks it.


One-Time Pads and Plausible Deniability

In a previous post I discussed one-time pads. The information for this post comes from Bruce Schneier's Applied Cryptography (section 10.8).

One-time pads are great for something called plausible deniability. With plausible deniability, when a person holding encrypted data is coerced into decrypting their data, the interrogator will not be able to tell if the person is complying with the decryption order or not. For example, the victim could provide an alternate key that decrypts the ciphertext into some harmless dummy plaintext. To make this more plausible, the plaintext would probably be something potentially embarrassing, such as pornography or secret love letters.

We have a one-time pad K, a plaintext P, a dummy plaintext (the pornography or love letters) D, a dummy key K', and a ciphertext C. Below, I denote XOR with ^.

To encrypt our plaintext, its the normal one-time pad algorithm,

P ^ K = C

Bob and Alice share K, so decryption works like,

C ^ K = P

However, the secret police come along with their thumbscrews and demand that Alice and Bob give them the one-time pad K. Instead, they will provide K'. How is K' defined? Like this,

K' = C ^ D

Because K is a one-time pad and is randomly generated, there is no way to prove that K' is not the real key. Alice and Bob give up K'. The secret police decrypt it,

C ^ K' = C ^ C ^ D = D

"See? We were just keeping our love affair a secret from our spouses!"


One-Time Pads

Example code: otp.c, Makefile (this Makefile may be useful)

The one-time pad (OTP) is an encryption algorithm that provides perfect secrecy, provided it is implemented properly. The pad, which is essentially a key that is the same length as the message, must be used only once (hence the name), kept secret, and, the hard part, generated using some truly random method. It works by XORing the pad and the message together, which makes the ciphertext indistinguishable from randomly generated data.

One-time pad diagram

I want to be clear with the terms here. A one-time pad is not "random". Numbers aren't random. It's just not a property numbers can have. To be pedantic, what is random is the method by which the numbers are generated. Think "random number-generators", not "random-number generators".

A pseudo-random number generator (PRNG), like what your computer normally uses to randomly generate numbers for games, will not work here. Your computer is mindlessly deterministic and is not capable of being a true random number generator by itself. PRNGs exhibit patterns and eventually loop, spitting out the same numbers in the same order as it did when it started. When the pad is generated randomly, each bit in the pad will be statistically independent from every other bit. That is, one value of one bit has no affect on the value on other bits. Because of this, the ciphertext encrypted from a one-time pad is equally likely to decipher into any possible plaintext of the same length. This makes it immune to brute-force attacks. The bits in a PRNG are not statistically independent, making them vulnerable to attack. If you know the value of one bit of the key, you also know something about other bits in the key.

Generally, two parties wanting to communicate using a one-time pad will exchange a large pad, such as a CD filled with randomly generated data, in person before one of them travels somewhere, say another country, where data needs to be sent back securely. As bytes from the pad are used, they are destroyed and never used again. If the pad is 100kB in length, only up to 100kB in data can be transmitted securely. Like most forms of encryption, key management is one of the trickiest parts.

I have provided a simple little C program for encrypting a message using a one-time pad (download link at the top). It encrypts stdin using the one-time pad provided in the file indicated as the first argument. Ciphertext is sent to stdout. As an added bonus, when a second argument is provided, the used bits of the key are written to the file provided in the second filename. I will show you why I added this in a moment.

I have provided an example pad for you, generated from my own /dev/random, which provides a true random number generator (RNG).

Get it: random.data (207kB). It looks kind of like this (hexdump),

f0a0 382b fdaf 6dff 5f15 0f2a 4634 f1c9
bf36 67f3 b896 7237 ac82 836b c906 e28f
2cea 8f06 b6a5 56f9 1faa 6dcb 06e6 f1e2
779f 728e ead1 b5ad fbb3 e615 9906 2006
f245 db61 8231 a2b7 2b10 37bb 5e48 8970
49b7 0856 c0e3 440f fde4 9e78 7384 3f5f
05c5 fed9 1c6e e5ac b85d cc7e fd20 a280
1688 3206 ddb5 5c47 9dfe df53 18ea df52
9b95 7e6b b76e c6c7 061c 43b8 f4f7 b3e3
4948 e543 d8f1 476d 0daf 400f d345 60f4
[...]

Here is how you put it to use in encrypting the source code for this program,

otp random.data < otp.c > otp.c.otp

If you shared the pad random.data with your friend, whom you are trying to communicate with, she can decrypt it with a similar command,

otp random.data < otp.c.otp > otp.c

random.data is only 207kB, so your message cannot be any longer than that. Additionally, once 207kB has been sent, you must exchange a new pad before more ciphertext can be transmitted. If you re-use your pad, you compromise the security of the new ciphertext as well as the old ciphertext.

The reason I provided the second option was this: you can use /dev/random directly without losing the pad (assuming you have a operating system that has a /dev/random, as all decent operating systems do).

# encrypt
otp /dev/random random.pad < otp.c > otp.c.otp

# decrypt
otp random.pad < otp.c.opt > otp.c

This is only academic, as the one-time pad is generally exchanged before the plaintext exists. Otherwise you can just exchange the plaintext when you would have normally exchanged the one-time pad! Also, /dev/random is slow. It generates numbers using environmental noise, which is only available at a trickle. If you want to encrypt a 1MB message, it could take days. With shorter messages you can usually speed things up by moving the mouse around or mashing the keyboard (this can be fun).

More on one-time pads next. In particular, I will present two tricks: one that works and one that does not work.


Up is Down

I was in an elevator the other day and I noticed something about the digital numbers. I was on the 7th floor going down. Inside the elevator was the standard 7-bar number display indicating the current floor,

Digital number 8

However, from the angle I was standing, the top right bar was not visible. Like this,

Digital number 8 covered

The arrow was indicating that the elevator was traveling downward, but with that bar in the way the numbers can appear to be going upwards, not downwards. It is an ambiguity, as without that bar, 6 is equivalent to 8 and 5 is equivalent to 9. One version is covered, the other is not.

Counting up/down Counting down

At first I thought this was simply some kind of quirky behavior of the elevator. "Why is it going up when the arrow points down? Seems like bad design to me!", I thought -- until we got to floor 4, where it was obvious what was going on.

"Oh... now I feel foolish."


For a Good Time

At work I am currently maintaining/adding features to a rather large piece of software. It takes over two hours just to build (but this is mostly because C++ takes a long time to compile, being the ugly mess C++ is). I wondered if there would be any, let's say "interesting", comments in there. With all that code there had to be something! I ran this to find out,

find . -name "*.[chCH]" -exec grep \
	'damn\|fuck\|shit\|kludge\|dragon\|idiot\|stupid\|hell' \ 
	{} -Hi \;

And I found maybe a hundred matches. A couple of us got some good laughs for awhile reading some very creative comments and, in a few cases, strings for debugging log output. If you want some code to run this on yourself this 2.4.1 release of Pidgin works fairly well. This is my favorite comment, which is followed by a piece of code handling just what it is talking about,

/* holy crap. who the fuck would transfer gigabytes through AIM?! */

For an easier way to search a whole lot of code, someone has already compiled some interesting Google Code Search searches that have interesting results. My favorite is probably this one.

Have fun!


Memoization

I had written in a previous post about a neat feature of Lua. I found out later that this is simply a form of Memoization. The idea is that you trade memory for speed by only doing calculations once and keeping track of previously calculated values. I had even complained about Perl hashes not being flexible enough (not true, thanks to Tie::Hash). Perl actually has something even cooler, which is the Memoize module.

The module can memoize any function, although it is only useful on "pure" functions: functions with no side effects and not dependant external data that will change. The official documentation contains a nice example demonstrating a recursive implementation of a Fibonacci sequence generator. My example is a little program I wrote the other day where the memoize module came in handy.

You have coins valued at 1, 2, 5, 10, 20, 50, 100, and 200. How many different ways can 200 be made using any number of coins. A simple recursive solution is this: stick in each coin one at a time and ask the same question again. So, we use a coin worth 1, now the question is how many ways can we make change for 199. Then 198, then 195, then 190, etc. Because the order of the coins is not important these two sets are identical: (1 1 5) (5 1 1). So, to avoid counting the same set twice, we also want to tell the function the largest size coin to use from then on. Our function may look like this now (Perl),

use List::Util qw(sum);

sub count {
    my ($s, $m) = @_;
    return 1 if ($s == 0);
    
    my @valid = grep {$_ <= $s and $_ >= $m} @coins;
    return 0 if ($#valid == -1);
    
    return sum map {count($s - $_, $_)} @valid;
}

Where it is called as count(total, max_coin_value).

However, we will be calculating the same value twice many times over. For example, lets say we start filling the first 10 of 200 like this: (1 1 1 1 1 5) or (5 5). The next call to count will be count(190, 5) for both cases. Just like the recursive Fibonacci implementation, we are spending an enormous amount of time repeating ourselves. Running this for a value of 200 will take minutes. Running it for a value of 2000 may take days! Memoization to the rescue!

We will now add this,

use Memoize;
memoize('count');

The module has now transparently installed a new version of the function over our original. If we ever pass the same arguments that we already have passed, the module will look up the original calculated value and return it instead of calling the real function. It now can calculate the number of ways to make change for a value of 2000 in a couple seconds rather than days. That's how much redundant work the function was doing. Here is the whole thing,

#!/usr/bin/perl

use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(sum);

use Memoize;
memoize('count');

my @coins = (1, 2, 5, 10, 20, 50, 100, 200);

print count(200, 1);
print "\n";

sub count {
    my ($s, $m) = @_;
    return 1 if ($s == 0);
    
    my @valid = grep {$_ <= $s and $_ >= $m} @coins;
    return 0 if ($#valid == -1);
    
    return sum map {count($s - $_, $_)} @valid;
}

Now, to apply it to the Collatz problem from my previous post we get a nice simple little program,

#!/usr/bin/perl

use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(max);

use Memoize;
memoize('collatz');

while (<>) {
    my ($n, $m) = split;
    printf("$n $m %d\n", max map { collatz($_) } ($n..$m));
}

sub collatz {
    my $n = shift;
    return 1 if ($n == 1);
    return 1 + collatz(3*$n+1) if ($n & 1);
    return 1 + collatz($n/2);
}

I really do love Perl.


Lisp Number Representations

This exercise partly comes from a couple different chapters in the book The Little Schemer. The book is an introduction to the Scheme programming language, a dialect of Lisp. The purpose to to teach basic programming concepts in a way that anyone can follow along just as well as someone with a degree in, say, computer science. It is still very useful for us programmer types because there are some good practice you get from reading and playing along.

First of all, Lisp is famous (infamous?) for lacking syntax. Any Lisp program is simply an S-expression, put simply, a list of lists. There is no operator precedence because operators are treated just like functions. This leads to prefix notation for mathematical expressions,

(+ 4 5)
=> 9

where the => indicates the result of evaluating the expression. We can apply as many operands as we want,

(+ 2 3 4 5 10)
=> 24

We can put another list right in there as an operand,

(+ 3 (* 2 5) 4)
=> 17

You get the idea. In a function, the value of the last expression is the return value. For example, here is the square function in Scheme, which squares its input,

(define (square x)
  (* x x))

Then we can use it,

(+ (square 2) (square 5))
=> 29

There are three important list operators to understand as well: car, cdr, and cons. car returns the first element in a list. In the example below, the ', a single quote, tells the interpreter or compiler that the list is to be treated as data and not to be executed. This is shorthand, or syntactic sugar, for the quote operator: (quote (stallman moglen)) is the same as '(stallman moglen).

(car '(stallman moglen lessig))
=> stallman

cdr returns the "rest" of a list (everything but the car of the list). When passing a list with only one element cdr returns the empty list: ().

(cdr '(stallman moglen lessig))
=> (moglen lessig)
(cdr '(stallman))
=> ()

We can ask if a list is empty or not with null?. #t and #f are true and false.

(null? '(stallman moglen lessig))
=> #f
(null? '())
=> #t

And finally, for lists, we have cons. This function allows us to build a list. It glues the first argument to the front of the list in the second argument,

(cons 'stallman '(moglen lessig))
=> (stallman moglen lessig)
(cons 'stallman '())
=> (stallman)

And one last function you need to know: eq?. It determines the two atoms are the same atom,

(eq? 'stallman 'moglen)
=> #f
(eq? 'stallman 'stallman)
=> #t

Now, for this exercise we will pretend that the basic arithmetic functions have not been defined for us. Instead all we have is add1 and sub1, each of which adds or subtracts 1 from its argument respectively.

(add1 5)
=> 6
(sub1 5)
=> 4

Oh, I almost forgot. We also have the zero? function defined for us, which tells us if its argument is 0 or not. Notice that functions that return true or false, called predicates, have a ? on the end.

(zero? 2)
=> #f
(zero? 0)
=> #t

To make things simple, these definitions will only consider positive numbers. We can define the + function (for only two arguments) in terms of the three basic functions shown above. It might be interesting to try to write this yourself before you look any further. (Hint: define it recursively!)

;; Adds together n and m
(define (+ n m)
  (if (zero? m) n
      (add1 (+ n (sub1 m)))))

If the second argument is 0 we are done and simply return the first argument. If not, we add 1 to n + (m - 1). The - function is defined similarly.

;; Subtracts m from n
(define (- n m)
  (if (zero? m) n
      (sub1 (- n (sub1 m)))))

Multiplication is the act of performing addition many times. We can go on defining it in terms of addition,

(define (* n m)
  (if (zero? m) 0
      (+ n (* n (sub1 m)))))

(We'll leave division as an exercise for the reader as it gets a little more complicated than I need to go in order to get my overall point across.)

We will leave math behind for a moment take a look at The Roots of Lisp. In that link is an excellent paper written by Paul Graham about John McCarthy, the inventor (or perhaps discoverer?) of Lisp, and how Lisp came to be. It turns out that in order to have a fully functional Lisp engine we only need seven primitive operators: operators defined outside of the language itself as building blocks for the language. For Lisp these seven operators are (Scheme-ized for our purposes): eq?, atom?, car, cdr, cons, quote, and if.

Notice how none of these are math operators. You may wonder how we can possibly perform mathematical operations when we lack these facilities. The answer: we have to define our own representation for numbers! Let's try this, define a number as a list of empty lists. So, the number 3 is,

'(() () ())

And here is 0, 2, and 4,

'()
'(() ())
'(() () () ())

See how that works? Before, when we wanted to define addition and subtraction, we needed three other functions: zero?, add1, and sub1. With our number representation, how could we define add1 with our seven primitive operators? Our numbers are defined as lists, so we can use our list operators. To add 1 to a number, we append another empty list. Hey, that sounds a lot like cons!

(define (add1 n)
  (cons '() n))

Subtraction is removing an element from the list, which sounds a lot like cdr,

(define (sub1 n)
  (cdr n))

And to define zero? we need to check for an empty list. Notice this will also be the definition for null?.

(define (zero? n)
  (eq? '() n))

And now we are back where we started. In fact, you can use the exact definitions above to define +, -, and *. Our entire method number representation depends on how we define add1, sub1, and zero?. Let's try it out,

;; 3 + 4
(+ '(() () ()) '(() () () ()))
=> (() () () () () () ())

;; 5 - 2
(- '(() () () () ()) '(() ()))
=> (() () ())

;; 2 * 2
(* '(() ()) '(() ()))
=> (() () () ())

;; 3 + 4 * 2   bolded for clarity
(+ (* '(() () () ()) '(() ())) '(() () ()))
=> (() () () () () () () () () () ())

Pretty cool, huh? We just added arithmetic (albeit extremely simple) to our basic Lisp engine. With some modifications we should be able to define and operate on negative integers and even define any rational number (limited by how much memory your computer's hardware can provide).

Now, thank goodness this isn't how real Lisp implementations actually handle numbers. It would be incredibly slow and impractical, not to mention annoying to read. Normally, numbers and math operators are primitive so that they are fast.


Rumors of My Death Have Been Greatly Exaggerated

It has been a few weeks since I made my last post. I just started my new job at Johns Hopkins University Applied Physics Laboratory, which has reminded my just how exhausting starting a new job can be. There is so much to learn and balance all at once, and when I get home from work I don't feel like doing anything constructive. This is why I haven't posted anything recently.

However, I have been working on the problems (25 completed so far) over at Project Euler. If you like programming and/or math, I strongly suggest you check it out. It would also be handy to be familiar with a language or system that natively supports (ruby, python, scheme, common lisp, bc, etc.) or transparently supports (perl with the bignum module) multi-precision math. Many of the "tricks" of the problems involve very large numbers, well beyond normal 32-bit or 64-bit integers.

I would also suggest you follow the "rules": any program you write should have a running time of less than a minute. All the problems can be solved with programs that don't need to run overnight. And don't search Google for answers until you have solved it yourself. If you do this, you will be missing out!

Anyway, hopefully I will get something interesting written up soon. I want to get back to my once a week schedule. And happy Pi Day and Talk Like a Physicist Day!


Simple Hash Table Correction

In December I made a post about a simple hash table I had written in C. I got an e-mail from Benoit Rouits ( http://brouits.free.fr/) pointing out a memory leak in ht_destroy() that he found with Valgrind. I had failed to free the main array in the hashtable when it was being destroyed,

free (hashtable->arr);

I added the fix and updated my e-mail address in the source tarball, which is in the same location as before. There are no version numbers involved, so I guess this is version "now" and the old one is version "then", with names subject to change.


Linear Spatial Filters with GNU Octave

Test Image Original Test Image Averaged Test Image Guassian Test Image Edge

I have gotten several e-mails lately about using GNU Octave. One specifically was about blurring images in Octave. In response, I am writing this in-depth post to cover spatial filters, and how to use them in GNU Octave (a free implementation of the Matlab programming language). This should be the sort of information you would find near the beginning of an introductory digital image processing textbook, but written out more simply. In the future, I will probably be writing a post covering non-linear spatial and/or frequency domain filters in Octave.

If you want to follow along in Octave, I strongly recommend that you upgrade to the new Octave 3.0. It is considered stable, but differs significantly from Octave 2.1, which many people may be used to. You will also need to install the image processing package from Octave-Forge. To get help with any Octave function, just type help <function>.

The most common linear spatial image filtering involves convolving a filter mask, sometimes called a convolution kernel, over an image, which is a two-dimensional matrix. In the case of an RGB color image, the image is actually composed of three two-dimensional grayscale images, each representing a single color, where each is convolved with the filter mask separately.

Convolution is sliding a mask over an image. The new value at the mask's position is the sum of the value of each element of the mask multiplied by the value of the image at that position. For an example, let's start with 1-dimensional convolution. Define a mask,

5 3 2 4 8

The 2 is the anchor for the mask. Define an image,

0 0 1 2 1 0 0

As we convolve, the mask will extend beyond the image at the edges. One way to handle this is to pad the image with 0's. We start by placing the mask at the left edge. (zero-padding is underlined)

Mask:   5 3 2 4 8
Image:  0 0 0 0 1 2 1 0 0

The first output value is 8, as every other element of the mask is multiplied by zero.

Output: 8 x x x x x x

Now, slide the mask over by one position,

Mask:   5 3 2 4 8
Image:  0 0 0 1 2 1 0 0

The output here is 20, because 8*2 + 4*1 = 20;

Output: 8 20 x x x x x

If we continue sliding the mask along, the output becomes,

Output: 8 20 18 11 13 13 5

Here is the correlation done in Octave interactively, (filter2() is the correlation function).

octave> filter2([5 3 2 4 8], [0 0 1 2 1 0 0])
ans =

    8   20   18   11   13   13    5

The same thing happens in two-dimensional convolution, with the mask moving in the vertical direction as well, so that each element in the image is covered.

Convolution Drawing

Sometimes you will hear this described as correlation (Octave's filter2) or convolution (Octave's conv2). The only difference between these operations is that in convolution the filter masked is rotated 180 degrees. Whoop-dee-doo. Most of the time your filter is probably symmetrical anyway. So, don't worry much about the difference between these two. Especially in Octave, where rotating a matrix is easy (see rot90()).

Now that we know convolution, let's introduce the sample image we will be using. I carefully put this together in Inkscape, which should give us a nice scalable test image. When converting to a raster format, there is a bit of unwanted anti-aliasing going on (couldn't find a way to turn that off), but it is minimal.

Test Image Original

Save that image (the PNG file, not the linked SVG file) where you can get to it in Octave. Now, let's load the image into Octave using imread().

m = imread("image-test.png");

The image is a grayscale image, so it has only one layer. The size of m should be 300x300. You can check this like so (note the lack of semicolon so we can see the output),

size(m)

You can view the image stored in m with imshow. It doesn't care about the image dimensions or size, so until you resize the plot window, it will probably be stretched.

imshow(m);

Now, let's make an extremely simple 5x5 filter mask.

f = ones(5) * 1/25

Octave will show us what this matrix looks like.

f =

   0.040000   0.040000   0.040000   0.040000   0.040000
   0.040000   0.040000   0.040000   0.040000   0.040000
   0.040000   0.040000   0.040000   0.040000   0.040000
   0.040000   0.040000   0.040000   0.040000   0.040000
   0.040000   0.040000   0.040000   0.040000   0.040000

This filter mask is called an averaging filter. It simply averages all the pixels around the image (think about how this works out in the convolution). The effect will be to blur the image. It is important to note here that the sum of the elements is 1 (or 100% if you are thinking of averages). You can check it like so,

sum(f(:))

Now, to convolve the image with the filter mask using filter2().

ave_m = filter2(f, m);

You can view the filtered image again with imshow() except that we need to first convert the image matrix to a matrix of 8-bit unsigned integers. It is kind of annoying that we need this, but this is the way it is as of this writing.

ave_m = uint8(ave_m);
imshow(ave_m);

Or, we can save this image to a file using imwrite(). Just like with imshow(), you will first need to convert the image to uint8.

imwrite("averaged.png", ave_m);

Test Image Averaged

There are a few things to notice about this image. First there is a black border around the outside of the filtered image. This is due to the zero-padding (black border) done by filter2(). The border of the image had 0's averaged into them. Second, some parts of the blurred image are "noisy". Here are some selected parts at 4x zoom.

Average 4x Zoom

Notice how the circle, and the "a" seem a little bit boxy? This is due to the shape of our filter. Also notice that the blurring isn't as smooth as it could be. This is because the filter itself isn't very smooth. We'll fix both these problems with a new filter later.

First, here is how we can fix the border problem: we pad the image with itself. Octave provides us three easy ways to do this. The first is replicate padding: the padding outside the image is the same as the nearest border pixel in the image. Circular padding: the padding from from the opposite side of the image, as if it was wrapped. This would be a good choice for a periodic image. Last, and probably the most useful is symmetric: the padding is a mirror reflection of the image itself.

To apply symmetric padding, we use the padarray() function. We only want to pad the image by the amount that the mask will "hang off". Let's pad the original image for a 9x9 filter, which will hang off by 4 pixels each way,

mpad = padarray(m, [4 4], "symmetric");

Next, we will replace the averaging filter with a 2D Gaussian distribution. The Gaussian, or normal, distribution has many wonderful and useful properties (as a statistics professor I had once said, anyone who considers themselves to be educated should know about the normal distribution). One property that makes it useful is that if we integrate the Gaussian distribution from minus infinity to infinity, the result is 1. The easiest way to get the curve without having to type in the equation is using fspecial(): a special function for creating image filters.

f_gauss = fspecial("gaussian", 9, 2);

This creates a 9x9 Gaussian filter with variance 2. The variance controls the effective size of the filter. Increasing the size of the filter from 9 to 99 will actually have virtually no impact on the final result. It just needs to be large enough to cover the curve. Six times the variance covers over 99% of the curve, so for a variance of 2, a filter of size 7x7 (always make your filters odd in size) is plenty. A larger filter means a longer convolution time. Here is what the 9x9 filter looks like,

Gaussian 2D

And to filter with the Gaussian,

gauss_m = filter2(f_gauss, mpad, "valid";
gauss_m = uint8(guass_m);

Notice the extra argument "valid"? Since we padded the image before filtering, we don't want this padding to be part of the image result. filter2() normally returns an image of the same size as the input image, but we only want the part that didn't undergo (additional) zero-padding. The result is now the same size as the original image, but without the messy border,

Test Image Gaussian

Also, compare the result to the average filter above. See how much smoother this image is? If you are interested in blurring an image, you will generally want to go with a Gaussian filter like this.

Now I will let you in on a little shortcut. In Matlab, there is a function called imfilter which does the padding and filtering in one step. As of this writing, the Octave-Forge image package doesn't officially include this function, but it is there in the source repository now, meaning that it will probably appear in the next version of that package. I actually wrote my own before I found this one. You can grab the official one here: imfilter.m

With this new function, we can filter with the Gaussian and save like this. Notice the flipping of the first two arguments from filter2, as well as the lack of converting to uint8.

gauss_m = imfilter(m, f, "symmetric");
imwrite("gauss.png", gauss_m);

imfilter() will also handle the 3-layer color images seamlessly. Without it, you would need to run filter2() on each layer separately.

So that is just about all there is. fspecial() has many more filters available including motion blur, unsharp, and edge detection. For example, the Sobel edge detector,

octave:25> fspecial("sobel")
ans =

   1   2   1
   0   0   0
  -1  -2  -1

It is good at detecting edges in one direction. We can rotate this each way to detect edges all over the image.

mf = uint8(zeros(size(m)));
for i = 0:3
  mf += imfilter(m, rot90(fspecial("sobel"), i));
end
imshow(mf)

Test Image Edges

Happy Hacking with Octave!


The New Index

You may have noticed the new index available on the left of the page. It really needs one, because looking at older posts required paging through the archives. Even more so because I have yet to add search functionality. There were Blosxom plug-ins available to do these things, but they are long gone now. I looked into getting a Google search box dropped in there, but it requires a big advertisement to go around the box (and Javascript... bleh!). I'll figure out something sometime.

Blosxom doesn't have a fancy database behind it, but rather plain flat text files. I imagine that this doesn't scale well, but I haven't looked that carefully into Blosxom's code. I should never be having problems like that anyway as I doubt I will ever surpass even 1000 posts, nor will there ever be some huge flood of users. What I do get is simplicity: to make a new post, I just write a text file and drop it on the server. It also made writing the index code very simple.

Without further ado, here is the code (under the same license as Blosxom): blog-index.perl. As you can see, it doesn't integrate with the rest of the blog; it only produces an index. I need function, not form, so this is fine for me. You can use it on your own Blosxom setup, just as long as you set the parameters at the top of the file correctly.

The index is created dynamically, but only when it needs to be updated. It first checks the date of the pre-generated index against the date of the latest entry. If the entry is newer, the index is rewritten. In both cases, this index is then served up efficiently straight from a plain XHTML file. All the proper file locking is in place (I think), so if there was a flood of requests when the index needs to be rewritten, no two requests are trying to re-write the index at the same time.

So nothing terribly exciting this week, but I hope someone finds this code useful somewhere.


My Team Won the Robot Competition

Introduction:
This "news" is over two months old, simply because I had other more interesting things to write about first. Not that I am out of ideas: I have at least three more ideas lined up at the moment on top of several half-written entries that may never see the light of day. I just want to get it out of the way.

In December we held the robot competition, pitting against each other the robots that we spent the semester building. It was a double-elimination bracket with five teams. Teams competed by arranging the maze (within the rules) and deciding the initial position for their opponents. The robots do not get to know about the maze or where they are starting; they must figure this out on their own by exploring the maze.

To recap, there was an 8'x8' game area containing a 4'x8' maze of 1-square-foot cells. On the floor of the game area was a grid of white lines on black, where the white lines were about 7-inches apart. The robot started at an unknown position and orientation in the maze, which was also set up with a configuration unknown to the robot. In the non-maze open area, three small wooden blocks were placed at the intersection of white lines, with a steel washer attached to the top of each block.

Game Board

In short, the robot had to move all three blocks to the repository, a pre-programmed position in the maze.

At the end of the semester, our team's robot was the only one that could successfully complete this task. The other teams needed to play in a degraded mode: known maze configuration, known starting position, known block positions. The loser bracket played this degraded version of the game. Because of this, our team was able to sweep the tournament with a perfect run. All the robot had to do was successfully run the full game. The competition, not being able to do this, automatically lost.

The robots were mostly the same, except for one team who had a robot with 4 multi-direction wheels. Every other team made a "scooter bot" type of robot: two powered wheels (with casters for balance) and chassis with three levels. The first real separation of design was when it came to picking up blocks. Each team initially had a different idea. One team was going to build a pulley system to lift the blocks. Another was going to use sweeping arms to sweep in the block. Another was going to used a stationary magnet.

Our team went with a rotating wheel in front with magnets along the outside (see images below). Once a block was found, the robot would rotate a magnet over the block, then rotate the attached block out of the way. In the end, four of the five teams ended up using this design for their own robots (the last team stuck with the stationary magnet).

These pictures were taken about a month before the competition. The wiring job was still a bit sloppy and the front magnet wheel lacks tiny magnets attached to the outside. Other than that, this is what our final robot looked like. In that last month, we attached the magnets, cleaned up the wiring, and made a whole bunch of code improvements making the robot more robust.

Robot Image 1 Robot Image 2 Robot Image 3 Robot Image 4 Robot Image 5 Robot Image 6 Robot Image 7 Robot Image 8 Robot Image 9 Robot Image 10 Robot Image 11 Robot Image 12 Robot Image 13 Robot Image 14 Robot Image 15 Robot Image 16 Robot Image 17 Robot Image 18 Robot Image 19 Robot Image 20 Robot Image 21

I will now attempt to describe some of the things you see in these images.

On the bottom of the robot you can see two casters for balancing the robot (big clunky things). You can see an IR sensor, which is pointing at the blue surface attached to the other side of the robot. This was the block detection sensor, a home-made break-beam sensor. And finally, you can see three LED lights on top of a long circuit board. This is a line tracker, with three sensors that can see the white grid on the bottom of the game board. The line tracker is how the robot navigated the open area of the board. It went back-and-forth looking for blocks, using the line tracker to stay on the line.

Also attached to this bottom layer are the powered wheels, with blue rubbers for traction, and their wheel encoders. There are spokes on the inside of the wheels (encoder disks), and the wheel encoders send a signal to the micro-controller each time it sees a spoke. The software counts the number of spokes that passed, allowing the robot to keep track of how far that wheel has turned. This information is combined with IR distance sensors to give it a very accurate idea of its position.

On top of the bottom black layer, you can see four distance IR sensors for tracking walls in the maze. They checked to make sure the robot was going straight (that's why there are two on each side), as well as map out the maze as it travels long. Hanging down from the bottom of the red layer is another IR sensor facing forward, looking at walls in front of the robot. Mounted on the front is the block retrieval device (lacking magnets at this point).

On top of the red layer are two (empty) battery packs, which holds 9 AA rechargeable NiMH batteries. This actually makes two separate power systems: a 4-pack for motors and a 5-pack for logic (micro-controller et al). In the circuit, the motors, containing coils of wire, behave like inductors, which could cause harmful voltage spikes to the logic. Separate power systems help prevent damage.

On top is the micro-controller and all of the important connections. The vertical board contains the voltage regulator and "power strip" where all of the sensors are attached. It also contains the start button, which was connected to an interrupt in the micro-controller. The micro-controller had its own restart button, but once the system started up, initialized, and self-calibrated, it waited for a signal from the start button to get things going.

I was about to post this when I was reminded by my fiancee that she took pictures at the end-of-semester presentation, after the competition. Here are some images of the robot after it was completely finished. Yes, that is a little face fastened to the front.

Final Robot Image Final Robot Image Table of Robots More Robots Even More Robots Competition Poster My Team (Team #1)

If you are ever at Penn State and are visiting the IST building, you can see the robot. Because the robot won the competition, it is on display and will be for years to come. You can recognize it by its face.

I have made the final robot code available here: final-robot-code.zip. I was the software guy, handling pretty much all the code, so everything here, except interupt_document.c was written by me. It's probably not very useful as code, except for reading and learning how our robot worked. There are a few neat hacks in there, though, which I may discuss as posts here. It's not noted in the code itself, nor in the zip file, but I'll make this available under my favorite 2-clause BSD license.


The 3n + 1 Conjecture

The 3n + 1 conjecture, also known as the Collatz conjecture, is based around this recursive function,

Collatz function LaTeX screenshot

The conjecture is this,

This process will eventually reach the number 1, regardless of which positive integer is chosen initially.

The way I am defining this may not be entirely accurate, as I took a shortcut to make it a bit simpler. I am not a mathematician (IANAM) -- but sometimes I pretend to be one. For a really solid definition, click through to the Wikipedia article in the link above.

A sample run, starting at 7, would look like this: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. The sequence starting at 7 contains 17 numbers. So 7 has a cycle-length of 17. Currently, there is no known positive integer that does not eventually lead to 1. If the conjecture is true, then none exists to be found.

I first found out about the problem when I saw it on UVa Online Judge. UVa Online Judge is a system that has a couple thousand programming problems to do. Users can submit solution programs written in C, C++, Java, or Pascal. For normal submissions, the fastest program wins.

Anyway, the way UVa Online Judge runs this problem is by providing the solution program pairs of integers on stdin as text. The integers define an inclusive range of integers over which the program must return the length of the longest Collatz cycle-length for all the integers inside that range. They don't tell you which ranges they are checking, except that all integers will be less than 1,000,000 and the sequences will never overflow a 32-bit integer (allowing shortcuts to be made to increase performance).

The simple approach would be defining a function that returns the cycle length (Lua programming language),

function collatz_len (n)
   local c = 1
   
   while n > 1 do
      c = c + 1
      if math.mod(n, 2) == 0 then
	 n = n / 2
      else
	 n = 3 * n + 1
      end
   end
   
   return c
end

Then we have a function check over a range (assuming n <= m here),

function check_range (n, m)
   local largest = 0
   
   for i = n, m do
      local len = collatz_len (i)

      if len > largest then
	 largest = len
      end
      
   end
   
   return largest
end

And top it off with the i/o. (I am just learning Lua, so I hope I did this part properly!)

while not io.stdin.eof do
   n, m = io.stdin:read("*number", "*number")
   
   -- check for eof
   if n == nil or m == nil then
      break
   end
   
   print (n .. " " .. m .. " " .. check_range(n, m))
end

Notice anything extremely inefficient? We are doing the same work over and over again! Take, for example, this range: 7, 22. When we start with 7, we get the sequence shown above: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Eight of these numbers are part of the range that we are looking at. When we get up to 22, we are going to walk down the same range again, less the 7. To make things more efficient, we apply some dynamic programming and store previous calculated cycle-lengths in an array. Once we get to a value we already calculated, we just look it up.

I used dynamic programming in my submission, which I wrote up in C. You can grab my source here. It fills in a large array (1000000 entries) as values are found, so no cycle-length is calculated twice. When I submitted this program, it ranked 60 out of about 300,000 entries. There are probably a number of tweaks that can increase performance, such as increasing the size of the array, but I didn't care much about inching closer to the top. I would bet that the very top entries did some trial-and-error and determined what ranges are tested, using the results to seed their program accordingly. You could take my code and submit it yourself, but that wouldn't be very honest, would it?

So why am I going through all of this describing such a simple problem? Well, it is because of this neat feature of Lua that applies well to this problem. Lua is kind of like Lisp. In Lisp, everything is a list ("list processing" --> Lisp). In Lua, (almost) everything is an associative array (Maybe they should have called it Assp? Or Hashp? I am kidding.) An object is a hash with fields containing function references. There is even some syntactic sugar to help this along.

The cool thing is that we can create a hash with default entries that reference a function that calculates the Collatz cycle-length of its key. Once the cycle-length is calculated, the function reference is replaced with the value, so the function is never called again from that point. The function only actually determines the next integer, then references the hash to get the cycle-length of that next integer.

Now this hash looks like it is infinitely large. This is really a form of lazy evaluation: no values are calculated until they are needed (this is one of my favorite things about Haskell). We don't need to explicitly ask for it to be calculated, either. We just go along looking up values in the array as if they were always there. Here is how you do it,

collatz_len = { 1 }

setmetatable (collatz_len, {
   __index = function (name, n)
      if (math.mod (n, 2) == 0) then
         name[n] = name[n/2] + 1;
      else
         name[n] = name[3 * n + 1] + 1;
      end
         return name[n] 
   end
})

So we replace the collatz_len function with this array (and replace the call to an array reference) and we have applied dynamic programming to our old program. If I run the two programs with this sample input,

10 1000
1000 3000
300 500

and look at average running times, the dynamic programming version runs 87% faster than the original.

One problem with this, though, is the use of recursion. In Lua, it is really easy to hit recursion limits. For example, accessing element 10000 will cause the program to crash. This will probably get fixed someday, or in some implementation of Lua.

I thought there might be a way to do this in Perl, by changing the default hash value from undef to something else, but I was mildly disappointed to find out that this is not true.

Here is the source for the original program and the one with dynamic programming (BSD licenced): collatz_simple.lua and collatz.lua


Optimizing, Multi-threading Brainfuck to C Converter

wbf2c converts an esoteric programming language called brainfuck into C code which can be machine compiled. Several optimizations are done to make the resulting program extremely fast an efficient. The converter supports both a static (standard 30,000 cells) array or a dynamically-sized array. It also supports many different cell types, from the standard char to multi-precision cells using GMP.

The converter can also run several brainfuck programs on the same memory array at once by running each one in a thread. To make sure each brainfuck operation is atomic, each cell gets a mutex lock. The only other multi-threading brainfuck implementation I know of is Brainfork.

For an example of some brainfuck code I wrote,

+>+<
[
[->>+>+<<<]>>>
[-<<<+>>>]<<

[->+>+<<]>>
[-<<+>>]<<
]

This program fills the memory with the Fibonacci series. Make sure you use the dynamically sized array, along with the bignum cell type. After two or three seconds of running, my laptop (unmodified Dell Inspiron 1000) can calculate and spit out a 140MB text file containing the first 50,000 numbers in the series. I used the -d dump option to see this output.

Download information, as well as some more examples, including a multi-threaded one, are on the project website.


Some News

Sorry for the lack of new material. I was apartment hunting with my fiancee (the most beautiful and wonderful girl in the world) this past weekend in the Baltimore area. I found a nice place in Columbia. Tomorrow morning I am getting some oral surgery done and will be hospitalized for awhile. So nothing new here for possibly a couple of weeks.

However, once I have recovered, I will probably make a post about Lua, another about a robot game I wrote this past summer, and another with pictures of my tournament-winning robot from last semester.

If you are reading this post months from now and there has been no new material, I guess you can assume something went bad with the surgery (unlikely). According to this book I have called Scam Proof Your Life (from the AARP), preventable medical errors in US hospitals results in 195,000 deaths. I am not worried, however.


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.