Gmail has a nice feature: when delivering e-mail, everything including
and after a + in a Gmail address is ignored. For example,
mail arriving at all of these addresses would go to the same place if
they were Gmail addresses,
Thanks to this feature, when a user acquires a Gmail account, Google
is actually providing about a googol (as in the number
10100) different e-mail addresses to that user! Quite
appropriate, really.
I have seen other mailers do similar things, like ignoring everything
after dashes. A nice advantage to this is when registering at a new
website I can customize my e-mail address for them by, say, throwing
the website name in it. Because I have a google of e-mail addresses
available, it is impossible to run out, so I can give every person I
meet their own version of my address. The custom address can come in
handy for sorting and filtering, and it will also tell me who is
selling out my e-mail address. This, of course, assumes that someone
isn't stripping out the extra text in my address to counter the Gmail
feature.
However, in my personal experience, most websites will not permit
+'s in addresses. This is completely ridiculous, because
it means that virtually every website will incorrectly invalidate
perfectly valid e-mail addresses. Even major websites, like
coca-cola.com, screw this up. They see the + in
the address and give up.
In fact, if I do a Google search for "email validation regex" right
now, 9 of the first 10 results return websites with regular
expressions that are complete garbage and will toss out many common,
valid addresses. The only useful result was at the fifth spot (linked
below).
For the love of Stallman's beard, stop writing your own e-mail
address validators!
Why shouldn't you even bother writing your own? Because the proper
Perl regular expression for RFC822 is over 6
kilobytes in length! Follow that link and look at that. This is
the smallest regular expression you would need to get it right.
If you really insist on having a nice short one and don't want
to use a validation library, which, again, is a stupid idea and you
should be using a library, then use the dumbest, most liberal
expression you can. (Just don't forget the security issues.) Like
this,
.+@.+
Seriously, if you add anything else you most almost surely make it
incorrectly reject valid addresses. Note that e-mail addresses can
contain spaces, and even more than one @! These are
valid addresses,
"John Doe"@example.com
"@TheBeach"@example.com
I have not yet found a website that will accept either of these, even
though both are completely valid addresses. Even MS Outlook, which I
use at work (allowing me to verify this), will refuse to send e-mail
to these addresses (Gmail accepts it just fine). Hmmm... maybe having
an address like these is a good anti-spam measure!
So if your e-mail address is "John Doe"@example.com no
one using Outlook can send you e-mail, which sounds like a feature to
me, really.
So, everyone, please stop writing e-mail validation regular
expressions. The work has been done, and you will only get it wrong,
guaranteed.
I would like to share one of my Dungeons and Dragons stories. It won't
be anywhere as interesting as the two stories I mentioned, but I have
gotten some laughs out of people I tell it to.
Back in my college days, a friend was running a 3.5 campaign and asked
me to join in. The other players had rolled up their characters and
done a couple practice sessions before I joined, so I was jumping into
a group that had already had their start.
It was an evil campaign. Everyone in the group was either evil or
neutral. We were serving some kind of evil sorcerer. I felt that being
evil allowed much more variety of play and usually gave us more
freedom in the game world than being good. If things started to get
dry, someone would do something stupid like burn down the nearest town
inn. The lawful evil characters weren't so happy about this, though.
I rolled up a chaotic evil half-orc barbarian named Gnohkk. I was
lucky on my rolls and had some wonderful stats, so he was very big and
strong. Gnohkk started with a two-handed axe that contained a magic
fire gem in its head. After some training he was able to cause a fiery
explosion at will centered at the location of the fire gem. If an
untrained character wielded the axe, upon striking anything he would
have to make a saving throw to prevent an explosion.
Because Gnohkk would also suffer from the explosion, this wasn't
terribly useful at first, except against targets particularly
vulnerable to fire, like trolls and magic trees. The training was
mainly to make the axe usable.
At some point during the campaign the party got a hold of a ring of
fire resistance, which was then used by Gnohkk. With this ring he was
invulnerable to his own explosions, making the axe very powerful. The
ring also provided a handful of simple fire spells that made combat
more interesting.
Sadly, not long after the ring was found the fire gem axe chipped
broke. Before discarding the broken axe, Gnohkk carefully removed the
fire gem for safe keeping.
After an intense battle against some type of monsters with big horns,
Gnohkk found himself a nice metal helmet. He also managed to scrounge
up 10 horns from the monster corpses, which gave him a great
idea. When the party got to the next town, Gnohkk visited the
blacksmith, and requested to have him affix the 10 horns around the
outside of the helmet. In the front of the helmet he wanted to have a
socket created and the fire gem embedded inside it. The blacksmith
said it would take a couple of days.
I was hoping this giant helmet would make Gnohkk more intimidating. So
far in the campaign, every time he had attempted to intimidate an NPC
he failed his rolls and came off looking stupid instead.
So, this was pretty cool: a giant horned helmet with a gem in the
front. With the helmet on, Gnohkk could explode into a ball of flames
at will and without warning. What a wonderful toy for a chaotic evil!
When Gnohkk returned to the blacksmith a couple days later he found
the shop was destroyed. Some of the walls remained, the roof was gone,
and weapons and tools were thrown all over the area. The blacksmith's
charred body had been dead for some time. However, the helmet
was completed and basically intact. Gnohkk shrugged, grabbed
the helmet and left, all without having to pay a single copper to the
now dead blacksmith.
We figured out later what happened (the GM probably just told
us). Gnohkk failed to tell the blacksmith about the properties of the
gem. When the blacksmith tapped the gem into the socket, the fire gem
activated and exploded, killing him and destroying his shop.
Some time ago I was watching through the entire series of Deep Space
9. It was a Star Trek television show about a space station that
rests next a
wormhole that connects to the other side of the galaxy (The Delta
quadrant).
The Delta quadrant is ruled by a group called the Dominion, and they
are looking to conquer the Federation side of the galaxy (the Alpha
quadrant). At one point during the series, the Federation needs to
temporarily disable the wormhole to prevent Dominion ships from
crossing through. They do this by
mining the wormhole with identical, cloaked, self-replicating
mines.
If a mine is destroyed, the neighboring mines will replicate a
replacement. The minefield repairs itself. This makes removing the
minefield within a reasonable amount of time difficult to
impossible. If even a single mine is left behind, it can replicate the
entire minefield again.
The most interesting question here is this:
When the Federation returns and wants to remove the minefield, how
would they do it? What would stop the Dominion from doing the same
thing?
The first thing that comes to mind is having a kill signal, but what
would this signal be? It could simply be a plain "kill" command, but
the Dominion could also broadcast such a signal to disable the
minefield. Consider that the Dominion could capture a single mine and
study everything about its workings. The minefield itself could
therefore hold no secrets whatsoever. This leaves out any possibility
of a secret kill command stored in the mines.
Here's what I would do, assuming that humans or aliens have not yet
discovered some giant breakthrough in factoring in the Star Trek
universe. I would randomly generate two very large prime
numbers. Today, two 1024-bit primes should be more than enough, but in
350 years even larger numbers would probably be necessary. Then, I
multiply these two number together and store this number in the mine
software. To disable the minefield, I simply broadcast these two
numbers into the minefield. The mines would be programmed to take the
product of any pairs of numbers it receives. If the product matches
the internal number, the mine shuts down.
Voila! A method for shutting down the minefield. The enemy can know
everything about every single mine's construction, including the
software and data stored on every mine, but will be unable to disable
the minefield without factoring a very large composite number, which
would presumably be difficult or impossible (within a reasonable
amount of time).
Another possibility would be using a hash. Come up with a strong
passphrase, then use a hashing algorithm like SHA-1 or MD5, or
whatever is available and appropriate in 350 years, to hash the
passphrase. Store the hash in the mines. When you want to disable the
minefield, broadcast the passphrase. These mines will hash the
broadcast and compare it to the stored hash. It's really the same
solution as before: a one-way function. This is also similar to how
passwords are stored inside a computer today.
If we wanted more commands, like "don't blow up any ships for awhile"
or "increase minefield density", we could generate more composites
corresponding to each command. However, once a command is issued, the
secret -- the two prime numbers -- is out, and it cannot be used
again. In this case, I would go into the realm of public key
cryptography.
I would issue a command, along with a timestamp, and maybe even a
nonce that could double as a global identifier for the command, and
sign the whole deal using my private key. On each mine I would store
the public key. When a command is received, the mines would check the
signature before executing the command. I could then issue repeat
commands, as the timestamps would change each time. An adversary
learns nothing when a command is issued, because the time stamps would
make any replay attacks useless.
Minefields just like this exist today all over the Internet, as botnets. Thousands of
computers all around the world become infected with malware and come
under the control of a single individual or group. Individual machines
in the botnet could be taken out, but removing the entire botnet is
difficult as it grows and repairs itself. Any security researcher
could disassemble the botnet malware and learn anything about it, so
the malware can store no secrets. How does a malicious person control
the botnet, then, without someone else taking control? Public key
cryptography, just as described above.
Patience. NetHack is all about patience. Never be too hasty, as the
game is extremely unforgiving. If you let your guard down for just a
few turns, you can easily lose everything. Death is permanent
(with
some exceptions), so dying means starting it all over with a new
character.
I got into NetHack a couple of
years ago. I play it in cycles, playing heavily for a couple of months
at a time, then take a break for a couple months after I tire of
getting stuck at the same point each game. I generally return a better
player. The game is very complex, taking probably a hundred hours of
play just to nail down the basic gameplay and techniques.
This is what my typical session looks like. I like it almost as simple
as it comes. By default, color is off, but I like to have it on (a lot
more information is available with color). The graphics may seem
crappy, but
it is said,
While the graphics may seem primitive by today's standards,
today's gameplay seems primitive by NetHack standards.
Like most rogue-like games, playing NetHack is a rewarding experience,
so if you never tried it out before, I suggest taking a look at
the NetHack
Guidebook and firing it up for yourself! If you don't want to
install it, but you have a telnet client available, you can connect to
the nethack.alt.org server
to play and watch others play, which is a good way to learn
more. Building it from source can be a little tricky (took me a little
while to figure it out the first time), so using your package manager
is advised if you do choose to install it to try it out.
Another good resource is the NetHack
wiki Wikihack. However, if you
don't want to "spoil" yourself, don't go there. Learning things about
the game outside of the game is not considered cheating, but rather
spoiling. I don't mind spoiling myself. I will run into plenty of
things I have not yet spoiled myself on. The game is hard enough as it
is!
There is a theory that if someone ever completely beats NetHack under
the hardest conditions, the source code will be thrown out and
replaced with something even harder and more unforgiving. There is
another theory which states that this has already happened.
When play for the first time, start with a Valkyrie or a Barbarian as
these classes are pretty tough against baddies, simpler to play, and
more foolproof than some of the squishier classes, like wizards.
WARNING: SPOILERS FOLLOW
To give an idea of some of the interesting gameplay, I have a couple
examples.
I am the @ symbol, which is how NetHack represents the
player. I am entering the room with two
awakened nymphs
(n) and a
yellow light
(y). Behind me is my cat who fights alongside me and
assists me. Nymphs do not do damage, but rather steal your
equipment. They approach you, charm you into giving them something
from your inventory, which includes weapons and armor, and teleport
away. This means that you generally want to kill them before they
close in.
Yellow lights also do no damage. They run up to you and explode,
causing you to be blind for a number of turns. Normally, this can be a
really bad situation if you do not have the right tools available to
you, which is typical for early in the game. In the worst case, which
also happens to be the most likely, while I work on the nymphs the
yellow light will run up to be blinding me. Now, since I cannot see
the nymphs, I cannot attack them at range and they run up to me each
stealing one item. If I remain blind for long enough, they may find me
again and steal something else while I am helpless. They would rob me
blind! It would be up to my cat to dispatch them.
However, I did have some good tools available. I happened to get
extremely lucky
(the Random Number God
was nice to me that day) and find both
a cloak
of displacement and
jumping
boots at a store at dungeon level 1. The cloak makes me appear to
be in a different place than I really am while the jumping boots let
me travel several squares in a single turn. I was also playing a
rogue, so my main attack was already a ranged one: throwing daggers.
The yellow light ran up to my displaced image and exploded, causing no
blindness to me. One down. Next, I threw my plentiful supply of
daggers at the nymphs, keeping my distance by jumping around the
room. I managed to prevent a deadly situation, being stuck naked with
no weapon, by applying my resources well. With NetHack, I had plenty
of time to plan out each move I made. There is no timer. NetHack moves
at my pace.
There was a another situation a couple months ago (which I will not
bother drawing) where I was attacked in the middle of a room by a
werewolf. The
werewolf summoned help immediately and I was surrounded by winter
wolves and coyotes and such. Winter wolves can be dangerous because
they shoot deadly frost bolts, and I had not yet have the
cold
resistance intrinsic.
I cut a hole through the canines towards the hallway where I could
fight them one at a time, and I lost most of my health in the
process. Then
I prayed to recover
my health. Then I starting hacking away at the canines. Again, I was
low on health and running out of options. The winter wolves were
firing bolts down the hallway at
me. Engraving Elbereth
on the floor was not going to save me from the ranged frost
bolts. Things were looking quite grim for me. The end seemed near.
I did the only thing I could think to do at this point: I reached down
and took a bite into one of the dead frost wolves that was resting on
the floor beneath me.
Take a moment and picture this. Frost bolts are zipping down a dark
cold hallway at a nearly dead Valkyrie. There are shattered potions
and frozen liquids all over the floor. The winter wolves have caused
the temperature to plummet. Her breath hangs visibly in the air. She
quickly crouches down and shoves her face into a dead wolf, taking a
nasty bite right into its furry, bloodied side. Between her clenched
teeth she tears off a piece dripping with blood. Desperately, she
stuffs her face with the fresh kill, trying to eat as much as fast as
she can as she dodges more frost bolts.
At this moment, the Random Number God blessed me with good fortune,
and I was granted the cold resistance intrinsic just before the next
frost bolt was going to kill me, allowing me to harmlessly absorb it
and continue hacking away the remaining canines. Her reward for
victory was a canine feast!
So, yes, the simple looking NetHack can be quite exciting. :-)
At work they recently moved me to a new project. It is a Matlab-based
data analysis thing. I haven't really touched Matlab in over a year
(the last time I used Matlab at work), and, instead, use GNU Octave at
home when the language is appropriate. I got so used to Octave that I
found a pretty critical feature missing from Matlab's implementation:
treat an expression as if it were of the type of its output.
Let's say we want to index into the result of a function. Take, for
example, the magic square function, magic(). This spits
out a
magic square
of the given size. In Octave we can generate a 4x4 magic square and
chop out the middle 2x2 portion in one line.
octave> magic(4)(2:3,2:3)
ans =
11 10
7 6
Or more possibly clearly,
octave> [magic(4)](2:3,2:3)
ans =
11 10
7 6
Try this in Matlab and you will get a big, fat error. You have to
assign the magic square to a temporary variable to do the same
thing. I kept trying to do this sort of thing in Matlab and was
thinking to myself, "I know I can do this somehow!". Nope, I
was just used to having Octave.
Where this really shows is when you want to reshape a matrix into a
nice, simple vector. If you have a matrix M and want to
count the number of NaN's it has, you can't just apply
the sum() function over isnan() because it
only does sums of columns. You can get around this with a special
index, (:).
So, to sum all elements in M directly,
octave> sum(M(:))
In Octave, to count NaN's with isnan(),
octave> sum(isnan(M)(:))
Again, Matlab won't let you index the result of isnan()
directly. Stupid. I guess the Matlab way to do this is to
apply sum() twice.
Every language I can think of handles this properly. C, C++, Perl,
Ruby, etc. It is strange that Matlab itself doesn't have it. Score one
more for Octave.
Stream
ciphers are one of the two types
of
symmetric key algorithms, which is when the same key is used for
encryption and decryption. They follow this simple concept: take a
good pseudo-random number generator and combine, usually by XOR, its
output with your plaintext stream. This is very similiar to
the
one-time pad, but the random pad is pseudo-random rather than
truly random. The key is the seed (or part of one) for
the PRNG.
Probably the most well known stream cipher is the extremely simple,
yet cryptographically
strong, Arcfour
algorithm. The official name is actually RC4, which comes from RSA
Security where it was developed. It was a trade secret until someone
anonymously leaked the algorithm to the public. The name RC4 is still
trademarked, though, so Arcfour is generally used instead, meaning
"Alleged RC4" (alleged because RSA Security never confirmed the
algorithm as being RC4). You have almost certainly used the
cipher yourself, because it is used in applications such as WEP and
SSL.
The algorithm has two parts: the key schedule algorithm and
pseudo-random generation algorithm. The key schedule uses the key,
and possible a non-secret initialization vector, to set up the state
of the PRNG. The state is an array of length 256 holding all of the
values from 0 to 255 in some order, along with two integers
(initialized to 0 after the key schedule). The algorithm looks like
this,
for i from 0 to 255
S[i] := i
endfor
j := 0
for i from 0 to 255
j := (j + S[i] + key[i mod keylength]) mod 256
swap(S[i],S[j])
endfor
The PRNG deals with one byte at a time, emitting a stream of bytes,
i := 0
j := 0
while GeneratingOutput:
i := (i + 1) mod 256
j := (j + S[i]) mod 256
swap(S[i],S[j])
output S[(S[i] + S[j]) mod 256]
endwhile
If you implement this in C and use the char type, you can
toss the modulus parts because they will just work automatically.
Now you just XOR your message with the output of the PRNG. The
Wikipedia article
probably explains it better than I can, so check it out if you still
don't follow.
Now, Arcfour has some flaws. Specifically, the algorithm itself
doesn't specify how an initialization vector is applied, which is
important. Using a plan key twice is bad it allows an adversary to get
information easily. For example, Lets say you have two
messages A and B. You use the same
key k, which will produce the same keystream K. Now, you
create your two ciphertexts CA
and CB
CA = A ^ K
CB = B ^ K
But notice if the adversary has both ciphertexts,
CA ^ CB = A ^ K ^ B ^ K = A ^ B
They are left with your two original messages XORed together. Let me
illustrate: we have two messages as bitmap images (here as PNGs for
the web),
Encrypt them using the same key. In this case, my key was "somekey".
If the adversary has both of these second "images", she can XOR them
together (having no knowledge of the key!) and get this,
An initialization vector (IV) is a set of bytes we combine with the
key. The IV is not a secret, as it is sent plaintext with the
ciphertext. If different IVs were used above with the same key, XORing
the ciphertext would result in nothing, because the keystreams are
totally different for each message.
However, the way the IV is combined is important too. Simply
concatenating the IV and the key can lead to weaknesses due to the way
the key schedule algorithm works. Something more secure would be a
cryptographic hash of the key and the IV. The reason WEP is broken is
because in its design the IV wasn't used properly.
Another weakness is that the initial bytes of the keystream have low
entropy. That is, some bits tend to be 0's or 1's consistently, which
can leak information to an adversary. This can be countered by tossing
the first few bytes of the keystream. Often the first 768 bytes are
dropped, but a conservative amount would be 3072 bytes. Another way to
deal with this is running the key schedule algorithm 10 or 20 times
(not reinitializing the S array between them of course) rather than
just once, which is the way
CipherSaber-2 does it.
Yet another weakness is that the keystream
becomes
distinguishable from random data after about a gigabyte of
output. That is, after about a gigabyte, the entropy of the overall
stream can become too low and compromise the security of the
message. A solution might be to change the IV each gigabyte.
I wrote an implementation of Arcfour in C, which you can get from my
Git repostiory with,
It is written as a library that can be used in different
applications. Included are a couple programs that make use of it. I
strongly suggest writing your own implementation. It is really easy to
do, and you will automatically have the algorithm memorized once you
do it. The Wikipedia article has some test vectors you can use to test
it.
My fiancee, Kelsey, claimed that she could tell the difference between
Coke and Pepsi. I wanted to put this to the test. Since there were
only two of us, arranging the test wasn't a simple matter of asking
someone else to pour some cups. I also wanted to do this right:
testing must
be
double-blind. I devised a little scheme that allowed us to perform
two different tests.
The first test was seeing if Kelsey or I could determine which
beverage was Pepsi and which was Coke. The second was determining if
there was any distinction in taste between the two drinks at all,
which consisted matching two different samples together. The second
test also acts as a check on the first test.
We bought one bottle of each at CVS. Next, we labeled six different
cups with the numbers one though six. Each odd number is paired with
the following even number. Kelsey, who was alone, used a die with an
even number of sides (this includes a something as simple as a coin
toss) to put one beverage in cup #1 and the other in cup #2. In this
case, we used my 20-sided die I use for Dungeons and Dragons, because
using it for this purpose was just full of win.
The die is important here as a random number generator. If it is left
up to a human to decide what drinks go where, we may bias the
setup. For example, I may be more likely to put Pepsi in an
odd-numbered cup than an even-numbered one.
It is difficult for human beings to behave randomly. Try generating a
list of 50 coin tosses yourself. I mean, without a coin. Just type a
series of 50 H's and T's. If you examine your list of flips, you will
find that you often generate very improbable series of flips
(excessive heads or tails) and exhibit patterns. We need dice or
coins to make decisions for us in this experiment.
To do it right, the beverage must be chosen before the roll: "I will
be pouring Pepsi now.". Roll the die. If it rolls an odd number, pour
the drink into the odd cup (#1). Write this information down and keep
it secret.
Next, after allowing the foam to calm down (which might accidentally
reveal information to me), Kelsey leaves the room and I enter. I
perform a similar procedure to distribute the drinks into cups #3 and
#4, then #5 and #6. I keep track of what drink, #1 or #2, goes into
what cup. I keep it secret. These last two cups are for the purpose of
the second experiment.
At this point Kelsey knows what was in cups #1 and #2, but not where
they went. I know where they went, but not what was in the
cups. Together we know everything, but individually we know
nothing.
In order to make the second test double-blind, and allow me to
participate, I leave the room. Kelsey rolls the die. If it rolls odd
she switches the label on cups #5 and #6. It is important that these
cups are identical. One flaw potential, however, is that the liquid in
each cup may look unique. One may be more fizzy or one cup a little
more full. Noticing this may happen subconsciously, which is the whole
point of doing double-blind tests.
Ok, we didn't actually do this last part because I didn't think of it
till later.
We sample all four cups (#3 - #6) in pairs, alone, making notes on
what beverage we think is in what cup. Once we are both done we share
our secrets and see how well we did.
Our results? Neither of us could tell the difference between Coke and
Pepsi.
I was at my fiancee's parent's house over Fourth of July weekend. Her
family likes to leave plenty of reading material right by the toilet,
which is something fairly new to me. They take their time on the john
quite seriously.
While I was in there I saw a large book
of Sudoku
puzzles. Since the toilet is a good spot to think (I like to call it
my
"
thinking chair"), I thought out an algorithm for solving
Sudokus. I then left the bathroom and implemented it in order to
verify that it worked.
The method is trial-and-error, which it does recursively: fill in the
next available spot with a valid number as defined by the rules
(cannot have the same number in a column, row, or partition), and
recurse. The function reports success (true) when a solution was
found, or failure (false), which means we try the next available
number. If no more valid numbers are available for testing at the
current position, then the puzzle is not solvable (we made an error at
a previous position), so we stop recursing and return failure.
More formally,
Find an open position.
Look at that position's row, column, and partition to find valid
numbers to fill in.
Fill the position with one of the valid choices.
Recurse using the new change.
If the recursion reports a problem (returns false), try the next
valid number and repeat.
If recursion reports success (true), stop guessing and return
success.
If the list of valid numbers is exhausted, return failure (false).
Note that the recursion depth does not exceed 81, as it only recurses
once per blank square. The "game tree" is broad rather than deep. It
doesn't have to duplicate the puzzle matrix in memory either because
all operations can be done in place.
Here is the implementation in C I typed up just after I left the
bathroom,
int solve (char matrix[9][9])
{
/* Find an empty spot. */
int x, y, i, j, s = 0;
for (i = 0; i < 9 && !s; i++)
for (j = 0; j < 9 && !s; j++)
if (matrix[i][j] == 0)
{
x = i; y = j; s = 1;
}
/* No empty spots, we found a solution! */
if (!s)
return 1;
/* Determine legal numbers for this spot. */
char nums[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
for (i = 0, j = y; i < 9; i++)
nums[(int)matrix[i][j]] = 0; /* Vertically */
for (i = x, j = 0; j < 9; j++)
nums[(int)matrix[i][j]] = 0; /* Horizontally */
for (i = 0; i < 3; i++)
for (j = 0; j < 3; j++)
nums[(int)matrix
[i + ((int)(x / 3)) * 3]
[j + ((int)(y / 3)) * 3]
] = 0; /* Within the partition */
/* Try each possible number and recurse for each. */
for (i = 1; i < 10; i++)
if (nums[i] > 0)
{
matrix[x][y] = i;
if (solve (matrix))
return 1;
}
/* Each attempt failed: reset this position and report failure. */
matrix[x][y] = 0;
return 0;
}
I assumed that it would be slow solving the puzzles, having to search
a wide tree, but it turns out to be very fast. It solves normal
human-solvable puzzles in a couple of milliseconds. Wikipedia has a
near-worst case Sudoku that is designed to make algorithms like mine
perform their worst.
On my laptop, my program solves this in 15 seconds, which means that
it should take no more than 15 seconds to solve any given Sudoku
puzzle. This provides me a nice upper limit.
There is a way to "defeat" this particular puzzle. For example, say an
attacker was trying to perform a
denial-of-service (DoS) attack on your Sudoku solver by giving it
puzzles like this one (making your server spend lots of time solving
only a few puzzles). However, these puzzles assume a certain guessing
order. By simply randomizing the order of guessing, both in choosing
positions and the order that numbers are guessed, the attacker will
have a much harder time creating a difficult puzzle. The worst case
could very well be the best case. This is very similar to how
Perl
randomizes its hash array hash functions.
Now suppose we kept our guess order random then "solved" an empty
Sudoku puzzle. What we have is a solution to a randomly generated
Sudoku. To turn it into a puzzle, we just back it off a bit. A Sudoku
is only supposed to have a single unambiguous solution, so we can only
back off until just before the point where two solutions becomes
possible. If you imagine a solution tree, this would be backing up a
branch until you hit a fork.
Normally, Sudokus are symmetric (in the matrix sense), but completely
randomizing the position guessing order won't achieve this. To make
this work, the randomizing process can be adjusted to only select
random points on the upper triangle (including the diagonal). For each
point it selects not on the diagonal, the mirror point is
automatically selected next. This will preserve symmetry when
generating puzzles.
One issue remains: there seems to be no way to control the difficulty
of the puzzles it generates. Maybe a number of open spaces left behind
is a good metric? This will require some further study (and another
post!).
A co-worker asked me a question today about C/C++ pointers,
If a pointer is declared inside a function with no explicit
initialization, can I assume that the pointer is initialized
to NULL?
We were down in the lab and, therefore, he had no Internet access to
look it up himself, which is why he asked. When I code C, it is just a
sort of mental habit to not use a non-static function variable without
first initializing it, but is this accurate? I knew the answer
was "no", but I wanted to be able to explain the "why".
Anyway, I quickly recalled some of my experimental C programs and
thought carefully about the mechanics of what is going on
behind-the-scenes, allowing me to confidently give him a "no"
answer. I then threw this together in a few seconds to prove it,
#include <stdio.h>
void a ()
{
int * x = (int *) 0x012345FF;
double y = -63454;
(void) x;
(void) y;
}
void b ()
{
int * x;
double y;
printf ("%p, %f\n", x, y);
}
int main ()
{
a ();
b ();
return 0;
}
When you compile it, make sure you don't use the optimization options
(-O, -O2, or -O3
for gcc) because they change the inner-workings of the
program. It might do things like make those functions inline (so they
won't be on the stack as I am intending), or even toss
out a(), as it appears to do nothing. The compiler sees
that, even though I "used" variables in a() by casting
them to void, nothing is really happening,
so a() can be ignored. We can probably get around this
with a tacked
on
volatile declaration, which you might see a lot of in a
micro-controller program. In a micro-controller, some memory addresses
are mapped to registers external to the software, so, from the
compiler's point of view, access to these locations may look like
nothing is really happening. Optimizing away variables that point to
these memory locations will lead to an incorrect binary, so your robot
or laser guided shark or whatever won't work.
Anyway, compiling with optimization will break my example! So don't do
it here.
When compiling, you should get some warnings about using uninitialized
variables, which is kind of the point of my example. Ignore it. That
warning alone gives away the answer to the main question, really, but
this example is a bit more fun!
Before you run it, study it and think about what the output should
look like. When a() is called, its stack frame goes into
the call stack, which contains the two declared variables. These
variables are then assigned as part of the function
execution. When a() returns, the frame is popped off the
stack. Then b() is called, and, as the variable
declarations are exactly the same, it will fit right over top
of a()'s old stack frame, and its variables will line
up. x and y are not assigned any value, so
they pick up whatever junk was lying around, which happens to be the
values assigned in a().
When you run the program, this is the output,
0x12345ff, -63454.000000
The pointer is not initialized
to NULL. If x is passed back uninitialized
under the assumption that a NULL is being passed, some
other poor function that handles the return value may dereference it,
resulting in possibly
some
nasal demons, but most likely an annoying segmentation
fault. Worse, this error may occur far, far away from where the actual
problem is, and even worse than that, only sometimes (depending on the
state of the call stack at just the right moment).
Note here that I am talking about non-static function variable
declarations. Global variables and static function variables will not
be on the stack. They are placed in a fixed location (in the data
segment), and their values are implicitly initialized to 0
at compile time.
In a
previous post I discussed one-time pads. Note: this is about a
cryptosystem, not some kind of menstruation disaster. The information
for this post comes
from
a little gem I saw on the xkcd
forums. I generally don't lurk around there because it suffers
from the same disease most non-self-moderating forums suffer from:
anal retentive, power-abusing, whiny moderators. I lucked out this
time.
The user Berengal posed the question,
One thing I've been wondering about is if you can use a single
one-time-pad to encrypt other one-time-pads to send around.
My first though was, "Hey, why didn't I think of this?". Then,
after a moment, I realized that it was the sort of thing that is too
good to be true. This is along the lines of ideas that break the laws
of thermodynamics. We are looking at a perpetual motion machine
here. Just as you cannot create or destroy energy, neither can you
use a one-time pad to distribute more than one other equal length
one-time pad. Let's make it formal,
In any closed one-time pad cryptosystem, the total number of
one-time pad bits in the system remains the same.
Also, if something like this could work, people would be using it
already everywhere. Before I got a chance to think too much about it,
user AJR spoiled it with an excellent proof on why it won't
work.
Note below that I use ^ to indicate
exclusive or
(XOR).
Suppose you have a master pad, K, that you use to
distribute two message pads, k1
and k2. You have two
messages, m1
and m2. We then have four transmitted texts:
two encrypted message pads e1
and e2, and two
ciphertexts c1
and c2. We define these as,
e1 = K ^ k1e2 = K ^ k2c1 = k1 ^ m1c2 = k2 ^ m2
Suppose an adversary is able to record all four transmitted texts. He
can apply them like so,
e1 ^ c1 = K ^ k1 ^ k1 ^ m1 = K ^ m1e2 ^ c2 = K ^ k2 ^ k2 ^ m2 = K ^ m2
This cancels the original message keys and effectively leaves you
encrypting two messages with the same pad, which is exactly the wrong
thing to do when using one-time pads. Once that is done, the adversary
can do tricks like,
e1 ^ e2 ^ c1 ^ c2 = m1 ^ m2
What is left is the two messages XORed together with no keys/pads
involved, which, depending on the messages, might reveal
something. Don't make this mistake.
Update: My friend Gavin thought he discovered a way to make
one-time pads useful, but I was able to show him that it simply
reduces to the same situation here. His
website details the algorithm and the crack that breaks it.
In a
previous post I discussed one-time pads. The information for this
post comes from Bruce
Schneier's
Applied Cryptography (section 10.8).
One-time pads are great for something
called
plausible deniability. With plausible deniability, when a person
holding encrypted data is coerced into decrypting their data, the
interrogator will not be able to tell if the person is complying with
the decryption order or not. For example, the victim could provide an
alternate key that decrypts the ciphertext into some harmless dummy
plaintext. To make this more plausible, the plaintext would probably
be something potentially embarrassing, such as pornography or secret
love letters.
We have a one-time pad K, a plaintext P, a
dummy plaintext (the pornography or love letters) D, a
dummy key K', and a ciphertext C. Below, I
denote XOR with ^.
To encrypt our plaintext, its the normal one-time pad algorithm,
P ^ K = C
Bob and Alice share K, so decryption works like,
C ^ K = P
However, the secret police come along with
their
thumbscrews and demand that Alice and Bob give them the one-time
pad K. Instead, they will provide K'. How is
K' defined? Like this,
K' = C ^ D
Because K is a one-time pad and is randomly generated,
there is no way to prove that K' is not the real
key. Alice and Bob give up K'. The secret police decrypt
it,
C ^ K' = C ^ C ^ D = D
"See? We were just keeping our love affair a secret from our spouses!"
Example code:
otp.c,
Makefile
(this Makefile may be useful)
The one-time
pad (OTP) is an encryption algorithm that provides perfect
secrecy, provided it is implemented properly. The pad, which is
essentially a key that is the same length as the message, must be used
only once (hence the name), kept secret, and, the hard part, generated
using some truly random
method. It works by XORing the pad and the message
together, which makes the ciphertext indistinguishable from randomly
generated data.
I want to be clear with the terms here. A one-time pad is not
"random". Numbers aren't random. It's just not a property numbers can
have. To be pedantic, what is random is the method by which the
numbers are generated. Think "random number-generators", not
"random-number generators".
A pseudo-random number generator (PRNG), like what your computer
normally uses to randomly generate numbers for games, will not work
here. Your computer is mindlessly deterministic and is not capable of
being a true random number generator by itself. PRNGs exhibit patterns
and eventually loop, spitting out the same numbers in the same order
as it did when it started. When the pad is generated randomly, each
bit in the pad will be
statistically independent from every other bit. That is, one value
of one bit has no affect on the value on other bits. Because of this,
the ciphertext encrypted from a one-time pad is equally likely to
decipher into any possible plaintext of the same length. This makes it
immune to brute-force attacks. The bits in a PRNG are not
statistically independent, making them vulnerable to attack. If you
know the value of one bit of the key, you also know something about
other bits in the key.
Generally, two parties wanting to communicate using a one-time pad
will exchange a large pad, such as a CD filled with randomly generated
data, in person before one of them travels somewhere, say another
country, where data needs to be sent back securely. As bytes from the
pad are used, they are destroyed and never used again. If the pad is
100kB in length, only up to 100kB in data can be transmitted
securely. Like most forms of encryption, key management is one of the
trickiest parts.
I have provided a simple little C program for encrypting a message
using a one-time pad (download link at the top). It
encrypts stdin using the one-time pad provided in the
file indicated as the first argument. Ciphertext is sent
to stdout. As an added bonus, when a second argument is
provided, the used bits of the key are written to the file provided in
the second filename. I will show you why I added this in a moment.
I have provided an example pad for you, generated from my
own /dev/random,
which provides a true random number generator (RNG).
Here is how you put it to use in encrypting the source code for this
program,
otp random.data < otp.c > otp.c.otp
If you shared the pad random.data with your friend, whom
you are trying to communicate with, she can decrypt it with a similar
command,
otp random.data < otp.c.otp > otp.c
random.data is only 207kB, so your message cannot be any
longer than that. Additionally, once 207kB has been sent, you must
exchange a new pad before more ciphertext can be transmitted. If you
re-use your pad, you compromise the security of the new ciphertext as
well as the old ciphertext.
The reason I provided the second option was this: you can
use /dev/random directly without losing the pad (assuming
you have a operating system that has a /dev/random, as
all decent operating systems do).
This is only academic, as the one-time pad is generally exchanged
before the plaintext exists. Otherwise you can just exchange the
plaintext when you would have normally exchanged the one-time pad!
Also, /dev/random is slow. It generates numbers
using environmental noise, which is only available at a trickle. If
you want to encrypt a 1MB message, it could take days. With shorter
messages you can usually speed things up by moving the mouse around or
mashing the keyboard (this can be fun).
More on one-time pads next. In particular, I will present two tricks:
one that works and one that does not work.
I was in an elevator the other day and I noticed something about the
digital numbers. I was on the 7th floor going down. Inside the
elevator was the standard 7-bar number display indicating the current floor,
However, from the angle I was standing, the top right bar was not
visible. Like this,
The arrow was indicating that the elevator was traveling downward, but
with that bar in the way the numbers can appear to be going upwards,
not downwards. It is an ambiguity, as without that bar, 6 is
equivalent to 8 and 5 is equivalent to 9. One version is covered, the
other is not.
At first I thought this was simply some kind of quirky behavior of the
elevator. "Why is it going up when the arrow points down? Seems like
bad design to me!", I thought -- until we got to floor 4, where it was
obvious what was going on.
At work I am currently maintaining/adding features to a rather large
piece of software. It takes over two hours just to build (but this is
mostly because C++ takes a long time to compile, being the ugly mess
C++ is). I wondered if there would be any, let's say "interesting",
comments in there. With all that code there had to be something! I ran
this to find out,
And I found maybe a hundred matches. A couple of us got some good
laughs for awhile reading some very creative comments and, in a few
cases, strings for debugging log output. If you want some code to run
this on yourself
this
2.4.1 release of Pidgin works fairly well. This is my favorite
comment, which is followed by a piece of code handling just what it is
talking about,
/* holy crap. who the fuck would transfer gigabytes through AIM?! */
I had written
in a
previous post about a neat feature of Lua. I found out later that
this is simply a form
of Memoization. The
idea is that you trade memory for speed by only doing calculations
once and keeping track of previously calculated values. I had even
complained about Perl hashes not being flexible enough (not true,
thanks
to Tie::Hash). Perl
actually has something even cooler, which is the
Memoize
module.
The module can memoize any function, although it is only useful on
"pure" functions: functions with no side effects and not dependant
external data that will change. The official documentation contains a
nice example demonstrating a recursive implementation of a Fibonacci
sequence generator. My example is a little program I wrote the other
day where the memoize module came in handy.
You have coins valued at 1, 2, 5, 10, 20, 50, 100, and 200. How many
different ways can 200 be made using any number of coins. A simple
recursive solution is this: stick in each coin one at a time and ask
the same question again. So, we use a coin worth 1, now the question
is how many ways can we make change for 199. Then 198, then 195, then
190, etc. Because the order of the coins is not important these two
sets are identical: (1 1 5) (5 1 1). So, to avoid counting the same
set twice, we also want to tell the function the largest size coin to
use from then on. Our function may look like this now (Perl),
use List::Util qw(sum);
sub count {
my ($s, $m) = @_;
return 1 if ($s == 0);
my @valid = grep {$_ <= $s and $_ >= $m} @coins;
return 0 if ($#valid == -1);
return sum map {count($s - $_, $_)} @valid;
}
Where it is called as count(total, max_coin_value).
However, we will be calculating the same value twice many times
over. For example, lets say we start filling the first 10 of 200 like
this: (1 1 1 1 1 5) or (5 5). The next call to count will
be count(190, 5) for both cases. Just like the recursive
Fibonacci implementation, we are spending an enormous amount of time
repeating ourselves. Running this for a value of 200 will take
minutes. Running it for a value of 2000 may take days! Memoization to
the rescue!
We will now add this,
use Memoize;
memoize('count');
The module has now transparently installed a new version of the
function over our original. If we ever pass the same arguments that we
already have passed, the module will look up the original calculated
value and return it instead of calling the real function. It now can
calculate the number of ways to make change for a value of 2000 in a
couple seconds rather than days. That's how much redundant work the
function was doing. Here is the whole thing,
#!/usr/bin/perl
use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(sum);
use Memoize;
memoize('count');
my @coins = (1, 2, 5, 10, 20, 50, 100, 200);
print count(200, 1);
print "\n";
sub count {
my ($s, $m) = @_;
return 1 if ($s == 0);
my @valid = grep {$_ <= $s and $_ >= $m} @coins;
return 0 if ($#valid == -1);
return sum map {count($s - $_, $_)} @valid;
}
Now, to apply it to the Collatz problem from my previous post we get a
nice simple little program,
#!/usr/bin/perl
use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(max);
use Memoize;
memoize('collatz');
while (<>) {
my ($n, $m) = split;
printf("$n $m %d\n", max map { collatz($_) } ($n..$m));
}
sub collatz {
my $n = shift;
return 1 if ($n == 1);
return 1 + collatz(3*$n+1) if ($n & 1);
return 1 + collatz($n/2);
}
This exercise partly comes from a couple different chapters in the
book The
Little Schemer. The book is an introduction to the Scheme
programming language, a dialect of Lisp. The purpose to to teach basic
programming concepts in a way that anyone can follow along just as
well as someone with a degree in, say, computer science. It is still
very useful for us programmer types because there are some good
practice you get from reading and playing along.
First of all, Lisp is famous (infamous?) for lacking syntax. Any Lisp
program is simply
an S-expression,
put simply, a list of lists. There is no operator precedence because
operators are treated just like functions. This leads to prefix
notation for mathematical expressions,
(+ 4 5)
=> 9
where the => indicates the result of evaluating the
expression. We can apply as many operands as we want,
(+ 2 3 4 5 10)
=> 24
We can put another list right in there as an operand,
(+ 3 (* 2 5) 4)
=> 17
You get the idea. In a function, the value of the last expression is
the return value. For example, here is the square
function in Scheme, which squares its input,
(define (square x)
(* x x))
Then we can use it,
(+ (square 2) (square 5))
=> 29
There are three important list operators to understand as
well: car, cdr,
and cons. car returns the first element in a
list. In the example below, the ', a single quote, tells
the interpreter or compiler that the list is to be treated as data and
not to be executed. This is shorthand, or syntactic sugar, for
the quote operator: (quote (stallman
moglen)) is the same as '(stallman moglen).
(car '(stallman moglen lessig))
=> stallman
cdr returns the "rest" of a list (everything but
the car of the list). When passing a list with only one
element cdr returns the empty list: ().
And finally, for lists, we have cons. This function
allows us to build a list. It glues the first argument to the front of
the list in the second argument,
Now, for this exercise we will pretend that the basic arithmetic
functions have not been defined for us. Instead all we have
is add1 and sub1, each of which adds or
subtracts 1 from its argument respectively.
(add1 5)
=> 6
(sub1 5)
=> 4
Oh, I almost forgot. We also have the zero? function
defined for us, which tells us if its argument is 0 or not. Notice
that functions that return true or false, called predicates, have
a ? on the end.
(zero? 2)
=> #f
(zero? 0)
=> #t
To make things simple, these definitions will only consider positive
numbers. We can define the + function (for only two
arguments) in terms of the three basic functions shown above. It might
be interesting to try to write this yourself before you look any
further. (Hint: define it recursively!)
;; Adds together n and m
(define (+ n m)
(if (zero? m) n
(add1 (+ n (sub1 m)))))
If the second argument is 0 we are done and simply return the first
argument. If not, we add 1 to n + (m -
1). The - function is defined similarly.
;; Subtracts m from n
(define (- n m)
(if (zero? m) n
(sub1 (- n (sub1 m)))))
Multiplication is the act of performing addition many times. We can go
on defining it in terms of addition,
(define (* n m)
(if (zero? m) 0
(+ n (* n (sub1 m)))))
(We'll leave division as an exercise for the reader as it gets a
little more complicated than I need to go in order to get my overall
point across.)
We will leave math behind for a moment take a look at
The Roots of
Lisp. In that link is an excellent paper written by Paul Graham
about John McCarthy, the inventor (or perhaps discoverer?) of Lisp,
and how Lisp came to be. It turns out that in order to have a fully
functional Lisp engine we only need seven primitive operators:
operators defined outside of the language itself as building blocks
for the language. For Lisp these seven operators are (Scheme-ized for
our purposes): eq?, atom?, car,
cdr, cons, quote, and
if.
Notice how none of these are math operators. You may wonder how we can
possibly perform mathematical operations when we lack these
facilities. The answer: we have to define our own representation for
numbers! Let's try this, define a number as a list of empty lists. So,
the number 3 is,
'(() () ())
And here is 0, 2, and 4,
'()
'(() ())
'(() () () ())
See how that works? Before, when we wanted to define addition and
subtraction, we needed three other
functions: zero?, add1,
and sub1. With our number representation, how could we
define add1 with our seven primitive operators? Our
numbers are defined as lists, so we can use our list operators. To add
1 to a number, we append another empty list. Hey, that sounds a lot
like cons!
(define (add1 n)
(cons '() n))
Subtraction is removing an element from the list, which sounds a lot
like cdr,
(define (sub1 n)
(cdr n))
And to define zero? we need to check for an empty
list. Notice this will also be the definition for null?.
(define (zero? n)
(eq? '() n))
And now we are back where we started. In fact, you can use the exact
definitions above to define +, -,
and *. Our entire method number representation depends on
how we define add1, sub1,
and zero?. Let's try it out,
Pretty cool, huh? We just added arithmetic (albeit extremely simple)
to our basic Lisp engine. With some modifications we should be able to
define and operate on negative integers and even define any rational
number (limited by how much memory your computer's hardware can
provide).
Now, thank goodness this isn't how real Lisp implementations actually
handle numbers. It would be incredibly slow and impractical, not to
mention annoying to read. Normally, numbers and math operators are
primitive so that they are fast.
It has been a few weeks since I made my last post. I just started my
new job at Johns Hopkins University Applied Physics Laboratory, which
has reminded my just how exhausting starting a new job can be. There
is so much to learn and balance all at once, and when I get home from
work I don't feel like doing anything constructive. This is why I
haven't posted anything recently.
However, I have been working on the problems (25 completed so
far) over at
Project Euler. If you like
programming and/or math, I strongly suggest you check it out. It would
also be handy to be familiar with a language or system that natively
supports (ruby, python, scheme, common lisp, bc, etc.) or
transparently supports (perl with the bignum module) multi-precision
math. Many of the "tricks" of the problems involve very large numbers,
well beyond normal 32-bit or 64-bit integers.
I would also suggest you follow the "rules": any program you write
should have a running time of less than a minute. All the problems can
be solved with programs that don't need to run overnight. And don't
search Google for answers until you have solved it yourself. If you do
this, you will be missing out!
Anyway, hopefully I will get something interesting written up soon. I
want to get back to my once a week schedule. And happy Pi Day
and Talk Like a Physicist
Day!
I added the fix and updated my e-mail address in the source tarball,
which
is in the
same location as before. There are no version numbers involved, so
I guess this is version "now" and the old one is version "then", with
names subject to change.
I have gotten several e-mails lately about using GNU Octave. One
specifically was about blurring images in Octave. In response, I am
writing this in-depth post to cover spatial filters, and how to use
them in GNU Octave (a free implementation of the Matlab programming
language). This should be the sort of information you would find near
the beginning of an introductory digital image processing textbook,
but written out more simply. In the future, I will probably be writing
a post covering non-linear spatial and/or frequency domain filters in
Octave.
If you want to follow along in Octave, I strongly recommend that you
upgrade to the new Octave 3.0. It is considered stable, but differs
significantly from Octave 2.1, which many people may be used to. You
will also need to install
the image
processing package
from Octave-Forge. To get
help with any Octave function, just type help
<function>.
The most common linear spatial image filtering
involves
convolving a filter mask, sometimes called a convolution
kernel, over an image, which is a two-dimensional matrix. In the
case of an RGB color
image, the image is actually composed of three two-dimensional
grayscale images, each representing a single color, where each is
convolved with the filter mask separately.
Convolution is sliding a mask over an image. The new value at the
mask's position is the sum of the value of each element of the mask
multiplied by the value of the image at that position. For an example,
let's start with 1-dimensional convolution. Define a mask,
5 3 2 4 8
The 2 is the anchor for the mask. Define an image,
0 0 1 2 1 0 0
As we convolve, the mask will extend beyond the image at the
edges. One way to handle this is to pad the image with 0's. We start
by placing the mask at the left edge. (zero-padding is underlined)
Mask: 5 3 2 4 8
Image: 0 0 0 0 1 2 1 0 0
The first output value is 8, as every other element of the mask is
multiplied by zero.
Output: 8 x x x x x x
Now, slide the mask over by one position,
Mask: 5 3 2 4 8
Image: 0 0 0 1 2 1 0 0
The output here is 20, because 8*2 + 4*1 = 20;
Output: 8 20 x x x x x
If we continue sliding the mask along, the output becomes,
Output: 8 20 18 11 13 13 5
Here is the correlation done in Octave interactively,
(filter2() is the correlation function).
The same thing happens in two-dimensional convolution, with the mask
moving in the vertical direction as well, so that each element in the
image is covered.
Sometimes you will hear this described as correlation
(Octave's filter2) or convolution
(Octave's conv2). The only difference between these
operations is that in convolution the filter masked is rotated 180
degrees. Whoop-dee-doo. Most of the time your filter is probably
symmetrical anyway. So, don't worry much about the difference between
these two. Especially in Octave, where rotating a matrix is easy
(see rot90()).
Now that we know convolution, let's introduce the sample image we will
be using. I carefully put this together
in Inkscape, which should give
us a nice scalable test image. When converting to a raster format,
there is a bit of unwanted anti-aliasing going on (couldn't find a way
to turn that off), but it is minimal.
Save that image (the PNG file, not the linked SVG file) where you can
get to it in Octave. Now, let's load the image into Octave
using imread().
m = imread("image-test.png");
The image is a grayscale image, so it has only one layer. The size
of m should be 300x300. You can check this like so (note
the lack of semicolon so we can see the output),
size(m)
You can view the image stored in m
with imshow. It doesn't care about the image dimensions
or size, so until you resize the plot window, it will probably be
stretched.
imshow(m);
Now, let's make an extremely simple 5x5 filter mask.
This filter mask is called an averaging filter. It simply
averages all the pixels around the image (think about how this works
out in the convolution). The effect will be to blur the image. It is
important to note here that the sum of the elements is 1 (or 100% if
you are thinking of averages). You can check it like so,
sum(f(:))
Now, to convolve the image with the filter mask
using filter2().
ave_m = filter2(f, m);
You can view the filtered image again with imshow()
except that we need to first convert the image matrix to a matrix of
8-bit unsigned integers. It is kind of annoying that we need this, but
this is the way it is as of this writing.
ave_m = uint8(ave_m);
imshow(ave_m);
Or, we can save this image to a file
using imwrite(). Just like with imshow(),
you will first need to convert the image to uint8.
imwrite("averaged.png", ave_m);
There are a few things to notice about this image. First there is a
black border around the outside of the filtered image. This is due to
the zero-padding (black border) done by filter2(). The
border of the image had 0's averaged into them. Second, some parts of
the blurred image are "noisy". Here are some selected parts at 4x zoom.
Notice how the circle, and the "a" seem a little bit boxy? This is due
to the shape of our filter. Also notice that the blurring isn't as
smooth as it could be. This is because the filter itself isn't very
smooth. We'll fix both these problems with a new filter later.
First, here is how we can fix the border problem: we pad the image
with itself. Octave provides us three easy ways to do this. The first
is replicate padding: the padding outside the image is the same as the
nearest border pixel in the image. Circular padding: the padding from
from the opposite side of the image, as if it was wrapped. This would
be a good choice for a periodic image. Last, and probably the most
useful is symmetric: the padding is a mirror reflection of the image
itself.
To apply symmetric padding, we use the padarray()
function. We only want to pad the image by the amount that the mask
will "hang off". Let's pad the original image for a 9x9 filter, which
will hang off by 4 pixels each way,
mpad = padarray(m, [4 4], "symmetric");
Next, we will replace the averaging filter with a 2D Gaussian
distribution. The Gaussian, or normal, distribution has many wonderful
and useful properties (as a statistics professor I had once said,
anyone who considers themselves to be educated should know about the
normal distribution). One property that makes it useful is that if we
integrate the Gaussian distribution from minus infinity to infinity,
the result is 1. The easiest way to get the curve without having to
type in the equation is using fspecial(): a special
function for creating image filters.
f_gauss = fspecial("gaussian", 9, 2);
This creates a 9x9 Gaussian filter with variance 2. The variance
controls the effective size of the filter. Increasing the size of the
filter from 9 to 99 will actually have virtually no impact on the
final result. It just needs to be large enough to cover the curve. Six
times the variance covers over 99% of the curve, so for a variance of
2, a filter of size 7x7 (always make your filters odd in size) is
plenty. A larger filter means a longer convolution time. Here is what
the 9x9 filter looks like,
Notice the extra argument "valid"? Since we padded the
image before filtering, we don't want this padding to be part of the
image result. filter2() normally returns an image of the
same size as the input image, but we only want the part that didn't
undergo (additional) zero-padding. The result is now the same size as
the original image, but without the messy border,
Also, compare the result to the average filter above. See how much
smoother this image is? If you are interested in blurring an image,
you will generally want to go with a Gaussian filter like this.
Now I will let you in on a little shortcut. In Matlab, there is a
function called imfilter which does the padding and
filtering in one step. As of this writing, the Octave-Forge image
package doesn't officially include this function, but it is there in
the source repository now, meaning that it will probably appear in the
next version of that package. I actually wrote my own before I found
this one. You can grab the official one
here:
imfilter.m
With this new function, we can filter with the Gaussian and save like
this. Notice the flipping of the first two arguments
from filter2, as well as the lack of converting
to uint8.
imfilter() will also handle the 3-layer color images
seamlessly. Without it, you would need to run filter2()
on each layer separately.
So that is just about all there is. fspecial() has many
more filters available including motion
blur,
unsharp, and edge detection. For example,
the Sobel edge
detector,
You may have noticed the new index available on the left of the
page. It really needs one, because looking at older posts required
paging through the archives. Even more so because I have yet to add
search functionality. There were
Blosxom plug-ins available to do these things, but they are long
gone now. I looked into getting a Google search box dropped in there,
but it requires a big advertisement to go around the box (and
Javascript... bleh!). I'll figure out something sometime.
Blosxom doesn't have a fancy database behind it, but rather plain flat
text files. I imagine that this doesn't scale well, but I haven't
looked that carefully into Blosxom's code. I should never be having
problems like that anyway as I doubt I will ever surpass even 1000
posts, nor will there ever be some huge flood of users. What
I do get is simplicity: to make a new post, I just write a text
file and drop it on the server. It also made writing the index code
very simple.
Without further ado, here is the code (under
the same license as
Blosxom):
blog-index.perl. As you can see, it doesn't integrate with the
rest of the blog; it only produces an index. I need function, not
form, so this is fine for me. You can use it on your own Blosxom
setup, just as long as you set the parameters at the top of the file
correctly.
The index is created dynamically, but only when it needs to be
updated. It first checks the date of the pre-generated index against
the date of the latest entry. If the entry is newer, the index is
rewritten. In both cases, this index is then served up efficiently
straight from a plain XHTML file. All the proper file locking is in
place (I think), so if there was a flood of requests when the index
needs to be rewritten, no two requests are trying to re-write the
index at the same time.
So nothing terribly exciting this week, but I hope someone finds this
code useful somewhere.
Introduction: This "news" is over two months old, simply because
I had other more interesting things to write about first. Not that I
am out of ideas: I have at least three more ideas lined up at the
moment on top of several half-written entries that may never see the
light of day. I just want to get it out of the way.
In December we held the robot competition, pitting against each other
the robots that
we spent the semester building. It was a double-elimination
bracket with five teams. Teams competed by arranging the maze (within
the rules) and deciding the initial position for their opponents. The
robots do not get to know about the maze or where they are starting;
they must figure this out on their own by exploring the maze.
To recap, there was an 8'x8' game area containing a 4'x8' maze of
1-square-foot cells. On the floor of the game area was a grid of white
lines on black, where the white lines were about 7-inches apart. The
robot started at an unknown position and orientation in the maze,
which was also set up with a configuration unknown to the robot. In
the non-maze open area, three small wooden blocks were placed at the
intersection of white lines, with a steel washer attached to the top
of each block.
In short, the robot had to move all three blocks to the repository, a
pre-programmed position in the maze.
At the end of the semester, our team's robot was the only one that
could successfully complete this task. The other teams needed to play
in a degraded mode: known maze configuration, known starting position,
known block positions. The loser bracket played this degraded version
of the game. Because of this, our team was able to sweep the
tournament with a perfect run. All the robot had to do was
successfully run the full game. The competition, not being able to do
this, automatically lost.
The robots were mostly the same, except for one team who had a robot
with 4 multi-direction wheels. Every other team made a "scooter bot"
type of robot: two powered wheels (with casters for balance) and
chassis with three levels. The first real separation of design was
when it came to picking up blocks. Each team initially had a different
idea. One team was going to build a pulley system to lift the
blocks. Another was going to use sweeping arms to sweep in the
block. Another was going to used a stationary magnet.
Our team went with a rotating wheel in front with magnets along the
outside (see images below). Once a block was found, the robot would
rotate a magnet over the block, then rotate the attached block out of
the way. In the end, four of the five teams ended up using this design
for their own robots (the last team stuck with the stationary magnet).
These pictures were taken about a month before the competition. The
wiring job was still a bit sloppy and the front magnet wheel lacks
tiny magnets attached to the outside. Other than that, this is what
our final robot looked like. In that last month, we attached the
magnets, cleaned up the wiring, and made a whole bunch of code
improvements making the robot more robust.
I will now attempt to describe some of the things you see in these
images.
On the bottom of the robot you can see two casters for balancing the
robot (big clunky things). You can see an IR sensor, which is pointing
at the blue surface attached to the other side of the robot. This was
the block detection sensor, a home-made break-beam sensor. And
finally, you can see three LED lights on top of a long circuit
board. This is a line tracker, with three sensors that can see the
white grid on the bottom of the game board. The line tracker is how
the robot navigated the open area of the board. It went back-and-forth
looking for blocks, using the line tracker to stay on the line.
Also attached to this bottom layer are the powered wheels, with blue
rubbers for traction, and their wheel encoders. There are spokes on
the inside of the wheels (encoder disks), and the wheel encoders send
a signal to the micro-controller each time it sees a spoke. The
software counts the number of spokes that passed, allowing the robot
to keep track of how far that wheel has turned. This information is
combined with IR distance sensors to give it a very accurate idea of
its position.
On top of the bottom black layer, you can see four distance IR sensors
for tracking walls in the maze. They checked to make sure the robot
was going straight (that's why there are two on each side), as well as
map out the maze as it travels long. Hanging down from the bottom of
the red layer is another IR sensor facing forward, looking at walls in
front of the robot. Mounted on the front is the block retrieval device
(lacking magnets at this point).
On top of the red layer are two (empty) battery packs, which holds 9
AA rechargeable NiMH batteries. This actually makes two separate power
systems: a 4-pack for motors and a 5-pack for logic (micro-controller
et al). In the circuit, the motors, containing coils of wire, behave
like inductors, which could cause harmful voltage spikes to the
logic. Separate power systems help prevent damage.
On top is the micro-controller and all of the important
connections. The vertical board contains the voltage regulator and
"power strip" where all of the sensors are attached. It also contains
the start button, which was connected to an interrupt in the
micro-controller. The micro-controller had its own restart button, but
once the system started up, initialized, and self-calibrated, it waited
for a signal from the start button to get things going.
I was about to post this when I was reminded by my fiancee that she
took pictures at the end-of-semester presentation, after the
competition. Here are some images of the robot after it was completely
finished. Yes, that is a little face fastened to the front.
If you are ever at Penn State and are visiting the IST building, you
can see the robot. Because the robot won the competition, it is on
display and will be for years to come. You can recognize it by its
face.
I have made the final robot code available
here:
final-robot-code.zip. I was the software guy, handling pretty much
all the code, so everything here,
except interupt_document.c was written by me. It's
probably not very useful as code, except for reading and learning how
our robot worked. There are a few neat hacks in there, though, which I
may discuss as posts here. It's not noted in the code itself, nor in
the zip file, but I'll make this available under my
favorite 2-clause BSD
license.
The 3n + 1 conjecture, also known as
the Collatz
conjecture, is based around this recursive function,
The conjecture is this,
This process will eventually reach the number 1, regardless of which
positive integer is chosen initially.
The way I am defining this may not be entirely accurate, as I took a
shortcut to make it a bit simpler. I am not a mathematician (IANAM) --
but sometimes I pretend to be one. For a really solid definition,
click through to the Wikipedia article in the link above.
A sample run, starting at 7, would look like this: 7, 22, 11,
34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. The sequence
starting at 7 contains 17 numbers. So 7 has a cycle-length of
17. Currently, there is no known positive integer that does not
eventually lead to 1. If the conjecture is true, then none exists to
be found.
I first found out about the problem when I saw it
on UVa Online
Judge. UVa Online Judge is a system that has a couple thousand
programming problems to do. Users can submit solution programs written
in C, C++, Java, or Pascal. For normal submissions, the fastest
program wins.
Anyway, the way UVa Online Judge runs this problem is by providing the
solution program pairs of integers on stdin as text. The
integers define an inclusive range of integers over which the program
must return the length of the longest Collatz cycle-length for all the
integers inside that range. They don't tell you which ranges they are
checking, except that all integers will be less than 1,000,000 and the
sequences will never overflow a 32-bit integer (allowing shortcuts to
be made to increase performance).
The simple approach would be defining a function that returns the
cycle length (Lua programming
language),
function collatz_len (n)
local c = 1
while n > 1 do
c = c + 1
if math.mod(n, 2) == 0 then
n = n / 2
else
n = 3 * n + 1
end
end
return c
end
Then we have a function check over a range (assuming n <= m here),
function check_range (n, m)
local largest = 0
for i = n, m do
local len = collatz_len (i)
if len > largest then
largest = len
end
end
return largest
end
And top it off with the i/o. (I am just learning Lua, so I hope I did
this part properly!)
while not io.stdin.eof do
n, m = io.stdin:read("*number", "*number")
-- check for eof
if n == nil or m == nil then
break
end
print (n .. " " .. m .. " " .. check_range(n, m))
end
Notice anything extremely inefficient? We are doing the same work over
and over again! Take, for example, this range: 7, 22. When we start
with 7, we get the sequence shown above: 7, 22, 11, 34, 17, 52,
26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Eight of these numbers
are part of the range that we are looking at. When we get up to 22, we
are going to walk down the same range again, less the 7. To make
things more efficient, we apply
some
dynamic programming and store previous calculated cycle-lengths in
an array. Once we get to a value we already calculated, we just look
it up.
I used dynamic programming in my submission, which I wrote up in
C. You can grab my
source
here. It fills in a large array (1000000 entries) as values are
found, so no cycle-length is calculated twice. When I submitted this
program, it ranked 60 out of about 300,000 entries. There are probably
a number of tweaks that can increase performance, such as increasing
the size of the array, but I didn't care much about inching closer to
the top. I would bet that the very top entries did some
trial-and-error and determined what ranges are tested, using the
results to seed their program accordingly. You could take my code and
submit it yourself, but that wouldn't be very honest, would it?
So why am I going through all of this describing such a simple
problem? Well, it is because of this neat feature of Lua that applies
well to this problem. Lua is kind of like Lisp. In Lisp, everything is
a list ("list processing" --> Lisp). In Lua, (almost)
everything is an associative array (Maybe they should have called it
Assp? Or Hashp? I am kidding.) An object is a hash with fields
containing function references. There is even
some syntactic
sugar to help this along.
The cool thing is that we can create a hash with default entries that
reference a function that calculates the Collatz cycle-length of its
key. Once the cycle-length is calculated, the function reference is
replaced with the value, so the function is never called again from
that point. The function only actually determines the next integer,
then references the hash to get the cycle-length of that next integer.
Now this hash looks like it is infinitely large. This is really a form
of lazy
evaluation: no values are calculated until they are needed (this
is one of my favorite things about
Haskell). We don't need to explicitly ask for it to be calculated,
either. We just go along looking up values in the array as if they
were always there. Here is how you do it,
collatz_len = { 1 }
setmetatable (collatz_len, {
__index = function (name, n)
if (math.mod (n, 2) == 0) then
name[n] = name[n/2] + 1;
else
name[n] = name[3 * n + 1] + 1;
end
return name[n]
end
})
So we replace the collatz_len function with this array
(and replace the call to an array reference) and we have applied
dynamic programming to our old program. If I run the two programs with
this sample input,
10 1000
1000 3000
300 500
and look at average running times, the dynamic programming version
runs 87% faster than the original.
One problem with this, though, is the use of recursion. In Lua, it is
really easy to hit recursion limits. For example, accessing element
10000 will cause the program to crash. This will probably get fixed
someday, or in some implementation of Lua.
I thought there might be a way to do this in Perl, by changing the
default hash value from undef to something else, but I
was mildly disappointed to find out that this is not true.
Here is the source for the original program and the one with dynamic
programming (BSD licenced):
collatz_simple.lua and
collatz.lua
wbf2c converts an
esoteric programming language
called brainfuck
into C code which can be machine compiled. Several optimizations are
done to make the resulting program extremely fast an efficient. The
converter supports both a static (standard 30,000 cells) array or a
dynamically-sized array. It also supports many different cell types,
from the standard char to multi-precision cells
using GMP.
The converter can also run several brainfuck programs on the same
memory array at once by running each one in a thread. To make sure
each brainfuck operation is atomic, each cell gets a mutex lock. The
only other multi-threading brainfuck implementation I know of
is
Brainfork.
This program fills the memory with the Fibonacci series. Make sure you
use the dynamically sized array, along with the bignum cell
type. After two or three seconds of running, my laptop (unmodified
Dell Inspiron 1000) can calculate and spit out a 140MB text file
containing the first 50,000 numbers in the series. I used
the -d dump option to see this output.
Download information, as well as some more examples, including a
multi-threaded one, are on the project website.
Sorry for the lack of new material. I was apartment hunting with my
fiancee (the most beautiful and wonderful girl in the world) this past
weekend in the Baltimore area. I found a nice place in
Columbia. Tomorrow morning I am getting some oral surgery done and
will be hospitalized for awhile. So nothing new here for possibly a
couple of weeks.
However, once I have recovered, I will probably make a post
about Lua, another about a robot
game I wrote this past summer, and another with pictures of my
tournament-winning robot from last semester.
If you are reading this post months from now and there has been no new
material, I guess you can assume something went bad with the surgery
(unlikely). According to this book I have called Scam Proof Your
Life (from the AARP), preventable medical errors in US hospitals
results in 195,000 deaths. I am not worried, however.
Don't stop here! This isn't everything. Check out the archives
(on the left) for more posts. Or just have a look at
the index.