null program

Movie Montage Poster

I wanted to try making one of those movie montage things I wrote about earlier into a nice poster that could be hung on a wall. Now, I prefer a Spartan environment whenever possible, so I really did not want to have my own poster. No decorations for me, please. I just wanted to make one. My solution? Make one for my sister, who has lots of junk and would enjoy having one. Her favorite movie is Pirates of the Caribbean, so I used this movie, which she conveniently already had on DVD.

As before, I used mplayer to rip all of the frames I needed. To get a poster-quality version I would need better resolution. To do this, I changed the frame output image size to 160x90 (100 times bigger than before).

[...] framestep=30,scale=160:90 [...]

Next, I used my own montage script to put these frames together. My script took about 2 minutes to put together one of these larger montages. Finally, I used the GIMP to add a black border and simple title at the top. Since I don't have any more than 512MB of memory on my computers, I actually had to scale the image down to 1/4 of its original size to do this. Before scaling, the GIMP was spending hours just adding the border because it was thrashing the hard drive. It needed about 2G of memory and it was using the hard drive to get it.

I took my giant image to FedEx Kinko's where they printed it to a 2'x2' poster for 30 bucks. Here are the results, taken using my sister's crappy Kodak camera (never buy Kodak digital cameras, as they all suck!). To help see what is going on, I provided a glare version and a non-glare version. Each shows different details.

Glare Poster Non-Glare Poster

Here is the normal version from before for comparison.

Pirates of the Caribbean


A Faster Montage

I had written a previous post called Movie DNA where I described a simple way of distilling an entire movie down to a single frame. It involved the use of two tools, with no intermediate code or software in the middle to glue things together.

The first tool, mplayer was used to dump all of the frames we needed. This took about the running length of the movie to do, which wasn't so bad. There may be a way to speed this up by giving mplayer some extra hints. I have not yet figured this part out.

The real time cost was in ImageMagick's montage tool, which made the final montage out of the images. This took between 6 and 10 hours to do this, depending on the length of the movie. The process seemed to be non-linear for some reason, with long movies taking unproportionally longer to process (One could always dig around the montage source to find out why). I knew there had to be a way that this could be improved!

Well, I wrote a Perl script last night, dubbed gdmontage to speed up the montage process. It was even faster than I thought it would be, taking only 12 seconds on the same machine as before. It uses the GD Graphics Library via Perl's GD module, which you would need to install to use this. It also uses the Term::ProgressBar, if it's available, to provide a progress bar and ETA.

Like the original montage program, the script recognizes file globs, so you can provide the files through a glob in order to avoid the limits on command line arguments.

$ ./gdmontage.pl "frames/*"

It is a bit unfair to call my code a "faster montage" because it only covers a tiny subset of the original montage. It makes some big assumptions in order to be faster; specifically, it assumes that every image is the same size. The original montage must look at every image before it even starts in order to determine the dimensions and placement of the final image.

It is also geared towards the Cinema Redux thing, doing only 60 images per row. This can be changed internally (no command line arguments for this) by adjusting the parameters at the top of the script. The script could probably be easily expanded to include most of the features of ImageMagick's montage, but I am sure this Perl script would be much faster when it comes to creating large montage's. (Why is montage so slow?)

Anyway, you can get the script here: gdmontage.pl (GPLv3)


Unsharp Masking

Moon image Shaprened moon image
Original image courtesy NASA.

While studying for my digital image processing final exam yesterday, I came back across unsharp masking. When I first saw this, I thought it was really neat. This time around, I took the hands-on approach and tried it myself in Octave. It has been used by the publishing and printing industry for years.

Unsharp masking is a method of sharpening an image. The idea is this,

  1. Blur the original image.
  2. Subtract the blurred image from the original, creating a mask.
  3. Add the mask to the original image.

Here is an example using a 1-dimensional signal. I blurred the signal with a 1x5 averaging filter -- [1 1 1 1 1] * 1/5. Then I subtracted the blurred signal from the original to create a mask. Finally, I added the unsharp mask to the original signal. For images, we do this in 2-dimensions, as an image is simply a 2-dimensional signal.

Example signal

When it comes to image processing, we can create the mask in one easy step! This is done by performing a 2-dimensional convolution with a Laplacian kernel. It does steps 1 and 2 at the same time. This is the Laplacian I used in the example at the beginning,

Laplacian

So, to do it in Octave, this is all you need,

octave> i = imread("moon.png");
octave> m = conv2(i, [0 -1 0; -1 4 -1; 0 -1 0], "same");
octave> imwrite("moon-sharp.png", i + 2 * uint8(m))

i is the image and m is the mask. The mask created in step 2 looks like this,

Unsharp Mask

You could take the above Octave code and drop it into a little she-bang script to create a simple image sharpening program. I leave this as an exercise for the reader.


Simple Hash Table in C

Download: hashtab.tar.gz (3.9KB)

I needed a hash table written in C for a project I was working on and I didn't like any the free hash table code that was out there. Plus, I have NIH syndrome and I really wanted to write my own for fun. My goal was to make it extremely generic, so that it would work with any data of any size. A small front end could be placed on top to make it work cleanly for strings (no need to pass the length of the data).

It uses open hashing, with linked lists to store the data. The only penalty from collisions is a slightly slower lookup. This also is what alows the variable entry sizes. I could definitely fix up a few things to make it more efficient. The default hash function uses modulus, which is quite expensive.

It works just fine with all my tests so far, so it suits my needs for the project. The code includes a demo of the hash table (main.c). If you want to read about some better hashing functions and use one with this hash table, take a look at A Hash Function for Hash Table Lookup.


Movie DNA

Brendan Dawes has this interesting idea he calls Cinema Redux. A entire film is distilled down to a single image. You take one frame from each second of the movie, shrink that frame down to 8x6 pixels, then line them up in a montage with 60 frames per row. Each row then represents one minute of film. There are 8 examples on his website.

I was interested in trying this for myself, but I couldn't find any of his code, which he had written in Java, to do it myself. Then it hit me: I really don't need to write anything to do this! Here is how you can make your own using only two tools: mplayer and ImageMagick.

Originally I thought that I may need to write a small Perl script to glue these two things together, but found, after digging though man pages, that this was completely unnecessary. There are two steps involved and each tool does one step: grab all of the frames, and second, make a montage out of those frames. Grabbing the frames is one call to mplayer,

mplayer -vo jpeg:outdir=frames -ao dummy -vf framestep=30,scale=8:6 \
        video_file

What we are doing here is dumping every 30th frame (assuming 30 frames-per-second) into a directory named frames. These images will be named by consecutive 8-digit numbers. These frames are also resized down to 8x6 pixels. If you are converting a video with a different aspect ratio, such as a wide-screen movie without letter-boxing, you will need to adjust this. A wide-screen film would be 16x9.

Next, we glue these frames together with ImageMagick,

montage -geometry +0+0 -background black -tile 60x "frames/*jpg" \
        montage.jpg

This will create the montage in the file montage.jpg . There is something important to note here. See how the file glob is quoted so that the shell will not expand it? Thats because listing 7000 frames pushes the limits of the system in passing command line arguments. ImageMagick knows about file globs and will do this internally.

And that's it! I put these together into a handy shell script that will also remove the frames after the montage has been successfully created. The process takes between 6 and 12 hours, depending on the length of the movie. It takes the movie running time to produce all the frames file, then it spends the rest of the time creating the montage, which is disappointingly slow. (Maybe I could write a Perl script that does it faster?) The script will create a montage out of just about any video you throw at it, thanks to mplayer. Example usage for DVDs,

$ ./cinrdx.sh dvd://

I did it on three movies so far: Gladiator, Tron, and The Matrix. I did it to Tron and The Matrix because I wanted to see if these movies have a dominant color scheme.

Gladiator
Gladiator
Tron
Tron
The Matrix
The Matrix

To inspect the coloring of these films, I took a hue histogram. Tron is very obvious: lots of blues and cyans dominate,

Tron Hue Histogram

I was expecting to see a lot of green show up in The Matrix, but was a little bit disappointed,

The Matrix Hue Histogram

To get these histograms, I loaded the images into GNU Octave, converted it to HSV so that the red channel is really the hue channel. Then I had the GIMP make the histograms by providing the histogram of the "red" (read hue) channel. I dropped an HSV color bar below with some image editing.

octave> m = imread("movie.jpg");
octave> [x map] = rgb2ind(m);
octave> map = rgb2hsv(map);
octave> imwrite("movie-hsv.jpg", x, map);

See if you can find some really interesting things to do with this.


South Park Downloader

Notice: This software is not affiliated with the South Park Zone nor South Park. In fact, this bypasses all of the South Park Zone's advertising, so they may not like it. I have do not have anyone's approval, nor do I know about any of the legal implications of this tool. Use at your own risk.

Get the program: spd.pl (3.8KB) GPLv3
Requires Perl and GNU wget

This was an afternoon project the other day. I found out about a website called the South Park Zone. This website has information on where to fetch South Park episodes from different hosts, such as MySpace and YouTube. These videos are stored in the Flash video format. You see, I don't have the proprietary Flash software installed and the free versions aren't quite ready for use yet. I use other means of watching videos on sites like YouTube. This means that I normally have to either use a tool to find the .flv file or search through the site's HTML code manually.

In this case, I couldn't find an existing tool that could grab the videos pointed to by the South Park Zone. I was able to find it manually, but trying to watch several videos takes a bit of digging. Anyway, if I really want to become a hacker sometime (I feel I am not yet worthy of this name), what do I do? Well, I write a program to do it for me!

I stretched my Perl legs a bit and wrote a script that will grab the video location from the South Park Zone and download the video. The only thing you need other than Perl is GNU wget. This is the program that does all the hard work. My Perl script just scans the HTML and XML for the data.

Here is an example session,

$ spd 807
Getting mirror number ... 
Getting episode URL ...
Downloading episode: The Jeffersons (125)
--20:21:21--  http://content.movies.myspace.com/0008289/24/25/828935242.flv
           => `South Park - 0807 The Jeffersons.flv'
Resolving content.movies.myspace.com... 204.16.34.216
Connecting to content.movies.myspace.com|204.16.34.216|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43,350,604 (41M) [application/octet-stream]

100%[====================================>] 43,350,604   890.43K/s    ETA 00:00

20:22:14 (823.14 KB/s) - `South Park - 0807 The Jeffersons.flv' saved [43350604/43350604]

The program accepts episode numbers, episode ranges, and search terms. For example, assuming you have installed the program and removed the .pl extension, this will download season 2, episode 3 and season 10, episode 10,

$ spd 203 1010

You can also provide a range of episodes with a dash,

$ spd 405-410

With that command, season 4, episodes 5 through 10 will be fetched. The range will not work across seasons! This is because I didn't see a convenient way to find out how many episodes were in each season. Right now the tool should be fine until season 100 (3 digit seasons) or South Park Zone goes away. It knows nothing about South Park: it only fetches the data on request.

And finally, one of the coolest parts, it will do searches for you too. Right now, it just grabs the first search result, but this seems to work pretty well. For example, this grabs the famous Make Love, Not Warcraft episode,

$ spd warcraft

I have used it a bit myself, but there may still be bugs. It also assumes that all of these videos are .flv files, but the South Park Zone is prepared for these things to be in any format. If its popular enough, perhaps I will turn it into a "project" and version it. It could also use a better front end for parsing option flags. Perhaps a quiet -q flag,

$ spd -q 506-510 708 710

Traveling Salesman Problem by Genetic Algorithm

Download Source: genetic.tar.gz (6.19KB)

Here is another project for my artificial intelligence class. I wrote a generic genetic algorithm class in C++ and then applied that class to the traveling salesman problem. A genetic algorithm more tuned to the traveling salesman problem would work better.

This particular implementation can use up to 16 points defined in the weight matrix stored in travel.dat. This weight matrix can either be defined by hand or generated using gendat.m from a list of points stored in points.txt. A chromosome is 64 bits wide, which is 16 points with 4 bits each. To make sure that every possible chromosome is a valid solution, the points are selected out of a circular queue. Every 4 bits describes how far along the queue to walk before pulling out a point. With the circular queue, the chromosome could be as short as 50 bits, but I was trying different things and 64 bits is the simplest way to represent a solution. Here are some sample chromosomes:

1101100010100000011001010101010111000110101100111111000010111010, 9315
0000110100000001000010010101010111011000101100010110000010111010, 9339
1010001100010001010010011001010010100000101100011111000010111110, 9410
0101001000000011010010011001010011010100101011100111000111011110, 9355
0101001001000000010110100001010011010110101101000001000101000110, 9349
0101011100010011000010011011011011010101101100011111010010111110, 9311
0000111101000001000010011001010010110100101100011111100010000010, 9350
1000001111100000010110101011011011100000111101010111000010111010, 9428
0000111011000001000010010111011111111110101100010111000001000111, 9448

The second number is the fitness value of the chromosome (10000 - path length). Below is a path found after 20000 iterations:

gnuplot plot

It takes many iterations to find a reasonable solution and it never finds a *really* good solution. This is because each node in the chromosome depends on every single node before it. This is terrible for a genetic algorithm, but it's really the best I could think of when using a generic genetic algorithm class like this.

A much better method would have the genetic algorithm actually know about the problem at hand, working with nodes rather than bits. Breeding would make cuts on nodes. Mutations would swap single nodes. Perhaps this can be written another time.


Noise Fractals and Clouds

I was reading about fractal terrains and wanted to give it a shot. As usual, I like to try things out in GNU Octave first by writing up a quick prototype, and, if it is still interesting enough, I will move to another language for better performance (Octave can be frustratingly slow sometimes). Additionally, when it comes to matrix manipulation, the Matlab language tends to be the most concise and powerful. Most functions and operators work on entire matrices or vectors at a time, avoiding many iteration loops, and, more notably, matrix notation built right into the language. This is powerful in the same way regular expressions are built into Perl. The problem is when your data shouldn't be in matrix form, and the language, lacking any other kinds of data structures (especially hash tables! (ignoring the struct hack)), forces you to fit your problem into matrices. Also, the implementations of the Matlab language tend to be very slow.

</rant> Anyway, back to noise.

So, the first and easiest noise algorithm I found was the diamond-square algorithm. Basically, it is noisy interpolation applied recursively. All of the noise adds up to provide something that may be good for height maps, possibly providing procedurally generated terrain for a game. Here is an example/pretty picture. Imagine this as being terrain,

heightmap

You can see we have a sort of mountain going on in the middle. Here is the "plasma fractal" view of our noise.

bird's eye view

These were generated using this Octave code. I believe that this is both the fastest and most concise way to do this in Octave. You can see we are throwing away a lot of random values that we pulled, but at the same time avoiding loops. You will need at least version 2.9 of Octave because, as far as I know, Octave 2.1 doesn't have interp2. And, unfortunately, interp2 is much slower than it probably could be.

function m = diamond_square (m, i, c)
  if isempty (m)
    m = zeros (2);
  end
  
  for k = 1:i
    m = interp2 (m, 1);
    
    ## Define points we want to randomize
    gridmap = ones (size (m));
    gridmap(1:2:end, 1:2:end) = 0;
    gridmap = find(gridmap);       # Makes Octave happy
    
    ## Define random values to be added
    randmap = c * randn (size (m));
    
    m(gridmap) += randmap(gridmap);
    
    c = c / 2;
  end
end

Something of note here compared to other implementations of diamond-square, my code above adds Gaussian noise (with Octave's randn) rather than uniform noise. The scaling variable c is actually adjusting the standard deviation of our noise and not the limits of the noise. I imagine this makes the terrain more natural, as Gaussian noise tends to approximate real noise well.

The first argument provides a base terrain to work from. With this you can define a mountain, islands, hillside, etc. When an empty matrix is provided, the terrain will be grown from flat ground. The second argument decides how many iterations we are going to run. The above example performs 6 iterations on flat terrain. The last argument decides how dramatic the terrain is, which doesn't have much meaning when starting from flat terrain. The example above was produced with two calls like this (one for each image),

octave> mesh(m=diamond_square([], 6, 0.1))
octave> imagesc(m); colormap(gray);

We can add some water, or perhaps lava or something, by choosing a water height and chopping off everything below it. I will choose -0.1 as our water level.

octave> w = m;
octave> w(w < -0.01) = -0.01;
octave> mesh(w)

Water

Here is an example of providing some base terrain. Let's put a mountain in the corner of the map,

octave> a = [0 0 0 0 0; 0 0 0 0 0; 0 0 0 0 0; 0 0 0 1 0; 0 0 0 0 0]
octave> mesh (b = diamond_square (a, 4, 0.15))

Mountain base

Becomes,

Mountain noise

Octave-Forge has a surf function that should make this look nicer, but it doesn't. If we want to render this to make it look nice, we will need another program to do it. Let's do that another time.

Another way to make noise is Perlin noise. Generally, this noise will be better and more useful than diamond-square noise (examples later). I personally like this person's approach: Perlin Noise. It is easy to implement and understand, though it doesn't have the all advantages you get from generating Perlin noise the normal way.

If you don't feel like clicking through and see the nice introduction to Perlin noise there, here is the idea of this approach: we create different frequencies of noise and mix them all together, giving more weight to lower frequency noise than high frequency noise. Below, you can see that I generated (Gaussian) noise in different frequencies, smoothing by spline interpolation. Then we add this all together to get our final Perlin noise.

Noise + Noise + Noise + Noise + Noise + Noise = Noise

Here is the Octave code I am using to generate Perlin noise,

function s = perlin (m)
  s = zeros(m);    # output image
  
  w = m;           # width of current layer
  i = 0;           # iterations
  while w > 3
    i = i + 1;
    d = interp2(randn(w), i-1, "spline");

    s = s + i * d(1:m, 1:m);
    w -= ceil(w/2 - 1);
  end
  
end

The first and only argument gives the side length of the image you want to generate - it only generates square images. This is an extremely simple approach, as there are no parameters to adjust to define the nature of the noise you want to generate. Fortunately, what I am going to try next works fine without these parameters (or with the default, hard-coded parameters if you wish).

Continuing with using some ideas from Hugo Elias (I think that's his name), I am going to use this noise to attempt to create some realistic looking clouds.

First, we will generate some noise. Let's generate some Perlin noise,

octave> n = perlin (200);

Perlin noise

Now, this doesn't look too much like clouds. To get clouds, we can apply an exponential function to the data and set anything below 0 to 0 (contrast stretching). If we scale our noise between 0 and 1, the function will look like this,

Exponential function

To adjust the clouds, we can move this function left and right across the x-axis. We can adjust the "time constant" of the function to change the sharpness of the clouds. Here is the function I wrote to convert the diamond-square or Perlin noise into cloud cover. The output is scaled between 0 and 255 to aid in image output.

function a = get_clouds (a)
  ## Scale a between 0 and 1
  a = a - min(a(:));
  a = a / max(a(:));
  
  ## Parameters
  density = 0.5;
  sharpness = 0.1;
  
  a = 1 - e.^(-(a - density) * sharpness);
  a(a < 0) = 0;

  ## Scale between 0 to 255 and quantize
  a = a / max(a(:));
  a = round(a * 255);
end

Now run this on our noise,

octave> c = get_clouds(n);

Perlin Clouds

Well, that looks a bit more like cloud cover. We just need to apply a colormap to this. I wrote this colormap function for this purpose,

function c = cloud_cmap ()
  c = [0.25 0.25 1];
  for i = 2:256
    c(i, :) = (i-2)/256 * (1 - c(2, :)) + c(2, :);
  end
end

Apply the colormap,

octave> imwrite("clouds.png", c+1, cloud_cmap)

Clouds

Wow, that looks pretty good now. We could improve this by fading to transparent rather than blue. Then do a projective transformation on the clouds and lay them over top a blue gradient. You could do this with image editing software such as GIMP, or you can continue to use Octave like I did at the end of this post.

So, in about 30 lines of Octave (including code on the interactive command line) code we could generate some kind-of realistic looking clouds. Here, I will use the cloud demo to show one particular advantage of Perlin noise over diamond-square (or at least my implementation). Here are some diamond-square clouds,

Clouds Clouds Clouds

And here are some Perlin clouds, which I think look a bit better,

Clouds Clouds Clouds

Notice the straight lines in the diamond-square clouds? You can see it right in the middle of the first image. This comes from the way that the 2-dimensional interpolation stretches the noise in the vertical and horizontal directions, drawing these lines out. This may only be apparent in my implementation, as I am probably missing the "diamond" part of the algorithm. Oh well.

Anyway, to take the clouds a bit further, you can use Octave's imperspectivewarp to apply a perspective transformation to the cloud images. I put some code together that does this transformation as well as adds a gradient,

function i = pers_clouds (n)
  w = size(n, 1);
  c = get_clouds (n);
  t = -pi/32;
  P = [cos(t) sin(t) 0; -sin(t) cos(t) 0; 0.001 0.002 1];
  
  ## Perspective transformation
  pc = imperspectivewarp (c/255, P, "cubic");
  pw = size(pc, 1);
  ph = size(pc, 2);
  
  ## Create and combine background gradient
  [dump back] = meshgrid(1:ph, 1:pw);
  i = pc * 4 * pw + back;
  
  ## Fit between 0 to 255 for image
  i = i - min (i(:));
  i = round (i / max (i(:)) * 255);
  i(isnan(i)) = 0;
end

Provide either Perlin noise or diamond-square noise and it will return an image that you can write out to a file,

octave> n = perlin (1000);
octave> imwrite ("clouds.png", pers_clouds(n) + 1, cloud_cmap)

After cropping the image with something like kolourpaint (as I did below), you get,

Clouds by Perspective

Then we could try sticking it in an image,

Eiffel Tower Before => Eiffel Tower After

Update: Sebastian Schaetz used my code above in a random map generator. Check it out.


Neural Network Blackjack Game

I have a really nice post coming up soon (it is mostly written up too!), but I want to spend more time on it to make good. Since I need something to post this week, and I still have material to move here, I have dug this up from my old website.

Get the source code (C++): neural.tar.gz (16.27KB) (GPL)

This is a neural network I wrote for an artificial intelligence class I took about a year ago. It includes a stand alone neural network class you can easily use in your own C++ program. Built around this neural network is a simple version of Blackjack (hit or stand only). You can play the neural network at Blackjack after it has finished training, which can take up to a minute or so.

My implementation of the neural network is a really simple one, using only back propagation, but it still has some neat surprises in it. When I was working with it, I would use a simple GNU Octave script to watch what was going on in real time.

Neural
       Network Plot

The x-axis is the number of iterations in tens of thousands and the y-axis describes how often the neural network plays exactly the same way a cheater would at the same seat. A cheater is defined as someone who knows what card is next and can play perfectly. Note: the neural network itself is not cheating. At the end, it agrees with the cheater about 83% of the time. This is the script that reads the neural network output. For those who don't recognize the she-bang, save this to the file plotdat and set the execution permission (chmod +x plotdat).

#!/usr/bin/octave -qf
# Usage: 
#  plotdat filename [loop] [last index]

dat = dlmread(argv{1});
x = length(dat);
loop = 0;

if (length(argv) > 1)
  if (argv{2} == 'loop')
    loop = 1;
  else
    x = argv{2};
  end
end

plot(dat(1:x));

while(loop)
  dat = dlmread(argv{1});
  plot(dat);  
  sleep(1);
end

pause;

When you run the program, a blank, dumb neural network is created. It doesn't know how to play blackjack. In fact, just as it seems with my sister at times, it doesn't know how to do anything at all. It's like a newborn baby, but without any instincts. To make this neural network useful, it is taught to play blackjack by running it very quickly through about one million games, all tutored by a teacher who gets to cheat by looking ahead in the deck. This allows the teacher to always know the proper move.

The training works by giving the network an input and the desired output (determined by the cheating). The network adjusts its internal weights depending on the error between what its current output for the input and the desired output. In doing this, the neural network picks up on the statistical nature of its inputs and will pick up on patterns.

When setting up the neural network in your own program, the tricky part is determining how to encode the inputs so that patterns will be found. In this case, I provide 3 integers to the network: its lower bound score, its best score, and the visible opponent score. By lower bound and best score, I am talking about Aces. All aces are treated as 1's in the lower bound and ideal (highest without bust) values in the best score. Scores never exceed 31, so we can encode this in 5 bits each for a total of 15 bits. We only need 1 output bit: hit (0) or stand (1). The network looks something like this, but with 30 hidden layer neurons and a lot more connections.

Neural
  Network

Here is an example run of the blackjack game,

Creating network ... created!
Training ...
Win, lose, total, c%, win%: 1536, 7640, 10000, 42.1168% , 15.36%
Win, lose, total, c%, win%: 3728, 14438, 20000, 55.8875% , 21.92%

... (lots of output for a few seconds) ...

Win, lose, total, c%, win%: 203744, 700338, 980000, 82.0455% , 20.3%
Win, lose, total, c%, win%: 205849, 707461, 990000, 82.5488% , 21.05%
Win, lose, total, c%, win%: 207981, 714515, 1000000, 82.3059% , 21.32%

Begin game:
Computer is dealt: (hole) 5 
Computer is dealt: (hole) 5 4 
Computer stands.
Your hand: 2 8 
Hit? (y/n): y
Your hand: 2 8 3 
Hit? (y/n): y
Your hand: 2 8 3 10 
You bust with 23
Computer wins with 19 against your 23
Play again? (y/n) : y

Begin game:
Computer is dealt: (hole) 4 
Computer is dealt: (hole) 4 10 
Computer stands.
Your hand: 10 A 
Hit? (y/n): n
Your hand: 10 A 
You win with 21 against computer's 20
Play again? (y/n) : n

Computer wins, losses, push: 1 (50%), 1 (50%), 0 (0%)

If you decide to try using the neural network in your own program, be sure to play with different sized networks to see what works best. In my implementation, a small network would be 1 layer with 3 nodes and a large network would be 3 or 4 layers with hundreds of nodes. Bigger networks may not work well at all, and there is no way to know what size is best other than trial and error.


Iterated Prisoner's Dilemma

Animation

I was reading about the prisoner's dilemma game the other day and was inspired to simulate it myself. It would also be a good project to start learning common lisp. All of the common lisp code is available in its original source file here: prison.lisp. I have only tried this code in my favorite common lisp implementation, CLISP, as well CMUCL.

In prisoner's dilemma, two players acting as prisoners are given the option of cooperating with or betraying (defecting) the other player. Each player's decision along with his opponents decision determines the length of his prison sentence. It is bad news for the cooperating player when the other player is defecting.

Prisoner's dilemma becomes more interesting in the iterated version of the game, where the same two players play repeatedly. This allows players to "punish" each other for uncooperative play. Scoring generally works as so (higher is better),

Player A
coopdefect
Player Bcoop (3,3)(0,5)
defect(5,0)(1,1)

The most famous, and strongest individual strategy, is tit-for-tat. This player begins by playing cooperatively, then does whatever the its opponent did last. Here is the common lisp code to run a tit-for-tat strategy,

(defun tit-for-tat ()
  (lambda (x)
    (if (null x) :coop x)))

If you are unfamiliar with common lisp, the lambda part is returning an anonymous function that actually plays the tit-for-tat strategy. The tit-for-tat function generates a tit-for-tat player along with its own closure. The argument to the anonymous function supplies the opponent's last move, which is one of the symbols :coop or :defect. In the case of the first move, nil is passed. These are some really simple strategies that ignore their arguments,

(defun rand-play ()
  (lambda (x)
    (declare (ignore x))
    (if (> (random 2) 0) :coop :defect)))

(defun switcher-coop ()
  (let ((last :coop))
    (lambda (x)
      (declare (ignore x))
      (if (eq last :coop) 
	  (setf last :defect) 
	  (setf last :coop)))))

(defun switcher-defect ()
  (let ((last :defect))
    (lambda (x)
      (declare (ignore x))
      (if (eq last :coop) 
	  (setf last :defect) 
	  (setf last :coop)))))

(defun always-coop ()
  (lambda (x) 
    (declare (ignore x))
    :coop))

(defun always-defect ()
  (lambda (x) 
    (declare (ignore x))
    :defect))

Patrick Grim did an interesting study about ten years ago on iterated prisoner's dilemma involving competing strategies in a 2-dimensional area: Undecidability in the Spatialized Prisoner's Dilemma: Some Philosophical Implications. It is very interesting, but I really wanted to play around with some different configurations myself. So what I did was extend my iterated prisoner's dilemma engine above to run over a 2-dimensional grid.

Grim's idea was this: place different strategies in a 2-dimensional grid. Each strategy competes against its immediate neighbors. (The paper doesn't specify which kind of neighbor, 4-connected or 8-connected, so I went with 4-connected.) The sum of these competitions are added up to make that cell's final score. After scoring, each cell takes on the strategy of its highest neighbor, if any of its neighbors have a higher score than itself. Repeat.

The paper showed some interesting results, where the tit-for-tat strategy would sometimes dominate, and, in other cases, be quickly wiped out, depending on starting conditions. Here was my first real test of my simulation. Three strategies were placed randomly in a 50x50 grid: tit-for-tat, always-cooperate, and always-defect. This is the first twenty iterations. It stabilizes after 16 iterations.

(run-random-matrix 50 100 20 '(tit-for-tat always-coop always-defect))

Animated GIF

White is always-cooperate, black is always-defect, and cyan is tit-for-tat. Notice how the always-defect quickly exploits the always-cooperate and dominates the first few iterations. However, as the always-cooperate resource becomes exhausted, the tit-for-tat cooperative strategy works together with itself, as well as the remaining always-cooperate, to eliminate the always-defect invaders, who have no one left to exploit. In the end, a few always-defect cells are left in equilibrium, feeding off of always-cooperate neighbors, who themselves have enough cooperating neighbors to hold their ground.

The effect can be seen more easily here. Around the outside is tit-for-tat, in the middle is always-cooperate, and a single always-defect cell is placed in the middle.

(run-matrix (create-three-box) 100 30)

Animated GIF

The asymmetric pattern is due to the way that ties are broken.

The lisp code only spits out text, which isn't very easy to follow whats going on. To generate these gifs, I first used this Octave script to convert the text into images. Just dump the lisp output to a text file and remove the hash table dump at the end. Then run this script on that file: pd_plot.m. The text file input should look like this: example.txt. You will need Octave-Forge

The script will make PNGs. You can either change the script to make GIFs (didn't try this myself), or use something like ImageMagick to convert the images afterward. Then, you compile frames into the animated GIF using Gifsicle.

See if you can come up with some different strategies and make some special patterns for them. You may be able to observe some interesting interactions. The image at the beginning of the article uses all of the listed strategies in a random matrix.

I will continue to try out some more to see if I can find something particularly interesting.


Polynomial Interpolation

This is something I learned the other day that I thought was interesting.

So you have some data from experimentation or from a function that is difficult to solve.

Equation 1

Suppose you want to estimate a polynomial curve to that fits the data. Then you could interpolate values between the data points. Let p(x) be the polynomial. The equation for the polynomial we will fit to the data will look like this,

Equation 2

The a's are the coefficients in our polynomial. We know x and we want to satisfy the condition,

Equation 3

which, when we want to solve it will take the form,

Equation 4

Where the a's are our unknowns for which we are solving. Notice something? This is the linear system,

Equation 5

We just have to solve for the a vector to get our coefficients. I quickly wrote this GNU Octave code to try this out,

function a = npoly (x, y)  
  X = repmat (x', 1, length(x));
  
  for i = 1:length(x)
    X(:,i) = X(:,i) .^ (i - 1);
  end
  
  a = X \ y';
end

This is just an extremely simple and slow version of Octave's polyfit function, except for the order of the coefficients solution vector. I also wrote this function that will take the coefficient vector and a value and do the polynomial interpolation at that point (Octave's polyval),

function v = psolve (x, a)
  v = zeros (size (x));

  for i = 1:length(a)
    v = v + a(i) * x.^(i-1);
  end
end

Here is an example of my polynomial interpolation function recognizing a parabola,

octave:88> x = 0:.5:3;
octave:89> y = x.^2;
octave:90> a = npoly(x, y) 
a =

   0.00000
  -0.00000
   1.00000
  -0.00000
   0.00000
  -0.00000
   0.00000

See how only the quadratic component is left because the zeros cancel out everything else? Here is an example with some added Gaussian noise, imitating data that might be pulled from a scientific experiment,

octave:142> x = 0:.5:3;
octave:143> y = x.^2 + randn(size(x))*0.5;
octave:144> a = npoly(x, y);
octave:145> plot(x, y, "b*");
octave:146> hold on
octave:147> x_test = 0:.05:3;
octave:148> y_test = psolve(x_test, a);
octave:149> plot (x_test, y_test, "r")

Example plot

You can see that the order of our polynomial is too high for the data we are using. The main problem, however, it that the linear system is ill-conditioned. The condition number of the generated X matrix above is 151900, meaning small changes in x result in large changes in the solution. If we step out a bit you can see the polynomial quickly diverge from the given data,

Example illplot

So, I definitely wouldn't use this for extrapolation.

All of the equation images were produced by this LaTeX document: pfit.tex


Chess AI Idea

So, I had this idea using a genetic algorithm to optimize the parameters of a program that plays chess. Now, the genetic algorithm wouldn't be used at all during a game, but rather to optimize the board evaluation parameters beforehand. I don't know much about writing board game AI programs, as I have only written a few of them for fun (tic-tac-toe, connect 4, Pente). For this chess program, I am taking a simple approach because I am more interested in seeing the genetic algorithm at work than seeing the chess playing AI do well against other chess AI or people.

The program would search the game tree using the minimax algorithm, with some possible optimizations added afterward. Tree searching is just a matter of generating all possible moves and looking at them. The hard part is the board evaluation function, which evaluates the a particular board's score based on the arrangement of the pieces. Parameters to this evaluation function would be, for example, the piece values. The pawn would be locked in at a value of 1, which anchors the other values and provides a base unit to work from.

Again, the parameters would not change during a game. We use the genetic algorithm ahead of time to determine the parameters.

For the genetic algorithm, the set of parameters strung together makes up a single chromosome. We maintain a pool of different chromosomes, i.e. different sets of parameters, and breed these chromosomes together to improve our parameter sets. We start out with a random pool made of parameters that are most likely pretty terrible.

To evaluate the chromosomes, we need a fitness function, which evaluates each chromosome for its level of "fitness" deciding if it breeds or not. To do this we simply play the chromosome we are evaluating against some base chromosome, which may just be parameters chosen intuitively by the programmer. Or, the base chromosome could be random too. Starting with a better base chromosome would be a good head start, though. The fitness of the chromosome is how often it wins against the base chromosome in, say, a few hundred games.

The most fit chromosomes are bred by taking a few parameters from each to make a new chromosome. Mutations are occasionally added in order to keep the chromosome pool from getting stuck in a local maximum. A mutation involves changing one or more parameters in a chromosome slightly in some random way. Mutations are rare and will usually be detrimental to the chromosome quickly killing it off, but will occasionally cause a good change that will be spread to other chromosomes in the next generation, improving the gene pool.

We iterate this until either the maximum fitness level in the pool is stuck for several iterations (we aren't getting anywhere and mutations aren't helping), or the chromosomes are so good they always beat the base chromosome, making the fitness algorithm meaningless. When this happens, we replace the base chromosome with the best chromosome in the pool and start over from scratch again with a random, or mostly random, pool.

As you would expect, I have looked into parallelizing this process to take advantage of a cluster. This is easy for several reasons. First, evaluating chromosomes can be done simultaneously. No evaluation depends on another chromosome's evaluation. Second, the minimax game tree search can be parallelized so that several different processes search the game tree and give their results back to the parent process. This works very well because the data being sent back to the parent will be a single integer. No need to send large amounts of data around the network.

I spent an afternoon hacking at this, but its still too crude to share yet. I got the non-parallel version of the chess engine built but I am still working on the evaluation function. The genetic algorithm hasn't been started. The only parameters at the moment are piece values. The board evaluation function just adds up the piece values on the board completely ignoring their positions. This makes the computer play extremely aggressively, capturing the opponent's pieces whenever it can. This makes for a somewhat interesting bloodbath where the board goes empty after just a few moves.

My problem right now is finding a good way to represent piece movements so that it can be recycled. That is, I want to represent movements when generating my search tree, verifying the legality of a move, and evaluating the board all the same way so that I don't have to program in piece movements several times.


Robot Version 1

Update: There is a followup post to this post.

Here is what my team has been working on for the last couple weeks. The end goal for this robot is to escape a maze, collect blocks, and find a repository in which to drop those blocks. Someone suggested we call it Pac-man. Here it is (without batteries),

Robot

Robot

And here it is being worked on. We added a third level to make more room for the batteries and extra sensors.

Robot

Here is the game board. It is 8x8 feet with a 4x8 foot maze.

Maze

Maze

Building a robot is an interesting experience, but a stressful one. Especially when you are doing it for a class. So many things could go wrong and you can spend hours tracking down a bad soldering job, which we once found inside an IR sensor. It was a poor manufacturing soldering job.

So, as of this writing, the robot uses 3 infrared (IR) sensors to look at walls and two wheel encoders for tracking the distance traveled by each wheel. You can see the disk encoder on the inside of the wheel in the second robot image. The robot uses 9 rechargeable nickel-metal hydride (NiMH) AA batteries: 5 for the Freescale 68HC12 micro-controller and sensors, and 4 for the continuous rotation servo motors. It is a competition, so I don't want to give too many details at the moment in case another team is reading.

Right now it limps along in the maze and gets around for awhile before drifting into a wall. This will get fixed this weekend, as our grade depends on it. We just need to make better use of our sensors. I.e., it is a software issue now.

Eventually, I will put some code up here we used in the robot. It is all done in C, of course.


Java Animated Maze Generator and Solver

Maze screenshot

Download source:

git clone http://git.nullprogram.com/RunMaze.git

Compiled (2009-07-07): RunMaze.jar (5.4 kB)

In preparation for showing off the maze-navigating robot I made last week, I give you this program I wrote last year.

When I started to learn Java, I wrote this little program for practice. It features a GUI and multi-threading. It generates a maze and then slowly solves the maze so you can watch it go. I wrote this with the GNU project's Java implementation (never used the official Sun one). The Makefile is therefore set up for gcj and gij. The Makefile can build the byte-compiled .jar file as well as a natively compiled version. I have updated it to use Ant, and I also use Sun's OpenJDK these days.

The maze generation algorithm is what I believe to be called the "straw man" algorithm: the maze starts as a matrix of individual cells. Break down the walls between two cells that aren't already connected and move into it. Choose another wall at random and repeat. If there are no walls left to break down, go back a step. Lather, rinse, repeat. If you are back in the starting cell with no more walls to break down, you are done.

Solving the maze is done in a similar way to generation. Go forward, right, or left if you can. If not, go back to the previous cell. This is the same algorithm I employed in the maze-navigating robot I built, which has kept me pretty busy. I will post pictures of it later.

You can specify the maze height, width, and cell pixel size on the command line when you run it. This runs with the defaults,

java -jar RunMaze.jar

Or if you grabbed the source,

ant run

And, for a tiny 100 by 100 maze,

java -jar RunMaze.jar 100 100 5

where,

java -jar RunMaze.jar <width> <height> <cell-size> 

So, why learn Java? I still don't like Java, but I had missed out on an interesting research opportunity because I had zero Java experience. I probably wouldn't use it on my own for anything but practice, as my own projects don't really need to be super-portable. Java is ugly, bulky, and slow, but I don't want to miss any more opportunities. Right after I was turned down because of my lack of experience, I bought a Java textbook and read it cover to cover on vacation. More importantly, I wrote simple little Java programs (like the one presented here) as I learned core concepts.

Update 2009-07-07: I use Java all the time at work now. There is really no escaping it. Except maybe with something like Clojure ...


Mandelbrot Set on a Cluster

This the the second part of my post about clusters. Finally, here are some more pretty pictures and video to look at! It is an extremely parallel Mandelbrot set generator written in C.

This image was generated by my program on a cluster at my university, and my favorite image generated so far. It is part of the sequence in the videos below.


Mandelbrot set zoom

The reason I built my own cluster was to run this program. The generator forks off an arbitrary number of jobs (defined by a config file) to generate a single fractal, or a fractal zoom sequence. The cluster then automatically moves these jobs around to different nodes, making the fractal generation fast.

I wrote it with two goals in mind. I wanted it to be parallel so that it could easily take advantage of a cluster. I also wanted it to not use any external libraries. This is because a cluster is often a shared resource. Programs and libraries may only be available if installed by an administrator, meaning that extra libraries like libpng may not be available. For inter-process communication the generator uses simple pipes. So all you need here is a POSIX interface to the operating system, rather than some MPI implementation.

I used Andy Owen's handy bitmap library for writing out the bitmaps. I don't know how I could have done without it!

The only thing you need in order to run the fractal generator is a C compiler and a POSIX interface (GNU/Linux, *BSD, and other Unix-like systems). Extra capabilities are available if gzip is installed.

If you can't see the video above, here are two links to videos.

mandel-zoom.ogg - 27M, Ogg Theora (Free, unencumbered codec)
mandel-zoom.avi - 40M, Xvid

I use GNU Xaos to find good locations in the set for zoom sequences. It lets you zoom in real-time. Once I find a good spot, I tell my generator to render some nice images as a zoom sequence to it. Sometime I hope to write in an algorithm for auto-zooming. This way the generator could create zoom sequences automatically for hours on end. Xaos already has this capability for its real-time zooming.

An interesting thing I discovered: for these fractals at least, the gzipped bitmaps are (barely) smaller than the equivalent PNG versions. For the image above, the PNG version (produced by ImageMagick defaults) is 11586185 bytes. The gzipped bitmap version is 11586074 bytes. Plus, gzipping is faster. On my laptop, BMP to PNG (ImageMagick's convert) took 13.678s. Gzipping (default options) took 5.167s. I am as surprised as you are.

The project page with source code and documentation can be found at projects/mandel.


Walk Text Adventure Game (Perl and MATLAB)

I wrote this thing back in September 2005 when I had to learn MATLAB for work. This writeup was done about year later.

Download the code (GPL licensed) at download/walk.tar.gz (33.72KB). There are both Perl an MATLAB versions.

This was a text-adventure engine I wrote when I was starting to learn MATLAB programming, which means the MATLAB code here is pretty horrendous. It may or may not work in GNU octave, I didn't bother checking. That's because I later wrote a Perl version that does the same thing (when I was still starting with learning Perl, so not much better). Included is a sample "world" to explore (in English and pig latin!). The format for the "world" is a really simple one, probably missing a lot of features of a real text adventure scripting language. Have fun!

This is what it looks like,

Welcome to Walk. Just type in plain English into the 'Action:' line what you
want to do.

Type 'inv' to display the inventory.
Type 'clc' to clear the screen
Type 'look' or 'ls' to get the current description of the room.
Type 'save' to quicksave the session.
Type 'load' to restore the quicksave session.
Type 'doc' or 'help' to see this message again.
Type 'quit' to quit.

 Action: look

 You appear to be in a small bedroom. There are no windows. Light is
provided by a small ceiling lamp. There is a brown wooden door, a small bed
with no sheets, a small box which is laying on the bed.

 Action: _

Converting MediaWiki Markup to LaTeX

Today I had a large document written in MediaWiki markup. The document was a simple one consisting only of paragraphs, all possible heading levels, and flat lists. There were no mathematical expressions or images -- nothing fancy whatsoever. I wanted this document available in LaTeX markup so that I could have a nice, professional looking printout.

After some brief searching I couldn't find a practical, simple script to convert the document to LaTeX markup (I didn't look very hard). Everything I found wanted to do the conversion the opposite way that was needed. It ended up that writing and debugging my own Perl script to do the job only took me about 30 minutes. You can see it here (BSD-style license),

wiki2latex.pl

Here are the main guts to give you the gist of it,

while (<>) {
    # Sections
    s/======([^=]+)======/\\subparagraph{$1}/g;
    s/=====([^=]+)=====/\\paragraph{$1}/g;
    s/====([^=]+)====/\\subsubsection{$1}/g;
    s/===([^=]+)===/\\subsection{$1}/g;
    s/==([^=]+)==/\\section{$1}/g;
    
    # Special characters
    s/([&#_])/\\$1/g;
    
    # Lists
    if (m/^\* / && !$listmode) {
	print "\\begin{itemize}\n";
	$listmode = 1;
    }
    if (!m/^\* / && $listmode) {
	print "\\end{itemize}\n";
	$listmode = 0;
    }
    s/^\* /\\item{} /;
    
    print;
}

Here is some sample input markup and output,

sample.wiki
sample.tex
sample.pdf

The script will write out all the header and footer markup (including title page and table of contents, which is what I needed) so that the output can go right into LaTeX for processing. The script is very far from being complete in terms of a "MediaWiki to LaTeX converter", and I have no intention on making it any more complete either. It does only what I needed it to do: handle a few special characters, convert headings, and create lists. I am providing it here in case someone finds it useful or interesting. Perhaps it may serve as a stepping stone for creating something more complete.


Proposal for a Free Musical

An idea I had some time ago would be for a free (as in speech) musical (or play). It could be licensed under the GPL or something like it (the GNU Free Musical License, GFML, perhaps?). For example, think of the scripts and music scores as the "source code" for the musical, where these documents must be provided to ticket-holders upon request in digital form, such as on a CD, or printed. Just like free software, we are mostly concerned with preserving the freedom of the user/audience. These are free musicals.

Have you ever watched a great musical and your skin tingled at the orchestra's crescendo during the hero's solo? Or perhaps you choked up at a dramatic scene? These moments should not be locked up so that they cannot be shared. Free musicals would be another step towards a free culture where these scenes are not lost, where everyone is free to share his or her own culture. This freedom does not exist fully today. For instance, I can write a story about vampires but I can't write one about Jedi.

Just as free software developers don't have to starve, neither do free musical composers and writers. The word free refers to freedom, not price. A free musical author can distribute the musical to a producer for any price he or she wants.

You see, I was involved in my high school musicals growing up and I remember how they had to pay some steep royalties to put these shows on. One year, to help pay for the show we had collected spare change from students during lunch periods. Even for all this expense, we weren't even permitted to make copies of the music and scripts as needed for use in the production (these cost extra). We were being dominated by the musical's publisher.

This, of course, did not stop these extra copies from being made. It's just another bad law which should have been and was ignored. Remember, the act of breaking laws itself is not wrong. You get to decide right and wrong for yourself. No one, especially a politician, can do this for you. I would wager that, in the US at least, just about everyone breaks some law at least once a month. Okay, back to free musicals.

If there were free musicals from which to choose, once a high school (or anyone) obtained a copy of a musical's source code they would be free to put on a production without paying any special or additional per-seat or per-ticket royalties. They could make as many copies of scripts and scores as needed without having to break any laws. They could even send copies to other schools.

Let's say a choir teacher (or whoever is directing things) goes out and sees a free musical somewhere. She enjoys the show so much she wants to have her students perform it as the next year's production. As a ticket-holder, she requests and receives the source code for this show. That's it! She can put on this show for the cost of a single ticket. In other cases, someone might be feeling generous and make the source code available to anyone at no cost to anyone who asks.

Since I have had the idea, I have dreamed of writing a free musical. Unfortunately, my writing skills are poor and my music skills are even worse. I have arranged music for a marching band in the past, but have done no serious composition. Maybe some day I will be good enough to write one. I would like to learn GNU LilyPond sometime and writing a free musical would be good practice.

Of course, an existing musical could always be liberated (expensively, without doubt) and turned into a free musical.


BCCD Clusters

I have recently really gotten into clustering with excitement to write something that can really take advantage of one. The first thing I had to do was figure out how to cluster my computers. The second was to write something that could really use my personal cluster to its full potential. This post is about the first step.

A live CD was the obvious choice (and still is) to begin clustering: no installation and quick setup. I started with dyne:bolic, a GNU/Linux distribution geared for editing and creating multimedia while strictly using free software (already off to a great start!). I had read that dyne:bolic had clustering capabilities with openMosix, which was my reason for looking at it in the first place. I found out later (after digging deep into documentation a dyne:bolic mirror) that dyne:bolic doesn't include openMosix anymore. No good for me.

Next I found BCCD, a live CD distribution based on LNX-BBC, the same distribution used by the Free Software Foundation for its business-sized CD member cards. This one is fantastic! I think this is due to its goal as an educational tool, making it easy to learn about clusters while using it. It comes with several programming examples and full MPI development capabilities.

The only objection I have about BCCD is that the boot/setup process is too interactive and manual. When booting off a CD, you must create and confirm a password (good!), hit enter through several dialogs to detect network cards (annoying), plus a dialog about DHCP (not bad). Then you have to broadcast a heartbeat and enable the openMosix detection daemon. I also have a problem with it occasionally trashing the fat16 file system my USB thumb drives. Fortunately, this only causes trivial, and small data loss as the drive contains only data generated by the cluster.

I own two computers and have a access third one (a laptop) borrowed from my school for use in a robot design lab. I have not yet employed my roommate's computer in my cluster (only a matter of time!). Since my NETGEAR router has only 4 ports, 4 computers is the max for my cluster. Booting up 4 computers with BCCD can be annoying enough, but imagine booting 30 (in some computer lab) by yourself! I bet they pictured a lab full of students all handling their own boot/setup sequence when the BCCD team made these decisions.

As I said before, BCCD has great MPI facilities (which their example programs take advantage of), including support for synced directories. I have yet to employ the MPI, however, as I find fork() and pipe() to be much more convenient in terms of coding and execution (run and forget). If you can tell me why MPI is better than a regular POSIX pipe when it comes to clusters, send me an e-mail. I haven't figured why yet, if it is even true in the first place.

I have a working version of a program that can take advantage of a cluster. I will write about this later as a sort-of "part 2".


Memory Allocation Pool

I wrote this code and writeup some time ago and have used it in several projects for building parse trees quickly. Using the pool is faster than making a system call with malloc each time you add a new node. In fact, one of my programs ran 25% faster when I dropped in the this memory allocation pool. This code is very solid and reliable.

Download the code (GPL licensed) at download/alloc-pool.tar.gz (2.58KB).

I read about memory allocation pools in the Subversion manual and decided to write one for fun. So here it is. Included is a small driver I ran over night to test for memory leaks. lint will complain about memory leaks on this code, however. I also added thread safety, but I don't suggest you use it.

A memory allocation pool is good for speeding up a program that needs to make many small memory requests quickly (many system calls). Instead, one system call is made in place of many.

It is also useful for semi-automatic memory management. Let's say you build some large tree structure somewhere in your program. If you want to free all this memory used by the tree, you will need to traverse it to take it down. This takes time and code. If you use a memory pool, you can free all of the memory at once by freeing the entire memory pool.

The pool works by allocating a large chunk of memory and dishes it out as requested (a subpool). If a request is too large to take out of the current chunk, it allocates another chunk twice as large as the previous one (another subpool). This doubling allows the pool to quickly scale up to whatever size is needed. Memory will still be allocated from the old pool until that pool has too many misses in a row (hard coded to 10 in my sources). Once this happens, the subpool remains untouched and your pool will have some slight internal fragmentation.


PNG Archiver - Share Files Within Images

This is one of my projects. The original idea for this project came from Sean Howard's Gameplay Mechanic's #012. The basic idea here is that image files are the second easiest type of data to share on the Internet (the first being text). Sharing anything other than images may be difficult, so Why not store files within an image as the image data? This is not steganography as the data is not being hidden. In fact, the data is quite obvious because we are trying to make the data as compact as possible in the image.

My "PNG Archiver" is usable but should still be considered alpha quality software. I am adding support for different types of PNGs (currently it does 8-bit RGB only), but I have found that using the libpng library gives me headaches. The archiver can actually only store a single file (just as gzip doesn't know what a file is). This is because I do not want to duplicate the work of real file archivers like tar. To store multiple files, make a "png-tarball".

Here are a couple examples. Below is the US Constitution (gzipped),
PNG Archive
And this is an audio recoding I made (encoded in Ogg Vorbis),
PNG Archive

The PNG Archiver stores a checksum in the image that allows it to verify that the data was received correctly. This also allows it to automatically scan the image for data. When it reads in a piece that fulfills the checksum it assumes that it found the data you are looking for. You can decorate the image with text or a border and the archiver should still find the data as long as you didn't disturb it. (examples of this on the project page)

You can find the project page with detailed information (duplicating some of what I said here) under projects/pngarch.


YouTube with Free Software

Update 2009-6-30: Thanks to HTML 5 and the video tag I will be self-hosting videos from now on. This is information is only historical.

As I have stated previously, I love free software and I try to use free software exclusively whenever I can (it is very difficult to find employment in computer engineering where no proprietary software is used). This can pose a problem when I want to watch YouTube videos because I do not use the proprietary, non-free Flash player. The free Flash players currently either handle YouTube poorly or not at all. I also find Flash annoying enough that I am not interested in using these free Flash players anyway (fewer ads automatically!).

Like everyone else with an e-mail address, I get links to videos on YouTube from my friends. I also post videos there myself under the name "throwaway0" as it is convenient not only for me, but also for anyone who wants to watch the videos. Now, if the only way to watch these videos was with proprietary software, I would not encourage this. In fact, not too long ago this was true of most online video, which was limited to a "choose your poison" type situation between the proprietary, worthless Windows Media and QuickTime formats. No poison for me, thanks.

I have discovered several solutions to watching YouTube with free software. There are two steps involved: getting the video and playing the video. For the first you have youtube-dl and Firefox Fast Video Download.

youtube-dl is a Python script that you can easily install on your system. You just give it a YouTube URL and it does all the work. It feels a bit like wget. The Firefox add-on will add a nice little icon like this,
Icon screenshot
Clicking this icon will download the video. With either tool, you will have this .flv file somewhere that you want to watch.

To watch the video file, you can use mplayer (without the w32 codecs) or VLC. As far as I know, these videos are handled with free software only when using these players. They play fine (except without seeking) on my system. I use Debian GNU/Linux so I am pretty confident that my system is strictly free software.

Now you can watch YouTube without having to fall victim to proprietary software.


Mandelbrot with GNU Octave

In preparation for another project idea I have (to be posted in the future), I wrote a Mandelbrot fractal generator in Octave. Octave is great for just trying things out and prototyping your algorithms. It is very slow, however. Here is the code,

function mandel_img = mandel ()
  ## Parameters
  w = [-2.5 1.5]; # Domain
  h = [-1.5 1.5]; # Range
  s = 0.005;      # Step size
  it = 64;        # Iteration depth
  
  ## Prepare the complex plane
  [wa ha] = meshgrid (w(1):s:w(2), h(1):s:h(2));
  complex_plane = wa + ha * i;
  
  ## Preallocate image
  mandel_img = zeros( length(h(1):s:h(2)), length(w(1):s:w(2)));
  
  ## Generate mandelbrot
  for wi = 1:size(mandel_img, 2)
    for hi = 1:size(mandel_img, 1)
      
      z = 0;
      k = 0;
      while k < it && abs(z) < 2
	z = z^2 + complex_plane (hi, wi);
	k = k + 1;
      end
      mandel_img (hi, wi) = k - 1;
      
    end
    ## Display progress
    waitbar (wi/size(mandel_img, 2));
  end
  
end

You may need to comment out the waitbar line if you do not have Octave-Forge installed properly (as is the case with Octave 2.9 on Debian as of this writing) or at all. You will also need Octave-Forge if you want to use the image functions described below.

You can find the same code all over the Internet for many different languages. The advantage with Octave is that it knows about complex numbers so that this can be expressed directly with z = z^2 + c and abs(z). For an introduction on how the Mandelbrot is generated, check out the Wikipedia article.

Now, this code just generates a matrix of the escape iteration numbers for each pixel. To visualize this, you will need to use the image functions. The simplest thing to do is view the data as a boring greyscale image.

octave> m = mandel(); # Generate the data
octave> imshow(m);

You should see something like this,
Greyscale Mandelbrot
You can save this as an image with imwrite,

octave> imwrite("mandel.png", m*4)

The *4 part is because the iteration depth was set to 64. The image being written will have values between 0 and 255. This allows the data to use the full dynamic range of the image.

If you want more interesting images, you can apply a colormap. Octave-Forge has two handy color maps, hot and ocean (cool). To make the inside of the fractal black, which are the points that are part of the set and never escaped, stick black on the end of the colormap. This can be done like this (viewing and saving),

octave> cmap = [hot(63); 0 0 0];  # The colormap
octave> imshow(m + 1, cmap);
octave> imwrite("mandel.png", m + 1, cmap);

Hot Mandelbrot

m is between 0 and 63. We add one to it to put it between 1 and 64. Then we take the colormap of length 63 and stick black on the end. If you substitute ocean for hot, you will get a nice blue version.

Cool Mandelbrot

You can modify the code above to try to get different fractals. For example, try z = z^4 + c instead,

Cool Mandelbrot

More on fractals another time.


null program is Alive

Welcome to null program. This is my new home as my student location will no longer exist in a few months. Here is where I will share my projects and ideas, even if no one is looking, reading, or caring.

The name for this website comes from a phrase spoken in several places in my favorite book, The Moon is a Harsh Mistress. If you have not read this book, you are missing out. I recommend you go to the nearest library immediately and read it. Amazon lets you read the first chapter online if you want to get a taste of it. When I started reading it, I was hooked right away. Why is it a great book? Because it is about an American Revolution style libertarian revolution on the moon, and the hero character is a computer engineer. What book can get better than that?

I would like to post something interesting here about once a week or so. I am taking classes right now, so getting ideas should be relatively easy. The first few posts will probably be repeating things I have already published at my old student location. This will include little intros to my old projects and a bunch of code snippets.

Also, separate from this blog thing are my projects, such as the PNG Archiver and binitools. These have a permanent home located under projects/.

I am a big free software fan and you will probably see the GNU project being mentioned a lot around here. I also use Emacs for just about everything I can. Here is more advice: learn to use Emacs. You will become much more productive. After using Emacs for awhile, you will begin loathing word processors and mice (the computer kind). More on this later.


Post Index

There are 130 posts.


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.