This is something I learned the other day that I thought was
interesting.
So you have some data from experimentation or from a function that is
difficult to solve.
Suppose you want to estimate a polynomial curve to that fits the
data. Then you could interpolate values between the data points. Let
p(x) be the polynomial. The equation for the polynomial we will fit to
the data will look like this,
The a's are the coefficients in our polynomial. We
know x and we want to satisfy the condition,
which, when we want to solve it will take the form,
Where the a's are our unknowns for which we are solving. Notice
something? This is the linear system,
We just have to solve for the a vector to get our
coefficients. I quickly wrote this GNU Octave code to try this out,
function a = npoly (x, y)
X = repmat (x', 1, length(x));
for i = 1:length(x)
X(:,i) = X(:,i) .^ (i - 1);
end
a = X \ y';
end
This is just an extremely simple and slow version of
Octave's polyfit function, except for the order of the
coefficients solution vector. I also wrote this function that will
take the coefficient vector and a value and do the polynomial
interpolation at that point (Octave's polyval),
function v = psolve (x, a)
v = zeros (size (x));
for i = 1:length(a)
v = v + a(i) * x.^(i-1);
end
end
Here is an example of my polynomial interpolation function recognizing
a parabola,
octave:88> x = 0:.5:3;
octave:89> y = x.^2;
octave:90> a = npoly(x, y)
a =
0.00000
-0.00000
1.00000
-0.00000
0.00000
-0.00000
0.00000
See how only the quadratic component is left because the zeros cancel
out everything else? Here is an example with some added Gaussian
noise, imitating data that might be pulled from a scientific
experiment,
octave:142> x = 0:.5:3;
octave:143> y = x.^2 + randn(size(x))*0.5;
octave:144> a = npoly(x, y);
octave:145> plot(x, y, "b*");
octave:146> hold on
octave:147> x_test = 0:.05:3;
octave:148> y_test = psolve(x_test, a);
octave:149> plot (x_test, y_test, "r")
You can see that the order of our polynomial is too high for the data
we are using. The main problem, however, it that the linear system is
ill-conditioned. The condition number of the generated X matrix above
is 151900, meaning small changes in x result in large changes in the
solution. If we step out a bit you can see the polynomial quickly
diverge from the given data,
So, I definitely wouldn't use this for extrapolation.
All of the equation images were produced by this LaTeX
document:
pfit.tex
So, I had this idea using a genetic algorithm to optimize the
parameters of a program that plays chess. Now, the genetic algorithm
wouldn't be used at all during a game, but rather to optimize the
board evaluation parameters beforehand. I don't know much about
writing board game AI programs, as I have only written a few of them
for fun (tic-tac-toe, connect 4, Pente). For this chess program, I am
taking a simple approach because I am more interested in seeing the
genetic algorithm at work than seeing the chess playing AI do well
against other chess AI or people.
The program would search the game tree using
the minimax
algorithm, with
some possible
optimizations added afterward. Tree searching is just a matter of
generating all possible moves and looking at them. The hard part is
the board evaluation function, which evaluates the a particular
board's score based on the arrangement of the pieces. Parameters to
this evaluation function would be, for example, the piece values. The
pawn would be locked in at a value of 1, which anchors the other
values and provides a base unit to work from.
Again, the parameters would not change during a game. We use the
genetic algorithm ahead of time to determine the parameters.
For the genetic algorithm, the set of parameters strung together makes
up a single chromosome. We maintain a pool of different chromosomes,
i.e. different sets of parameters, and breed these chromosomes
together to improve our parameter sets. We start out with a random
pool made of parameters that are most likely pretty terrible.
To evaluate the chromosomes, we need a fitness function, which
evaluates each chromosome for its level of "fitness" deciding if it
breeds or not. To do this we simply play the chromosome we are
evaluating against some base chromosome, which may just be parameters
chosen intuitively by the programmer. Or, the base chromosome could be
random too. Starting with a better base chromosome would be a good
head start, though. The fitness of the chromosome is how often it wins
against the base chromosome in, say, a few hundred games.
The most fit chromosomes are bred by taking a few parameters from each
to make a new chromosome. Mutations are occasionally added in order to
keep the chromosome pool from getting stuck in a local maximum. A
mutation involves changing one or more parameters in a chromosome
slightly in some random way. Mutations are rare and will usually be
detrimental to the chromosome quickly killing it off, but will
occasionally cause a good change that will be spread to other
chromosomes in the next generation, improving the gene pool.
We iterate this until either the maximum fitness level in the pool is
stuck for several iterations (we aren't getting anywhere and mutations
aren't helping), or the chromosomes are so good they always beat the
base chromosome, making the fitness algorithm meaningless. When this
happens, we replace the base chromosome with the best chromosome in
the pool and start over from scratch again with a random, or mostly
random, pool.
As you would expect, I have looked into parallelizing this process to
take advantage of a cluster. This is easy for several reasons. First,
evaluating chromosomes can be done simultaneously. No evaluation
depends on another chromosome's evaluation. Second, the minimax game
tree search can be parallelized so that several different processes
search the game tree and give their results back to the parent
process. This works very well because the data being sent back to the
parent will be a single integer. No need to send large amounts of data
around the network.
I spent an afternoon hacking at this, but its still too crude to share
yet. I got the non-parallel version of the chess engine built but I am
still working on the evaluation function. The genetic algorithm hasn't
been started. The only parameters at the moment are piece values. The
board evaluation function just adds up the piece values on the board
completely ignoring their positions. This makes the computer play
extremely aggressively, capturing the opponent's pieces whenever it
can. This makes for a somewhat interesting bloodbath where the board
goes empty after just a few moves.
My problem right now is finding a good way to represent piece
movements so that it can be recycled. That is, I want to represent
movements when generating my search tree, verifying the legality of a
move, and evaluating the board all the same way so that I don't have
to program in piece movements several times.
Here is what my team has been working on for the last couple
weeks. The end goal for this robot is to escape a maze, collect
blocks, and find a repository in which to drop those blocks. Someone
suggested we call it Pac-man. Here it is (without batteries),
And here it is being worked on. We added a third level to make more
room for the batteries and extra sensors.
Here is the game board. It is 8x8 feet with a 4x8 foot maze.
Building a robot is an interesting experience, but a stressful
one. Especially when you are doing it for a class. So many things
could go wrong and you can spend hours tracking down a bad soldering
job, which we once found inside an IR sensor. It was a poor
manufacturing soldering job.
So, as of this writing, the robot uses 3 infrared (IR) sensors to look
at walls and two wheel encoders for tracking the distance traveled by
each wheel. You can see the disk encoder on the inside of the wheel in
the second robot image. The robot uses 9 rechargeable nickel-metal
hydride (NiMH) AA batteries: 5 for the Freescale 68HC12
micro-controller and sensors, and 4 for the continuous rotation servo
motors. It is a competition, so I don't want to give too many details
at the moment in case another team is reading.
Right now it limps along in the maze and gets around for awhile before
drifting into a wall. This will get fixed this weekend, as our grade
depends on it. We just need to make better use of our sensors. I.e.,
it is a software issue now.
Eventually, I will put some code up here we used in the robot. It is
all done in C, of course.
In preparation for showing off the maze-navigating robot I made last
week, I give you this program I wrote last year.
When I started to learn Java, I wrote this little program for
practice. It features a GUI and multi-threading. It generates a maze
and then slowly solves the maze so you can watch it go. I wrote this
with the GNU project's Java implementation (never used the official Sun
one). The Makefile is therefore set up for gcj and
gij. The Makefile can build the byte-compiled
.jar file as well as a natively compiled version.
I have updated it to use Ant, and I also use Sun's OpenJDK these days.
The maze generation algorithm is what I believe to be called the
"straw man" algorithm: the maze starts as a matrix of individual
cells. Break down the walls between two cells that aren't already
connected and move into it. Choose another wall at random and
repeat. If there are no walls left to break down, go back a
step. Lather, rinse, repeat. If you are back in the starting cell with
no more walls to break down, you are done.
Solving the maze is done in a similar way to generation. Go forward,
right, or left if you can. If not, go back to the previous cell. This
is the same algorithm I employed in the maze-navigating robot I built,
which has kept me pretty busy. I will post pictures of it later.
You can specify the maze height, width, and cell pixel size on the
command line when you run it. This runs with the defaults,
So, why learn Java? I still don't like Java, but I had missed out on
an interesting research opportunity because I had zero Java
experience. I probably wouldn't use it on my own for anything but
practice, as my own projects don't really need to be
super-portable. Java is ugly, bulky, and slow, but I don't want to
miss any more opportunities. Right after I was turned down because of
my lack of experience, I bought a Java textbook and read it cover to
cover on vacation. More importantly, I wrote simple little Java
programs (like the one presented here) as I learned core concepts.
Update 2009-07-07: I use Java all the time at work now. There is
really no escaping it. Except maybe with something like Clojure ...
This the the second part of my post about clusters. Finally, here are
some more pretty pictures and video to look at! It is an
extremely parallel Mandelbrot set generator written in C.
This image was generated by my program on a cluster at my university,
and my favorite image generated so far. It is part of the sequence in
the videos below.
The reason I built my own cluster was to run this program. The
generator forks off an arbitrary number of jobs (defined by a config
file) to generate a single fractal, or a fractal zoom sequence. The
cluster then automatically moves these jobs around to different nodes,
making the fractal generation fast.
I wrote it with two goals in mind. I wanted it to be parallel so that
it could easily take advantage of a cluster. I also wanted it to not
use any external libraries. This is because a cluster is often a
shared resource. Programs and libraries may only be available if
installed by an administrator, meaning that extra libraries
like libpng may not be
available. For inter-process communication the generator uses simple
pipes. So all you need here is a POSIX interface to the operating
system, rather than some MPI implementation.
I used Andy Owen's handy
bitmap library for writing out the bitmaps. I don't know how I
could have done without it!
The only thing you need in order to run the fractal generator is a C
compiler and a POSIX interface (GNU/Linux, *BSD, and other Unix-like
systems). Extra capabilities are available
if gzip is installed.
If you can't see the video above, here are two links to videos.
I use GNU Xaos to find
good locations in the set for zoom sequences. It lets you zoom in
real-time. Once I find a good spot, I tell my generator to render some
nice images as a zoom sequence to it. Sometime I hope to write in an
algorithm for auto-zooming. This way the generator could create zoom
sequences automatically for hours on end. Xaos already has this
capability for its real-time zooming.
An interesting thing I discovered: for these fractals at least, the
gzipped bitmaps are (barely) smaller than the equivalent PNG
versions. For the image above, the PNG version (produced
by ImageMagick defaults) is
11586185 bytes. The gzipped bitmap version is 11586074 bytes. Plus,
gzipping is faster. On my laptop, BMP to PNG (ImageMagick's convert)
took 13.678s. Gzipping (default options) took 5.167s. I am as
surprised as you are.
The project page with source code and documentation can be found
at
projects/mandel.
Don't stop here! This isn't everything. Check out the archives
(on the left) for more posts. Or just have a look at
the index.