I had written
in a
previous post about a neat feature of Lua. I found out later that
this is simply a form
of Memoization. The
idea is that you trade memory for speed by only doing calculations
once and keeping track of previously calculated values. I had even
complained about Perl hashes not being flexible enough (not true,
thanks
to Tie::Hash). Perl
actually has something even cooler, which is the
Memoize
module.
The module can memoize any function, although it is only useful on
"pure" functions: functions with no side effects and not dependant
external data that will change. The official documentation contains a
nice example demonstrating a recursive implementation of a Fibonacci
sequence generator. My example is a little program I wrote the other
day where the memoize module came in handy.
You have coins valued at 1, 2, 5, 10, 20, 50, 100, and 200. How many
different ways can 200 be made using any number of coins. A simple
recursive solution is this: stick in each coin one at a time and ask
the same question again. So, we use a coin worth 1, now the question
is how many ways can we make change for 199. Then 198, then 195, then
190, etc. Because the order of the coins is not important these two
sets are identical: (1 1 5) (5 1 1). So, to avoid counting the same
set twice, we also want to tell the function the largest size coin to
use from then on. Our function may look like this now (Perl),
use List::Util qw(sum);
sub count {
my ($s, $m) = @_;
return 1 if ($s == 0);
my @valid = grep {$_ <= $s and $_ >= $m} @coins;
return 0 if ($#valid == -1);
return sum map {count($s - $_, $_)} @valid;
}
Where it is called as count(total, max_coin_value).
However, we will be calculating the same value twice many times
over. For example, lets say we start filling the first 10 of 200 like
this: (1 1 1 1 1 5) or (5 5). The next call to count will
be count(190, 5) for both cases. Just like the recursive
Fibonacci implementation, we are spending an enormous amount of time
repeating ourselves. Running this for a value of 200 will take
minutes. Running it for a value of 2000 may take days! Memoization to
the rescue!
We will now add this,
use Memoize;
memoize('count');
The module has now transparently installed a new version of the
function over our original. If we ever pass the same arguments that we
already have passed, the module will look up the original calculated
value and return it instead of calling the real function. It now can
calculate the number of ways to make change for a value of 2000 in a
couple seconds rather than days. That's how much redundant work the
function was doing. Here is the whole thing,
#!/usr/bin/perl
use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(sum);
use Memoize;
memoize('count');
my @coins = (1, 2, 5, 10, 20, 50, 100, 200);
print count(200, 1);
print "\n";
sub count {
my ($s, $m) = @_;
return 1 if ($s == 0);
my @valid = grep {$_ <= $s and $_ >= $m} @coins;
return 0 if ($#valid == -1);
return sum map {count($s - $_, $_)} @valid;
}
Now, to apply it to the Collatz problem from my previous post we get a
nice simple little program,
#!/usr/bin/perl
use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(max);
use Memoize;
memoize('collatz');
while (<>) {
my ($n, $m) = split;
printf("$n $m %d\n", max map { collatz($_) } ($n..$m));
}
sub collatz {
my $n = shift;
return 1 if ($n == 1);
return 1 + collatz(3*$n+1) if ($n & 1);
return 1 + collatz($n/2);
}
This exercise partly comes from a couple different chapters in the
book The
Little Schemer. The book is an introduction to the Scheme
programming language, a dialect of Lisp. The purpose to to teach basic
programming concepts in a way that anyone can follow along just as
well as someone with a degree in, say, computer science. It is still
very useful for us programmer types because there are some good
practice you get from reading and playing along.
First of all, Lisp is famous (infamous?) for lacking syntax. Any Lisp
program is simply
an S-expression,
put simply, a list of lists. There is no operator precedence because
operators are treated just like functions. This leads to prefix
notation for mathematical expressions,
(+ 4 5)
=> 9
where the => indicates the result of evaluating the
expression. We can apply as many operands as we want,
(+ 2 3 4 5 10)
=> 24
We can put another list right in there as an operand,
(+ 3 (* 2 5) 4)
=> 17
You get the idea. In a function, the value of the last expression is
the return value. For example, here is the square
function in Scheme, which squares its input,
(define (square x)
(* x x))
Then we can use it,
(+ (square 2) (square 5))
=> 29
There are three important list operators to understand as
well: car, cdr,
and cons. car returns the first element in a
list. In the example below, the ', a single quote, tells
the interpreter or compiler that the list is to be treated as data and
not to be executed. This is shorthand, or syntactic sugar, for
the quote operator: (quote (stallman
moglen)) is the same as '(stallman moglen).
(car '(stallman moglen lessig))
=> stallman
cdr returns the "rest" of a list (everything but
the car of the list). When passing a list with only one
element cdr returns the empty list: ().
And finally, for lists, we have cons. This function
allows us to build a list. It glues the first argument to the front of
the list in the second argument,
Now, for this exercise we will pretend that the basic arithmetic
functions have not been defined for us. Instead all we have
is add1 and sub1, each of which adds or
subtracts 1 from its argument respectively.
(add1 5)
=> 6
(sub1 5)
=> 4
Oh, I almost forgot. We also have the zero? function
defined for us, which tells us if its argument is 0 or not. Notice
that functions that return true or false, called predicates, have
a ? on the end.
(zero? 2)
=> #f
(zero? 0)
=> #t
To make things simple, these definitions will only consider positive
numbers. We can define the + function (for only two
arguments) in terms of the three basic functions shown above. It might
be interesting to try to write this yourself before you look any
further. (Hint: define it recursively!)
;; Adds together n and m
(define (+ n m)
(if (zero? m) n
(add1 (+ n (sub1 m)))))
If the second argument is 0 we are done and simply return the first
argument. If not, we add 1 to n + (m -
1). The - function is defined similarly.
;; Subtracts m from n
(define (- n m)
(if (zero? m) n
(sub1 (- n (sub1 m)))))
Multiplication is the act of performing addition many times. We can go
on defining it in terms of addition,
(define (* n m)
(if (zero? m) 0
(+ n (* n (sub1 m)))))
(We'll leave division as an exercise for the reader as it gets a
little more complicated than I need to go in order to get my overall
point across.)
We will leave math behind for a moment take a look at
The Roots of
Lisp. In that link is an excellent paper written by Paul Graham
about John McCarthy, the inventor (or perhaps discoverer?) of Lisp,
and how Lisp came to be. It turns out that in order to have a fully
functional Lisp engine we only need seven primitive operators:
operators defined outside of the language itself as building blocks
for the language. For Lisp these seven operators are (Scheme-ized for
our purposes): eq?, atom?, car,
cdr, cons, quote, and
if.
Notice how none of these are math operators. You may wonder how we can
possibly perform mathematical operations when we lack these
facilities. The answer: we have to define our own representation for
numbers! Let's try this, define a number as a list of empty lists. So,
the number 3 is,
'(() () ())
And here is 0, 2, and 4,
'()
'(() ())
'(() () () ())
See how that works? Before, when we wanted to define addition and
subtraction, we needed three other
functions: zero?, add1,
and sub1. With our number representation, how could we
define add1 with our seven primitive operators? Our
numbers are defined as lists, so we can use our list operators. To add
1 to a number, we append another empty list. Hey, that sounds a lot
like cons!
(define (add1 n)
(cons '() n))
Subtraction is removing an element from the list, which sounds a lot
like cdr,
(define (sub1 n)
(cdr n))
And to define zero? we need to check for an empty
list. Notice this will also be the definition for null?.
(define (zero? n)
(eq? '() n))
And now we are back where we started. In fact, you can use the exact
definitions above to define +, -,
and *. Our entire method number representation depends on
how we define add1, sub1,
and zero?. Let's try it out,
Pretty cool, huh? We just added arithmetic (albeit extremely simple)
to our basic Lisp engine. With some modifications we should be able to
define and operate on negative integers and even define any rational
number (limited by how much memory your computer's hardware can
provide).
Now, thank goodness this isn't how real Lisp implementations actually
handle numbers. It would be incredibly slow and impractical, not to
mention annoying to read. Normally, numbers and math operators are
primitive so that they are fast.
It has been a few weeks since I made my last post. I just started my
new job at Johns Hopkins University Applied Physics Laboratory, which
has reminded my just how exhausting starting a new job can be. There
is so much to learn and balance all at once, and when I get home from
work I don't feel like doing anything constructive. This is why I
haven't posted anything recently.
However, I have been working on the problems (25 completed so
far) over at
Project Euler. If you like
programming and/or math, I strongly suggest you check it out. It would
also be handy to be familiar with a language or system that natively
supports (ruby, python, scheme, common lisp, bc, etc.) or
transparently supports (perl with the bignum module) multi-precision
math. Many of the "tricks" of the problems involve very large numbers,
well beyond normal 32-bit or 64-bit integers.
I would also suggest you follow the "rules": any program you write
should have a running time of less than a minute. All the problems can
be solved with programs that don't need to run overnight. And don't
search Google for answers until you have solved it yourself. If you do
this, you will be missing out!
Anyway, hopefully I will get something interesting written up soon. I
want to get back to my once a week schedule. And happy Pi Day
and Talk Like a Physicist
Day!
Don't stop here! This isn't everything. Check out the archives
(on the left) for more posts. Or just have a look at
the index.