null program

Memoization

I had written in a previous post about a neat feature of Lua. I found out later that this is simply a form of Memoization. The idea is that you trade memory for speed by only doing calculations once and keeping track of previously calculated values. I had even complained about Perl hashes not being flexible enough (not true, thanks to Tie::Hash). Perl actually has something even cooler, which is the Memoize module.

The module can memoize any function, although it is only useful on "pure" functions: functions with no side effects and not dependant external data that will change. The official documentation contains a nice example demonstrating a recursive implementation of a Fibonacci sequence generator. My example is a little program I wrote the other day where the memoize module came in handy.

You have coins valued at 1, 2, 5, 10, 20, 50, 100, and 200. How many different ways can 200 be made using any number of coins. A simple recursive solution is this: stick in each coin one at a time and ask the same question again. So, we use a coin worth 1, now the question is how many ways can we make change for 199. Then 198, then 195, then 190, etc. Because the order of the coins is not important these two sets are identical: (1 1 5) (5 1 1). So, to avoid counting the same set twice, we also want to tell the function the largest size coin to use from then on. Our function may look like this now (Perl),

use List::Util qw(sum);

sub count {
    my ($s, $m) = @_;
    return 1 if ($s == 0);
    
    my @valid = grep {$_ <= $s and $_ >= $m} @coins;
    return 0 if ($#valid == -1);
    
    return sum map {count($s - $_, $_)} @valid;
}

Where it is called as count(total, max_coin_value).

However, we will be calculating the same value twice many times over. For example, lets say we start filling the first 10 of 200 like this: (1 1 1 1 1 5) or (5 5). The next call to count will be count(190, 5) for both cases. Just like the recursive Fibonacci implementation, we are spending an enormous amount of time repeating ourselves. Running this for a value of 200 will take minutes. Running it for a value of 2000 may take days! Memoization to the rescue!

We will now add this,

use Memoize;
memoize('count');

The module has now transparently installed a new version of the function over our original. If we ever pass the same arguments that we already have passed, the module will look up the original calculated value and return it instead of calling the real function. It now can calculate the number of ways to make change for a value of 2000 in a couple seconds rather than days. That's how much redundant work the function was doing. Here is the whole thing,

#!/usr/bin/perl

use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(sum);

use Memoize;
memoize('count');

my @coins = (1, 2, 5, 10, 20, 50, 100, 200);

print count(200, 1);
print "\n";

sub count {
    my ($s, $m) = @_;
    return 1 if ($s == 0);
    
    my @valid = grep {$_ <= $s and $_ >= $m} @coins;
    return 0 if ($#valid == -1);
    
    return sum map {count($s - $_, $_)} @valid;
}

Now, to apply it to the Collatz problem from my previous post we get a nice simple little program,

#!/usr/bin/perl

use strict;
use warnings;
no warnings qw(recursion);
use List::Util qw(max);

use Memoize;
memoize('collatz');

while (<>) {
    my ($n, $m) = split;
    printf("$n $m %d\n", max map { collatz($_) } ($n..$m));
}

sub collatz {
    my $n = shift;
    return 1 if ($n == 1);
    return 1 + collatz(3*$n+1) if ($n & 1);
    return 1 + collatz($n/2);
}

I really do love Perl.


Lisp Number Representations

This exercise partly comes from a couple different chapters in the book The Little Schemer. The book is an introduction to the Scheme programming language, a dialect of Lisp. The purpose to to teach basic programming concepts in a way that anyone can follow along just as well as someone with a degree in, say, computer science. It is still very useful for us programmer types because there are some good practice you get from reading and playing along.

First of all, Lisp is famous (infamous?) for lacking syntax. Any Lisp program is simply an S-expression, put simply, a list of lists. There is no operator precedence because operators are treated just like functions. This leads to prefix notation for mathematical expressions,

(+ 4 5)
=> 9

where the => indicates the result of evaluating the expression. We can apply as many operands as we want,

(+ 2 3 4 5 10)
=> 24

We can put another list right in there as an operand,

(+ 3 (* 2 5) 4)
=> 17

You get the idea. In a function, the value of the last expression is the return value. For example, here is the square function in Scheme, which squares its input,

(define (square x)
  (* x x))

Then we can use it,

(+ (square 2) (square 5))
=> 29

There are three important list operators to understand as well: car, cdr, and cons. car returns the first element in a list. In the example below, the ', a single quote, tells the interpreter or compiler that the list is to be treated as data and not to be executed. This is shorthand, or syntactic sugar, for the quote operator: (quote (stallman moglen)) is the same as '(stallman moglen).

(car '(stallman moglen lessig))
=> stallman

cdr returns the "rest" of a list (everything but the car of the list). When passing a list with only one element cdr returns the empty list: ().

(cdr '(stallman moglen lessig))
=> (moglen lessig)
(cdr '(stallman))
=> ()

We can ask if a list is empty or not with null?. #t and #f are true and false.

(null? '(stallman moglen lessig))
=> #f
(null? '())
=> #t

And finally, for lists, we have cons. This function allows us to build a list. It glues the first argument to the front of the list in the second argument,

(cons 'stallman '(moglen lessig))
=> (stallman moglen lessig)
(cons 'stallman '())
=> (stallman)

And one last function you need to know: eq?. It determines the two atoms are the same atom,

(eq? 'stallman 'moglen)
=> #f
(eq? 'stallman 'stallman)
=> #t

Now, for this exercise we will pretend that the basic arithmetic functions have not been defined for us. Instead all we have is add1 and sub1, each of which adds or subtracts 1 from its argument respectively.

(add1 5)
=> 6
(sub1 5)
=> 4

Oh, I almost forgot. We also have the zero? function defined for us, which tells us if its argument is 0 or not. Notice that functions that return true or false, called predicates, have a ? on the end.

(zero? 2)
=> #f
(zero? 0)
=> #t

To make things simple, these definitions will only consider positive numbers. We can define the + function (for only two arguments) in terms of the three basic functions shown above. It might be interesting to try to write this yourself before you look any further. (Hint: define it recursively!)

;; Adds together n and m
(define (+ n m)
  (if (zero? m) n
      (add1 (+ n (sub1 m)))))

If the second argument is 0 we are done and simply return the first argument. If not, we add 1 to n + (m - 1). The - function is defined similarly.

;; Subtracts m from n
(define (- n m)
  (if (zero? m) n
      (sub1 (- n (sub1 m)))))

Multiplication is the act of performing addition many times. We can go on defining it in terms of addition,

(define (* n m)
  (if (zero? m) 0
      (+ n (* n (sub1 m)))))

(We'll leave division as an exercise for the reader as it gets a little more complicated than I need to go in order to get my overall point across.)

We will leave math behind for a moment take a look at The Roots of Lisp. In that link is an excellent paper written by Paul Graham about John McCarthy, the inventor (or perhaps discoverer?) of Lisp, and how Lisp came to be. It turns out that in order to have a fully functional Lisp engine we only need seven primitive operators: operators defined outside of the language itself as building blocks for the language. For Lisp these seven operators are (Scheme-ized for our purposes): eq?, atom?, car, cdr, cons, quote, and if.

Notice how none of these are math operators. You may wonder how we can possibly perform mathematical operations when we lack these facilities. The answer: we have to define our own representation for numbers! Let's try this, define a number as a list of empty lists. So, the number 3 is,

'(() () ())

And here is 0, 2, and 4,

'()
'(() ())
'(() () () ())

See how that works? Before, when we wanted to define addition and subtraction, we needed three other functions: zero?, add1, and sub1. With our number representation, how could we define add1 with our seven primitive operators? Our numbers are defined as lists, so we can use our list operators. To add 1 to a number, we append another empty list. Hey, that sounds a lot like cons!

(define (add1 n)
  (cons '() n))

Subtraction is removing an element from the list, which sounds a lot like cdr,

(define (sub1 n)
  (cdr n))

And to define zero? we need to check for an empty list. Notice this will also be the definition for null?.

(define (zero? n)
  (eq? '() n))

And now we are back where we started. In fact, you can use the exact definitions above to define +, -, and *. Our entire method number representation depends on how we define add1, sub1, and zero?. Let's try it out,

;; 3 + 4
(+ '(() () ()) '(() () () ()))
=> (() () () () () () ())

;; 5 - 2
(- '(() () () () ()) '(() ()))
=> (() () ())

;; 2 * 2
(* '(() ()) '(() ()))
=> (() () () ())

;; 3 + 4 * 2   bolded for clarity
(+ (* '(() () () ()) '(() ())) '(() () ()))
=> (() () () () () () () () () () ())

Pretty cool, huh? We just added arithmetic (albeit extremely simple) to our basic Lisp engine. With some modifications we should be able to define and operate on negative integers and even define any rational number (limited by how much memory your computer's hardware can provide).

Now, thank goodness this isn't how real Lisp implementations actually handle numbers. It would be incredibly slow and impractical, not to mention annoying to read. Normally, numbers and math operators are primitive so that they are fast.


Rumors of My Death Have Been Greatly Exaggerated

It has been a few weeks since I made my last post. I just started my new job at Johns Hopkins University Applied Physics Laboratory, which has reminded my just how exhausting starting a new job can be. There is so much to learn and balance all at once, and when I get home from work I don't feel like doing anything constructive. This is why I haven't posted anything recently.

However, I have been working on the problems (25 completed so far) over at Project Euler. If you like programming and/or math, I strongly suggest you check it out. It would also be handy to be familiar with a language or system that natively supports (ruby, python, scheme, common lisp, bc, etc.) or transparently supports (perl with the bignum module) multi-precision math. Many of the "tricks" of the problems involve very large numbers, well beyond normal 32-bit or 64-bit integers.

I would also suggest you follow the "rules": any program you write should have a running time of less than a minute. All the problems can be solved with programs that don't need to run overnight. And don't search Google for answers until you have solved it yourself. If you do this, you will be missing out!

Anyway, hopefully I will get something interesting written up soon. I want to get back to my once a week schedule. And happy Pi Day and Talk Like a Physicist Day!


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.