null program

Converted to HTML 5

I converted this website to HTML 5 from its original XHTML 1.0 Transitional. Because I was already valid, it was actually pretty easy. Well, I first converted it to XHTML 1.0 Strict by shifting some things into CSS that should have been that way in the first place (results of laziness). Then when that validated I just changed the DOCTYPE to HTML 5, added a new meta tag, checked that validation and that was it. Really simple.

One of the main reasons was so that I could use the new video tag. From now on any posts with videos will display them as in-line videos (Ogg Vorbis), provided you are using a browser with HTML 5 support. In fact, I already did it with two older posts: Knight's Tour and the Mandelbrot set zoom videos. Here, check it out,

HTML 5 is still a moving target, so I may have to do some keeping up over time. Probably not.

Video is finally a first class web citizen and it is quite exciting.


Browser URL Mangling

Public Domain angry face. All but one of the major browsers do special URL mangling as a "feature", but I think of it as bad behavior. Specifically, Firefox gets it right while IE, Opera, Safari, and Chrome all do it wrong. The last three are likely replicating the broken behavior of IE. I would rather not do this.

I ran into this specific issue work. An IT person had sent out a URL to an important internal website, which contained links to other resources. However, this person, despite being in IT, wasn't very tech-savvy and wrote all of the links with backslashes instead of slashes. When I followed with Firefox I got a 404 Not Found because Firefox asked for a filename with a backslash in it -- proper behavior.

As first I thought the page was out of date with dead links, but it was obviously a brand new page. That's when I noticed that the links had the wrong slashes. A quick check with IE showed the links as working, which is why the IT person didn't notice his error. IE was automatically, and silently, converting the backslashes to slashes.

According to RFC 1738, all URLs must encode their backslashes because of buggy transport agents (and, years later, because of buggy browsers like IE). So, if you want to use a backslash in a URL you are supposed to encode it anyway, but if you don't most major browsers will silently mangle the URL. I really, really don't like this. I know what I am doing, Mr. Browser.

It's done to replicate IE behavior, and IE did it because it's a common error for Windows users to use backslashes when they shouldn't (Windows directory separators), so it is anticipating user error (and assuming it knows best!). Let's not cripple our software to bring it down the same level as the major shortcomings of Windows, the toy OS.

I wrote a little test suite for testing if your browser does this annoying mangling. You can try it with your browser below,

URL Mangling Test

Other than the browsers listed above, the only other browser I know that fails is Links (not Lynx). Note that Opera passes the "automatic" test, but fails the "manual" test, because it doesn't mangle the http-equiv URLs used in the automatic tests.

So beware using the browsers I listed at the beginning. Not only may they be sending you to the wrong URL, but they are crippling the web for the sake of Microsoft's errors.


The Emacs Calculator

Did you know that Emacs comes with a calculator? Woop-dee-doo! Call the presses! Wow, a whole calculator! Sounds a bit lame, right?

Actually, it's much more than just a simple calculator. It's a computer algebra system! It is officially called a calculator, which isn't fair. It's an understatement, and I am sure has caused many people to overlook it. I finally ran into it during a thorough (re)reading of the Emacs manuals and almost skipped over it myself.

Ever see that demonstration by Will Wright for the game Spore several years ago? The player starts as a single-cell organism and evolves into a civilization with interstellar presence. When he started the demo he showed a cell through what looked like a microscope. No one had any idea yet what the game was about, so every time he increased the scope, from bacteria to animal, animal to civilization, civilization to space travel, interplanetary travel to interstellar travel, there was a huge reaction from the audience. It was like those infomercials: "But that's not all!!!"

As I made my way through the Emacs calc manual I was continually amazed by its power, with a similar constant increase in scope. Each new page was almost saying, "But that's not all!!!" Like an infomercial I'm going to run through some of its features. See the calc manual for a real thorough introduction. It has practice exercises that shows some gotchas and interesting feature interactions.

Fire it up with C-x * c or M-x calc. There will be two new windows (Emacs windows, that is), one with the calculator and the other with usage history (the "trail").

First of all, the calculator operates on a stack and so its basic use is done with RPN. The stack builds vertically, downwards. Type in numbers and hit enter to push them onto the stack. Operators can be typed right after the number, so no need to hit enter all the time. Because negative (-) is reserved for subtraction an underscore _ is used to type a negative number. An example stack with 3, 4, and 10,

3:  3
2:  4
1:  10
    .

10 is at the "top" of the stack (indicated by the "1:"), so if we type a * the top two elements are multiplied. Like so,

2:  3
1:  40
    .

The calculator has no limitations on the size of integers, so you work with large numbers without losing precision. For example, we'll take 2^200.

2:  2
1:  200
    .

Apply the ^ operator,

1:  1606938044258990275541962092341162602522202993782792835301376
    .

But that's not all!!! It has a complex number type, which is entered in pairs (real, imaginary) with parenthesis. They can be operated on like any other number. Take -1 + 2i minus 4 + 2i,

2:  (-1, 2)
1:  (4, 2)
    .

Subtract with -,

1:  -5
    .

Then take the square root of that using Q, the square root function.

1:  (0., 2.2360679775)
    .

We can set the calculator's precision with p. The default is 12 places, showing here 1 / 7.

1:  0.142857142857
    .

If we adjust the precision to 50 and do it again,

2:  0.142857142857
1:  0.14285714285714285714285714285714285714285714285714
    .

Numbers can be displayed in various notations, too, like fixed-point, scientific notation, and engineering notation. It will switch between these without losing any information (the stored form is separate from the displayed form).

But that's not all!!! We can represent rational numbers precisely with ratios. These are entered with a :. Push on 1/7, 3/14, and 17/29,

3:  1:7
2:  3:13
1:  17:29
    .

And multiply them all together, which displays in the lowest form,

1:  51:2842
    .

There is a mode for working in these automatically.

But that's not all!!! We can change the radix. To enter a number with a different radix, which prefix it with the radix and a #. Here is how we enter 29 in base-2,

2#11101

We can change the display radix with d r. With 29 on the stack, here's base-4,

1:  4#131
    .

Base-16,

1:  16#1D
    .

Base-36,

1:  36#T
    .

But that's not all!!! We can enter algebraic expressions onto the stack with apostrophe, '. Symbols can be entered as part of the expression. Note: these expressions are not entered in RPN.

1:  a^3 + a^2 b / c d - a / b
    .

There is a "big" mode (d B) for easier reading,

          2
     3   a  b   a
1:  a  + ---- - -
         c d    b

    .

We can assign values to variables to have the expression evaluated. If we assign a to 10 and use the "evaluates-to" operator,

          2
     3   a  b   a             100 b   10
1:  a  + ---- - -  =>  1000 + ----- - --
         c d    b              c d    b

    .

But that's not all!!! There is a vector type for working with vectors and matrices and doing linear algebra. They are entered with brackets, [].

2:  [4, 1, 5]
1:  [ [ 1, 2, 3 ]
      [ 4, 5, 6 ]
      [ 6, 7, 8 ] ]
    .

And take the dot product, then take cross product of this vector and matrix,

2:  [38, 48, 58]
1:  [ [ -14, -18, -22 ]
      [ -19, -18, -17 ]
      [ 15,  18,  21  ] ]
    .

Any matrix and vector operator you could probably think of is available, including map and reduce (and you can define your own expression to apply).

We can use this to solve a linear system. Find x and y in terms of a and b,

x + a y = 6
x + b y = 10

Enter it (note we are using symbols),

2:  [6, 10]
1:  [ [ 1, a ]
      [ 1, b ] ]
    .

And divide,

          4 a     4
1:  [6 + -----, -----]
         a - b  b - a

    .

But that's not all!!! We can create graphs if GNU Plot is installed. We can give it two vectors, or an algebraic expression. This plot of sin(x) and x cos(x) was made with just a few keystrokes,

Sample plot of given functions.

But that's not all!!! There is an HMS type for handling times and angles. For 2 hours, 30 minutes, and 4 seconds, and some others,

3:  2@ 30' 4"
2:  4@ 22' 13"
1:  1@ 2' 56"
    .

Of course, the normal operators work as expected. We can add them all up,

1:  7@ 55' 13"
    .

We can convert between this and radians, and degrees, and so on.

But that's not all!!! The calculator also has a date type, entered inside angled brackets, <> (in algebra entry mode). It is really flexible on input dates. We can insert the current date with t N.

1:  <6:59:34pm Tue Jun 23, 2009>
    .

If we add numbers they are treated as days. Add 4,

1:  <6:59:34pm Sat Jun 27, 2009>
    .

It works with the HMS format from before too. Subtract 2@ 3' 15".

1:  <4:56:32pm Sat Jun 27, 2009>
    .

But that's not all!!! There is a modulo form for performing modulo arithmetic. For example, 17 mod 24,

1:  17 mod 24
    .

Add 10,

1:  3 mod 24
    .

This is most useful for forms such as n^p mod M, which this will handle efficiently. For example, 3^100000 mod 24. The naive way would be to find 3^100000 first, then take the modulus. This involves a computationally expensive middle step of calculating 3^100000, a huge number. The modulo form does it smarter.

But that's not all!!! The calculator can do unit conversions. The version of Emacs (22.3.1) I am typing in right now knows about 159 different units. For example, I push 65 mph onto the stack,

1:  65 mph
    .

Convert to meters per second with u c,

1:  29.0576 m / s
    .

It is flexible about mixing type of units. For example, I enter 3 cubic meters,

       3
1:  3 m

    .

I can convert to gallons,

1:  792.516157074 gal
    .

I work in a lab without Internet access during the day, so when I need to do various conversions Emacs is indispensable.

The speed of light is also a unit. I can enter 1 c and convert to meters per second,

1:  299792458 m / s
    .

But that's not all!!! As I said, it's a computer algebra system so it understands symbolic math. Remember those algebraic expressions from before? I can operate on those. Let's push some expressions onto the stack,

3:  ln(x)

       2   a x
2:  a x  + --- + c
            b

1:  y + c

    .

Multiply the top two, then add the third,

                2   a x
1:  ln(x) + (a x  + --- + c) (y + c)
                     b

    .

Expand with a x, then simplify with a s,

                 2   a x y              2   a c x    2
1:  ln(x) + a y x  + ----- + c y + a c x  + ----- + c
                       b                      b

    .

Now, one of the coolest features: calculus. Differentiate with respect to x, with a d,

    1             a y             a c
1:  - + 2 a y x + --- + 2 a c x + ---
    x              b               b

    .

Or undo that and integrate it,

                       3      2                  3        2
                  a y x    a x  y           a c x    a c x       2
1:  x ln(x) - x + ------ + ------ + c x y + ------ + ------ + x c
                    3       2 b               3       2 b

    .

That's just awesome! That's a text editor ... doing calculus!

So, that was most of the main features. It was kind of exhausting going through all of that, and I am only scratching the surface of what the calculator can do.

Naturally, it can be extended with some elisp. It provides a defmath macro specifically for this.

I bet (hope?) someday it will have a functions for doing Laplace and Fourier transforms.


United States Hamiltonian Paths

Awhile ago I wanted to find every Hamiltonian path in the contiguous 48 states. That is, trips that visit each state exactly once. Writing a program to search for Hamiltonian paths is easy ( I did this already). The most time consuming part was actually putting together the data that specified the graph to be searched. I hope someone somewhere finds it useful. Here is a map for reference,

Map us the lower 48 United States

It took me several passes before I stopped finding errors. I think I have it all right now, but there could still be some mistakes. If you see one, leave a comment and I'll fix it here. Here is the graph as an S-expression alist; the car (first) element in each list is a state, and the cdr (rest) is the unordered list of states that can be reached from it.

((me nh)
 (nh vt ma me)
 (vt ny ma nh)
 (ma ri ct ny nh vt)
 (ny pa nj ma ct vt)
 (ri ma ct)
 (ct ri ma ny)
 (nj pa ny de)
 (de md pa nj)
 (pa nj ny de md wv oh)
 (md pa de va wv)
 (va md wv ky tn nc)
 (nc va tn ga sc)
 (sc nc ga)
 (ga fl sc al nc tn)
 (al ms fl ga tn)
 (ms la ar tn al)
 (tn ms al ga nc va ky mo ar)
 (ky wv va tn mo il in oh)
 (wv md pa oh ky va)
 (oh pa wv ky in mi)
 (fl al ga)
 (mi wi oh in)
 (wi mn ia il mi)
 (il in ky mo ia wi)
 (in oh ky il mi)
 (mo il ky tn ar ok ks ne ia)
 (ar mo tn ms la tx ok)
 (la ms ar tx)
 (tx ok nm ar la)
 (ok ks mo ar tx nm co)
 (ks ok co ne mo)
 (ne sd ia mo ks co wy)
 (sd nd mn ia ne wy mt)
 (nd mt sd mn)
 (ia ne mo il wi mn sd)
 (mn wi ia sd nd)
 (mt id wy sd nd)
 (wy id ut co ne sd mt)
 (co ne ks ok nm ut wy)
 (nm co ok tx az)
 (az nm ut ca nv)
 (ut nv id wy co az)
 (id mt wy ut nv or wa)
 (wa or id)
 (or wa id nv ca)
 (nv or id ut az ca)
 (ca az nv or))

Note that all paths must start or end in Maine because it connects to only one other state.


Javascript Distributed Computing

Diagram of Javascript/browser computing grid. I'm not the first to come across this idea: the browser could be used as part of a distributed computing system. A web server hands out Javascript and the browser runs the script and reports the results back to the server. The browser is a portable, widely available platform so just about anyone can easily contribute, possibly without even knowing.

Browsers aren't really expecting this sort of thing. They will complain if a script is running for too long. If you tell Firefox to continue running the script anyway, it will lock up until the script is done (or when it complains again). This can be worked around by writing a simple scheduler with setTimeout().

This could also potentially be used as an alternative to advertising. Instead of selling advertising space, a website operator could sell visitor's CPU time by including a little snippet of code. This may be more successful, because most visitors would be unaware of it, making it less intrusive. It will be less likely to be blocked. Of course, there ethical issues about this. In fact, there is already a company doing this with secret Java applets.

There are two serious constraints on using Javascript in a browser as a distributed computing platform:

Low bandwidth. There isn't a lot of opportunity to transfer data between the server and the node, and nodes can't talk to each other. The data needed by a node must be small. The results data must also be small.

Short computational units. The Javascript in the browser has no way to store its running state between browser sessions, so it must rely on the server for this. This means that the units of work must be able to be completed within a short period of time. A few minutes at the most on a normal computer.

A lot of problems won't fit inside these limitations. One that I thought might was a Mersenne prime search. A Mersenne prime is a prime of the form,

2n - 1

So even though the largest known Mersenne prime has nearly 13 million digits (about 5 MB just to store the entire number), it can be described by it's exponent, 43,112,609, which is small enough to fit in a 4-byte integer. The result of a calculation, a probabilistic primality test, is a "yes" or "no". One bit. It fits the first constraint very well.

However, the smallest amount of work a node can do is an entire primality test. If we break it down any further, the prime will have been expanded and we will not fit the first constraint. There will be too much data. To see how possible it might be, I implemented it, which you can try out here,

/download/mersenne/ (sloppy code warning!)

I modified an existing Javascript bigint library which allowed me to get it up running quickly. After you receive the page, your browser will run the Miller-Rabin primality test on 2^9941 - 1. You can edit the source HTML to try a different exponent. Running it on several of my computers on different browsers it took anywhere from an hour to 8 hours. And that's only with an exponent of 9941. It's an unsuitable problem.

It would be neat to see a browser computing grid in action, but I can't think of a problem to solve with it.


E-mail Obfuscater Perl One-liner

If you look at the page sources around here you might notice that there are no bare e-mail addresses around. This is because I obfuscate them into a series of HTML entities. So far this has been pretty effective at hiding from address-scraping, web-crawling spam bots. They don't seem to try very hard at decoding HTML entities.

When I added the comment system, I needed to obfuscate addresses automatically. I quickly realized that this is yet another perl one-liner (and implemented as one line in the comment system). It can be used on the command line to obfuscate a file/pipe containing a list of addresses.

perl -lpe '$_ = join "", map {"&#" . ord() . ";"} split //'

All of the spaces are really only there for us humans,

perl -lpe '$_=join"",map{"&#".ord.";"}split//'

I keep running into these one-liners.


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.