I converted this website to HTML 5 from its original XHTML 1.0
Transitional. Because I was already valid, it was actually pretty
easy. Well, I first converted it to XHTML 1.0 Strict by shifting some
things into CSS that should have been that way in the first place
(results of laziness). Then when that validated I just changed the
DOCTYPE to HTML 5, added a new meta tag, checked
that validation and that was it. Really simple.
One of the main reasons was so that I could use the new
video tag. From now on any posts with videos will display
them as in-line videos (Ogg Vorbis), provided you are using a browser
with HTML 5 support. In fact, I already did it with two older posts:
Knight's Tour and the Mandelbrot set zoom videos. Here, check it
out,
HTML 5 is still a moving target, so I may have to do some
keeping up over time. Probably not.
Video is finally a first class web citizen and it is quite exciting.
All but one of the major browsers do special URL mangling as a
"feature", but I think of it as bad behavior. Specifically, Firefox
gets it right while IE, Opera, Safari, and Chrome all do it wrong. The
last three are likely replicating the broken behavior of IE. I would
rather not do this.
I ran into this specific issue work. An IT person had sent out a URL
to an important internal website, which contained links to other
resources. However, this person, despite being in IT, wasn't very
tech-savvy and wrote all of the links with backslashes instead of
slashes. When I followed with Firefox I got a 404 Not
Found because Firefox asked for a filename with a backslash in
it -- proper behavior.
As first I thought the page was out of date with dead links, but it
was obviously a brand new page. That's when I noticed that the links
had the wrong slashes. A quick check with IE showed the links as
working, which is why the IT person didn't notice his error. IE was
automatically, and silently, converting the backslashes to slashes.
According to RFC
1738, all URLs must encode their backslashes because of buggy
transport agents (and, years later, because of buggy browsers like
IE). So, if you want to use a backslash in a URL you are supposed to
encode it anyway, but if you don't most major browsers will
silently mangle the URL. I really, really don't like this. I know
what I am doing, Mr. Browser.
It's done to replicate IE behavior, and IE did it because it's a
common error for Windows users to use backslashes when they shouldn't
(Windows directory separators), so it is anticipating user error (and
assuming it knows best!). Let's not cripple our software to bring it
down the same level as the major shortcomings of Windows, the toy OS.
I wrote a little test suite for testing if your browser does this
annoying mangling. You can try it with your browser below,
Other than the browsers listed above, the only other browser I know
that fails is Links (not
Lynx). Note that Opera passes the "automatic" test, but fails the
"manual" test, because it doesn't mangle the http-equiv
URLs used in the automatic tests.
So beware using the browsers I listed at the beginning. Not only may
they be sending you to the wrong URL, but they are crippling the web
for the sake of Microsoft's errors.
Did you know that Emacs comes with a
calculator? Woop-dee-doo! Call the presses! Wow, a whole
calculator! Sounds a bit lame, right?
Actually, it's much more than just a simple calculator. It's a computer
algebra system! It is officially called a calculator, which isn't
fair. It's an understatement, and I am sure has caused many people to
overlook it. I finally ran into it during a thorough (re)reading of
the Emacs manuals and almost skipped over it myself.
Ever see that demonstration by Will Wright for the game Spore
several years ago? The player starts as a single-cell organism and
evolves into a civilization with interstellar presence. When he
started the demo he showed a cell through what looked like a
microscope. No one had any idea yet what the game was about, so every
time he increased the scope, from bacteria to animal, animal to
civilization, civilization to space travel, interplanetary travel to
interstellar travel, there was a huge reaction from the audience. It
was like those infomercials: "But that's not all!!!"
As I made my way through the Emacs calc manual I was continually
amazed by its power, with a similar constant increase in scope. Each
new page was almost saying, "But that's not all!!!" Like an
infomercial I'm going to run through some of its features. See the
calc manual for a real thorough introduction. It has practice
exercises that shows some gotchas and interesting feature
interactions.
Fire it up with C-x * c or M-x calc. There
will be two new windows (Emacs windows, that is), one with the
calculator and the other with usage history (the "trail").
First of all, the calculator operates on a stack and so its basic use
is done with RPN. The stack builds vertically, downwards. Type in
numbers and hit enter to push them onto the stack. Operators can be
typed right after the number, so no need to hit enter all the
time. Because negative (-) is reserved for subtraction an
underscore _ is used to type a negative number. An
example stack with 3, 4, and 10,
3: 3
2: 4
1: 10
.
10 is at the "top" of the stack (indicated by the "1:"), so if we type
a * the top two elements are multiplied. Like so,
2: 3
1: 40
.
The calculator has no limitations on the size of integers, so you work
with large numbers without losing precision. For example, we'll
take 2^200.
But that's not all!!! It has a complex number type, which is entered
in pairs (real, imaginary) with parenthesis. They can be operated on
like any other number. Take -1 + 2i minus 4 +
2i,
2: (-1, 2)
1: (4, 2)
.
Subtract with -,
1: -5
.
Then take the square root of that using Q, the square
root function.
1: (0., 2.2360679775)
.
We can set the calculator's precision with p. The default
is 12 places, showing here 1 / 7.
Numbers can be displayed in various notations, too, like fixed-point,
scientific notation, and engineering notation. It will switch between
these without losing any information (the stored form is separate from
the displayed form).
But that's not all!!! We can represent rational numbers precisely with
ratios. These are entered with a :. Push
on 1/7, 3/14, and 17/29,
3: 1:7
2: 3:13
1: 17:29
.
And multiply them all together, which displays in the lowest form,
1: 51:2842
.
There is a mode for working in these automatically.
But that's not all!!! We can change the radix. To enter a number with
a different radix, which prefix it with the radix and a
#. Here is how we enter 29 in base-2,
2#11101
We can change the display radix with d r. With 29 on the
stack, here's base-4,
1: 4#131
.
Base-16,
1: 16#1D
.
Base-36,
1: 36#T
.
But that's not all!!! We can enter algebraic expressions onto the
stack with apostrophe, '. Symbols can be entered as part
of the expression. Note: these expressions are not entered in RPN.
1: a^3 + a^2 b / c d - a / b
.
There is a "big" mode (d B) for easier reading,
2
3 a b a
1: a + ---- - -
c d b
.
We can assign values to variables to have the expression evaluated. If
we assign a to 10 and use the "evaluates-to" operator,
2
3 a b a 100 b 10
1: a + ---- - - => 1000 + ----- - --
c d b c d b
.
But that's not all!!! There is a vector type for working with vectors
and matrices and doing linear algebra. They are entered with
brackets, [].
Any matrix and vector operator you could probably think of is
available, including map and reduce (and you can define your own
expression to apply).
We can use this to solve a linear system. Find x
and y in terms of a and b,
x + a y = 6
x + b y = 10
Enter it (note we are using symbols),
2: [6, 10]
1: [ [ 1, a ]
[ 1, b ] ]
.
And divide,
4 a 4
1: [6 + -----, -----]
a - b b - a
.
But that's not all!!! We can create graphs if GNU Plot is
installed. We can give it two vectors, or an algebraic
expression. This plot of sin(x) and x cos(x)
was made with just a few keystrokes,
But that's not all!!! There is an HMS type for handling times and
angles. For 2 hours, 30 minutes, and 4 seconds, and some others,
3: 2@ 30' 4"
2: 4@ 22' 13"
1: 1@ 2' 56"
.
Of course, the normal operators work as expected. We can add them all up,
1: 7@ 55' 13"
.
We can convert between this and radians, and degrees, and so on.
But that's not all!!! The calculator also has a date type, entered
inside angled brackets, <> (in algebra entry
mode). It is really flexible on input dates. We can insert the current
date with t N.
1: <6:59:34pm Tue Jun 23, 2009>
.
If we add numbers they are treated as days. Add 4,
1: <6:59:34pm Sat Jun 27, 2009>
.
It works with the HMS format from before too. Subtract 2@ 3'
15".
1: <4:56:32pm Sat Jun 27, 2009>
.
But that's not all!!! There is a modulo form for performing modulo
arithmetic. For example, 17 mod 24,
1: 17 mod 24
.
Add 10,
1: 3 mod 24
.
This is most useful for forms such as n^p mod M, which
this will handle efficiently. For example, 3^100000 mod
24. The naive way would be to find 3^100000 first,
then take the modulus. This involves a computationally expensive
middle step of calculating 3^100000, a huge number. The
modulo form does it smarter.
But that's not all!!! The calculator can do unit conversions. The
version of Emacs (22.3.1) I am typing in right now knows about 159
different units. For example, I push 65 mph onto the stack,
1: 65 mph
.
Convert to meters per second with u c,
1: 29.0576 m / s
.
It is flexible about mixing type of units. For example, I enter 3
cubic meters,
3
1: 3 m
.
I can convert to gallons,
1: 792.516157074 gal
.
I work in a lab without Internet access during the day, so when I need
to do various conversions Emacs is indispensable.
The speed of light is also a unit. I can enter 1 c and
convert to meters per second,
1: 299792458 m / s
.
But that's not all!!! As I said, it's a computer algebra system so it
understands symbolic math. Remember those algebraic expressions from
before? I can operate on those. Let's push some expressions onto the
stack,
3: ln(x)
2 a x
2: a x + --- + c
b
1: y + c
.
Multiply the top two, then add the third,
2 a x
1: ln(x) + (a x + --- + c) (y + c)
b
.
Expand with a x, then simplify with a s,
2 a x y 2 a c x 2
1: ln(x) + a y x + ----- + c y + a c x + ----- + c
b b
.
Now, one of the coolest features: calculus. Differentiate with respect
to x, with a d,
1 a y a c
1: - + 2 a y x + --- + 2 a c x + ---
x b b
.
Or undo that and integrate it,
3 2 3 2
a y x a x y a c x a c x 2
1: x ln(x) - x + ------ + ------ + c x y + ------ + ------ + x c
3 2 b 3 2 b
.
That's just awesome! That's a text editor ... doing calculus!
So, that was most of the main features. It was kind of exhausting
going through all of that, and I am only scratching the surface of
what the calculator can do.
Naturally, it can be extended with some elisp. It provides a
defmath macro specifically for this.
I bet (hope?) someday it will have a functions for doing Laplace and
Fourier transforms.
Awhile ago I wanted to
find every Hamiltonian path in the contiguous 48 states. That is,
trips that visit each state exactly once. Writing a program to search
for Hamiltonian paths is easy ( I did this
already). The most time consuming part was actually putting
together the data that specified the graph to be searched. I hope
someone somewhere finds it useful. Here is a map for reference,
It took me several passes before I stopped finding errors. I
think I have it all right now, but there could still be some
mistakes. If you see one, leave a comment and I'll fix it here. Here
is the graph as an S-expression
alist; the car (first) element in each list is a state, and the
cdr (rest) is the unordered list of states that can be reached from
it.
((me nh)
(nh vt ma me)
(vt ny ma nh)
(ma ri ct ny nh vt)
(ny pa nj ma ct vt)
(ri ma ct)
(ct ri ma ny)
(nj pa ny de)
(de md pa nj)
(pa nj ny de md wv oh)
(md pa de va wv)
(va md wv ky tn nc)
(nc va tn ga sc)
(sc nc ga)
(ga fl sc al nc tn)
(al ms fl ga tn)
(ms la ar tn al)
(tn ms al ga nc va ky mo ar)
(ky wv va tn mo il in oh)
(wv md pa oh ky va)
(oh pa wv ky in mi)
(fl al ga)
(mi wi oh in)
(wi mn ia il mi)
(il in ky mo ia wi)
(in oh ky il mi)
(mo il ky tn ar ok ks ne ia)
(ar mo tn ms la tx ok)
(la ms ar tx)
(tx ok nm ar la)
(ok ks mo ar tx nm co)
(ks ok co ne mo)
(ne sd ia mo ks co wy)
(sd nd mn ia ne wy mt)
(nd mt sd mn)
(ia ne mo il wi mn sd)
(mn wi ia sd nd)
(mt id wy sd nd)
(wy id ut co ne sd mt)
(co ne ks ok nm ut wy)
(nm co ok tx az)
(az nm ut ca nv)
(ut nv id wy co az)
(id mt wy ut nv or wa)
(wa or id)
(or wa id nv ca)
(nv or id ut az ca)
(ca az nv or))
Note that all paths must start or end in Maine because it connects to
only one other state.
I'm not the first to come across this idea: the browser could be used
as part of a distributed computing system. A web server hands out
Javascript and the browser runs the script and reports the results
back to the server. The browser is a portable, widely available
platform so just about anyone can easily contribute, possibly without
even knowing.
Browsers aren't really expecting this sort of thing. They will
complain if a script is running for too long. If you tell Firefox to
continue running the script anyway, it will lock up until the script
is done (or when it complains again). This can be worked around by writing a simple
scheduler with setTimeout().
This could also potentially be used as an alternative to
advertising. Instead of selling advertising space, a website operator
could sell visitor's CPU time by including a little snippet of
code. This may be more successful, because most visitors would be
unaware of it, making it less intrusive. It will be less likely to be
blocked. Of course, there ethical issues about this. In fact, there is already a
company doing this with secret Java applets.
There are two serious constraints on using Javascript in a browser as
a distributed computing platform:
Low bandwidth. There isn't a lot of opportunity to transfer data
between the server and the node, and nodes can't talk to
each other. The data needed by a node must be small. The results data
must also be small.
Short computational units. The Javascript in the browser has no
way to store its running state between browser sessions, so it must
rely on the server for this. This means that the units of work must be
able to be completed within a short period of time. A few minutes at
the most on a normal computer.
A lot of problems won't fit inside these limitations. One that I
thought might was a Mersenne prime
search. A Mersenne prime is a prime of the form,
2n - 1
So even though the largest known Mersenne prime has nearly 13 million
digits (about 5 MB just to store the entire number), it can be
described by it's exponent, 43,112,609, which is small
enough to fit in a 4-byte integer. The result of a calculation, a
probabilistic primality test, is a "yes" or "no". One bit. It fits the
first constraint very well.
However, the smallest amount of work a node can do is an entire
primality test. If we break it down any further, the prime will have
been expanded and we will not fit the first constraint. There will be
too much data. To see how possible it might be, I implemented it,
which you can try out here,
I modified an
existing Javascript bigint library which allowed me to get it up
running quickly. After you receive the page, your browser will run the
Miller-Rabin primality test on 2^9941 - 1. You can
edit the source HTML to try a different exponent. Running it on
several of my computers on different browsers it took anywhere from an
hour to 8 hours. And that's only with an exponent of
9941. It's an unsuitable problem.
It would be neat to see a browser computing grid in action, but I
can't think of a problem to solve with it.
If you look at the page sources around here you might notice that
there are no bare e-mail addresses around. This is because I obfuscate
them into a series of HTML entities. So far this has been pretty
effective at hiding from address-scraping, web-crawling spam
bots. They don't seem to try very hard at decoding HTML entities.
When I added the comment system, I needed to obfuscate addresses
automatically. I quickly realized that this is yet another perl
one-liner (and implemented as one line in the comment system). It can
be used on the command line to obfuscate a file/pipe containing a list
of addresses.