Christopher Wellons
mosquitopsu@gmail.com
Public GPG Key
Post Index - see a list of all of the entries.
Born and raised in Pittsburgh, I am currently a happily married computer engineer working at the Johns Hopkins University Applied Physics Laboratory in Maryland.
Most of the topics you will find here are about hobby computing and programming with Free software.
Local resources:
- Git repositories
- pastebin @ nullprogram
- Custom RSS feeds
- Greasemonkey scripts
Projects:
- Brainfuck Compiler
- PNG Archiver
- BINI Tools
-
Parallel Mandelbrot Generator
Blogs I follow:
- Good Math, Bad Math
- God Plays Dice
- Twenty Sided
- Terminally Incoherent
- The Universe of Discourse
- Friendly Robot Overlord
- /dev/random
- ReWired for Sound
Web Comics I follow:
- Cyanide and Happiness
* DM of the Rings
- Darths & Droids
* A Modest Destiny
- Order of the Stick
- xkcd
- Abstruse Goose
* Perry Bible Fellowship
- White Ninja
* complete or stagnant
Archives:
2007
- September 2007
- October 2007
- November 2007
- December 2007
2008
- January 2008
- February 2008
- March 2008
- April 2008
- June 2008
- July 2008
- August 2008
- September 2008
- December 2008
2009
- January 2009
- February 2009
- March 2009
- April 2009
- May 2009
- June 2009
- July 2009
This site is powered by blosxom (modified), pollxn (modified), Perl, Debian GNU/Linux, DreamHost, and the pinnacle of human achievement: Emacs.
Earlier this year I implemented the RinkWorks fantasy name generator in Perl. I think lisp lends itself even better for that, and so I have a partial elisp implementation for you.
What stands out for me is that the patterns can easily be represented as a S-expression. We represent substitutions with symbols, literals with strings, and groups with lists. For example, this pattern,
s(ith|<'C>)V
can be represented in code as,
(s ("ith" ("'" C)) V)
I want a function I can apply to this to generate a name. First, I set up an association list with symbols and its replacements,
(defvar namegen-subs '((s ach ack ad age ald ale an ang ar ard as ash at ath augh aw ban bel bur cer cha che dan dar del den dra dyn ech eld elm em en end eng enth er ess est et gar gha hat hin hon ia ight ild im ina ine ing ir is iss it kal kel kim kin ler lor lye mor mos nal ny nys old om on or orm os ough per pol qua que rad rak ran ray ril ris rod roth ryn sam say ser shy skel sul tai tan tas ther tia tin ton tor tur um und unt urn usk ust ver ves vor war wor yer) (v a e i o u y) ... (d elch idiot ob og ok olph olt omph ong onk oo oob oof oog ook ooz org ork orm oron ub uck ug ulf ult um umb ump umph un unb ung unk unph unt uzz)) "Substitutions for the name generator.")
Since we will need this in a couple places, make a function to randomly select an element from a list,
(defun randth (lst) "Select random element from the given list." (nth (random (length lst)) lst))
A function for replacing a symbol,
(defun namegen-select (sym) "Select a replacement for the given symbol." (if (null (assoc sym namegen-subs)) (throw 'bad-symbol (concat "Invalid substitution symbol: " (format "%s" sym))) (symbol-name (randth (cdr (assoc sym namegen-subs))))))
And finally, the generator. Find a string, pass it through, find a symbol, substitute it, find a list, pick one element and recurse on it.
(defun namegen (sexp) "Generate a name from the given sexp generator." (cond ((null sexp) "") ((stringp sexp) sexp) ((symbolp sexp) (namegen-select sexp)) ((listp sexp) (concat (if (listp (car sexp)) (namegen (randth (car sexp))) (namegen (car sexp))) (namegen (cdr sexp))))))
That's it! We can apply it to the expression above,
(namegen '(s ("ith" ("'" C)) V))
-> "rynithi"
But that's really the easy part. The hard part would be converting the original pattern into the S-expression, which I don't plan on doing right now.
Something else to note: this is thousands of times faster than the Perl version I wrote earlier.
I threw the code in with the rest of my name generation code (namegen.el),
git clone http://git.nullprogram.com/fantasyname.git
S-expressions are handy anywhere.
coralized permalink
short permalink
long permalink
permanent link
I converted this website to HTML 5 from its original XHTML 1.0
Transitional. Because I was already valid, it was actually pretty
easy. Well, I first converted it to XHTML 1.0 Strict by shifting some
things into CSS that should have been that way in the first place
(results of laziness). Then when that validated I just changed the
DOCTYPE to HTML 5, added a new meta tag, checked
that validation and that was it. Really simple.
One of the main reasons was so that I could use the new
video tag. From now on any posts with videos will display
them as in-line videos (Ogg Vorbis), provided you are using a browser
with HTML 5 support. In fact, I already did it with two older posts:
Knight's Tour and the Mandelbrot set zoom videos. Here, check it
out,
HTML 5 is still a moving target, so I may have to do some keeping up over time. Probably not.
Video is finally a first class web citizen and it is quite exciting.
coralized permalink
short permalink
long permalink
permanent link
All but one of the major browsers do special URL mangling as a
"feature", but I think of it as bad behavior. Specifically, Firefox
gets it right while IE, Opera, Safari, and Chrome all do it wrong. The
last three are likely replicating the broken behavior of IE. I would
rather not do this.
I ran into this specific issue work. An IT person had sent out a URL
to an important internal website, which contained links to other
resources. However, this person, despite being in IT, wasn't very
tech-savvy and wrote all of the links with backslashes instead of
slashes. When I followed with Firefox I got a 404 Not
Found because Firefox asked for a filename with a backslash in
it -- proper behavior.
As first I thought the page was out of date with dead links, but it was obviously a brand new page. That's when I noticed that the links had the wrong slashes. A quick check with IE showed the links as working, which is why the IT person didn't notice his error. IE was automatically, and silently, converting the backslashes to slashes.
According to RFC 1738, all URLs must encode their backslashes because of buggy transport agents (and, years later, because of buggy browsers like IE). So, if you want to use a backslash in a URL you are supposed to encode it anyway, but if you don't most major browsers will silently mangle the URL. I really, really don't like this. I know what I am doing, Mr. Browser.
It's done to replicate IE behavior, and IE did it because it's a common error for Windows users to use backslashes when they shouldn't (Windows directory separators), so it is anticipating user error (and assuming it knows best!). Let's not cripple our software to bring it down the same level as the major shortcomings of Windows, the toy OS.
I wrote a little test suite for testing if your browser does this annoying mangling. You can try it with your browser below,
Other than the browsers listed above, the only other browser I know
that fails is Links (not
Lynx). Note that Opera passes the "automatic" test, but fails the
"manual" test, because it doesn't mangle the http-equiv
URLs used in the automatic tests.
So beware using the browsers I listed at the beginning. Not only may they be sending you to the wrong URL, but they are crippling the web for the sake of Microsoft's errors.
coralized permalink
short permalink
long permalink
permanent link
Did you know that Emacs comes with a calculator? Woop-dee-doo! Call the presses! Wow, a whole calculator! Sounds a bit lame, right?
Actually, it's much more than just a simple calculator. It's a computer algebra system! It is officially called a calculator, which isn't fair. It's an understatement, and I am sure has caused many people to overlook it. I finally ran into it during a thorough (re)reading of the Emacs manuals and almost skipped over it myself.
Ever see that demonstration by Will Wright for the game Spore several years ago? The player starts as a single-cell organism and evolves into a civilization with interstellar presence. When he started the demo he showed a cell through what looked like a microscope. No one had any idea yet what the game was about, so every time he increased the scope, from bacteria to animal, animal to civilization, civilization to space travel, interplanetary travel to interstellar travel, there was a huge reaction from the audience. It was like those infomercials: "But that's not all!!!"
As I made my way through the Emacs calc manual I was continually amazed by its power, with a similar constant increase in scope. Each new page was almost saying, "But that's not all!!!" Like an infomercial I'm going to run through some of its features. See the calc manual for a real thorough introduction. It has practice exercises that shows some gotchas and interesting feature interactions.
Fire it up with C-x * c or M-x calc. There
will be two new windows (Emacs windows, that is), one with the
calculator and the other with usage history (the "trail").
First of all, the calculator operates on a stack and so its basic use
is done with RPN. The stack builds vertically, downwards. Type in
numbers and hit enter to push them onto the stack. Operators can be
typed right after the number, so no need to hit enter all the
time. Because negative (-) is reserved for subtraction an
underscore _ is used to type a negative number. An
example stack with 3, 4, and 10,
3: 3
2: 4
1: 10
.
10 is at the "top" of the stack (indicated by the "1:"), so if we type
a * the top two elements are multiplied. Like so,
2: 3
1: 40
.
The calculator has no limitations on the size of integers, so you work
with large numbers without losing precision. For example, we'll
take 2^200.
2: 2
1: 200
.
Apply the ^ operator,
1: 1606938044258990275541962092341162602522202993782792835301376
.
But that's not all!!! It has a complex number type, which is entered
in pairs (real, imaginary) with parenthesis. They can be operated on
like any other number. Take -1 + 2i minus 4 +
2i,
2: (-1, 2)
1: (4, 2)
.
Subtract with -,
1: -5
.
Then take the square root of that using Q, the square
root function.
1: (0., 2.2360679775)
.
We can set the calculator's precision with p. The default
is 12 places, showing here 1 / 7.
1: 0.142857142857
.
If we adjust the precision to 50 and do it again,
2: 0.142857142857
1: 0.14285714285714285714285714285714285714285714285714
.
Numbers can be displayed in various notations, too, like fixed-point, scientific notation, and engineering notation. It will switch between these without losing any information (the stored form is separate from the displayed form).
But that's not all!!! We can represent rational numbers precisely with
ratios. These are entered with a :. Push
on 1/7, 3/14, and 17/29,
3: 1:7
2: 3:13
1: 17:29
.
And multiply them all together, which displays in the lowest form,
1: 51:2842
.
There is a mode for working in these automatically.
But that's not all!!! We can change the radix. To enter a number with
a different radix, which prefix it with the radix and a
#. Here is how we enter 29 in base-2,
2#11101
We can change the display radix with d r. With 29 on the
stack, here's base-4,
1: 4#131
.
Base-16,
1: 16#1D
.
Base-36,
1: 36#T
.
But that's not all!!! We can enter algebraic expressions onto the
stack with apostrophe, '. Symbols can be entered as part
of the expression. Note: these expressions are not entered in RPN.
1: a^3 + a^2 b / c d - a / b
.
There is a "big" mode (d B) for easier reading,
2
3 a b a
1: a + ---- - -
c d b
.
We can assign values to variables to have the expression evaluated. If
we assign a to 10 and use the "evaluates-to" operator,
2
3 a b a 100 b 10
1: a + ---- - - => 1000 + ----- - --
c d b c d b
.
But that's not all!!! There is a vector type for working with vectors
and matrices and doing linear algebra. They are entered with
brackets, [].
2: [4, 1, 5]
1: [ [ 1, 2, 3 ]
[ 4, 5, 6 ]
[ 6, 7, 8 ] ]
.
And take the dot product, then take cross product of this vector and matrix,
2: [38, 48, 58]
1: [ [ -14, -18, -22 ]
[ -19, -18, -17 ]
[ 15, 18, 21 ] ]
.
Any matrix and vector operator you could probably think of is available, including map and reduce (and you can define your own expression to apply).
We can use this to solve a linear system. Find x
and y in terms of a and b,
x + a y = 6 x + b y = 10
Enter it (note we are using symbols),
2: [6, 10]
1: [ [ 1, a ]
[ 1, b ] ]
.
And divide,
4 a 4
1: [6 + -----, -----]
a - b b - a
.
But that's not all!!! We can create graphs if GNU Plot is
installed. We can give it two vectors, or an algebraic
expression. This plot of sin(x) and x cos(x)
was made with just a few keystrokes,
But that's not all!!! There is an HMS type for handling times and angles. For 2 hours, 30 minutes, and 4 seconds, and some others,
3: 2@ 30' 4"
2: 4@ 22' 13"
1: 1@ 2' 56"
.
Of course, the normal operators work as expected. We can add them all up,
1: 7@ 55' 13"
.
We can convert between this and radians, and degrees, and so on.
But that's not all!!! The calculator also has a date type, entered
inside angled brackets, <> (in algebra entry
mode). It is really flexible on input dates. We can insert the current
date with t N.
1: <6:59:34pm Tue Jun 23, 2009>
.
If we add numbers they are treated as days. Add 4,
1: <6:59:34pm Sat Jun 27, 2009>
.
It works with the HMS format from before too. Subtract 2@ 3'
15".
1: <4:56:32pm Sat Jun 27, 2009>
.
But that's not all!!! There is a modulo form for performing modulo arithmetic. For example, 17 mod 24,
1: 17 mod 24
.
Add 10,
1: 3 mod 24
.
This is most useful for forms such as n^p mod M, which
this will handle efficiently. For example, 3^100000 mod
24. The naive way would be to find 3^100000 first,
then take the modulus. This involves a computationally expensive
middle step of calculating 3^100000, a huge number. The
modulo form does it smarter.
But that's not all!!! The calculator can do unit conversions. The version of Emacs (22.3.1) I am typing in right now knows about 159 different units. For example, I push 65 mph onto the stack,
1: 65 mph
.
Convert to meters per second with u c,
1: 29.0576 m / s
.
It is flexible about mixing type of units. For example, I enter 3 cubic meters,
3
1: 3 m
.
I can convert to gallons,
1: 792.516157074 gal
.
I work in a lab without Internet access during the day, so when I need to do various conversions Emacs is indispensable.
The speed of light is also a unit. I can enter 1 c and
convert to meters per second,
1: 299792458 m / s
.
But that's not all!!! As I said, it's a computer algebra system so it understands symbolic math. Remember those algebraic expressions from before? I can operate on those. Let's push some expressions onto the stack,
3: ln(x)
2 a x
2: a x + --- + c
b
1: y + c
.
Multiply the top two, then add the third,
2 a x
1: ln(x) + (a x + --- + c) (y + c)
b
.
Expand with a x, then simplify with a s,
2 a x y 2 a c x 2
1: ln(x) + a y x + ----- + c y + a c x + ----- + c
b b
.
Now, one of the coolest features: calculus. Differentiate with respect
to x, with a d,
1 a y a c
1: - + 2 a y x + --- + 2 a c x + ---
x b b
.
Or undo that and integrate it,
3 2 3 2
a y x a x y a c x a c x 2
1: x ln(x) - x + ------ + ------ + c x y + ------ + ------ + x c
3 2 b 3 2 b
.
That's just awesome! That's a text editor ... doing calculus!
So, that was most of the main features. It was kind of exhausting going through all of that, and I am only scratching the surface of what the calculator can do.
Naturally, it can be extended with some elisp. It provides a
defmath macro specifically for this.
I bet (hope?) someday it will have a functions for doing Laplace and Fourier transforms.
coralized permalink
short permalink
long permalink
permanent link
Awhile ago I wanted to find every Hamiltonian path in the contiguous 48 states. That is, trips that visit each state exactly once. Writing a program to search for Hamiltonian paths is easy ( I did this already). The most time consuming part was actually putting together the data that specified the graph to be searched. I hope someone somewhere finds it useful. Here is a map for reference,
It took me several passes before I stopped finding errors. I think I have it all right now, but there could still be some mistakes. If you see one, leave a comment and I'll fix it here. Here is the graph as an S-expression alist; the car (first) element in each list is a state, and the cdr (rest) is the unordered list of states that can be reached from it.
((me nh) (nh vt ma me) (vt ny ma nh) (ma ri ct ny nh vt) (ny pa nj ma ct vt) (ri ma ct) (ct ri ma ny) (nj pa ny de) (de md pa nj) (pa nj ny de md wv oh) (md pa de va wv) (va md wv ky tn nc) (nc va tn ga sc) (sc nc ga) (ga fl sc al nc tn) (al ms fl ga tn) (ms la ar tn al) (tn ms al ga nc va ky mo ar) (ky wv va tn mo il in oh) (wv md pa oh ky va) (oh pa wv ky in mi) (fl al ga) (mi wi oh in) (wi mn ia il mi) (il in ky mo ia wi) (in oh ky il mi) (mo il ky tn ar ok ks ne ia) (ar mo tn ms la tx ok) (la ms ar tx) (tx ok nm ar la) (ok ks mo ar tx nm co) (ks ok co ne mo) (ne sd ia mo ks co wy) (sd nd mn ia ne wy mt) (nd mt sd mn) (ia ne mo il wi mn sd) (mn wi ia sd nd) (mt id wy sd nd) (wy id ut co ne sd mt) (co ne ks ok nm ut wy) (nm co ok tx az) (az nm ut ca nv) (ut nv id wy co az) (id mt wy ut nv or wa) (wa or id) (or wa id nv ca) (nv or id ut az ca) (ca az nv or))
Note that all paths must start or end in Maine because it connects to only one other state.
coralized permalink
short permalink
long permalink
permanent link
I'm not the first to come across this idea: the browser could be used
as part of a distributed computing system. A web server hands out
Javascript and the browser runs the script and reports the results
back to the server. The browser is a portable, widely available
platform so just about anyone can easily contribute, possibly without
even knowing.
Browsers aren't really expecting this sort of thing. They will
complain if a script is running for too long. If you tell Firefox to
continue running the script anyway, it will lock up until the script
is done (or when it complains again). This can be worked around by writing a simple
scheduler with setTimeout().
This could also potentially be used as an alternative to advertising. Instead of selling advertising space, a website operator could sell visitor's CPU time by including a little snippet of code. This may be more successful, because most visitors would be unaware of it, making it less intrusive. It will be less likely to be blocked. Of course, there ethical issues about this. In fact, there is already a company doing this with secret Java applets.
There are two serious constraints on using Javascript in a browser as a distributed computing platform:
Low bandwidth. There isn't a lot of opportunity to transfer data between the server and the node, and nodes can't talk to each other. The data needed by a node must be small. The results data must also be small.
Short computational units. The Javascript in the browser has no way to store its running state between browser sessions, so it must rely on the server for this. This means that the units of work must be able to be completed within a short period of time. A few minutes at the most on a normal computer.
A lot of problems won't fit inside these limitations. One that I thought might was a Mersenne prime search. A Mersenne prime is a prime of the form,
2n - 1
So even though the largest known Mersenne prime has nearly 13 million
digits (about 5 MB just to store the entire number), it can be
described by it's exponent, 43,112,609, which is small
enough to fit in a 4-byte integer. The result of a calculation, a
probabilistic primality test, is a "yes" or "no". One bit. It fits the
first constraint very well.
However, the smallest amount of work a node can do is an entire primality test. If we break it down any further, the prime will have been expanded and we will not fit the first constraint. There will be too much data. To see how possible it might be, I implemented it, which you can try out here,
/download/mersenne/ (sloppy code warning!)
I modified an
existing Javascript bigint library which allowed me to get it up
running quickly. After you receive the page, your browser will run the
Miller-Rabin primality test on 2^9941 - 1. You can
edit the source HTML to try a different exponent. Running it on
several of my computers on different browsers it took anywhere from an
hour to 8 hours. And that's only with an exponent of
9941. It's an unsuitable problem.
It would be neat to see a browser computing grid in action, but I can't think of a problem to solve with it.
coralized permalink
short permalink
long permalink
permanent link
If you look at the page sources around here you might notice that there are no bare e-mail addresses around. This is because I obfuscate them into a series of HTML entities. So far this has been pretty effective at hiding from address-scraping, web-crawling spam bots. They don't seem to try very hard at decoding HTML entities.
When I added the comment system, I needed to obfuscate addresses automatically. I quickly realized that this is yet another perl one-liner (and implemented as one line in the comment system). It can be used on the command line to obfuscate a file/pipe containing a list of addresses.
perl -lpe '$_ = join "", map {"&#" . ord() . ";"} split //'
All of the spaces are really only there for us humans,
perl -lpe '$_=join"",map{"&#".ord.";"}split//'
I keep running into these one-liners.
coralized permalink
short permalink
long permalink
permanent link
I've been using elisp a lot lately, but unfortunately it's missing a lot of features that one would find in a more standard lisp. The following are some features I wish elisp had. Many of these could be fit into a generic "be more like Scheme or Common Lisp". Some of these features would break the existing mountain of elisp code out there, requiring a massive rewrite, which is likely the main reason they are being held back.
Closures, and maybe continuations. Closures are one of the
features I miss the most when writing elisp. They would allow the
implementation of Scheme-style lazy evaluation with delay
and force, among other neat tools. Continuations would
just be a neat thing to have, though they come with a performance
penalty.
Closures would also pretty much require Emacs switch to lexical scoping.
Arbitrary precision. Really, any higher order language's
numbers should be bignums. Emacs 22 does come with the Calc
package which provides arbitrary precision via
defmath. Perl does something like this with the bignum
module.
Packages/namespaces. Without namespaces all of the Emacs
packages prefix their functions and variables with its name
(i.e. dired-). Some real namespaces would be useful for
large projects.
C interface. This is something GNU Emacs will never have because Richard Stallman considers Emacs shared libraries support to be a GPL threat. If Emacs could be dynamically extended some useful libraries could be linked in and exposed to elisp.
Concurrency. If some elisp is being executed Emacs will lock up. This is a particular problem for Gnus. Again, Emacs would really need to switch to lexical scoping before this could happen. Threading would be nice.
Speed. Emacs lisp is pretty slow, even when compiled. Lexical scoping would help with performance (compile time vs. run time binding).
Regex type. I mention this last because I think this would be really cool, and I am not aware of any other lisps that do it. Emacs does regular expressions with strings, which is silly and cumbersome. Backslashes need extra escaping, for example. Instead, I would rather have a regex type like Perl and Javascript have. So instead of,
(string-match "\\w[0-9]+" "foo525")
we have,
(string-match /\w[0-9]+/ "foo525")
Naturally there would be a regexpp predicate for checking
its type. There could also be a function for compiling a regexp from a
string into a regexp object. As a bonus, I would also like to use it
directly as a function,
(/\w[0-9]+/ "foo525")
I think a regexp price would really give elisp an edge, and would be entirely appropriate for a text editor. It could also be done without breaking anything (keep string-style regexp support).
There is more commentary over at EmacsWiki: Why Does Elisp Suck.
coralized permalink
short permalink
long permalink
permanent link
I wanted an elisp macro that could measure the running time of a block of code. Specifically, I wanted it to work like this,
(measure-time ... body ...)
And it would return the running time as seconds in floating point. Well, here's a macro that does it!
(defmacro measure-time (&rest body) "Measure and return the running time of the code block." (let ((time-start (make-symbol "time-start")) (time-end (make-symbol "time-end")) (sec (make-symbol "sec")) (usec (make-symbol "usec"))) `(let* ((,time-start (current-time)) (,sec (cadr ,time-start)) (,usec (cadr (cdr ,time-start)))) ,@body (let ((,time-end (current-time))) (+ (- (cadr ,time-end) ,sec) (/ (- (cadr (cdr ,time-end)) ,usec) 1000000.0))))))
It's only good for up to around 18 hours, then the time integer overflows. If only Emacs had arbitrary precision numbers. Here it is in action using my binomial function from last week.
(measure-time (nck 20 10) (nck 30 7))
Which, just now, returned 3.643713 seconds when executed.
coralized permalink
short permalink
long permalink
permanent link
A Knight's
Tour is a path on a chess board such that a knight moving
according to the rules of chess will visit each square exactly
once. Here is a program that solves the Knight's Tour for various
board sizes.
git clone http://git.nullprogram.com/knight.git
I wrote it purely for speed, and so in cases where the algorithm doesn't blow up (taking possibly thousands of years to complete) it only takes a few milliseconds to complete. The algorithm is recursive and goes like so,
The purpose of going towards targets with few opportunities is so that we don't leave any stranded spaces that will not be detected until much later. When it does run into a dead end, it usually only has to back out a short ways and try again.
The algorithm breaks down on certain board sizes. For example, try a 120 by 120 board. Instead of nearly instantaneous solution like 120 by 119 would get, it takes enormous amounts of time. I don't yet know why.
Also included in the repository is a Perl script for creating frames
for a video. Here are two videos, one for an 8x8 board, one for a
80x80 board. They are in Ogg
Theora. Update: if your browser supports HTML 5 you can now
watch these videos inline thanks to the video tag.
It's interesting how on the large one the behavior seems to be random at some places, but all it's doing is following that simple algorithm.
coralized permalink
short permalink
long permalink
permanent link
Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.