I've been using elisp a lot lately, but unfortunately it's missing a
lot of features that one would find in a more standard lisp. The
following are some features I wish elisp had. Many of these could be
fit into a generic "be more like Scheme or Common Lisp". Some of these
features would break the existing mountain of elisp code out there,
requiring a massive rewrite, which is likely the main reason they are
being held back.
Closures, and maybe continuations. Closures are one of the
features I miss the most when writing elisp. They would allow the
implementation of Scheme-style lazy evaluation with delay
and force, among other neat tools. Continuations would
just be a neat thing to have, though they come with a performance
penalty.
Closures would also pretty much require Emacs switch to lexical
scoping.
Arbitrary precision. Really, any higher order language's
numbers should be bignums. Emacs 22 does come with the Calc
package which provides arbitrary precision via
defmath. Perl does something like this with the bignum
module.
Packages/namespaces. Without namespaces all of the Emacs
packages prefix their functions and variables with its name
(i.e. dired-). Some real namespaces would be useful for
large projects.
C interface. This is something GNU Emacs will never have
because Richard Stallman considers Emacs shared libraries support to
be
a GPL threat. If Emacs could be dynamically extended some useful
libraries could be linked in and exposed to elisp.
Concurrency. If some elisp is being executed Emacs will lock
up. This is a particular problem for Gnus. Again, Emacs would really
need to switch to lexical scoping before this could happen. Threading
would be nice.
Speed. Emacs lisp is pretty slow, even when compiled. Lexical
scoping would help with performance (compile time vs. run time
binding).
Regex type. I mention this last because I think this would be
really cool, and I am not aware of any other lisps that do it. Emacs
does regular expressions with strings, which is silly and
cumbersome. Backslashes need extra escaping, for example. Instead, I
would rather have a regex type like Perl and Javascript have. So
instead of,
(string-match "\\w[0-9]+" "foo525")
we have,
(string-match /\w[0-9]+/ "foo525")
Naturally there would be a regexpp predicate for checking
its type. There could also be a function for compiling a regexp from a
string into a regexp object. As a bonus, I would also like to use it
directly as a function,
(/\w[0-9]+/ "foo525")
I think a regexp price would really give elisp an edge, and would be
entirely appropriate for a text editor. It could also be done without
breaking anything (keep string-style regexp support).
I wanted an elisp macro that could measure the running time of a block
of code. Specifically, I wanted it to work like this,
(measure-time
...
body
...)
And it would return the running time as seconds in floating
point. Well, here's a macro that does it!
(defmacromeasure-time (&rest body)
"Measure and return the running time of the code block."
(declare (indent defun))
(let ((start (make-symbol "start")))
`(let* ((,start (float-time)))
,@body
(- (float-time) ,start))))
It's only good for up to around 18 hours, then the time integer
overflows. If only Emacs had arbitrary precision numbers. Here it is
in action using my binomial function from
last week.
(measure-time
(nck 20 10)
(nck 30 7))
Which, just now, returned 3.643713 seconds when executed.
A Knight's
Tour is a path on a chess board such that a knight moving
according to the rules of chess will visit each square exactly
once. Here is a program that solves the Knight's Tour for various
board sizes.
I wrote it purely for speed, and so in cases where the algorithm
doesn't blow up (taking possibly thousands of years to complete) it
only takes a few milliseconds to complete. The algorithm is recursive
and goes like so,
Given a board marked with visited spaces and a position create a
list of valid moves.
If the list is empty, return failure.
Sort of the list ascendingly according to the opportunities of
the target space. That is, sort by the number of valid moves
that are available from the target space.
Mark the current space as visited, then recurse from the new
position with the new board starting with the first target in
the list.
If the recursion returns success, return success.
If the recursion returns failure, recurse with the next target
in the list, and so on. If all the targets on the list have been
tried, return failure.
The purpose of going towards targets with few opportunities is so that
we don't leave any stranded spaces that will not be detected until
much later. When it does run into a dead end, it usually only
has to back out a short ways and try again.
The algorithm breaks down on certain board sizes. For example, try a
120 by 120 board. Instead of nearly instantaneous solution like 120 by
119 would get, it takes enormous amounts of time. I don't yet know
why.
Also included in the repository is a Perl script for creating frames
for a video. Here are two videos, one for an 8x8 board, one for a
80x80 board. They are in Ogg
Theora. Update: if your browser supports HTML 5 you can now
watch these videos inline thanks to the video tag.
It's interesting how on the large one the behavior seems to be random
at some places, but all it's doing is following that simple algorithm.
Many comment/discussion systems get previews wrong. This even includes
major sites like Boing Boing and Slashdot. Sometimes they feed back a
different comment in the textarea, so repeated previews slowly degrade
the comment. Other times the comment preview isn't the same thing as
the final result. A comment actually has four states,
The raw comment is the unfiltered string of bytes from the
user. This is not safe to give directly back to the user, as it could
be exploited to feed an arbitrary page to an innocent user.
The escaped comment is created from the raw comment by
filtering it through the escapeHTML() function. This
function creates HTML entities out of some of the characters, like
< and >. A browser will interpret the escaped comment as a
simple string, and is safe to give back to the user. This function is
actually provided by perl's CGI module, so perl programmers need not
implement this.
Note that escapeHTML() is reversible, though the server
side won't need to reverse it. The browser does.
The stripped comment is created from the raw comment by
filtering it through stripHTML(), which removes
non-whitelisted HTML tags. It also strips non-whitelisted attributes
from allowed tags. It should probably add a
rel="nofollow" to links. It also runs escapeHTML()
on attribute values and content outside tags. This is safe to give
back to the user because only safe tags are left.
If your comments use markup other than HTML, like BBCode, this
function should strip all HTML (your whitelist is empty) and do
the conversion from your markup to HTML.
It might also be a good idea for it to produce well-formed HTML. This
will allow your comments/discussion pages to be XHTML compliant.
stripHTML() is irreversible because it dumps information.
The stored comment is the encoding of the comment in the
system. This depends entirely on the storage system. In some cases it
may be identical to the stripped comment (and store is
the identity function). If the comment is going through SQL into
database, some characters may need to be escaped as to not cause
problems. It could even be a base 64 encoding.
store() must be unambiguously reversible, and the server
should have an unstore() to do this. It should probably
also be able to convert any arbitrary string of characters into a safe
encoding for storage.
There should only be one version of all these functions for both
previews and final posting of comments.
When doing a comment preview both the escaped comment and the stripped
comment are given back to the user. The stripped comment is dropped in
as HTML, and the escaped comment is put into the textarea of the
form. It would probably be convenient for the user if you give them
back any other form information, including the same captcha and their
answer to it (or not charge them with a captcha for that comment
anymore).
You may be tempted to store the raw comments (safely with
store()) and do HTML stripping on the fly. This would
allow you to upgrade your HTML stripping function in the future to
"better" handle user input. I don't recommend it. That's extra
processing for each page request, but worse, it breaks the concept of
the preview, because the comment formatting is subject to change in
the future.
The hardest function to implement is probably stripHTML()
because it needs to be able to handle poorly formed HTML. If you are
using perl, you will probably want to use the HTML::Parser module,
which is what I did. This does everything noted above and also
auto-links anything that looks like a URL, forces proper comment
nesting, automatically makes paragraphs from blank-line-separated
chunks, and almost produces well-formed HTML.
The unquoted let form evalues to 20, not 35,
because foo evaluates to the symbol bar,
which is bound to 10. The unquoted version is used to select variable
names dynamically.
This isn't very useful in a lexically scoped lisp because the
variables that can access it are decided, well, lexically. With
dynamic scoping we can use it to temporarily mask variables and select
what variables to mask dynamically. I found a use for it in one of my
projects.
I had a recursive function that searched a graph made of symbols. The
symbols globally stored the state of the node. The current node was
passed into the function, which would change the state of that node,
and recurse. Here's how I was doing it,
I've been using lisp on and off for the past few years. I read some
lisp books, went through The Little
Schemer and some of SICP. But
I could never really think in lisp. When I needed to write some
code, I would prefer another language first. I was writing imperative
code for 10 years before I saw lisp, and I was used to it.
Recently, I have found myself wanting to use lists (including alists,
plists, etc.) as data structures for everything, even when I'm not
even writing lisp. I think I finally
got lisp and now I want to use lisp for everything.
There is a 75% chance of rain on any given day in the next
week. What is the probability that it will rain on at least 5 of
the 7 days?
The purpose of the question was to point out a neat coincidence in the
problem (explained at the link). I used a program to solve the
problem and there are two reasons for this.
First, I wanted to use the program to explore the problem and find the
"special" property. With a program, I could quickly try different
parameters, which would take longer, and be more error prone, by hand.
Second, which is similar to the first, I hate evaluating a large
expression by hand. It's slow and error prone. Writing a program to do
the same work is faster and mistakes are easier to catch. Also, I can
quickly try different parameters to make sure my program's output is
reasonable. In this case, for any reasonable input, the output, a
probability, shouldn't be greater than 1.
Well, let's see, this is a Bernoulli
experiment: each day is independent, so it is like flipping a coin
seven times and counting the heads. That means we need the choose
function.
My first thought was Octave, as this is a simple program and Octave
already provides nchoosek() for me.
# Rain at least n days
function sum = rain (n)
sum = 0;
for i = n:7
sum += nchoosek(7,i) * 0.75 ^ i * 0.25 ^ (7 - i);
endend
Simple, but I actually made a couple little mistakes working it out,
and it took me a little longer than it should have. If you asked
someone to write this program in any imperative language, it would
probably look a lot like this.
I then made a lisp version (elisp), but I first needed to define the
binomial coefficient function since there wasn't one provided.
(defunnck (n k)
"The binomial coefficient."
(if (or (= k 0) (= k n)) 1
(+ (nck (- n 1) (- k 1)) (nck (- n 1) k))))
This is the recursive version, based on Pascal's
rule, so it doesn't need factorials.
In lisp, recursion is preferred to iteration, so that's the way I
approached the program.
(defunp-rain (n)
"Probability of rain on exactly n days."
(* (nck 7 n) (expt 0.75 n) (expt 0.25 (- 7 n))))
(defunrain-at-least (n)
"Probability it will rain on at least n days."
(if (> n 7) 0
(+ (p-rain n) (rain-at-least (+ 1 n)))))
I like the recursive version, and this code, much better. It presents
the solution in a more straight forward way. And I got it right the
first time, too. Yes, it was written after the Octave version,
but I still think it counts for something.
The lisp version is also less complex. The Octave version has to use
some temporary variables, i and sum, which
is extra conceptual overhead. Sure, it could also be written
recursively, but this is not really the way Octave is meant to be
written.
Eh, not a great example, really. Lisp has so many powerful features,
like macros (the powerful lisp kind), symbols, low-level access into
the interpreter, and the REPL, that allow the programmer to do some
really cool things. Many of these features are unique to lisps.
At work right now I am using two different machines. One is a 32-bit
SPARC and the other is a very powerful SMP x86-64. Sometimes data are
generated on one machine and used in a simulation on the other. There
is a problem of byte-order, though. The SPARC is big-endian and the
other is little-endian, and the programs on both sides don't pay
attention to that small detail.
Luckily, the data are all 4-byte aligned. That's perfect for a Perl
one-liner byte order conversion,
Perl is really great for concise hacks. I really like how this
one-liner almost reads like a natural language sentence. Is there any
other language that can do powerful one-liners like Perl?
To all other non-emacsen text editors, can your text editor do that?!
Ha! Even though elisp is a slow, closure-less, dynamically scoped,
ugly cousin of more popular lisps, it's still a lot of fun to write.
To fire it up, load it into Emacs and run the extended command
(M-x) httpd-start. By default it will serve
files from "~/public_html". To change this, change the
variable httpd-root to the desired web root. You can stop
the server with httpd-stop.
It's about 200 lines of code and can serve static websites made of
small, static files. I say small files because it serves files from
buffers, meaning it has to read the entire file in first.
For a simple, text editor based server it can hold up to a pretty
decent load. At one point I hit it with 8 wget instances all making
rapid recursive downloads and my manual navigation wasn't slowed down
noticeably. Despite running in the slow elisp interpreter, I think it
can have much better performance by caching commonly served files in
buffers.
It should run, unmodified, anywhere a modern Emacs can run, so
I expect that it's already very portable. I can imagine it being
useful in a situation where someone needs to temporarily host some
files but there isn't a web server on the machine. Just grab this
script and throw it at Emacs.
Well, it only does IPv4 right now, though I expect IPv6 only requires
changing one number (namely, 4 to 6). I don't have any IPv6 systems to
test it on.
When writing it I also had security in mind so, as far as I know, it
should be safe to use. It cleans up the GET from the
client so that no files underneath the serving root can be accessed.
The server log is lisp itself. Here is an example log starting the
server, serving one request, and halting,
The log is alists of alists, making a hierarchical tree structure that
can be explored with some simple lisp functions. Normally this sort of
thing is done with XML, but lisp already has its own structured
format: lists!
When GET is a directory, it looks for
"index.html" and serves that if it exists. More indexes
can be added to the variable httpd-indexes. This can
actually be done in a special ".htaccess.el" file.
If a ".htaccess.el" exists in the directory from which a
file is being served, Emacs will first load/execute it. You see, it's
just a lisp program. If you wanted to add a new index file name, the
hypertext access file could contain this,
(add-to-list 'httpd-indexes "0001.html")
It's a bit like a .emacs file.
But I think one of the coolest things about having a lisp-based server
is that the server can be modified in place without disrupting or
restarting it. In my Emacs web server, the only change that requires a
restart is changing the server port. In fact, I wrote most of it while
the server was running and tested my changes from a browser right as I
made them -- all on the same instance of the server.
If you want to look into the AI side of this, the server could modify
its own code in response to its use.
I also had the idea of creating dynamic websites with elisp, in the
same way PHP or Perl does. If a .el file (or
.elc) is accessed, the server would pass the
GET/POST arguments as an alist to a function
in the elisp file. The server would also provide some nifty HTML
generation macros. A dynamic script might look like this,
I finally have a comment system, thanks to pollxn, a blosxom comment system that
actually works. There is a link to it, indicating the number of
comments, in the bottom of each post. Try it out and say hello.
Unfortunately, pollxn doesn't have any sort of anti-spam or CAPTCHA system. If
you look around the Interwebs where other people are using pollxn, you
will see everyone has their own little CAPTCHA thing. Well, I am not
different. I hacked together my own to keep away automated spammers.
It selects words from the dictionary (of 40,000 words in this case)
and encrypts them with Blowfish in CBC mode, with a unique IV each
time. This is to passed to the user, who passes it to an image
generator which decrypts the word and uses GD in Perl to render it,
apply some transforms, and drop a line randomly over it. The user
submits the guess of the image along with the encrypted version
(hidden field), which is decrypted and compared on the other end. The
same encrypted ID cannot be used twice, but thanks to the IV the same
word can be used twice.
Here are some samples. If you hit refresh, they will render
differently.
It's not a great CAPTCHA, but it should be good enough for the low
volume of traffic I see here. As I inevitably collect small amounts of
spam (by spammers manually passing the CAPTCHA), I will gradually
create the needed tools to combat it. I can also easily update the
CAPTCHA image algorithm without disrupting the functioning of the
website.
I'm sure I will be making improvements to the comment system over time
as well. I should make it obfuscate e-mail addresses, for one. Maybe
add a preview. And better blosxom integration.
So say hello below! I am excited to finally have a real blog.
I have recently been playing around with Greasemonkey, a great Firefox
add-on that gives users a lot of control over how a website is
displayed. It works by having the user providing a bit of Javascript
that runs when a website is rendered. The user doesn't actually have
to write the script, but can find them in various script repositories.
When I looked around, I couldn't find scripts to do some things I
wanted, so I started writing my own. Now that I have started that
habit I see uses for user scripts all over the place. Suddenly I can
fix anything I find annoying on the web. It's very empowering.
Of course, Firefox add-ons can do anything a user script can do. But
Greasemonkey user scripts are lightweight, more secure (due to being
less powerful), easier to write, and don't require a browser restart
to install and uninstall.
I posted my scripts on
userscripts.org so that people could find them easily, but I
always like to host these things locally, too. You can find it under
/userscripts here. Don't forget to review
the source before you install it! That is, unless you automatically
trust me and my website's security. I, or an infiltrator, could slip
something sneaky in there.
I actually first used Greasemonkey back in 2005, but they had some
very serious security issues back in those days. It was bad enough
that I just uninstalled it, which was actually recommended by the
Greasemonkey people themselves. So, four years later I am back to
check it out.
The first user script I wrote was in response to a "feature" on TV Tropes. In addition to the overall
cruddiness of the website, they started adding "folders" to the
information on long pages. It folds up all of the information behind
little clickable widgets. It uses CSS to hide the information and
Javascript to reveal it by adjusting those styles on the fly. If you
have a browser with CSS support but not Javascript (or have it
disabled like I did), you won't ever be able to see the information in
the browser. As of this writing the
Jumping the Shark article uses this. Take a look.
This is an awful idea! It critically breaks the usability of the
page. And what's the point? We already have a vertical scrollbar to
control the display. Unfortunately, a lot of clueless people seem to
like this sort of behavior -- because it's flashy -- so we will
probably only see more of it on the web in the future.
My TV Tropes user script is simple: it scrapes off some of the
CSS. Specifically, the "folder" CSS. That's it! Someone who already
knows how to make user scripts could probably put this together in
less than 15 minutes.
If you want to tackle web annoyances, learn Greasemonkey!
Here's something interesting I saw on Wikipedia. There is a map of the
US with states colored according to the spread of the H1N1 flu
epidemic. Take a look:
H1N1_USA_Map.svg. The interesting part is actually at the bottom
of that image page.
Wikipedia, being a wiki, versions everything. It has to. No one
actually changes a page, they just add a new version that is a
derivative of the "current" version. Because of this, Wikipedia has
incidentally created a time-lapse version of the map in its version
control system. In case you can't see it when you are reading this,
here's what part of it looks like,
That's the timestamp and the state of the spread at that time,
presented in an extremely useful way. And this was created by
accident. Pretty cool, eh?
Don't stop here! This isn't everything. Check out the archives
(on the left) for more posts. Or just have a look at
the index.