null program

Clay Klein Bottle

A few years ago I made my wife -- girlfriend at the time -- a Klein bottle (well, the three-dimensional projection of one) out of clay. Since I hadn't used clay before I used some assistance from my dad. Here's how it was done,

Bottle diagram Bottom diagram image

As you can see, it's not quite the same as the generally depicted Klein bottle. The form you see here was easier to make with clay. After it was done, we baked it in a kiln. It's a bad idea to put sealed items in a kiln because they will burst as they heat. It took some time to convince the staff that our Klein bottle was actually unsealed.

Here are some pictures,

Front Side Bottom Top


Arcfour and CipherSaber in Emacs Lisp

I have previously talked about arcfour and CipherSaber and provided implementations in C. For fun, I made an implementation of arcfour in Emacs lisp (elisp), and built upon that to make a CipherSaber implementation in elisp. Check it out with Git,

git clone http://git.nullprogram.com/arcfour.git

If you don't have Git (yet), you can follow that link to use the web interface. The relevant files are arcfour.el and ciphersaber.el. There are some test vectors in there that I won't show here. Here is the arcfour implementation,

(defun rc4-init-state ()
  "Initialize the arcfour state vector."
  (interactive)
  (setq rc4-state (make-vector 256 0))
  (setq rc4-i 0)
  (setq rc4-j 0)
  (let (i)
    (dotimes (i 256 rc4-state)
      (aset rc4-state i i))))

(defun rc4-swap (i j)
  "Swap two elements in the state vector."
  (let ((temp (aref rc4-state i)))
    (aset rc4-state i (aref rc4-state j))
    (aset rc4-state j temp)))

(defun rc4-key-sched (key)
  "Arcfour key-scheduler: initialize state from key."
  (interactive "sEnter key: ")
  (let ((j 0) i)
    (dotimes (i 256 rc4-state)
      (setq j (% (+ j 
                    (aref rc4-state i) 
                    (aref key (% i (length key)))) 256))
      (rc4-swap i j))))

(defun rc4-gen-byte ()
  "Generate a single byte."
  (interactive)
  (setq rc4-i (% (1+ rc4-i) 256))
  (setq rc4-j (% (+ rc4-j (aref rc4-state rc4-i)) 256))
  (rc4-swap rc4-i rc4-j)
  (aref rc4-state (% (+ (aref rc4-state rc4-i) 
                        (aref rc4-state rc4-j)) 256)))

For the sake of simplicity it uses some global variables to store the cipher state. It would be better if the state was returned as a list, continuation, or closure. That way we could run a bunch of different ciphers at the same time.

And this provides interactive functions so that they can be called by a user right on the buffer being used, with M-x rc4-buffer.

(defun rc4-region (start end key)
  "Encrypt/decrypt region with arcfour using given key."
  (interactive "r\nsEnter key: ")
  (rc4-init-state)
  (rc4-key-sched key)
  (save-excursion
    (let (c)
      (goto-char start)
      (while (< (point) end)
        (setq c (char-after))
        (delete-char 1)
        (insert-char (logxor c (rc4-gen-byte)) 1)))))

(defun rc4-buffer (key)
  "Encrypt/decrypt entire buffer with arcfour."
  (interactive "sEnter key: ")
  (rc4-region (point-min) (point-max) key))

You may run into encoding issues with encrypted regions. The CipherSaber implementation below gets around this problem by turning off multi-byte encoding with set-buffer-multibyte.

In ciphersaber.el we apply these functions. The CipherSaber functions can be called with M-x cs-encrypt-buffer and M-x cs-decrypt-buffer. Note that this is only CipherSaber-1, and I leave CipherSaber-2 as an exercise for the reader :-P.

(defun cs-encrypt-buffer (key)
  "Encrypt buffer with CipherSaber-1."
  (interactive "sEnter key: ")
  (set-buffer-multibyte nil)
  (let* ((iv (cs-generate-iv))
         (cs-key (concat key iv)))
    (rc4-buffer cs-key)
    (beginning-of-buffer)
    (insert iv)))

(defun cs-decrypt-buffer (key)
  "Decrypt buffer with CipherSaber-1."
  (interactive "sEnter key: ")
  (let* ((iv (cs-retrieve-iv))
         (cs-key (concat key iv)))
    (rc4-buffer cs-key)))

(defun cs-generate-iv ()
  "Generate a 10-byte initialization vector."
  (let ((iv "") i)
    (dotimes (i 10 iv)
      (setq iv (concat iv (char-to-string (random 256)))))))

(defun cs-retrieve-iv ()
  "Remove initialization vector from buffer and return it."
  (beginning-of-buffer)
  (let ((iv "") i)
    (dotimes (i 10 iv)
      (setq iv (concat iv (char-to-string (char-after))))
      (delete-char 1))))

I didn't bother with save-excursion because I didn't think it would be important where the point is in the middle of an encrypted file. Feel free to add it, though!

These functions could be used to make a minor mode to transparently encrypt and decrypt CipherSaber files. It could also be modified to take advantage of Emacs' batch mode to handle CipherSaber processing right from the shell. But those are other projects!


CipherSaber

If you are a crypto-anarchist type like me, you should definitely take a look at CipherSaber. It is an extremely simple encryption protocol that even beginner programmers can implement. The protocol can also easily be memorized and quickly implemented from memory on the fly. In the case that cryptography was completely outlawed, CipherSaber would be a useful tool in allowing its users to continue to communicate privately.

I think the name is just perfect and captures everything CipherSaber is about. Here is the description right from the CipherSaber page,

In George Lucas' Star Wars trilogy, Jedi Knights were expected to make their own light sabers. The message was clear: a warrior confronted by a powerful empire bent on totalitarian control must be self-reliant.

CipherSaber is based on the arcfour stream cipher, but goes beyond it by defining the use of an initialization vector (IV) and how it is stored with the ciphertext. There are actually two versions: CipherSaber-1 and CipherSaber-2. The second one exists because of vulnerabilities in the first. The difference between them is small.

You want to make sure you generate a long enough passphrase for your encryption key. A normal password isn't good enough because an adversary will be able to throw all his available processing power at your ciphertext. Using Diceware would be a good idea here.

Here is the protocol.

CipherSaber diagram

Generate a 10-byte random IV. This need not be done using a very strong random number generator. It is only important that the same IV is not used more than once. Concatenate a secret user selected key (i.e. passphrase) with the IV and use that concatenation as the key for an arcfour cipher. Encrypt the message using the cipher. Concatenate the IV and the arcfour ciphertext to create the CipherSaber ciphertext.

To decipher, remove the first ten bytes of the ciphertext and use it as an IV. Concatenate the secret passphrase with the IV, and use it as the key for an arcfour cipher. Decrypt the remaining ciphertext with the arcfour cipher.

Because of vulnerabilities in the arcfour cipher, CipherSaber-2 is an updated version that runs the arcfour key scheduler at least 20 times. The exact number of times is a secret that the sender and receiver must agree on. Notice that CipherSaber-1 is CipherSaber-2 with only 1 key schedule iteration.

Using a large number of iterations could be considered a form of key strengthening. An adversary who is making a brute force attack on the ciphertext has that much more work to do for each passphrase trial.

You should really implement your own, but here is one of my implementations, written in C. I put it in with the rest of my arcfour stuff. Get it with git,

git clone http://git.nullprogram.com/arcfour.git

You can use it as a reference to make sure your first implementation is correct. You can use these two ciphertexts to test your implementation as well,

ciphersaber.png.cs
ciphersaber.png.cs2

This is the diagram image above (ciphersaber.png) encrypted with the key "nullprogram". The first one is CipherSaber-1 and the second is CipherSaber-2 with 20 key schedule iterations.


Emacs Htmlize and Highlighted Source Code

In the past I have always posted my code plainly without any sort of syntax highlighting. Boring! I finally got around to using the Emacs htmlize tool written by Hrvoje Niksic. So from now on I will try to provide syntax highlighting for inline code.

Htmlize leverages the syntax highlighting of Emacs by converting the faces into HTML. It's more or less what I see in my Emacs buffers. The exact output format can be in HTML + CSS, HTML with inline CSS, or HTML with old font tags. It can provide highlighting for any language that Emacs supports, which is just about everything.

I threw the CSS stuff in my main stylesheet so the XHTML of my posts remains simple. Here are some examples in action. Perl,

## Find the maximum total from top to bottom of the triangle.

use warnings;
use strict;
use List::Util qw/reduce max/;

my @triangle;
open my $datafile, "data67.txt";
while (<$datafile>) { push @triangle, [split/\s+/] }

my $result = reduce {
    for my $i (0..$#{$b}) {
        @$b[$i] += max(@$a[$i], @$a[$i+1]);
    }
    $b
} reverse @triangle;

print @{$result}[0] . "\n";

Scheme,

(define (sqrt x)
  (sqrt-iter 0 1.0 x))

(define (sqrt-iter last-guess guess x)
  (if (good-enough? last-guess guess)
      guess
      (sqrt-iter guess (improve guess x)
                 x)))

(define (good-enough? last-guess guess)
  (< (/ (abs (- last-guess guess)) guess) 0.001))

Octave,

# npoly()
function a = npoly (x, y)  
  X = repmat (x', 1, length(x));
  
  for i = 1:length(x)
    X(:,i) = X(:,i) .^ (i - 1);
  end
  
  a = X \ y';
end

C,

#define SWAP(a, b) if (a ^ b) {a ^= b; b ^= a; a ^= b;}

/* Initialize a keystream.  */
void init_keystream (keystream *k, int n)
{
  k->i = 0;
  k->j = 0;

  int i;
  for (i = 0; i < 256; i++)
    k->S[i] = i;

  int s;
  byte j = 0;
  for (s = 0; s < n; s++)
    for (i = 0; i < 256; i++)
      {
        j += k->S[i] + k->key[i % k->keylen];
        SWAP(k->S[i], k->S[j]);
      }
}

elisp,

(defun count-words-buffer ()
  "Print a message the number of words in the current buffer."
  (interactive)
  (save-excursion
    (let ((count 0))
      (goto-char (point-min))
      (while (< (point) (point-max))
        (forward-word 1)
        (setq count (1+ count)))
      (message "%d words." count))))

And even HTML,

<html>
<head>
<title>Sample HTML</title>
</head>
<body>
<h1>Hello World!</h1>
<!-- This is a comment. -->
<p>
  This is my HTML sample.
</p>
</body>

Custom Webcomic RSS Feeds

HTML to RSS I actually didn't start using RSS feeds until recently, and I wish I had started earlier. However, when I added my favorite webcomics to my feed I noticed most of them didn't actually include the comic itself. Personally, I would rather read the comic right in the feed.

Well, I later came across an old post on a blog I frequent: Terminally Incoherent. Luke Maciak posted some code to generate an RSS feed by scraping the comic's website. He did it in order to provide feeds for webcomics that don't have one. I took it, added a web front-end to it, and made the feeds available to anyone. You can find them under /feeds/ or at the link in the main left bar.

The scraper works by searching the comic's "latest comic" page for image sources matching a regular expression. So to teach the scraper about a new comic, all that needs to be provided is a URL and a regular expression matching only the image of the comic.

My front-end has a database of URL/regex pairs and takes care of caching results. This way it only queries the comic's website once every few hours at the most.

Here's where you can find the code,

feeds.pl
grab.pl

I also use an .htaccess file to write the links to look much cleaner,

RewriteEngine on
RewriteRule ^([\w\/]+)$ feeds.pl?q=$1
RewriteRule ^$ list

There are some comics that are immune. Specifically, some comics don't allow hot-linking, which includes links from an RSS feed. There is nothing I can really do about these.

If you enjoy these comics, go ahead and add these feeds if you want to read them right in your RSS reader.


A Not So Stupid C Mistake

I was reading through a website of " computer stupidities" today when I came across this,

if (a)
  {
    /* do something */
    return x;
  }
else if (!a)
  {
    /* do something else */
    return y;
  }
else
  {
    /* do something entirely different */
    return z;
  }

This was quickly dismissed as being an obvious beginner mistake. I don't think this can be dismissed so quickly without thinking it through for a moment. Yes, in the example above we will never reach the last condition where we return z, but consider the following,

if (a < b)
  printf ("foo\n");
else if (a > b)
  printf ("bar\n");
else if (a == b)
  printf ("baz\n");
else
  printf ("faz\n");

The same quick dismissal might drop the last "faz" print statement as being an impossible condition. Can you think of a situation where the program would print "faz"?

Our final condition will be reached if a or b is equal to NAN, which is defined by the IEEE floating-point standard. It is available in C99 from math.h. A NAN in any of the comparisons above will evaluate to false.

So don't be so quick to dismiss code like this.


SWF Decompression Perl One-liner

Magnifying glass Flash seems to be the popular way of playing videos online. This is a bit better than the bad old days of online video where a user had to select from a few buggy media player plug-ins. Things have improved.

However, if you don't use Flash, or if you want to watch the videos in your own media player, you are stuck. A download link for the video is almost never provided. The video is always somewhere, though, to be fetched via http. I mentioned this before for downloading YouTube videos using youtube-dl.

The trick is finding the URL. Sometimes you can derive it from the HTML code, sometimes you have to dig a little deeper by inspecting the Flash player itself. strings can be invaluable here.

There could be an extra layer of stuff to work out, which is explained below. My main reason for posting this is so I can refer back to it later when I need to do it again.

So, the other day I ran into a Flash video player that contained part of the URL of its video. I began by studying the embed tag in the HTML, which gave me some information about where to find the video (the video ID number). I downloaded the Flash player SWF file for the purpose of running strings on it.

I ran into a problem here. I wasn't finding any non-garbage strings inside the file. file told me it was compressed.

$ file player.swf
player.swf: Macromedia Flash data (compressed), version 9

Searching online quickly revealed that a compressed Flash file is just zlib compression after an 8-byte header. Decompression can actually be done with a Perl one-liner,

perl -MCompress::Zlib -0777 -e \
      'print uncompress substr <>, 8;' player.swf > player

I ran strings and greped for "http", revealing the location of the video. That was it!

I actually came across a Java program that does the same thing. It is 115 lines of code. Java programs always seem to be bloated like this.

I hope you find this useful!


URL Shortening

Short URL diagram There has been a lot of talk online about the fragility of URL shortening services, particularly in relation to Twitter and its 140 character limit on posts (based on SMS limits). These services create a single point of failure and break mechanisms of the web that we rely on. Several solutions have been proposed, so over the next couple years we get to see which ones end up getting adopted.

There are many different URL shortening services out there. They take a large URL, generate a short URL, and store the pair in a database. Several of these services have already shut down in response to abuse by spammers who hide fraudulent URLs behind shortened ones. If these services ever went down all at once, these shortened URLs would rot, destroying many of the connections that make up the world wide web. This is called the rot link apocalypse, and it has some people worried.

I am not very worried about this, though. I don't use Twitter, or any other service that puts such ridiculous restrictions on message sizes. Nor do I think information on Twitter is very important. Also, this mass link rot will occur gradually, slow enough to be dealt with.

In any case, short URLs may be useful sometimes, especially if a URL needs to be memorized or if the URL is extremely long. Or, it could be used to get around a design flaw in an inferior browser.

One idea that I have not yet seen implemented is simple data compression. When a short URL is needed, a user can apply a compression algorithm to the URL. The original URL can be recovered from this alone, so we don't have to rely on third parties to store any data.

I have doubts this would work in practice, though. Generic compression algorithms cannot compress such a small amount of data because their overhead is too large in relation. Go ahead, try pushing a URL through gzip. It will only get longer. We would need a special URL compression algorithm.

For example, I could harvest a large number of URLs from around the web, probably sticking to a single language, and use it to make a Huffman coding frequency table. Then I use this to break URLs into symbols to encode. The ".com/" symbol would likely be mapped to one or two bits. Finally, this compressed URL is encoded in base 64 for use. The client, who already has the same URL frequency table, would use it to decode the URL.

URLs don't seem to have too many common bits, so I doubt this would work well. I should give it a shot to see how well it works.

We probably need to stick with lookup tables mapping short strings to long strings. Instead of using a third party, which can disappear with the valuable data, we do the URL shortening at the same location as the data. If the URL shortening mechanism disappears, so did the data. The URL shortening loss wouldn't matter thanks to this coupling. Getting the shortened URL to users can be tricky, though.

One proposal wants to set the rev attribute of the link tag to "canonical" and point to the short URL.

<link rev="canonical" href="http://example.com/FbVT">

To understand this one must first understand the rel attribute. rel defines how the linked URL is related to the current document. rev is the opposite, describing how the current page is related to the linked page. To say rev="canonical" means "I am the canonical URL for this page".

However, I don't think this will get far. Several search engines, including Google, have already adopted a rel="canonical" for regular use. It's meant to be placed with the short URL and will cause search engines to treat it as if it was a 301 redirect. This won't help someone find the short URL from the long URL, though. It is also likely to be confused with the rev attribute by webmasters.

The rev attribute is also considered too difficult to understand, which is why it was removed from HTML5.

Another idea rests in just using the rel attribute by setting it to various values: "short", "shorter", "shortlink", "alternate shorter", "shorturi", "shortcut", "short_url". This website does a good job of describing why they are all not very good (misleading, ugly, or wrong), and it goes on to recommend "shorturl".

I went with this last one and added a "short permalink" link in all of my posts. This points to a 28 letter link that will 301 direct to the canonical post URL. In order to avoid trashing my root namespace, all of the short URLs begin with an asterisk. The 4 letter short code is derived from the post's internal name.

I also took the time to make a long version of the URL that is more descriptive. It contains the title of the post in the URL so a user has an idea of the destination topic before following through. The title is actually complete fluff and simply ignored. Naturally this link's rel attribute is set to "longurl".

Keep your eyes open to see where this URL shortening stuff ends up going.


Hashapass Password Management

Lock The author of a tool named Hashapass contacted me some time ago to bring his tool to attention. It is a way to mitigate the problem of having to memorize and generate many different passwords.

Good security practice is for users to have a different password with each web site and system they use. Should one of them be compromised, your other accounts will still be safe. The problem is that passwords tend to both be hard to remember and difficult to generate.

Hashapass allows a user to have just one password (ideally, passphrase) that is used to generate many different passwords. Provide the master passphrase and the name of the website (parameter) needing a password and Hashapass generates an 8-character password worth 48 bits.

The website works entirely in Javascript, so you don't have to worry about transmitting your password or master passphrase. This also makes it easy to see how the hashing is done. If this was a secret, I wouldn't recommend using it.

It works by applying HMAC, with the SHA-1 hash, to the the parameter and passphrase as to stir them together into a hash. Then it outputs the 48 most significant bits in base-64 as the password.

I mentioned before that you should really use a master passphrase instead of a master password, because a compromised hash password can be brute forced to reveal the master password. Unfortunately, the Hashapass website says "password" instead of "passphrase".

I made a Hashapass password cracker to test how practical this attack would be. You can grab it with Git,

git clone http://git.nullprogram.com/hashapass.git

The idea is that if a malicious website operator peeked at your password, knew you used Hashapass, and properly guessed the parameter (which isn't a secret), he could use a tool like this to brute force attack the password to retrieve the master passphrase. A short master password could easily be discovered.

Running on one machine with one instance of the program, my tool can break any password with five or less characters in a matter of hours. A 6-character password could take a month or two. A 7-character password would take a decade. Each character in the password increases the search time by a factor of 100.

If multiple computers/cores/processors are put to use on the attack, these times can be shortened: 2 computers would halve the time, for example. The attack is easy to parallelize.

My tool assumes a strong, but short, master password was chosen, as it checks against all printable ASCII characters. If a weaker password was used, and the attacker assumed this, the above time table would be much shorter.

So, for the master passphrase, use at least 8 characters generated using a strong random number generator. I recommend generating the passphrase with Diceware using 5 words.


Brainfuck Halting Problem

Stop sign On my brainfuck compiler project, I proposed pre-calculation as an optimization technique. The idea can work, but it has an issue that will always be unsolvable: how do you know that the pre-calculation will halt? This is called the halting problem and it has been proven impossible to solve.

The idea was that the compiler would run the brainfuck program up until the first input operation -- if there even was one. It would record all output and the final state of the memory. Instead of compiling the code was was run, it would compile code that would print all of the output and set the memory at the final state.

I has mistakenly assumed that it would be possible to detect a non-halting program and avoid doing pre-calculation on it. I described how it would be done and left it at that. Recently, someone kindly sent me an email containing only 5 letters:

+[--]

This defeated my ill-conceived idea.

Because brainfuck is Turing complete, it is actually impossible to determine whether or not an arbitrary brainfuck loop will halt. A computer can't do it. A human brain (a fancy computer) can't do it either. It cannot be done, at least not in this universe.

So, if implemented, this pre-calculation measure will always be flawed.


The Lazy Fibonacci List

GFDL licensed: found at /fdl-1.3.txt In a project I am working on, I want to implement a large list using lazy evaluation in Scheme. The list is large enough to be too unwieldy to store entirely in memory, but I still want to represent it in my program as if it was. The solution is lazy evaluation.

One use of lazy evaluation is allowing a program to have infinitely sized data structures without going into the impossible task of actually creating them. Instead, the structure is created on the fly as needed. As a prototype for getting it right, I made an infinitely long list in Scheme that contains the entire Fibonacci series.

This function, given two numbers from the series, returns the lazy list. It uses delay to delay evaluation of the list.

(define (fib f)
  (cons (cadr f)
        (delay (fib (list (cadr f)
                          (apply + f))))))

Notice the recursion here as no base case, so without lazy evaluation it would continue along forever without halting. Now run it,

> (fib '(0 1))
(1 . #<promise>)

The rest of the list is stored as a promise, which will later be teased out using force. This forces evaluation of the promise. Here is a function to traverse the list to the nth element and return it. Notice, this does have a base case.

(define (nth-fib f n)
  (if (= n 1) (car f)
      (nth-fib (force (cdr f)) (- n 1))))

Here it is in action. It is retrieving the 30th element.

> (define f (fib '(0 1)))
> f
(1 . #<promise>)
> (nth-fib f 30)
832040

If you examine f, it contains the first 30 numbers until running into an unevaluated promise. This behavior is very similar to memoization, as calculated values are stored instead of being recalculated later.

These two functions are also behaving as coroutines. When nth-fib reaches a promise, it yields to fib, which continues its non-halting definition. After producing a new value in f, it yields back to nth-fib.

The way I called these functions above, however, can lead to problems. We are storing all the calculated values in f, which can take up a lot of memory. For example, this probably won't work,

> (nth-fib f 1000000)

We will run out of memory before it halts. Instead, we can do this,

> (nth-fib (fib '(0 1)) 1000000)

Because nth-fib uses tail recursion as it traverses the list, unneeded calculated values are tossed (which the garbage collector will handle) and no additional function stack is used. All Scheme implementations optimize tail recursion in this way. This will continue along until it hits the millionth Fibonacci number, all while using a constant amount of memory.

It turns out that Scheme calls this type of data structure a stream, and some implementations have functions and macros defined so that they are ready to use.

So there you go: memoization, lazy evaluation, and coroutines all packed into one example.


Apartment Balcony Gardening

My wife was interested in growing a garden this summer. She has never done it before and wanted to learn. However, we live in an apartment so we don't exactly have a yard. Instead, we got some pots and started growing a balcony garden.

I thought we were being clever, but it turns out this is a common practice called container gardening. Searches online for "balcony gardening", "apartment gardening", and "container gardening" will bring up lots of useful information.

I think it's more convenient than regular gardening, as the plants are practically in the apartment. In fact, it's about 4 feet from our bed, through the balcony door. We can move the plants around to the other balcony, or even inside, if needed.

Gardening is a lot of fun, really.

This year, we are growing (or trying to grow) carrots, peppers, strawberries, catnip, and dactylis (cat grass). We'll see how things turn out in a couple months. Here's how it looks right now,

Balcony garden Top balcony garden Calvin inspection


Vimperator Firefox Add-on

Mouse crossed out I recently learned about an excellent Firefox add-on called Vimperator, which I have been using for a few days now. It creates an extremely efficient Vim-like interface to Firefox. One of the main functions is to be able to browse completely mouseless.

Why mouseless? Because the mouse is a bad input device for many uses of a computer. It's a good choice for many games, like first-person shooters, or graphic design, like Inkscape or GIMP. But for tasks like text editing, word processing, and data entry, the mouse one of the worst kinds of input device. The less you touch the mouse, the better.

Vimperator's argument is that the browser is better as a pure keyboard interface.

I am an Emacs person myself, which I use for text editing, file management, and IRC, but I appreciate the vi/Vim interface and accept it as being almost as good. Most of my vi experience actually comes from NetHack and Less. My main use for vi is editing my Debian sources.list so I can install Emacs.

Vimperator removes your toolbar, menu bar, and address bar. Then it transforms the status bar into the standard Vim status lines. This is because you don't need any of this stuff anymore with the Vim interface. It's traded for more browser real estate. This also creates the fun situation of watching your friends try to use your browser. At first, it really is pretty disorienting.

There is handy built-in documentation, found by pressing F1 or calling the :help command. You'll want to read through these before trying to do anything.

My problem right now is breaking my old Firefox keyboard muscle memory. Before Vimperator, my browsing was already fairly mouseless. I used keyboard shortcuts for everything. I had the Mouseless Browsing add-on installed, and occasionally used. When not using Mouseless Browsing, it worked out well because my right hand did the mouse, while most of the keyboard shortcuts could easily be performed with my left hand (C-tab, C-S-tab, C-t, C-w).

I think Vimperator has the potential to be even more efficient than that.

Probably one of the biggest adjustments is following links without a mouse. Like the Mouseless Browsing add-on, Vimperator assigns numbers to the links to be typed out. It is less intrusive though, because it doesn't reformat the page to show the numbers. It has a "hint" mode you go into for that. This mode displays the numbers over the links as red markers.

But even better than that, you don't generally even need those numbers. You can enter hint mode and begin typing the type of the link out. As soon as you reach a unique string prefix, it follows the link. This is the primary way I follow links, and I started doing this completely by accident. I wasn't even aware this was possible until I did it. Vimperator was completely natural in this respect.

Probably my favorite feature so far is automatic page advancement. I use these all the time now. One set of commands is C-a and C-x. These increment and decrement the last number in a URL, handy for those annoying multi-page articles. If they number the pages in the URL, this should handle it automatically. The other form of page turning is [[ and ]]. These search for links labeled "next", ">", "prev", "previous", and "<" and follow them. This works in Google searches and many web comics.

A potential use for macros is quick data scraping. You can write a macro to automatically perform a series of actions, like save the current page and move the next one, and have them repeat a number of times. It could also help in rapidly filling out the same form over and over, leaving only the CAPTCHA for manual input, if you were up to something mischievous.

For example, here is a macro to open in a new tab the first result of a Google search on the current page, then move to the next page. If you repeat it, it will open the first result on page 1, then the first result on page 2, and so on.

q s F 2 8 ] ] q

Note, the "28" may be different for you. To open the first result on the next 15 search result pages,

1 5 @ s

It is pretty cool watching it work away.

It's not perfect, though. Like Vim, you can prefix commands with numbers to repeat them, but this won't work with many commands, such as the page turning one above. You can get around it sometimes by placing the command in a macro.

Also, Vimperator still requires a mouse for many actions, like saving images. The worst part about it is these actions cannot be used as part of a macro. Hopefully Vimperator will improve in the future and fix this.

Give it a shot sometime. Like learning a good text editor for the first time, after you are set up, move your mouse out of reach so you are forced to use the keyboard. It slows you down in the short run, but you will be very fast later on down the road.


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.