Archived Comments

Pth and ncurses

(no comments)

Pendulum Waves

possiblywrong


Very cool! I remember first seeing this video on Richard Wiseman's blog (see here). He has a lot of interesting visual effects, illusions, etc.

Finder999

Is it possible to have the source code of this applet and to have the possibility to reduce or raise the lenght of each pendulum or all together or  to reduce or raise the speed of  the set of pendulum.

Finder999

Is it possible to have the source code of this applet and to have the possibility to reduce or raise the lenght of each pendulum or all together or to reduce or raise the speed of the set of pendulum.     

Christopher Wellons

The source is linked at the bottom of the post (at "git clone"). All of those values are constants (directly or indirectly) in the Display class, so they're easy to tweak and play with.


Flip Foolflim the Dragon Traitor

(no comments)

Football Parity

Chris Wellons


My data have 253 games for 2008 and 255 games for 2009, so it appears I might be missing a few games. I probably also have some score errors. I basically just copy/pasted from the first source, then did some spot checking and filled in missing values using the second source, which appears to be more reliable.


I don't see any 2009 data on the website you linked, and the 2008 data lists a full 256 games. Where did 231 and 244 come from?


Also, thanks for that link. Those data should be interesting to toy with.

possiblywrong


After some discussion, I realized that I made too strong a statement in my earlier comment about tournaments. It is *not* sufficient to simply have no team win/lose all of its games for there to exist a Hamiltonian cycle. Strong-connectedness is the right equivalent condition, and it did not occur to me that there might be tournaments that are *not* strongly connected despite having no zero in/out-degrees. There are... but a minimal example requires 6 teams.

possiblywrong


You're right, Rasp only has data for seasons up through 2008. And when I found discrepancies I went to ESPN and NFL's sites to settle them.


The 231 and 244 were from the graph2008.el and graph2009.el extractions, which were just the edge counts in the corresponding graphs, the result of eliminating multiple edges when team A beats team B twice.

possiblywrong


The original graphic and a lot of the reddit comments seem to use this as evidence for "parity" within the league. But I don't buy it; it is not clear (to me, at least) whether Hamiltonian cycles might be extremely common, as long as no team wins/loses all of its games (A). For example, the NBA and NHL both have "tournament" schedules (in the graph-theoretic sense of the word), in which case there *always* exists a Hamiltonian cycle in any regular season satisfying (A) above.


After poking around with this problem, starting with your data, then going to http://www2.stetson.edu/~jrasp/data.htm for earlier seasons, the data didn't look right. The NFL currently has a 16-game season, so there should be 256 total games. The data here indicate 231 and 224 games, resp., in 2008/2009.


A closer look at 2008 suggests that the differences were due to (1) eliminating multiple edges when team A beats team B twice, which is not relevant to this problem; (2) two incorrect games, Cleveland beating Buffalo and Tampa beating Minnesota (actually, Rasp's data is wrong two, indicating that Seattle beat Washington); and most interestingly, (3) including Cinci beating Philly when they in fact tied! I made the same error in my initial extraction.


Interesting problem...

J.R. Parsons

Chris - is there a quick way to prove or estimate the existence of a Hamiltonian path in a data set of that size?  In particular, if we know that by (say) week 8, no teams remain winless or undefeated, is there a back-of-the-envelope way to take 32 teams with 4 incoming/outgoing paths on average, and arrive at an approximate probability of the existence of the Hamiltonian?

Does the NFL's scheduling setup improve the odds of a Hamiltonian existing? In particular, we can say with certainty that Team A will play its division rivals B,C, and D twice each, and that A will have 11 distinct opponents in common with B,C, and D? Also, those other eight rivals (E-H and I-L) are themselves divisional rivals?

Christopher Wellons

Heh, while I have a good understanding of the rules of football, I know little about its implementations. I've watched maybe three NFL games on television and only one in person -- and that's only because I was in the halftime show. I don't know how they determine the who, where, and when, so I can't say how likely it is to occur.

It would seem to be a lot easier to prove the absence of a cycle than the existence of one. Finding a Hamiltonian cycle on an arbitrary graph is NP-complete, so, until more progress is made on the P=NP problem, the only way to prove a cycle exists would be to find one.

As far as estimation goes, I guess you could use any of the NP search techniques for quickly finding local maxima -- as measured by path length -- and try to make some meaningful statement about the result. I can't think of anything other than that. My personal guess just from searching the 2009 dataset would be that in the absence of a winless or undefeated team, a cycle would most probably exist. Of the tiny space I searched, I found millions of cycles in the 2009 dataset, though they all shared a large common subset path; a local maxima, I guess.

J.R. Parsons

Okay, so off the top of my head: assume the "any given Sunday" adage holds true and that each team has a 50% chance to win a game.  That means that each team has a 75% chance of having an edge that points toward a divisional opponent (3 of the 32 other teams) and a 50% chance of having an edge that points toward a non-divisional opponent.  So for any randomly-chosen list of all 32 teams, there is a 6.4 x 10^-21 chance that the list happens to be a Hamiltonian cycle.

If your estimate of the search space (2^118) is correct, then that points to a very large number of paths probably existing.  

If win probability for each of 32 teams is drawn from a win probability distribution like this one (http://www.advancednflstats... and I take into account duplicate edges from divisional games, the probability of finding a Hamiltonian with a random draw from all possible match-ups drops to 2e-25.  Note that this is taken from the larger draw space (2^118) but it still looks like a full exploration of a randomized season should tend to have something like a few hundred million valid Hamiltonian cycles.

Probability and graph theory are not my strong suits -- I haven't taken a formal class in either -- but I think I'm a lot closer to understanding how these things come to occur.  Thanks for doing a lot of the legwork!

J.R. Parsons

Oh - on further reflection, your discussion of local maxima and winless teams makes me think that searching for a handful of most-likely minimum cases could rule out the most likely scenarios under which no Hamiltonian cycle would exist.  Check right away for the likelihood of a winless or undefeated team; then add the likelihood of two near-perfect divisional opponents going 15-1 or 1-15 so that each has only the other upstream, and admits no third node upstream.  Then look at the vanishingly-small probability of three teams going 14-2 such that each has only the other two upstream.

Conversely, the likelihood that a 15-1 or 1-15 team will exist without being locked down as above actually helps find a Hamiltonian cycle, because that team has only one incoming or outgoing edge on one side, and so it can be considered a supernode with its neighbor... yeah?  And every Hamiltonian cycle would have to pass through that node the same way.  


Knight's Tour

Gavin Black


I think this is actually a problem with the code, running it on another machine in Cygwin I get: "Error while dumping state(Probably corrupted stack) segmentation fault(core dumped)".

Gavin Black


Is the nxn Knights Tour always solvable? I couldn't find online whether it was or not, and if it's not it could explain why some of them hang.

Chris Wellons


Cygwin gets really weird. If I run it with a 128x128 board (and other various sizes) it drops out without completing and without reporting any errors. When I say "without completing", I mean that is doesn't print a board or "No solution." and one of those should always be output. It also returns success to the shell.


I can't get the error you reported, though the same thing might be happening, just silently.


If I run it with a 55x55 board it locks up, though on GNU/Linux it spits out an answer after a few milliseconds. No idea why it hangs.


Now check this code out,


#include <stdio.h>

void recurse(int x)
{
if (x == 0)
{
printf("%p\n", &x);
return;
}
recurse(x - 1);
}

int main()
{
recurse(129273);
printf("Done.\n");
}



With Cygwin, gcc with no options, on my system here at work this works fine. But change that 129273 to 129274 and it will crash as described above. The stack frame size here is 16 bytes. So this means after 2068368 bytes (~2020kb) of stack something bad is happening.


If I add the -mno-cygwin option, it can go slightly further. On GNU/Linux I can take it to around 500000 before it segfaults.


Is it running into the text segment here? Note the size/value of the printed pointer. Pretty low.


My conclusion is that the crashing is the program running out of stack space. A 128x128 board needs 16,384 (+ some overhead) calls on the stack. Multiply that by the stack frame size of the knight's tour program and it goes beyond the few kilobytes of stack space made available.


Is there a compiler option for providing a bigger stack? The only available course of action might be to use iteration and build a stack manually in the heap, where I have GBs to work with rather than kB.


This has been a little lesson on the limitations of C for me.

Chris Wellons


I really don't know. I assume after a certain size that any board has a solution, because it can be broken down into smaller boards that have solutions.


Software Serial Codes

Gavin Black


I would think the easiest attack against anything that doesn't phone home would be to get a serial key from someone who legitimately has one and just use that. Still interesting that brute forcing is minimized with an even split between the checksum and the serial number.

Chris Wellons


In the case where the software talks to other instances of itself (a multiplayer game, for example), it might compare serial codes and, if they match, refuse to talk to each other. You would have to (in order of general difficulty) either purchase another serial code, figure out how to make your own serial codes, or modify the software's behavior. Also, future versions of the software may reject published serial codes. Share a serial code too much and it stops working.


Brute forcing might not necessarily be minimized by an even split. I suspect my implementation might benefit from a larger checksum and smaller code, because it's harder to iterate over the code segment than the checksum segment: not every checksum has a matching code, but the reverse is always true. But in general the brute force attack will be iterating over the smaller set. An even split maximizes the size of both sets.


Emacs UUIDs

Ron


Case and point of not relying on data being avaible on the internet Martin Blais site is down.

Chris Wellons


Thanks for pointing that out, Ron. Perfect example of what I was talking about. I've added a local mirror of Martin's code next to the original link.

Ron


djcp, has some info on getting ip data from win32 here (as well as linux ,and probably other unixs including MacOs and Cygwin).


http://emacs-fu.blogspot.com/2009/05/getting-your-ip-address.html


So a hack that I derived from that can be found here


http://gist.github.com/399236


I currently dont have any windows machines to test, and the win32 info is only based on googled screenshots. I would be curious to know if it works on an actual win32 build. And rereading your post suggests other hardware resources needed. Well mac address ill probably need at some point anyway.


Chris Wellons


Yes, that works in Windows, but with the caveat that ipconfig uses dashes instead of colons. You must know since the string-match pattern is tuned to it. I actually had a problem in Linux because ifconfig is in /sbin, which isn't in my regular user path. I changed it to an absolute path in the code.


So you've solved it for GNU/Linux and Windows, so now you have to solve it for Macs and the other dozen operating systems Emacs runs on. :-)


I think I prefer the pseudo-random methods anyway, as 128-bits from a decent PRNG with proper seeding should always be practically unique.


Elisp Function Composition

Gavin Black


Just have to point out that Haskell has implicit partial application and function composition ;)


Something like:

foo x = (+) x
bar x y = x - y
faz = foo . bar 2
faz 3 4




is perfectly valid

Mark

Works great. Thanks for sharing.

Yu0

Nowadays it is probably better to use the dash.el function -> or the more convenient macro -->. E.g. (--> 13 (prin1-to-string it) (concat it "::" it)) will give the string "13::13".

Seems like that function was added to dash.el around 2012 [1].

[1] https://github.com/magnars/...


Sudoku Applet

Chris Wellons


Yeah, I like cranking up all the static code checks on all my (personal) projects, no matter the language. With Java, I use Checkstyle and compile with the -Xlint switch, which issues a warning when a serialVersionUID is not provided when extending a Serializable class. Emacs generates the actual UID for me, so it's very little effort to do.


But I'm not actually doing any serialization in this particular project. I'm just dotting my is and crossing my ts for the bureaucratic Java libraries.

Gavin Black


Interesting, I was thrown by the serialVersionUID at first since I'd never seen it before. I never knew there was a standard way to do serialization in Java.



It seems unfortunate this isn't inherit to objects themselves like it is in most other languages (ie. JSON in Javascript, memcpy for C structs, print/read in Lisp, and read/show in Haskell ;). Actually looking at Wikipedia it looks like the considerations were: Some classes are tied to JVM state, might have incompatible class versions, and it would allow allow access to private members. To me these seem like complete cop-out reasons though :P


Java Applets Demo Page

Chris Wellons


Hmmm, the model for a double pendulum looks pretty complicated. It's going to take some time for me to understand it.

Mike Abraham


This is awesome. I propose another simple physics simulation that demonstrates chaos: the double pendulum.


Introducing Java Mode Plus

(no comments)

A Modest Chess Engine

possiblywrong


This is very cool, thanks for posting it! I just tried it out and was promptly beaten.


I like the idea, as you suggest, of pitting AIs against each other. I have wanted to do this with a couple of recent projects of mine (Pente and Keryo-Pente, and Carcassonne; see links below), but I also have not been at all interested in climbing around in an icky mediator API.


There is another way, though, at least for "one-offs" against a particular opponent. A handy approach that I used in both games was to *not* distinguish any player as human or computer ahead of time, but instead, let the user on any given turn choose to play a specific move with the mouse, or press space to "do what the computer would do."


This way, I can pit my AI against some other AI (e.g., Mark Mammel's WPente), by running both versions simultaneously. I play WPente as a human; when the WPente AI moves, I make the same move in *my* version manually (e.g., like a human). I then use my AI to make the next move, which I "sneakernet" over to WPente.


Thanks again!


http://possiblywrong.wordpress.com/2010/05/31/pente-from-the-apple-to-today/


http://possiblywrong.wordpress.com/2010/09/06/carcassonne-update/

Michael Abraham


I actually came by here to dig up a copy of MATBOTS. This chess engine is cool stuff, Chris! I beat the snot out of it but it is still impressive.

Roberto Munter


Fantastic!!

Chris Wellons


I did do some sneakernet play between GNU Chess and my AI, which is how I know with certainty that my AI isn't very strong. That's a good idea, stepping the computer moves manually. That would make it easier to analyze the game rather than the rapid back-and-forth that would normally happen.


Those two posts you linked are very interesting, and sound familiar to my own experience.

Chris Wellons


I found some more information on engine strength and search depth: Question for people who know something about computer strengths and human ratings...

Folkert van Heusden

Consider implementing the uci/xboard protocol in it. You can then interface it to internet chess servers, let it battle and find out what elo rating it has.


Lorenz Chaotic Water Wheel Applet

(no comments)

Jump to Java Documentation from Emacs

Timo

This is wonderful, thank you! I especially like the new automatic importing.


Emacs Set Window to 80 Columns

Max

Thanks for this post. Exactly what I was looking for.

Ted Mielczarek

I realize this is a very old post, but I couldn't get this to work in Emacs 24. Your code would give me the error "No window on the right of this one". I poked around a bit and I think maybe it's trying to resize the minibuffer window in modern Emacs? In any event, I rewrote it to resize the current frame instead and it works now. Here's my code:



(defun set-80-columns ()
"Set the selected frame to 80 columns."
(interactive)
(set-frame-width (selected-frame) 80))
Christopher Wellons

In the article when I say "window" I'm using the Emacs term, which is that pane inside the OS window (Emacs frame). There needs to be another window to the right to fill in the space beyond 80 columns, which is what Emacs was complaining about. It still seems to work as intended (equally as crudely) with the latest Emacs here in 2017. It sounds like the frame version you figured out is what you really wanted in the first place. Thanks for sharing it!


SampleJavaProject

(no comments)

Emacs Find All Files

Oliver


Handy, thanks!


Elisp Higher-order Conversion to Interactive

E Sabof

There is a small error in your post. `compose' isn't part of standard emacs.


Throw Up a Quick HTTP Server

possiblywrong


You saved me some time, too. I found this post deeply embedded in a Google search while trying to tweak my x64 Ubuntu 10.2 installation without an internet connection. (My work desktop is Windows, but I have another KVM-switched standalone Linux box).


I think the problem is even worse than you describe. I was using the CD (not the DVD) burned from the .iso, which mounts as some long volume name... but apt seems to hard-code the location where it expects the CD to be (/media/cdrom, I think). Serving the CD from Python works great, though.


Thanks!

Chris Wellons


That exactly describes the problem I was having with the DVD, but I had assumed it was only due to it being the less-popular DVD. I'm a little surprised to see it's totally broken for both types of media. Looks like nobody is testing it.


Off topic question: you say you found this post deep in a Google search? I noticed you have a post linking to the Howard County Times. I currently live in Howard county, and if you do too that's an interesting coincidence. Or perhaps Google does something with search results varying by locality.

Michal Zuber

Yeah, really cool stuff, added to my bash aliases:
alias webshare='echo "Visit http://$(hostname):8080/"; python3 -m http.server 8080'


Middleman Parallelization

(no comments)

Java is Death By A Thousand Paper Cuts

Chris Wellons


JRuby is something I've been interested in, but I haven't really looked into it. The JVM language I'd really like most to get into Clojure, Lisp on the JVM. The problem is the serious lack of support: my package manager doesn't handle it; Clojure Slime is a huge pain to set up, has no documentation, and is half-unmaintained; and Clojure contrib is even worse.

Luke Maciak


Oops... You already mentioned Clojure. Oh well. :P

Luke Maciak


I feel your pain man. It's even worse when I switch to C++.


Then again Java was what I used all throughout college so I tend to forgive it a lot of it's flaws due to nostalgia and stuff like that.


I guess it's probably to late for this, but next time you need to write for the JVM maybe try Clojure? It is a lisp dialect written for jvm.


I have only played with it a little bit and I think it is a bit different from say Common Lisp but it is worth giving a shot if Java is driving you nuts. :)

Jeff


Have you considered Jython and Nailgun, or maybe JRuby? I never got along well with Java.


Distributed Computing with Emacs

(no comments)

Elisp Memoize

(no comments)

GIMP Painting

(no comments)

GIMP Space Elevator Drawing

(no comments)

Emacs Byte Compilation

(no comments)

Elisp Running Time Macro

Chris Wellons


I've realized that this function could be a lot shorter, cleaner, and more correct just by using float-time. So I fixed it. Enjoy!

Chris Wellons


Yeah, I love Emacs. It's a really fantastic editor/IDE and portable programming environment. And thoroughly self-documented. You mentioned running code in any buffer: if you fire up Emacs info and open one of the elisp manuals, you can execute the examples right inside the manual as you read it. Very convenient.


I keep using elisp, rather than other lisps, because I don't even need a separate interpreter -- it's built right into the editor -- and all of the documentation is right there. What's this function do? C-x f function. Want more depth? M-x elisp-index-search function.


Now if only elisp had closures, bignums, modules/namespaces, a C interface, continuations, and was reasonably fast (i.e. it was more like Common Lisp or Scheme) it could be used as a serious programming language. Without these, it's still good for writing little experiments. I suspect it will gain most of these features someday.


While I am wishing, I wish it had a flexible regex type, like Javascript or Perl, too. Putting regular expressions in strings is silly.

Lukasz Grzegorz Maciak


Nice! I'm bookmarking this for future reference. I'm pretty sure this will become useful at some point.


Btw, all these recent elisp posts made me start messing around with Emacs again and I love the fact that I can type a function into a buffer, and then run it on that very same buffer. :)


Emacs ParEdit and IELM

Chris Wellons


Yup, I've used eshell a bit, mostly at work (to avoid the horrible cmd.exe). But since it doesn't live in a real terminal emulator it has limited usefulness. It's also lacking documentation at the moment, so it can be hard to figure out how to make it do some of the cool things it can do, such as redirect input/output to/from buffers.

Joseph Gay


Hi, you may already be aware of eshell. I just thought I'd mention that it supports lisp input.

Egarrulo

Thank you for this useful snippet of code.

Regarding Eshell's lack of documentation, have a look at this tutorial:
http://www.masteringemacs.o...

What do you mean by "since it doesn't live in a real terminal emulator it has limited usefulness"? Maybe because it doesn't work with console programs which take up all the screen (like vim, mutt, etc.)?

Christopher Wellons

Yup, you've got it. It can't do curses type stuff because it doesn't interpret escape sequences (ANSI, VT100, etc.) like a real terminal would. This isn't eshell's fault, so it's not a complaint from me. Emacs *does* have the ability to be a terminal emulator (ansi-term), but it unavoidably interferes with Emacs' own bindings.

essay-on-time writing service

This is really wonderful and useful for those Emac users who wanted to make use of this extension. It has a unique feature that can give a person some output that they need.

Andrew Kirkpatrick

I know this is a little late, but you can put that same advice on 'inferior-emacs-lisp-mode in order to have the parens appear at the first prompt.


Elisp Printed Hash Tables

(no comments)

Identifying Files

(no comments)

Neat Random Inn Generator

(no comments)

LZMA and XZ Binaries

Gavin Black


Looks like it was mysteriously added to lzma and xz utils pages :P

Chris Wellons


Ah, it does, and I just tested at work to confirm. Thanks for pointing that out. I've never used the beta versions before so I didn't know.

Jeff Borck


7-zip now seems to support XZ, as of 9.04 beta.

Gavin Black


Nice job! You ought to add that to the Wiki page or somewhere visible, since I think you have the only native Windows xz utility.

Chris Wellons


Eh, I don't want to put my own link on Wikipedia. That's bad form and bad ettiquette. But if you wanted to do it I can't stop you! ;-)

Midzuki


Thanks!


Yes, yet another usefull item for my software collection.


^__^


Scheme Live Coding

(no comments)

Scheme Neural Network

baoxiang pan

cool!!!!

amz3

thx!


Neural Network Blackjack Game

abcd

Stumbled upon this. Interesting method and nice write up. Thanks.


Emacs cat-safe

Kelsey Wellons


I wonder who's cute little paw that is? :)


Function Pointers are Special

Chris Wellons


NealAbq, I think that's exactly what's going on, at least in the case of x86. The x86 ISA has different models for object code, medium and compact, where the function pointers are different sizes than data pointers. I found a discussion of it, including an example of this in action, on Stack Overflow: Can the Size of Pointers Vary Depending on what’s Pointed To?. It also links to this comp.lang.c FAQ question: Question 4.13.


I'm going to add those links to my post now.

NealAbq


Thanks. I'd almost forgotten about small/medium/compact/large, although they still matter sometimes if you're doing embedded systems.


Apparently Intel gives us 6 memory models, although only medium and compact have skewed-up pointers:


http://en.wikipedia.org/wiki/Intel_Memory_Model

NealAbq


When are function pointers different? Is it a "near" vs "far" pointer thing, where you tell the compiler to assume data pointers are all "near" unless explicitly marked "far", and all function pointers are "far"?

Gavin Black


Finally something to replace giving printouts of Whitespace programs to interviewees and asking them to debug it :-D


Common Lisp Quick Reference

_

Greatness is here 0 COMMENTS!?!!?!?!?!?!!?!?

celestine virgen

Interesting article . I Appreciate the facts , Does someone know if my business might be able to acquire a fillable HUD-1 document to edit ?


Wisp Screencasts

(no comments)

Wisp Lisp

Chris Wellons


Hmmm, that can only mean that something is passing a size of 0 to realloc() at some point. It returned NULL and it was thought to be an error.


That's the kind of stuff I'm working on ironing out right now. The number of defined functions will expand once I have the core pretty solid. I look forward to the point where I'm just adding interesting language features, exposing interesting library interfaces, and optimizing things.


I'm also nearing the point where I can start benchmarking its performance. Right now it seems to be on par with non-compiled Emacs lisp. However, I have the advantage of much fewer language features.


Setting up a Common Lisp Environment

(no comments)

Magick Thumbnails

(no comments)

Game of Life in Java

(no comments)

Tweaking Emacs for Ant and Java

Gavin Black


A very similar effect can be achieved with a real editor using ant_menu.vim plugin. You simply copy it to $VIM_HOME/plugins and then: ,b to build; ,f to find; ,t to build a target; etc. All of which go into a separate buffer on a split screen.

You can run the program using :!whatever_command. Or if you want to be fancy and run it in it's own buffer you could do :e name then :%!whatever_command.


Chris Wellons


As an addendum:


What's also nice about this, and Emacs in general, is that the setup is portable. That is, they make me use Windows at work for some stuff (i.e. management making decisions they didn't quite understand), but I can still set up a nearly identical development environment as the one I enjoy on GNU/Linux. If you have javac and ant in your path, you're all ready to use my stuff in the post.


However, you might want to adjust the "/" part to suit the weird filesystem layout Windows uses. Is there an generic elisp function for detecting the filesystem root?

ckoch

You can also set the compile-command which defaults to make -k with the ant -emacs -f by adding the following to your init.el: (setq compile-command "ant \-emacs \-f")


First Maryland Snow of 2009

(no comments)

Emacs Web Servlets

Rick Hanson

This is freakin' awesome!  Thanks for sharing.


Your BitTorrent Client is Probably Defective by Design

Ahmed Fasih

Alas, as of 2016, BEP 27 Private Torrents is officially listed as “accepted”: https://github.com/bittorre...

Christopher Wellons

Honestly I'm surprised it only happened that recently. It was already a de facto standard more than a decade earlier. I just wish open source clients allowed it to be overridden/ignored by the user, just as how most of the open source PDF readers ignore PDF DRM flags (no-print, etc.).


Comments Upgrade with Avatars

stephy


The picture of spam totally cracked me up. I know nothing about code and computer speak but the Spam did resonate with me. :)

Matt Stine


hawt.

Chris Wellons


I actually swiped that image right from Wikipedia, which is indicated, as required, by mousing over the image to see the image's title attribute. Thanks to the Free licensing Wikipedia uses I can do this without risking some kind of Fair Use/copyright infringement thing.

Chris Wellons


Check out that sexy avatar to the left!


Unorderable Sets

Gavin Black


Hmmm, it is pretty hard to come up with items that don't imply order and are easy to compute with. Your word idea works, but you'd have to use something expensive like a hashmap.



You could also use unicode to get non-alphabetical symbols (Mathematics, punction, etc), which would still have an underlying number associated with them. I don't think asian languages have an ordering associated with their characters either, since they are essentially ideograms.

Luke Maciak


I submit that it is impossible to create such set. As soon as you list the members, it will become possible to design a hash function that will transform that set into a linearly orderable set of hash values. This is of course only necessary if there is no explicit presentation order.


Ordering and indexing things is just something that we humans do, and we do it fairly well. So I would say that any attempt at enforcing un-orderability of a set will fail, as sooner or later someone will come up with the ordering.


But if you want a set without an inherent ordering, why not just assign each person in your village a GUID. They still can be ordered an indexed quite easily, but this ordering is less intuitive than that of social security numbers for example.

Chris Wellons


@Gavin: I thought that maybe another culture would have some kind of assumed unordered set. I'll have to check up on that.


@Luke: Yup, anyone can easily define an order to a finite set (I think any infinite set would be ordered by its very definition), so I guess the best we could hope for is designing the set such that defining an order is difficult and the defined order is unintuitive.


I like the the GUID idea. It's similar to my mention of assigning large numbers in a semi-random way. The assignment process could be a hash of (typically) unchanging physical properties like height, eye color, hair color, skin color, date of birth, blood type, handedness, length of right index finger, etc. If villagers understand how naming assignments work, smaller identifiers could be used and no one would assume a meaningful ordering. (Not that I condone imprisoning innocent people in an Orwellian village.)

Tino


I think (but are not sure) that every set can be assigned an order. It is an interesting question though. My first idea for avoiding ordering was a continuous 2D-surface of points. But even there you can create an order after, say, the distance to the origin and the angle to the x-axis. My guess is that any finite or infinite set can be mapped to points on an N-dimensional coordinate system, and then use a similar ordering in N dimensions.


But it seems you are also hinting towards a more cultural question; if there is any "set" in our culture that we think of as inherently unordered. For that, how about the a set of colors?


I think most humans think of colors as an unordered set, even though it is of course *possible* to order them in many ways. A good test would be to give a number of colored rectangles to people and the task to "please order these". With colors you would probably get widely different results, while with e.g., names, words, and social security numbers, almost everyone would just sort them alphanumerically.


SumoBots Programming Game

jb


Interesting, thanks for sharing !


null program Turns Two Years Old

Gavin Black


"beautilluminate"


Wow a word that has no results on google. I guess it will now, or will fade into crastelanimousity

David Engel


Congratulations! I'm just on the starting end of my blog, but I have to agree: Blosxom is a great choice.

Someone


beautilluminate

Chris Wellons


@Gavin: Hmm, I thought I read somewhere about the blog photograph claim. I can't find a source for it to prove it, though. So maybe I am wrong. In any case, Ferrari Guy reminds me a bit of Limozeen.


And the "preview" goes in the "preview" area. Notice the gap above it? ;-) That's why.


@David: Where's your blog, David? Drop us a link.

Gavin Black


Congrats on 2 years of having your site :)


I'd be interested on seeing the study on portraits on websites. I have the opposite intuition, the reader can make up the image of you they want and not pass judgment based on looks. I think I feel that way because when I was little I saw ESR after reading the Jargon File, and it was like being hit with a bat :)


Actually I take that back after seeing Ferrari Guy, site portraits FTW!


Gavin Black


I just noticed something kinda weird. When you preview a comment it is put under the others, but when you submit it goes on top :p

Matt Stine


Congratulations, Chris! I'm barely consistent enough to sign on to Jabber once a week, let alone write a blog post every seven days! Here's to another two years! :)

Chris Wellons


I had also checked that against Google and noticed it had no results, so I don't know what Someone was intervocabmunicating. I'm going to have some unique search terms pointing here now.


An Entropy-Efficient Virtual Dice Rolling Algorithm

Elsie Howard

Nice post. Cristopher, your project is awesome, this algorithm is perfect. I like your work. Thanks for sharing here.


Lossless Optimizers

Kelsey Wellons


How about MS Paint? :)


IRC Random Number Generator

Chris Wellons


"You could inject small amounts of other random data periodically to deter anyone listening in."


Perhaps with Fortuna?

Gavin Black


The idea of network based RNG doesn't seem like a bad idea. You could inject small amounts of other random data periodically to deter anyone listening in.


Or better yet if you try to find the delta as a value of CPU clock cycles you'll get clock-drift randomness too.


Not sure why you picked IRC though, since it's so slow (/b/ would be good if you could rely on it being up).
Seems that even network latency itself could be used to generate a lot of data pretty quickly, e.g. repeatedly fetch from an ad server (Since they often serve up different content each time) and measure the amount of cycles that occurred.


Flying Spaghetti Monster in the HTML 5 Spec

(no comments)

Web Pages Are Liquids

JKirchartz


Definitely true, webpages should be liquid and accept changes in font-size for accessibility and so that it doesn't look terrible on screens of different resolution (iPhone to HDtv)


But typography is important in that line-length is really closely related to read-ability. It's easier to read on a per-line basis than per-word.


To be liquid and follow those rules change all pixel measurements to em. http://en.wikipedia.org/wiki/Em_(typography) has some good links and data.

David Engel


I agree, to a point. I don't like non-liquid designs generally, especially because of the issues lower resolutions have, but on machines with very high resolution, the extended line length becomes a usability issue - it is accepted that paragraphs that are very wide - I think more than 20 words wide - are difficult to read for the average person. That's why I am in favor of setting a maximum width based on the font size, which will leave the empty space you are noting.


Ad-blocking and the Regrettable URL Format

Gavin Black


So you're advocating the Java package naming convention :)


I don't think it would be too hard to write something that translates URLs back and forth between the two, since the first 3 are the only reordered ones.


Also I'm pretty sure http://*.whatever.com/* is supported in newer versions of adblock. I have several entries like that and just tested my doubleclick one.

Chris Wellons


Yup, just because Java did it doesn't necessarily make it bad. (But that's a good rule of thumb! :-P)


And yeah, that pattern works in Adblock Plus just fine. I'm just tired of typing it. Adblock Plus provides a list of likely patterns that is one item too short. Let's say the single-pixel tracker was at "http://beacon7.example.com/tracker/pixel.gif ". Adblock Plus would present me,


http://beacon7.example.com/tracker/pixel.gif
http://beacon7.example.com/tracker/*
http://beacon7.example.com/*
Text box for custom a filter



I'd like another quick option available for me so I don't have to type it out every time,

http://*.example.com/*



That's what I do use, and it's the most common pattern in my blocklist. This pattern is less efficient than the alternate-universe URL where we need only one glob. It's also erroneous, as in the case of, say, a wiki documenting this as a bad website,


http://example.net/wiki/beacon7.example.com/screenshot.png


Our screenshot would be blocked. To resolve this, Adblock Plus has some new kind of pattern that only matches domain names. I haven't used it yet, though.


You mentioned translating the URL, an idea which was also in the back of my mind. Adblock Plus could work on translated URLs, potentially simplifying, optimizing, and disambiguating patterns (and translations should be cheap). It introduces an ambiguity because this maps many-to-one.

http://wiki.example.net/
http://example.net/wiki/


Both translate to,

http://net/example/wiki/


Now that I think about it, the alternate URL scheme would have some of its own issues: instead of DNS resolving subdomains to different machines, it would have to be handled during the HTTP handshake by the server. Instead, maybe something like this one-to-one mapping would be more desirable for the above two URLs,

http://net.example.wiki/
http://net.example/wiki/



(Oh, and RFC2606 suggests using reserved names, like example.com or example.net, for testing and documentation. ;-) )


Three Rivers Stadium Implosion

(no comments)

Television Commercials

(no comments)

Dry Ice Potato Gun

Luke Maciak


This is officially one of the more awesome things I have seen lately. :)

Matt Stine


Chris,


http://hackaday.com/2009/08/11/dry-ice-cannon/
is a slightly more advanced version of our original invention:
http://nullprogram.com/blog/2009/07/20


Just thought I'd share, and I have to wonder if these people saw your blog post when coming up with their idea. :)


Matt


PS, I just saw your awesomely hideous "OMFG YOU USE INTERNET EXPLORER" banner, since I loaded your blog from work using IE6. :O Nice touch!

Chris Wellons


@Luke: Thanks!


Looking more closely at the video I see that the camera shakes when the gun fires. I think this adds a nice, subtle visual effect that makes it seem more than it is.


Lisp Fantasy Name Generator

anonymous


I use mine to create names for my, uhhhhh , World of Warcraft characters :o


Converted to HTML 5

(no comments)

Browser URL Mangling

(no comments)

The Emacs Calculator

Chris Wellons


Oops, I forgot to include errors and intervals!

anonymous


wow, thanks, i've only used calc for very basic stuff apparently; that unit conversion stuff and the symbolic forms are great :)

Guest

Actually, as early as 1994 I used Emacs calc for probability density function calculations for queueing theory. I made a small extension to Emacs to do Laplace transforms, and it worked beautifully. It's a wonderful tool.

Penguin Pete Trbovich

Although I've known about this for a while, this was an awesome tutorial to help explain Emacs calculator mode to my kids.

Roy Schiramael

Here is a bunch of helpful tips from the Google + Emacs community: https://plus.google.com/com...

Also see this guide : http://blog.everythingtaste...


United States Hamiltonian Paths

Kelsey Wellons


I found a path by hand in a minute!

Chris Wellons


I accidentally posted this earlier today after I only wrote the title, which some RSS readers may have picked up on. This is the real deal now.


Javascript Distributed Computing

Gavin Black


Oops I forgot a zero: Mn - a*x <= 0

Gavin Black


Okay, so I've been thinking about this. I have an idea but it takes a ton of space on the host's side.



1. Give the user an offset number(x) and have an offset constant(a)

2. Have the user check if any of the numbers [Mn - a*x, Mn - (a-1)*x ) are divisors of Mn.

3. If a divisor is found return back x and the location found, else return x and something like -1 meaning no divisors in that set.

4. If a divisor is found by foo distant IP addresses the host tries the next number.

5. If you get to the point where Mn - a*x is <= then it is prime



The main issue is the host is required to store foo*(Mn/a) database entries for each test, which would be quite large.



P.S. the ol, ul tags don't work in comments. Not sure if you want them to

Chris Wellons


If Mn - a*x <= 0, then a and x together are about the same size or larger, in terms of computer storage, as Mn. I think can prove that. But they won't have the nice small form to represent them. Both a and x have to be sent to the node.


I also think it would be so computationally costly such that testing even a single number couldn't be completed by all the browsers in the world in any reasonable amount of time. These are gigantic numbers are are talking about. 13 million base-10 digits.


And wouldn't Mn/a database entries probably require more storage than the whole universe could provide? Unless a is extremely large (and therefore can't be reasonably transferred to a node or iterated over).


Maybe I am misunderstanding your idea.


P.S. If you notice the note above the comment box, ol and ul (and therefore li) tags aren't in the approved list. I guess I could add them sometime (they are a bit more special than some other tags).

Alix Axel


Well finding Mersenne primes with the Lucas-Lehmer test will never be appropriate for a distributed computing JavaScript environment, there are only two problems that I'm aware of that are suitable for JSDC:


1) Factoring Composite Numbers
2) Testing the Collatz Conjecture aka "3n + 1 Problem"
3) Generating Rainbow Tables

Gavin Black


I didn't explain myself very well the first time, and I may not again today due to lack of sleep, but here goes an example.


I want to test the number 2^6969 - 1 this shall henceforth be known as M
I have in my database c entries where c is some constant number
Now assume c is something that has a small representation, for this example let c = 2^6919
So (2^6969 - 1)/(2^6919) ~= 10^15, and this is what the user gets.


But anyway you give to the user an offset value(s) between [0,10^15] and M and have them step through each number(n)
(M - c*s) to (M - c*s + s) and return back if it found an n s.t. n | M


So the transfer between the user and server is pretty small, but the space requirements for the server are horrendous. Also the computationally time would be awful. Except that I would think, but have absolutely no proof, most numbers that aren't prime wouldn't take too long to find a divisor if you start from small numbers, and you could maybe do a better local test once you've narrowed down a probable number.


These were just thoughts, not saying they are at all feasible...but it does cut down on the communication between the server and the client.

Alex

There is one project about distributed javascript calculations.

https://zlelik.blogspot.com...


E-mail Obfuscater Perl One-liner

Kelsey Wellons


Here's a good one-liner: Those that forget the pasta are doomed to reheat it.

Chris Wellons


What about array_map()? This actually seems like it would actually be even longer, because you would have to use create_function(). And I thought "lambda" was a long keyword/function name (the language Arc calls it fn to be nice and short).


Wow, look at create_function()! It takes in two strings which have to be parsed and compiled at run time. And being in strings it messes up working with the code. Yuck! Good luck making anything but a trivial anonymous function with that.


Luckily you PHP people get real anonymous functions, with closures to boot, when 5.3.0 eventually comes out. Until then you have to make due with functions in strings. I didn't realize how bad PHP programmers had it. :-P


Elisp Wishlist

Anonymous Coward


See Doug Hoyte's "Let Over Lambda" for an implementation of some of your regexp-related wishes in Common Lisp.


Doing Comment Previews the Right Way

Chris Wellons


Fixed the margin thing: it's all CSS now.

Gavin Black


main()

{

Indented_Code_line(1);

Indented_Code_line(2);

}

Chris Wellons


Incidentally, the short URL for this post is "cows".

Chris Wellons


That last comment would be my wife messing with me. :-D

Gavin Black


Preview seems to be working pretty good...except I can now blow your margins, not sure if you care since your comments are in a different location than the actual story. Although it makes it irritating because even legitimate text like this long paragraph will go off screen and require using the horizontal scrollbar to see.


BlooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooownMaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaargins





Also it seems like there isn't an automagic newline to BR.


Chris Wellons


I don't want to ban long lines because someone might drop some long code inside a set of pre tags, but I also don't want normal paragraphs to not wrap either, like that is happening above. The problem is that pollxn puts the comments in a table, and your long line stretches the cell out. I need to fix pollxn again.


I won't write a sophisticated engine for detecting abuse because it will never be complete. I can just manually remove posts that are abusive (like ones that post a whole bunch of <br> in a row to create giant gaps).


I don't quite want automagic \n -> br because I want this to be as close to HTML as possible, but still permit someone unfamiliar with HTML to separate their paragraphs. If you look at the code, it separates paragraphs by putting <p></p> in place of two newlines ("\n\n"). This allows a regular person to easily make paragraphs. You might not have seen this because I actually broke that by accident during the time you made your post. It is fixed now.


If you really needed a br you can drop one in yourself. It is one of the allowed tags.

anonymous


It looks good. I am actually writing a block of code similar to this for my blog. I was having trouble, but after seeing your code, I realized what I had been doing wrong.

Chris Wellons


As I said, I want it to be almost like HTML, so here you use <pre>. But you already knew that. The only way I want to depart from HTML is with the paragraph thing.


main()
{
Indented_Code_line(1);
Indented_Code_line(2);
}



Closed: NOT A BUG :-P


Unquoted Let

(no comments)

Getting Lisp

Chris Wellons


Yeah, I know, this post is a bit lame, but I really like that code! To make up for it I have two interesting posts queued up for the next two days. (I only post one per day at most.)


Another Perl One-liner -- Byte Order

(no comments)

Emacs Web Server

Gavin

I just found this project http://elserv.sourceforge.net/ which looks similar, but my $DIETY that page is hard to read.

Luke Maciak

Is there anything Emacs can't do? LOL

Chris Wellons

Yeah, I also came across that before I started my own version. That was written before Emacs had a sockets API (added around Dec. 2003?), so it's a whole different, more complicated beast.

I haven't found another one that makes use of the Emacs 22 make-network-process function yet.

DaveLoyall

I Finally Have Comments

anonymous


This is a test. -Chris

Chris Wellons

So you built up your own from scratch? Nice.

Why no comments? Don't want people to be able comment, or you don't think anyone would be interested?

JKirchartz

HUZZAH!! I recently developed a blogging system for work, but since it's for car dealerships I'll never have to worry about coding comments...


Greasemonkey User Scripts

(no comments)

Wikipedia Flu Time-lapse

(no comments)

Clay Klein Bottle

Chris Wellons

Yes, it's to the left of the text.

Gavin

Shouldn't it have an opening on the bottom?

Gavin Black

Ohh, my bad. Since it's all white it looked like it was a hole that had been covered. That and I was mainly posting any random thing to play with the commenting system :p


Arcfour and CipherSaber in Emacs Lisp

(no comments)

CipherSaber

neigy

Superb, except I know nothing bout programming. All this stuff even the most simplest programming is beyond me right now.


Emacs Htmlize and Highlighted Source Code

(no comments)

Custom Webcomic RSS Feeds

(no comments)

A Not So Stupid C Mistake

(no comments)

SWF Decompression Perl One-liner

Chris Wellons


Ah, I see. I didn't notice it before. Thanks for pointing it out. Glad you found your way to the non-damaged version!

Ozgur


This is a pretty neat trick, especially if one (has Perl already available and) doesn't want to (or even can't in a particular set up) install utilities to this job.


I found your page from the comment you posted on that Java program's page initially. The problem there however was that the less-than/greater-than symbols were lost in that page's comment system (as HTML tag delimiters I guess), and the one-liner displayed there doesn't work "as is" of course. You may want to drop another comment there so that your neat one-liner there doesn't frustrate people who can't fix it right away! :)


Thanks,


Ozgur


URL Shortening

Gavin Black


Looks like some spam finally got through you're captcha :)




Unless you have some weird(er) friends :p

Chris Wellons


They also posted junk on the IRC RNG post, which I just caught before the feed picked it up (it's cached). If you noticed before I deleted it, the link had a rel="nofollow" attribute set, so their spam wouldn't have accomplished much. All comment links get this.


Also, seeing that they never got the captcha wrong (according to my logs) they were manually entered. No way to stop that kind of spam, but it's expensive for spammers to do.


If my captcha did seem to be broken updating it would be pretty easy to do.


Hashapass Password Management

(no comments)

Brainfuck Halting Problem

Andrew

The halting problem is unsolvable, but that doesn't mean that it's useless to ask questions about halting. Most usefully, there are algorithms with bounded running time that only return true for programs that halt, but return false for all programs that don't halt and some programs that do. An obvious example would be to simply run the program for N time steps and record whether it halted by that time, but there are others. The point being, your idea isn't completely torpedoed as an optimization; it's just an optimization that won't work every time.

Christopher Wellons

That's a very good point. That's essentially Firefox's approach to detecting scripts that may not halt. Though, unfortunately, Firefox measures by wall clock time rather than something like CPU ticks or VM cycles (which I imagine is hard to nail down once JIT is involved), meaning when Firefox is suspended for awhile and woken back up, it complains about any scripts that happened to be running when it got suspended.


The Lazy Fibonacci List

(no comments)

Apartment Balcony Gardening

(no comments)

Vimperator Firefox Add-on

leeleblanc


Will attempt macros now; believe it or not I nearly completely forgot about them. I've spent a lot time building a custom _vimperatorrc -which once gets set makes vimperator even more addictive.


Avoid Zip Archives

(no comments)

LZMA Tarballs Are Coming

(no comments)

GNU Screen

Michal Zuber

I use it, too. Could you provide link to your settings, other usage workflow? How do you copy/paste, have dynamic window naming, terminal title naming?


Distributed Issue Tracking

(no comments)

Creating Simple Dice with GIMP

Daniel Doc

Such an old tutorial and no comment?
What a pitty, because this is simply great and very easy to follow!
Thanks so much!

Pavo real

Exactly what I was looking for, thank you!


Diceware Passphrases

Neigy

brilliant! Now if I only knew how to create my own Cipher. Its difficult for being so new in with all of this. Thank you for your post by the way.


Shamus Young's Twenty Sided Tale

(no comments)

AES Random Number Generator

Chris Wellons


Sorry, somehow I missed your post.


I'm using libgcrypt, so that would only happen if libgcrypt supported AES-NI, which it does as of two weeks ago. So if you're using a libgcrypt fresh from the repository, yes.

Alex Fihman


Do you have version that supports new Intel instruction set AES-NI?

udc

Well, if you think about it, what is really so great about this? What is really so random? All what this PRNG does is that it takes a perfectly linear incremental sequence of 128 bit numbers (like 1, 2, 3...) and encrypts it with AES using 256 bit key and 128 bit IV.

So the original data (its format) is pretty much known, the algorithm generating such "random" data from it is also pretty much known, the only thing that is somewhat random here is the first number of the whole linear sequence and the AES key and IV. In other words, no matter how much "random" data you are generating this way (be it 100 MB, 100 GB or 100 TB), of all that amount of data only 64 bytes of random numbers were really used, all the rest was just mathematically calculated. Would you dare to call something like that even a pseudo-random generator?

Now, what the Gibson's website says?
"The deterministic binary noise generated by my server, which is then
converted into various displayable formats, is derived from the highest
quality mathematical pseudo-random algorithms known. In other words,
these password strings are as random as anything non-random can be."

Not only this is quite ridiculous stretch (if not a pure lie) but it doesn't even make any sense. The guy was probably trying to impress everybody so hard that he got it backwards. A noise can't be derived from an algorithm, that's a logical nonsense. What he might have meant was that mathematical algorithms were using the binary noise, only he forgot to mention how much of that noise was actually used and how much was merely calculated. But Gibson is known for making overkill statements so this is not really a surprise.

But let's just think about this idea a little bit more. Libgcrypt has several functions to generate random data with the fastest one called gcry_create_nonce(). Yet this function is described in the documentation as generating "unpredictable" numbers. Now if we look in the source code we find that this function generates random numbers by creating 16 bytes timestamps and then encrypting them with AES. In fact, this procedure is standardized in the document called “NIST-Recommended Random Number Generator Based on ANSI X9.31 Appendix A.2.4 Using the 3-Key Triple DES and AES Algorithms”.

This tells us three things:
1) generating random numbers by encrypting some sequence of numbers (a timestamp series sounds like a sequence, right?) by AES is obviously not such a bad idea,
2) no, Gibson definitely didn't invent any of this,
3) Gibson's version is clearly inferior to libgcrypt's gcry_create_nonce().

Why is Gibson's version inferior? Because a timestamp, whatever that is, is definitely less linear (if enough precision is used) than a series of incremental numbers. Actually, anything would be less linear than 1,2,3...

But that's not all. In libgcrypt the 128 bits timestamp actually consists of seconds and microseconds using gettimeofday() and 3 counters, all that mixed together in a way that makes the resulting timestamp not linear at all. Compared to that, Gibson's 1,2,3... is not only inferior but is actually a total garbage.

In other words, if you remove everything from your code and simply generate random data using gcry_create_nonce() you will get much higher quality of random data than imitating Gibson's nonsense.

Once more again from Gibson's website:
"In other words, these password strings are as random as anything non-random can be."

Now talk about the credibility of that person..

By the way, if you use 256 bit key and thus AES256 then you should specify the cipher name as GCRY_CIPHER_AES256, not GCRY_CIPHER_AES which is an alias for AES128. Currently it does work anyway because libgcrypt doesn't care about the names and takes the keylen as the decisive parameter to set rounds, but the documentation clearly states that the keylen must match the required length of the algorithm so it would be wise to stick with it as they may fix it anytime. Also you shouldn't use atoi() in connection with filesize.

Christopher Wellons

Would you dare to call something like that even a pseudo-random generator?

That's the very definition of a PRNG: it's seeded with an initial state and generates lowly-correlated numbers deterministically. It's very fast and the output is indistinguishable from a true RNG using known tests, so I think it's a valuable one. Besides this, repeatability is required for some applications, so determinism is an essential feature of PRNGs.


Now if we look in the source code we find that this function generates
random numbers by creating 16 bytes timestamps and then encrypting them with AES. In fact, this procedure is standardized

Using Gibson's generator, I can accomplish the same thing if I just mix in a struct timeval on each iteration. Doing a quick test now, I see this slows down the generator by about 3x. It would be worse if I had hardware AES support. This may or may not be worth the trade-off, depending on the application. If I was using this generator for a Monte Carlo analysis, I wouldn't mix in the timestamp.


By the way, if you use 256 bit key and thus AES256 then you should
specify the cipher name as GCRY_CIPHER_AES256, not GCRY_CIPHER_AES which
is an alias for AES128.

Gibson's generator uses AES128 and I wanted to copy their exact algorithm. I don't know why they use a 256-bit key.


Also you shouldn't use atoi() in connection with filesize.

Good point. At the time I wrote this article this would have overflowed after 2 minutes. Using atoll() would take 20,000 years to overflow.

John

Great tip! I got similar results with Ent on a C# version.. thanks!


Fantasy Name Generator -- Request for Patterns

(no comments)

Don't Write Your Own E-mail Validator

(no comments)

The Fire Gem

(no comments)

Controlling a Minefield

Ron

It should be noted that MD5 is broken today, but cool post backlinked from before I followed 


Play NetHack

(no comments)

A GNU Octave Feature

Chris Wellons


It's nice to meet another Chris Wellons! I've actually never met another Wellons to whom I wasn't related. Are you originally from Colorado? All of my extended Wellons family is from the south east, around Georgia and Florida. I was the first one, inside the last several generations anyway, to be born north of the Mason-Dixon line -- as far as I know.


You may have noticed that it wouldn't let you use "Chris Wellons" (capitalized) as a comment name here, as it's specifically reserved for me. :-)

chris wellons


Interesting, did a background check on myself and found another chris wellons who is a computer engineer. To top it all off my senior design project is a matlab data analysis program for smart grid systems at university of denver.

Ahmed Fasih

<sad panda="">: as of 2013a:

>> magic(4)(2:3,2:3)
Error: ()-indexing must appear last in an index expression.


The Arcfour Stream Cipher

(no comments)

Two-Man Double Blind Coke vs. Pepsi Taste Test

Covarr

I'm really glad somebody did a double-blind Pepsi Challenge; this has a lot less potential for bias than the usual single-blind challenge as performed by Pepsi representatives and most home copycats.

Sadly, I don't know that a sample size of two participants can produce meaningful results, unless the goal is simply to learn about the individual participants. Two people isn't enough to extrapolate to a larger population.

Still, an entertaining read, even a decade after it was written :)

Christopher Wellons

Blast from the past! Yeah, this is less about the actual result and more about how happy I was figuring how out how to do the double-blind test with only two participants. I haven't actually drank a non-diet soda like this since 2009, so the result was soon irrelevant to me anyway.


Sudoku Solver

(no comments)

Variable Declarations and the C Call Stack

(no comments)

A One-Time Pad Mistake

(no comments)

One-Time Pads and Plausible Deniability

(no comments)

One-Time Pads

(no comments)

Up is Down

(no comments)

Memoization

(no comments)

Lisp Number Representations

(no comments)

Simple Hash Table Correction

(no comments)

Linear Spatial Filters with GNU Octave

Safy Saeed


Hello,


this is a very nice explanation for applying Mask Filter in spatial domain,


i am making a same program for applying average filter(blur)
but i am using 2 FOR loops to apply the mask pixel by pixel.
i am having a problem in zero padding, i am using 3x3 mask, when the mask tries to convolve outside the boundry of input image then matlab gives error that "Subscript Indices Must Either Be Real Positive Integers Or Logicals'


i wuould be very thankfull if u help me with this or send me coding of a working sample program


Regards,
Sh.Safy

Safy Saeed


Thnx for the reply


i will work on it

Philip Lacombe


Very clear. Helped me a lot.
Thanks a bunch.

Chris Wellons


Glad you enjoyed it.


Octave/Matlab won't magically pad your array for you when you're indexing it. You have to check, in your loops, if the index is outside the bounds of the array, and if so, substitute 0 (for zero padding). Or you could modify the array to add the padding beforehand, and only convolve on the middle; the padarray function I had mentioned can do this for you.

bbalaraj

Thanks a lot

Marcos

thank you, that was helpful!


My Team Won the Robot Competition

(no comments)

The 3n + 1 Conjecture

(no comments)

Optimizing, Multi-threading Brainfuck to C Converter

(no comments)

Some News

(no comments)

Movie Montage Poster

(no comments)

Unsharp Masking

(no comments)

A Faster Montage

(no comments)

Simple Hash Table in C

skleroza

interesting but why not stored size of objects into blob?

Christopher Wellons

The blob size *is* stored, but there's no way to retrieve it via the interface. It's up to the caller to keep track of its size, which is almost always the case anyway. The hash table only needs to know because it makes a local copy (which it also probably shouldn't be doing).

I originally wrote this library 8 years ago as a student. I've learned a whole lot more since then, and I could easily whip up something much better in like 30 minutes, one better tailored to the situation. So, honestly, this hash table really isn't useful for much. There are lots of undesirable issues with it.


Movie DNA

(no comments)

South Park Downloader

(no comments)

Traveling Salesman Problem by Genetic Algorithm

(no comments)

Noise Fractals and Clouds

(no comments)

Iterated Prisoner's Dilemma

Gavin Black


Very cool! I haven't looked through your old posts before, looks like there is alot of neat code. After the semester is over I'll have to try to run some of these :)


Polynomial Interpolation

Ruben

This is indeed a neat trick. You can also use Newton divided differences or Lagrange polynomials for polynomial interpolation to avoid having to solve a linear system to find the coefficients. However, polynomials don't work very well for interpolation. Piecewise polynomials or splines have much better interpolation behavior. If you have a function instead of just some points, L2 projection is also an interesting technique. If you do have n points and want to find a polynomial of degree m < n - 1 that almost interpolates these points (formally, it minimizes the norm ||Ax - b||), you can still generate the linear system Ax = b, which you gave. It is not square, but you can solve the normal equations transpose(A) * A * x = transpose(A) * b and this will give x for which ||Ax - b|| is minimal.

Christopher Wellons

I agree with you about splines, and these days I'd just use splines for this sort of interpolation. Ten years ago when I wrote this article I didn't know any better. ;-)


Robot Version 1

(no comments)

Java Animated Maze Generator and Solver

Chris Wellons


Just updated some stuff here to be more reasonable (and so I look slightly less foolish). Man, I really sounded like a weenie back then! :-P


In two years will I look back at now and still call myself a weenie? Perhaps.

Gavin Black


Ha ha, I used to think the same thing about Java. And I too now use it for most of my work projects :p



Ohh well, it will still be a cold day in hell before I touch Ruby!


Narendra

Hey, how do you use Emacs with Java?

I am just about to switch to Intellij for Java Programming at work.

Christopher Wellons

I haven't programmed in Java for a few years now, and I intend to avoid it as much as possible. However, I did write two packages to make Java less annoying: javadoc-lookup and ant-project-mode.

https://github.com/skeeto/j... https://github.com/skeeto/a...

The former lets you quickly access package documentation, and it can even do some import management since it's aware of packages. The latter supports a very opinionated workflow oriented around Ant.

None of this gives you anything close to intelligent completion. If you want something more elaborate, I've heard good things about IDEE, but I've never used it.


Mandelbrot Set on a Cluster

(no comments)

Converting MediaWiki Markup to LaTeX

Hunniger

Actually I made a complete solution for this problem

 http://de.wikibooks.org/wik...


Walk Text Adventure Game (Perl and MATLAB)

(no comments)

Memory Allocation Pool

(no comments)

PNG Archiver - Share Files Within Images

(no comments)

YouTube with Free Software

(no comments)

Mandelbrot with GNU Octave

A. Jorge Garcia


Yeah, my school has IE and I can do nothing about that except in my Linux lab! Thanx for the prompt reply.


You may want to look at the Pelican HPC site (just google it) in addition to the MPITB site. I recall you did a lot with fork() and pipe() but I thought that was under openMOSIX?


BTW, you inspired me to start up my own blog about my students' experience clustering and some other topics relating to teaching and learning with technology. If you are interested, just visit my website:


http://calcpage.tripod.com


and you will find a new link near the top of my home page for the blog. Its called "Shadowfax Cluster Rant!"


Enjoy,
AJG

A. Jorge Garcia


BTW, my email address is calcpage@aol.com - sorry it looks mangled on the comments page....


TIA,
AJG

Adam
A. Jorge Garcia


Hi!


I hope you get notified whenever you get a comment as this comment will otherwise be buried in your archives!


Anyway, I don't know if you remember me, but you helped me and my students come up with some fractal code for a cluster programming project a couple of years ago. Thanx so much!


So, I have a new question for you. My current cluster team wants to try MPI (we've been using openMOSIX for a while but that project has ended). We found a booatble CD that sets up MPI to work with Octave. I'm wondering if you've had any experience with this. There's a library called MPITB (MPI ToolBox) for Octave that's included. The CD is available at the old parallelKnoppix website which is now called Pelican HPC.


TIA,
AJG

Chris Wellons


Yup, I remember you. Welcome back!


I've never used MPI in Octave, or in any other language, so I don't think I can't really help you. I was always working on a cluster, so fork() and pipe() did everything I needed. One thing you'll have to work out is setting up a process to fire off a process on each machine, each with the appropriate configuration. The Octave MPITB page seems to be down right now so I can't take a look at it.


Good luck, though!


P.S. There's a comments RSS feed I set up and follow so I know when any comments have been posted anywhere. Oh, and your e-mail address appears mangled because you are probably using Internet Explorer which has CSS issues/bugs and can't display this page correctly.


null program is Alive

(no comments)

null program

Christopher Wellons

There were some DNS issues during the transition, with DNS reporting both the old and new addresses. You may have seen half of the old site and half of the new site on the same load. It was also completely unavailable for a couple hours while other changes were propagating. Everything should be fine now.

Luke

Oh wow... This is actually pretty cool. I used the github pages before for some of my projects, but I was mostly sticking to single static pages and keeping it simple. It's nice to see you can actually run an entire blog like this. 

Marco Gualtieri

do your comments get saved into plain text files as well?

Christopher Wellons

The comments go into some Disqus database offsite, completely independent of GitHub or any version control. I *can* export them from Disqus into XML if I needed to.

Diego Sevilla

This is a neat selection of tools and hosting. I went to a similar route myself. To learn Common Lisp and Emacs-Lisp, I wrote my own static blog generation engine, also used Disqus for comments, and generated HTML "conformant" with Wordpress, so that I could use nice Wordpress templates (that are mostly CSS). The last thing left is to host it via github. I wasn't sure, you know, my code, my posts, etc. But then I thought anyway people can download it completely! Now I see you also opted that route, I think I'm going for it.


Infinite Parallax Starfield

Brian

Just watched this video again: Hypernova was really cool visually and internally. It would be neat to take it to the next level.

ambethia

Hey, could could embed a soft-synth and still have the sounds be rendered from MIDI. Thanks for the tutorial.

arrogant.gamer

Thanks so much for this awesome technique! I'm still workking out a few bugs in my implementation, which is in lua (using the fabulous LOVE2d framework).

In my initial attempt, the stars jumped around inconsistently. They would always be in the same places if I returned somewhere, but instead of drifting away when you move they would vanish and be replaced with different stars as though I was hopping vast distances.

I solved this with the following change:

hash = mix(STAR_SEED, i - xoff, j - yoff)

This gives me nice slow drifting stars. Since I'm not sure what xoff and yoff were in your original implementation, I can't be sure I am passing the right things in as those variables: I'm using the player's offset from origin.

Here is a gist: https://gist.github.com/var...

arrogant.gamer

I think I may have spoken too soon!

Graham Toal

If you finish the game and release source, we'd be interested in porting it to the "PiTrex" when it is ready for release. That's a pi-based add-on processor for the Vectrex vector console. Potential ports will be listed here: http://computernerdkev.heli... (I found your pages when looking for starfield code, quite a few things here of interest)

Christopher Wellons

Here's the source: https://github.com/skeeto/H...

I never finished it, and at this point I never will. PiTrex is a pretty interesting project, so thanks for sharing.

Graham Toal

Appreciate the link, thanks! Even if it isn't playable or portable to our system, exposing the vector programming community to more source code is always a good thing. There may well be parts of your code we can reuse in some way. Thanks.


Movie Montage Comparison

possiblywrong


This is cool-- interesting timing, too, since I just saw TRON: Legacy last night.


After reading the title label of one of these images, I can generally recognize the trends in color. I wonder if this ever works the other way around; that is, are there movies so distinctive that their "redux" images are identifiable (with some high probability) even without their label? (Or at least, a set of movies that you might be able to match with their corresponding set of labels.)


The Great Tab Mistake

Chris Wellons


"You must learn to respect our differences."


I drifted away from tiling window managers because I still had to frequently to switch out to normal windows to do certain things, such as making a video recording or run something particularly graphical. They also lose some usefulness on smaller displays. My laptop -- which is 7 years old and will not be replaced it until it is completely dead -- only goes to 1024x768. That's just a little too skinny for side-by-side anything, so overlapping windows are still useful for me.


One day when I've upgraded to larger resolution screens perhaps I'll take a look at tiling window managers again.


Now, you assume that I'm using the mouse to perform all this navigation, but this is untrue. Outside of initial positioning, I don't use the mouse. I don't think I've actually clicked to focus a Fluxbox tab yet. I bound C-tab and C-S-tab to move between tabs. 99% of my navigating between windows is done by keyboard. In fact, this grouping has made keyboard navigation for effective for me, because the usual M-tab only switches between groups of windows, skipping over tabs. One current flaw for Fluxbox is that the mouse is required to combine two windows into a tabbed window -- and the default binding is especially clumsy for laptops. I would like to see this corrected someday.


I also rarely use the mouse to move between tabs in Firefox. With Pentadactyl I have the w and e keys bound to this, as well as its Emacs-style buffer switching (typing out part of the tab name). I'd also love to see that buffer-style switching in Fluxbox.


You are right that I don't often need tabbed windows. Currently I'm only using the Fluxbox tabs to combine 2-4 xterms, usually with one sshed off somewhere. If we relied on the WM for tabs, this is also how I would be using Firefox. If Emacs didn't have its divine buffer interface I might be using tabs there. In the future, I should attempt some more creative tab uses, such as combining Emacs and Firefox when using my java-docs extension.

Matt Stine


The window manager-level tabbing was always my favorite feature of Fluxbox, other than the fact that it was the only modern window manager to run reasonably well on my IBM T20.


I don't have any experience with Chromium yet, but the appropriately-inclusive tab content seems to be a move in the right direction for browsers.


Gavin Black


I'm sorry that you've fallen under the sway of a heathen WM, hopefully you'll adopt the Tiling civic again. To me it's just moving the "Start" bar of most WMs from the bottom (The ones that group at least). Since I started with raw VTs, I'd make 8 desktops and treat each desktop as my way to organize content (1-IM, 2-Internet, 3/4-Development, 5-Usenet/Filesharing, ...), which seems less clunky than visually grepping for the right window, dragging the mouse down the list, and clicking the intended victim tab. Thankfully XMonad lets me re-enforce the memorized groupings I've used since high school, deftly avoiding UI natural selection (Replying on Meta-2 now ;)


Also if you haven't heard of it there is Ion which is a tabbed tiling WM, although I know the tabbing is not why you originally switched.


Besides the browser (Which I'd argue needs a stack of some sort more than tabs) how often do you have dozens of windows open that you are actively using? You can pipe any sort of status to a common /dev/pts. I'm probably just being an old fogy though, enjoy your new fangled mouse interactive WMs, I shan't be needing them >:|



Feedback Loop Applet

possiblywrong


This is pretty addictive to play with. I'm not sure I understand where (at least on my display) the "five-ness" comes from in the applet? That is, the mouse input seems to result in five equally spaced copies of stuff.


You might like Hofstadter's "I Am A Strange Loop," where in one section he describes some interesting experiments and perspectives on video feedback. (Even after GEB, this is a pretty interesting book. He is fun to read even if I don't quite agree with all of his ideas.)


BrianScheme Update: Bootstrapping and Images

Gavin Black


Really cool so far.
I think the acronym makes it sound like a government project though :P


Hopefully you can get the FFI going at a reasonable speed, would be neat to see GD/SDL or whatever bindings so that it can do some graphics.

Matt Stine


Nice work. I've taken a look at the project's Github page, and impressed to see the rate of generating new, substantial commits.


Now if I were only familiar with Scheme so that I could play around with what you are creating. :)


Chess AI Idea

(no comments)

Fake Emacs Namespaces

(no comments)

Torrent File Strip

(no comments)

A Fractran Short Story

Anon

The empty list is the identity function in fractran.

Weux

Fractran programs care more about the exponents of the primes in the factorization than anything else, that is where the Turing-completeness lies. So the program you are looking for would be something that takes input 2^n and produces the prime factorization of n encoded in the exponents. Also, the identity function can be implemented as the null list (as someone else said (it is quite an elegant solution)) or as the (1/3) with an input coded 2^n. Since the input is not divisible by any denominator on the list, it becomes the output.


BrianScheme

(no comments)

The Problem with String Stored Regex

(no comments)

October's Chess Engine Gets Some Attention

(no comments)

BCCD Clusters

(no comments)

October Chess Engine Updates

(no comments)

Pen and Paper RPG Wishlist

(no comments)

Java Hot Code Replacement

John Doe

Ever heard of Objectweb ASM or BCEL. You can instrument bytecode on the fly.

Christopher Wellons

I've heard of both, and I've used BCEL indirectly for static code analysis (FindBugs), but otherwise I don't know too much about them. I haven't coded in Java for a couple of years now, but next time I do I'll have to give these a look.


Readline Wrap

(no comments)

A Faster Montage

Kavinda Jayasinghege Don

Hey Chris,

I know that this is an old post but do you still have the script? Could you fix the link for us?

Thanks!

Christopher Wellons

Thanks for pointing out the problem! Apparently the script was lost when I changed hosts four years ago, but no one seems to have noticed until now. I just wrote a much-improved replacement (Perl sucks!) in C that does the same job in a mere 50ms. It's now linked at the top of the artice. It should work in Windows too (untested), if that's your thing. See the program's header for usage info.

Apparently ImageMagick's montage is a whole lot more reasonable than it was back in 2007. It can do the same montage in only 24 seconds on my current machine, so a custom script is no longer necessary. I originally wrote my script because I was tired of having to wait overnight for results.

Kavinda Jayasinghege Don

Thanks Chris!

I'm using Windows but I'm having to wait ages (nearly 2 hours) for ImageMagick to form a montage of 34 images consisting of 110 tiles pieced together into a vertical strip. IM can produce a montage very quickly when you process images vertically but not horizontally for me!

Thanks again for updating the link!


Git is Better

Miroslav

Maybe by skiping Fossil you didn't realize how good it is compared to Git.


Halftone

jfrio

Very cool... I work in R&D for a major printer manufacturer, so it's close to true that I do halftoning algorithms for a living, mostly FM, not AM, though. If you ever want to give it a try, drop me a line and I can point you to freely available information.

Brian

Wow. Nice result for not-a-lot of code.


Emacs Configuration Repository

Brian

I'm now a happy user of your configuration. It was a really good starting point. I just added clojure-mode and a few other things. http://github.com/netguy204...

Thanks!

heliofrota

Thanks !

I'm now another happy user of your configuration.

Deric Bytes

Thanks, got some helpful tips


Some Cool Shell Aliases

Chris2048

Have you tried 'tmux'? I much prefer it over screen now.

Christopher Wellons

Hmmm, `tmux` does seem to be more capable than `screen`. I'll start using it and see if it sticks. Thanks!

Michal Zuber

Tried he 'httpd' alias, but was interfering with apache so needed to change to 'webshare' :)

deathbullet

Ah, the good old days when you used to bash vi!
Sad to see you like it more than Emacs now!

Christopher Wellons

When it comes to text editor bindings, you could say I've seen the light. ;-) Now that I've used both, I can confidently say the Vim/vi's text editing model is definitely superior to Emacs' defaults.

I'm now using both Emacs and Vim regularly, and I expect to keep using both like this indefinitely. Initially I could switch between Emacs bindings and Vim bindings with ease, but my Emacs muscle memory has totally atrophied at this point. When I'm running short tests in an unconfigured Emacs instance (package testing, foreign machine, etc.), I switch on Viper since otherwise I'd go nuts. It's no Evil, but it works in a pinch.

Emacs is a whole lot more flexible and extensible than Vim, and so I actually don't do much to customize Vim. Instead I bend Emacs to be more like Vim.


Introducing NativeGuide

(no comments)

Proposal for a Free Musical

(no comments)

Try Out My Java With Emacs Workflow Within Minutes

Jocimar Lopes

GREAT!!! Thank you, very much for all the time you spent sharing this.

fistofsenn

just found this article. I created an emacs package to integrate Eclipse with Emacs. If you are interested you can find more details here: http://blog.senny.ch/blog/2...

Albert

Thanks for this I still like java but am only wanting to use emacs more and more and it looks like youve developed a best practice with emacs here using other leading edge stuff like magit.

Wagner Marques

many many thanks...
 

Sugi.

I really thank you. I spent so many hours in setting up emacs for Java. Your help helped me lot of time. Thanks a lot for your kind heart in sharing such a time cosuming work for others. Thanks again. 

emacsgeek

i love you!


Poor Man's Video Editing

(no comments)

Rumors of My Death Have Been Greatly Exaggerated

(no comments)

CSS Variables with Jekyll and Liquid

legendlee

surprised to see this! 


Silky Smooth Perlin Noise Surface

Brian

Pretty!


Cartoonish Liquid Simulation

DucLoi

can you port to android?

Christopher Wellons

The Java version wouldn't run well as-is on Android. Blurring is done using the CPU, which will be far too slow on a mobile device. It would need to be ported to OpenGL ES to get decent performance on typical Android devices. And that's exactly what I ended up doing: at the top of the article is a link to my WebGL port. Android supports WebGL, so you should be able to run that one in your browser. Firefox works best by far because I'm using asm.js for the physics simulation, and Firefox runs that much faster than any other browser.

Ruben

I remember doing something similar in QBASIC years and years ago. That program used a 320x200 resolution and used the CPU to loop over all pixels. I don't remember the details, but the trick was to have black pixels (background/empty), white pixels (foreground/object), and blue pixels (water). Then you could update the pixels in the screenbuffer by moving the blue pixels down one pixel at a time, plus a random horizontal offset (but of course the pixel should only be moved if the target position is free). This is not fantastic (for example, the water falls with a fixed speed and looks weird if you eyeball individual pixels), but I remember it working remarkably well for such a simple 'algorithm'.

Christopher Wellons

If you ever figure out exactly what you did in that old QBASIC program, I'd be interested in seeing the effect and hearing how it worked.

Ruben

Basically it's just a simplified particle engine where you use an array to store the type of pixel at that pixel location. So you do something like "if (buffer[particle.x, particle.y + 1] == free) { particle.y++; }. Combining that with a small chance to move sideways (if that pixel is free) yields an OK water simulation. I thought it was possible to do this in screen space as well (and it probably is), but when I tried yesterday I got an asymmetric simulation due to the order of processing of the pixels. I tried to circumvent this with some tricks but couldn't find an easy solution. Maybe updating the pixels in a semi-random way would work, but this would kind of defeat the purpose of doing the 'simulation' in screen space.

Anyway, for some (QBASIC - should be easy to read) source you can see http://www.qb45.org/rate.ph.... Since this program is written in QBASIC (it probably won't work in newer versions without porting, since it uses VGA). You can set it up in DOSBox, but for you to get an impression I captured and uploaded a run with 150 particles here: https://youtu.be/ZVmBFQgLT0I

I now realize this would not work well on higher resolutions without hardware acceleration (but I'm not sure how this would work).


Lisp Let in GNU Octave

possiblywrong

Thanks, I learned something today.  I had to stare at your code for a while to convince myself that it generates random unit vectors that are indeed uniformly distributed on the sphere.  I'm used to seeing a slightly different "trick" for this, that is relatively simple to describe thinking in spherical coordinates, but isn't as compact in Octave code: pick a "longitude" uniformly at random, then instead of picking a "latitude" (which doesn't work, as you point out), pick a *z-coordinate* uniformly at random.

Ahmed Fasih

I almost always prefer `bsxfun` to `repmat`, i.e., `@(v) v ./ repmat(sqrt(sum(abs(v) .^ 2, 2))` becomes `@(v) bsxfun(@times, v, 1 ./ sqrt(sum(abs(v) .^ 2, 2))` (and I use `times` there only because I can never remember the proper difference between `rdivide` and `ldivide`). Note that Numpy and R and other sane environments don't have an explicit function like Matlab/Octave's `bsxfun` because they do array broadcasting automatically!

Christopher Wellons

I wasn't aware of bsxfun() until now. Looks useful! Looking back at
my original expression now, I don't know why I put the abs() in there.
I'm wondering if there was a particular reason.

After seeing your comment, I just came up with another one today:

p = (@(v) v ./ cellfun(@norm, num2cell(v, 2)))(randn(n, 3));


Rumor Simulation

possiblywrong

Neat problem, I'll have to stash this in my bag of "cool exercises" :).  I have to imagine that the intended solutions were supposed to be observed simulation results and not exact derivations, because the solutions here are, well, not exactly nice (and only about 40 years old).

There is a closed form solution, though-- sort of.  The model is similar to the SIR model of the spread of disease in epidemics; the ignorants are "S"usceptible to the disease, the spreaders are the "I"nfected, and the stiflers are "R"ecovered and do not continue to pass along the disease.  Analysis in that context is usually "continuous" (i.e., differential equations), so I didn't expect a nice solution in this discrete model, but I poked around with it anyway for way too long :).

There is a nice recent summary of results about this and similar models here:
http://arxiv.org/pdf/1003.4... 

Short version: the fraction of the population that hears the rumor is, in convenient Mathematica notation, 1+ProductLog[-2*E^-2]/2, or 0.7968121300200199, where ProductLog[z] is the solution to the equation z == w*E^w.

Christopher Wellons

I figured this one would tickle your curiosity.

Seeing that paper makes me feel better about not having figured anything out. The solution is a lot more complicated than I imagined.

Luke

Are you allowing for repeated meetings, or are pairs that already met exuded from the next selection.

It would also be interesting to try to re-factor this problem as cellular automata. I vaguely remember the fish and sharks model in which you had cells randomly moving on the grid that could be adopted to this. After all spatial factors play a huge part in rumor spreading.

Christopher Wellons

Following the problem description strictly, yes, repeated meetings are allowed. So A could tell B the rumor, then immediately attempt to tell B again turning them both into stiflers.

I never heard of the fish and sharks model, but I just looked it up. It sounds neat, so I'll give it a shot.


Making Your Own GIF Image Macros

(no comments)

Pseudo-terminals

possiblywrong

This actually came up at work for me as well-- and it was during a field test, so we needed it *quick* :).  There is actually a decent open-source solution for Windows, called com0com (link below).  With it, you can create (pairs of) virtual ports, and there is also a handy add-on tool called com2tcp that adapts a serial connection to a TCP socket.

The original project was for Win32, but I did get things working on Win7 as well, after some device driver signing issues that I would have to check my notes to remember :).

http://com0com.sourceforge....

Ron

Could something like an LD_PRELOAD hack work as well, does LD_PRELOAD work on a Windows machine?

Christopher Wellons

I had to look it up and it seems that DLL injection is the Windows equivalent. I'm not sure if it can be used to override the serial port functions. It's something to try out. If it is possible, then this would be another solution, so good idea!

Michael_mhr

How can I use this functions to read the data from the serial port, modify it and then send it to a virtual serial port?
I'm very new to C and Serial ports. I need it for a school project and I don't know how to do it.

Christopher Wellons

If the devices already exist under /dev, you should open(2) each, read(2) from one, make your changes, and write(2) to the other.


Why Do Developers Prefer Certain Kinds of Tools?

Gavin Black

I think you are describing a very certain type of programmer.  There are lots of developers who rarely use a shell and are comfortable letting huge IDEs manage everything for them (Netbeans, Visual Studio, Eclipse), and using WYSIWYG layout tools. I have even seen console level git diffs confuse developers used to fancier tools.

I actually feel the difference comes down to people who are comfortable (And probably learned to program) in a *nix environment and those who are aren't. That might be an unfair bias, but I've never seen someone who both prefers to use Linux for development and likes to have everything hidden by them in their IDE.

Christopher Wellons

I'm talking about true programmers, of course! : -)

Ron

For images if been experimenting with keeping an original, and doing the transforms via scripts, can be a little annoying if the watermark needs to be changed on some images


Presentations with Jekyll and deck.js

(no comments)

SSH Honeypots

(no comments)

Perlin Noise With Octave, Java, and OpenCL

Mopzar

Hi,

Nice script OpenCL script!

I think you are not using OpenCL in the right way though, check this link to see how to use the parallelism properly ;)

http://www.drdobbs.com/para...

Christopher Wellons

Thanks for the link. I must have gotten *something* right, though, since it's much faster than any CPU-only implementation.


SSH and GPG Agents

Fergus

nice post - thx!
for those with security concerns, you could add the --clear flag, which would keep the daemon running but assume each new login is an intruder requiring validation.


Versioning Personal Configuration Dotfiles

Ron

And if your OS is not too stupid, it will actually implement fsync() (looking at you OSX). The POSIX spec, allows (not implies, it explicit) null implementations of fsync().

int fsync(int fd){
return 0;
}

Is valid

Samask
jasonkarns

One of the drawbacks for versioning the entire home directory has a workaround, that you mentioned. You noted `$GIT_DIR`. However, you can also use a file in the repo named `.git` that contains the path to the the git directory. I have said file named `git` (note the lack of the leading '.') While the file is named `git`, the home directory is not considered a git repo. I simply change the filename to `.git` when I need to do any git operations; which is almost never thanks to the following...

I have my actual git_dir cloned as ~/dotfiles. Within dotfiles, I have the core.worktree set to `../../`. Thus the repo is ~/dotfiles, but the dot-files themselves are checked out to $HOME. This allows me to avoid the problems you mentioned above. My $HOME is not considered a git repo, so no inadvertent git commands are possible. (Git commands must be executed under ~/dotfiles to apply to the repo). The only two scenarios that don't handle the worktree setting are the Mac gui tool GitX and the use of the git-submodule command. (The only submodule is vundle for bootstrapping my vim plugins.) In order to run either of these two commands, I simply rename ~/git to ~/.git (as described above) and presto.

While it's a bit more tedious, I enjoy the fact that any changes or *new* files in my home directory are picked up by my dotfiles repo.

Patrick

I found gnu-stow to be the perfect program for managing the symlinking of dotfiles from a repo...


Publishing My Private Keys

Philip Jägenstedt

Very interesting. In the year since you wrote this, have you had any reason to reconsider?

Christopher Wellons

Over the last year having my signing key available whenever I need it has been very convenient. Being part of my dotfiles repo, it's automatically present in my live system, which is probably the safest place for me to use my key -- a fresh, read-only, throwaway system that doesn't write to any permanent storage.

Security-wise, nothing has changed. Brute-forcing my passphrase is still much more expensive than other possible attacks, like compromising one of my computers with a software or hardware keylogger. Occasionally someone opens up an issue on GitHub to warn me that my private keys have been committed, but I just redirect them here.

Philip Jägenstedt

Cool. I found this while trying to find sane ways to store secret information publicly accessible, but reckon that unless I need the GnuPG part (haven't used it in years) then just encrypting it using AES256 will be simpler... although I suppose the --s2k-count makes your solution more expensive to brute force.

Krishna

Hello. Thank you for this article. I followed it and I ran this command over my 4096 bit RSA key.
gpg --s2k-cipher-algo AES256 --s2k-digest-algo SHA512 --s2k-mode 3 --s2k-count 65000000 --edit-key keyid and I entered 'passwd' for a new passphrase, then quit the prompt with 'quit', I gave 'y' to save changes.

However, --list-packets option's output doesn't show "SHA512 protection", it is still "SHA1 protection", can you please tell me where I'm going wrong?

Christopher Wellons

I *think* "SHA1 protection" is some kind of integrity check. It's not part of key derivation. The important part is the "hash: 10" describing the key derivation algorithm. This means it used algorithm number 10, which is listed as SHA512 in RFC4880 (section 9.4). Before you edited the key, this was probably "hash: 2".

https://tools.ietf.org/html...

The "algo: 9" part refers to "AES with 256-bit key" (section 9.2), the algorithm encrypting your private key.

Krishna

Thanks, output has Hash:10 and Algo:9.

I have another question, did this method also change settings for both my signing and encrypting keys, or just for the encrypting key alone?
Thank you!

Christopher Wellons

That's something I'm not sure about. I think it does both, but you'll need to inspect --list-packets closely to be sure.

Krishna

Hi, I tried inspecting list-packets, but I couldn't figure out yet. Since you published your secring.gpg, I think it is not safe until you know if both keys go under the same....

simonszu

Hi,

i have stumbled over this post while searching for a convenient way to sync my GnuPG details. Since the .gpg files are encrypted, git treats them as binary files, so whenever i have merge conflicts, i cannot properly merge them. I thought you would have the same problem, and maybe have a solution for this. I only came over "pushing and pulling the repository consequently when you start or leave a session at a certain computer", but maybe there is something better?

Christopher Wellons

Sorry, I don't have any solution for this. I just don't check in any changes to my keyring. One, because it's a pain to merge as you're finding. Two, it makes it a little harder for others to get a list of my contacts. I doubt GnuPG will ever make a version-control-friendly keyring format because they really don't want people using it this way.
The good news is that, for better or worse, the web of trust captures the most important information in my keyring, outside of my private key, that is. If I trust a key, I sign it and publish my signature to a keyserver. Later on when I need to validate that key again (on another machine, etc.), I can fetch back my own signature. The WoT is my cloud backup.

Pablo Olmos de A. C.

Since gnupg 2.1[1], there's no `secring.gpg`. How can I achieve somewhat, the same results?

[1]: https://www.gnupg.org/faq/w...

Christopher Wellons

I haven't had to deal with this yet, but I believe everything here should work the same. When GPG does the one-time update to the files in your .gnupg/ directory, you will need to inform Git about the change. The constraints on things like number of iterations and hash algorithm selection is set by the OpenPGP format, not by GnuPG's internal format.
However, unless I'm missing something, I don't think this issue applies to GnuPG "classic" (i.e. 1.4.x)? I couldn't find much information about it. My own use of GnuPG is limited to the "classic" version.

Pablo Olmos de A. C.

Hm, I don't think so. The edit key parameters looks like are being ignored. Since the keys now live in private-keys-v1.d directory. After doing this and changing the passphrase, the "key" in that directory, have the same checksum (which makes me believe the key hasn't changed at all) and it's exactly the same file as before, right? Afaik, if the key it's encrypted itself, the key file should along with the passphrase?

Also, list-packets doesn't work with those files, so I don't know which algorithm was used.

Christopher Wellons

Ok, so I just grabbed the latest "modern" version of GnuPG (2.1.1) to have a look at what changed. I'm not entirely happy with what I'm seeing.

To inspect your secret keys regardless of how GnuPG is storing them internally, export them (--export-secret-keys) and run list-packets on the output. That will put the keys into the OpenPGP format for list-packets. On 2.1.1 you'll have to enter your passphrase to do this because the keys are decrypted and re-encrypted on the way out.

Second, there's a bug in the latest release of GnuPG 2 that's causing it to ignore all of the --s2k* options, even when exporting keys. This is pretty serious, so I'll report it upstream. Perhaps these options are now meant to apply to GnuPG's internal encryption only, but I'm not seeing evidence of this. Either way this is bad.

When I use edit-key all the files under private-keys-v1.d/ are changing as expected, so I don't know why yours aren't changing. Even with the s2k parameters are untouched, the salt changes anytime the passphrase is changed with passwd, even (especially!) when the same passphrase is entered again.

The good news is that 2.1.1 uses AES-128 for s2k-cipher-algo and a high number of s2k iterations by default! Neither is the strongest available, but it's good enough. The real problem is that the default choice of s2k-digest-algo is still SHA1 and the user can no longer change it. If you're using GnuPG like I do, I'd stay away from GnuPG "modern" (2.1.1) until this is fixed.

Pablo Olmos de A. C.

Ok, sorry, you are right, the checksums actually change, even with the same passphrase, I don't know why it was different before. When I was testing I had more than one directory.

You are right about list-packets, it works in a exported armored key:

iter+salt S2K, algo: 7, SHA1 protection, hash: 2, salt: xxxx

But the pass was changed after the sk2* options, so you are right that they are being ignored. Can you tell me where did you reported this so I can follow it?

Christopher Wellons

Sorry, I should have linked this when I created it:

https://bugs.g10code.com/gn...

cryptoluks

any updates on this? Is there a way to force the agent to change the s2k export options?

Christopher Wellons

Nothing new as far as I know. However, the "legacy" version of GnuPG, which is what Debian and its derivates still use for /usr/bin/gpg, works correctly with these s2k options. So if you stick with that version everything here will still work as intended.


Moving to Openbox

Andrew Stine

Neat. I should try that trick with feh.

wet sock

I guess If I had been a KDE user I would have tried XFCE as a lighter alternative. I have never used either of them though. I myself went to Open box as a lighter desktop but then decided it was a bit heavy too and tried DWM and Snapwm. DWM was great but Snapwm is just a tad lighter and that is what I use now.


Literal Arrays and Vectors in Lisp

(no comments)

A Few Tricky C Questions

J B

The people most skilled in the use of a system are the ones that know every obscure way to break it.

Nidhy

The result of the program related register variable depends on compiler used. Some compiler is produce warning only

Christopher Wellons

According to the spec this should be an error because this situation is not permitted.

"[W]hether or not addressable storage is actually used, the address of any part of an object declared with storage-class specifier register may not be computed"

"The operand of the unary & operator shall be either a function designator or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier."

http://flash-gordon.me.uk/a...

aswanth b

Thank you.......
And please add more like this.......

aman punj

i want to print the whole array but to print i call a function with an 2 parametes i.e. pointer and an int size
but instead of passing base addres in pointer i pass the address of any element in the array
how to print the whole aray ???
will you please helpme with this

Mazhar MIK

Good collection


Viewing Java Class Files in Emacs

Andrew Hyatt

FYI, I wrote a java decompiler as part of the JDEE project awhile ago: http://jdee.svn.sourceforge...

Very slow compared to what you can do in Java, but it does work.

Christopher Wellons

Wow, impressive disassembler. Is that how JDEE learns semantic information about classes? This could be really handy. Currently my java-mode extension scans Javadoc output to learn about the classes rather than parse the class files themselves.

Andrew Hyatt

This particular class was replaced with a Java-based disassembler, so I don't think it is currently used for anything right now.

Oleksandr Gavenko

Here is https://github.com/hiredman... and http://www.bytopia.org/2015...

Both uses find-file-hook. file-name-handler-alist much steeper ))

Christopher Wellons

Thanks for the link!

Jonathan Gibbons

If it helps, javap will accept a jar: URL to a .class file in a .jar archive.

Christopher Wellons

Oh, good thinking. If I ever revisit this problem I'll keep that in mind.


Programmatically Setting Lisp Docstrings

(no comments)

The Physical Analog for Encryption is the Hyperdrive

(no comments)

Switching to the Emacs Lisp Package Archive

Andrew Stine

The MELPA "link s-exp" setup reminds me of the old Common Lisp package "manager" ASDF-Install, where links to repos were inserted into a publicly editable wiki. There were issues with that approach, needless to say. This one seems to at least be a little more secure, so I might look into swapping Marmalade with it in my Emacs setup.


Elisp Unit Testing with ERT

(no comments)

simple-httpd and impatient-mode

caoyuanqi

I think this mode is amazing, I used that a lot. But recently I found that it can not recognise the change of css file change any more. I started the httpd-start , enabled the impatient mode in both css file and html buffers. When I change the content of html it runs ok. when I change the css file it won't work. I could open some url like http://localhost:8181/imp/live/style.css/, it will refresh but not for the linked html file.

Christopher Wellons

I wonder if it's a caching issue. My memory is really rusty on how this stuff works in impatient-mode, but I remember there being an imp-related-files buffer-local variable that's supposed to keep track of this. If you're willing to get your hands dirty to figure this out, in the CSS buffer check if this variable lists the HTML file. If it's not, then somehow it failed to detect that these buffers are related. If it is, then it's some caching issue.


Implemented Is Simple Data Compression

Gavin Black

Blog get hacked? Failed AI experiment? Late onset schizophrenia? Either I've finally lost the remainder of my sanity or this post makes absolutely no sense :P


Markov Chain Text Generation

JKIrchartz

Markov is fun! I was going to make a twitter bot, that generated random text from wikiquotes, based off my @yinzbot code...

csar

Yes it is fun :) I forked the git repo for some modifications. Now you can provide your own "random" function (this is for using a seeded random generator, so you can reproduce the results). And i'll add a feature to generate not sentences from words, but words from letters (like name generation). It is available at https://github.com/csarn/ma....

Christopher Wellons

This is a good idea. If all you're interested in is selecting a seed, you can easily seed Emacs PRNG by passing it a string.

(progn
(random "my seed")
(markov-text-generate 10))

On my 32-bit laptop running Emacs 24.3 I get this string:
"Cards, I glanced at Sola, who had already raised the skirts
of Mr. Jaggers's"


Fractal Rendering in Emacs

Davef

cool

Bojohan

There's an "svg-clock" package in ELPA.

Christopher Wellons

 Wow, that's great! Someone's already way ahead of me.

bin

awesome!

emacsomancer

I want Penrose tiles in Emacs.


Revisiting an N-body Simulator

(no comments)

Elisp Recursive Descent Parser (rdp)

csar

Wow nice work! I'm still a bit hesitant to implement a parser on my own, so i think i am going to use this :) Anyway, you can add another "fix typo" commit to the repo, in the comment of rdp.el you call the functions "rpd-parse" and "rpd-parse-string".

Christopher Wellons

Thanks, I just fixed the typos. I kept making that mistake. If you can show me I'd like to see what you make with it!

csar

 I'm writing a program to simplify logical expressions. I started in common lisp, but hopefully it will be easy to port to elisp. So far i have the simplification part, which needs a lot of refactoring before i can show any of it in the public. But it should be able to parse those expressions out of strings, thats where your rdp might help.


Programs as Elisp Macros

jonnay

This is super cool! What about syntax like (git :checkout branch) or (ls :-l file)?

Christopher Wellons

Yeah, keywords would make for a good API. However, if you want an argument that starts with a colon, you need some way to express it. For example, deleting remote branches with Git requires an argument starting with a colon.

Klaus

Somewhat offtopic:
The definition of make-shell-macro is a nice example of why I find the default indentation rules abhorrent...

Any idea how that can be improved? Using `-*- lisp-indent-offset:2 -*-' works out of the box, but it again results in unfavorable situations such as weird indentation of `let' bindings.

Christopher Wellons

Yeah, it moves towards the right pretty quickly. Sorry, I don't know a good way to deal with this other than to restructure the code to avoid it. I've certainly restructured functions just to work around indentation issues, and if I were to rewrite make-shell-macro today (i.e. 6 years later), I'd structure it differently for this reason.

An especially annoying case is call-process-shell-command. When indenting arguments, the next level of indentation is based on the length of the function name, so long function names exacerbate the issue. On top of that, this function's arguments also tend to be long, and so I always seem to run out out horizontal space.

But the most annoying at all is the behavior of square brackets (vectors) with emacs-lisp-mode's indentation. These indent like lists of code (e.g. function calls, etc.), but vectors (should) only hold data, not code, so this is inappropriate. It makes formatting s-expression data structures containing vectors annoying.


Emacs Abnormal Termination

(no comments)

Emacs visual-indentation-mode

Ahmed Fasih

Huge noob question (sorry). I cloned your .emacs.d repo and then added your visual-indentation-mode as a submodule into it. I see that ~/.emacs.d/visual-indentation-mode is in my emacs load-path, and visual-indentation-mode shows up and is correctly identified via M-x locate-library, but as I type "M-x visual-indentation-mode" emacs says "[No match]" and can't find it unless I explicitly load-file its .el file. Is this a subtle git issue or a not-so-subtle emacs issue? Many thanks for this and the excellent .emacs.d repo, I'm enjoying both.

Christopher Wellons

Yeah, this is one of the confusing things beginners have to overcome once they start manually configuring packages. The behavior you desire is called autoloading.

The load-path is where require looks when asked to load a package. Being in the load-path by itself is not enough to make a file's functions available. It still needs to actually be loaded.

At this point you have a couple of options. One is to simply have your configuration load the package at startup. You'd add this to init.el.

(require 'visual-indentation-mode)

Or, if you're concerned with keeping the Emacs start time minimal, set up an autoload. Nothing is actually loaded until you attempt to enable visual-indentation-mode.

(autoload 'visual-indentation-mode "visual-indentation-mode" nil t)

When you install a package through package and ELPA, an autoloads file is set up and loaded automatically. If visual-indentation-mode was installed this way, you wouldn't have to do either of the above. You can manually create an autoloads file (just once),

(package-generate-autoloads
"visual-indentation-mode" "~/.emacs.d/visual-indentation-mode/")

But in order to use it you would need to load the autoloads file yourself anyway, so there's not much advantage to doing this -- at least not for this particular package.

(require 'visual-indentation-mode-autoloads)

I hope that helps.


Debian Bugtracker Data

(no comments)

Skewer: Emacs Live Browser Interaction

Marco Paolo Valerio Vezzoli

truly awesome!
Next time I have to develop something with a web interface I will check it!

boothead

Would there be an easy way to hook this up over a WebSocket?

Christopher Wellons

WebSockets are a possible future for skewer-mode. Someone's already written a library for it so half the work is done. Personally I'm still waiting for the dust to settle before I rely on WebSockets for anything: better browser and proxy support. Right now skewer-mode works with any browser that supports XHR, and I'm unaware of a browser that supports JavaScript but not XHR.

Since skewer-mode is really only intended for local use, the overhead from the XHR long polls are negligible. There doesn't seem to be any advantage for WebSockets in this case.

boothead

I was thinking of the ability to do bi-directional stuff, but long polling fills this requirement too I guess.

BTW, how do you actually use skewer? I looked at your boids example, but I couldn't see where the skewer.js is being included or the connection to emacs being opened?

Christopher Wellons

Skewer is a servlet for one of my other Emacs packages: simple-httpd. That takes care of all of the network connections for Skewer.

None of Skewer's files are served directly. The file skewer.js is served by the servlet installed at /skewer. There's a hook for injecting extra code in here, for use by extensions to Skewer, so the file isn't necessarily served literally. The file example.html is served by /skewer/demo, which will automatically be visited by M-x run-skewer.

The boids repository doesn't actually pull in Skewer on its own. You'll need to either add the script tag yourself or use the userscript to inject it.

quasi

could you elaborate ? Or simply, could you possibly have a small tutorial on using skewer on your own files (or setting up of the boids demo). I could not really understand how to use it except the REPL.

quasi

never mind :) got it. I had to mount the web root of the httpd server and lo behold I was done. haha. thanks for this awesome thing. SLIME is the best and I shall miss it less with this mode.

VincentToups

Sweet!  I'm going to use this to enable interactive development with Gazelle! (https://github.com/VincentT...

Christopher Wellons

Definitely interested in seeing this. I found out about Gazelle when you posted it in r/lisp in December. I thought it would be really quick to hook up to Skewer, but I only saw functions for compiling files, nothing for compiling from strings or s-expressions.

Note: Disqus included the closing paren on your link, so it's broken.

jasonm23github

Great work Christopher,  just one point, the httpd processes aren't linked to buffers, (closing *httpd* (or is it *http*)) doesn't affect the process, same goes for skewer-repl. 

Seems like a stupid question, but how do we quit?

Christopher Wellons

Good idea. Perhaps I should associate the server process with the *httpd* buffer. Right now you can stop the server with M-x httpd-stop, being the opposite of httpd-start.

There is no single process for skewer-repl to be associated with. That buffer is just a receiver of messages from various short-lived TCP connection processes. Because of long-polling and HTTP keep-alive, even after shutting down simple-httpd you may have lingering access to the browser with skewer until that particular TCP connection dies. You won't be able to connect any new pages to Emacs with the server down.

jasonm23github

Oh, when I did M-x list-processes  skewer-repl shows as  an orphan (buffer-less) process. (the repl buffer was already killed)

Christopher Wellons

That's a consequence of building on top of comint-mode. It requires some process to exist, so I create a dummy pty process for that task. IELM does the same thing, but it uses hexl.

However, it looks like skewer-repl doesn't properly clean up when the buffer is killed. I'll have to investigate that. Thanks!

jasonm23github

cheers, hth

Piotr Pałek

Hey, maybe i'm a little late with my comment, but do you think it would be possible to develop ExtJs apps with skewer? I mean something like change a panel and only reload the code necessary, so you don't have to reload the whole page to see the changes. Would probably need a lot of integration with the framework's code huh?
ps. I also was thinking about developing for ExtJS but with the new MVC model

Christopher Wellons

As far as I know, ExtJs can do AJAX just the same, so it should work fine with Skewer. See the CORS/bookmarlet section of the Skewer README. You just need to add a script element with a src pointing to Emacs.

Murph

Is there some reason that skewer-mode wouldn't work with Emacs 24.3.1? When I do M-x package-install ENTER skewer-mode, I only get the following output to the minibuffer: Package `emacs-24.1' is unavailable.

I put an issue on the Github page as well, in case this is better dealt with there.

shevchuk

Does skewer-repl support popping previous / next inputs (like slime-repl-previous-input) ?

Christopher Wellons

Yes, skewer-repl is derived from comint, so just use the comint commands (comint-previous-input, comint-next-input, etc). I've mapped these to the arrow keys in my config for all comint modes.

shevchuk

great, thanks a lot!

shevchuk

Is there any chance to see an auto-complete in skewer repl ?

Christopher Wellons

The closest thing Skewer has to autocomplete is ScottyB's ac-js2, but it doesn't work in the REPL. The main obstacle for this is probably the "js>" prompt. js2-mode wants to parse the entire buffer as JavaScript, but those prompts get in the way. The easiest thing to do would probably be to change this to some sort of valid JS, but since this isn't an itch for me, I haven't spent any time looking into it.


JavaScript's Quirky eval

anonymous

The perl example isn't properly quoted. If you escape the dollar, bar is set to 5 as expected.

Christopher Wellons

You're right! It's been a couple of years since I've touched Perl so I completely forgot about string interpolation. This behavior makes a lot more sense. I just fixed the post. Thanks!

anonymous

Sorry, that's not what I meant...

sub foo {    my $bar = 10;    eval "\$bar = 5";    return "$bar";}print foo();

That prints 5 for me. So from my understanding (I didn't read the perl docs), eval *does* allow me to set (albeit not declare) variables...


JavaScript Strings as Arrays

Gavin Black

You'd love typeclasses then :) All the generalization, without crossing your fingers that a function doesn't do anything that ruins it. 

Luke

Heh, I was about to say you could "inject" these methods into the String prototype, but yes - it is something that will make future maintainers cranky. :)
 


JavaScript Debugging Challenge

(no comments)

Raising the Dead with JavaScript

(no comments)

JavaScript Truthiness Quiz

(no comments)

A Use For Macrolet

(no comments)

Elisp Weak References

(no comments)

How a simple-httpd Vulnerability Slipped In

(no comments)

An Emacs Pastebin

Lorenz

Cool

Luke

Thanks for the link. The series ain't over yet though. There are at least two more parts to it. I just took a break as not to inundate people with PHP. :)

Btw, I never cease to be amazed at the flexibility of Emacs. I think Steve Yegge once described it as an ELisp interpreter which coincidentally also happens to be pretty good at editing text. :P


(no comments)

Clojure and Emacs for Lispers

Lorenz

Thanks

Ahmed Fasih

Christopher, when I open a .clj file with your current .emacs.d (from Github), its parens and brackets are all white, until I do "M-x clojure-mode", i.e., paren-face seems to not recolor parens and brackets upon clojure-mode first initializing. This doesn't seem to be an issue with elisp files---their parens are grayed out as soon as they open.

(PS. Thanks for the autoload tip!)

Christopher Wellons

That's just a small parenface bug. The first time the mode runs, parenface adds the new highlighting rules to the mode's syntax table. Emacs unfortunately doesn't make use of the new rules until the next time the mode is activated. It hasn't bothered me enough to invest time in fixing it. I run Emacs for weeks at a time (--daemon), so it's a fairly rare annoyance.

Christopher Wellons

Ahmed, could you e-mail me at my home e-mail address? I just realized I don't have your home e-mail address and I have something to show you.


Flu Trends Timeline

Guitreize

Could you explain what colors mean what?
Because as far as I'm concerned, I could be looking at a US map colored at random.

Christopher Wellons

As far as I can tell, the exact meaning of trend numbers are a secret that Google holds closely. It's some normalized count of queries on the flu. So take the coloring as relative to other states and dates. Also, note that the color is scaled by logarithm of the raw data.

For example, for last week the value for NY, a state of 20 million people, was 12k. If that's the number of total flu queries it seems like a low-ish number, but maybe not. That's an entire order of magnitude more queries than one year ago. Strangely, 11k of those were from NYC itself, so its as if the rest of the state barely uses Google.


Turning Asynchronous into Synchronous in Elisp

(no comments)

Parameter Lists in Common Lisp and Clojure

Foo

(defn make-name
  ([first last]
     {:first first, :last last})
  ([[first] [last]]
     {:first first, :last last}))

Above already fails. The patterns are different, but we are not allowed to use the same arity. Huh? 

Christopher Wellons

Yup, overloading is by arity only, not by pattern. When I was first learning Clojure I had hoped I could do this, too. It would be neat to dispatch Haskell-style, like x:xs. I can understand why this sort of thing isn't provided so it doesn't bother me.

Gonzih

It's not pattern matching, it is destructuring. Overloading is done by arity only. If you looking for pattern matching - core.match.

Anders

Hi, I'd be really grateful if you could explain the syntax of the height parameter in the sample that you gave:
(defn height-opinion [name & {height :height}]

I don't understand how this function can be called like (height-opinion "Chris" :height 6.25). I would expect (height-opinion "Chris" {6.25 height}).

Anders

Sorry, (height-opinion "Chris" {6.25 :height}).

Christopher Wellons

The ampersand in the parameter list says to treat all of the remaining arguments as "rest" arguments, to be passed to the function as a collection. In this case that collection will be a map, but normally it will be a sequence of some sort. The parameter that accepts the "rest" arguments is a map destructuring bind specification, which will disassemble the "rest" collection as though it was map.

z0ltan

Short and sweet. Well done, my good man!


The Limits of Emacs Advice

Hero Workshipper

"I'll figure something out."

So brave! A real hero!

Jason Aeschliman

use an idle-timer

Christopher Wellons

Yeah, that's another possibility but it's kind of messy. I need to worry about cleaning up the timer when the buffer is killed and it would need to poll frequently in order to be useful. I prefer to avoid polling when I can.

yuriko

it does help me

yuriko

could you teach me more about other types of advice, such as around advice? or introduce a little more about advice system please?

Christopher Wellons

The "Advising Emacs Lisp Functions" section of the Emacs Lisp Reference Manual covers most of what there is to know about advice. The manual comes with Emacs (M-x info). It explains how advice works and provides examples to get you started. Note that, as of this week with the release of Emacs 24.4, the defadvice macro is considered nearly obsolete, replaced by the very similar advice-add function.

PythonNut

What about the atomic sledgehammer known as post-command-hook?

Christopher Wellons

I don't remember if I tried it at the time, but it looks like post-command-hook isn't invoked by narrow-to-region or widen.


Live CSS Interaction with Skewer

top essay writing services

Learning the advanced methods of computer programs with the use of the CSS is really useful to us web developers. This helps us to do different platforms in computers that make our designs to be more responsive when it comes to computer designs and coding.


Web Distributed Computing Revisited

Gavin Black

Very cool, I remember doing those initial setTimeout experiments and how lackluster they were. Now we can have purely browser based botnets ;)

Ahmed Fasih

This is a really neat use of browsers, but did you tag this as 'emacs' because you attached the V8 engine to emacs?

Christopher Wellons

Oops, well at least this tells me people really do pay attention to tags! I added those tags before I started writing the post, and I was originally intending to compare and contrast with this old Emacs post in addition to the other JavaScript post. Thanks, I just removed the "emacs" tag.

Ahmed Fasih

I came looking for ars emacsica but your enthusiasm for Javascript, so generous in this post, is infectious. Thank you!

Ahmed Fasih

Could someone have undermined you by editing the Javascript before running it to send back false results (either obviously false or subtly incorrect)?

Christopher Wellons

That's a general concern for this type of system, but it wasn't a problem here. When the client found a higher-scoring key, it only reported the key itself, not evening bothering to report the score. Upon receiving a key, Emacs would check for validity (contains each letter of the alphabet exactly once), compute its score, and then decide what to do with it. So, conveniently, this problem was impossible to cheat this way.

In my last experiment this was an issue. I had clients doing primality tests and there was no way to verify any result without repeating the computation.

Ciurmy

Hi, a solution to the problem has been implemented in the Ciurmy platform.

If you have interested about the JavaScript distributed computing, look at www.ciurmy.com, where the users can sell the computing power of your devices, using the browser for the calculation.
We are searching javascript coders for implementing algorithms in js language for the analysts and makers that need computing power.

Gianluca Conti

Alex

There is one project where everyone can participate to make a one step to Unified Field Theory, which Einstein tried to find.
https://zlelik.blogspot.com...


Emacs Javadoc Lookups Get a Facelift

Brian

Nice work. This really makes Javadoc way more accessible from Emacs. 


How to Make an Emacs Minor Mode

Dan Doherty

Christopher,

Thanks for this nicely done tutorial. Just what I needed. Nice formatting too.

Regards,

psachin

Hi Christopher,

Indeed a nice tutorial which I got to read when I needed it. I wrote a network status elisp code which appends up/down arrow in the mode line depending on network connection. You can browse my code at https://github.com/psachin/.... I want to make this feature as a minor mode. My code uses 'make-network-process' to check the network connection and every time I have to kill the process so that it there won't be any queue of processes. Now my question is how can I kill the process if the mode is disabled(toggled)?. So far it just removes the ':lighter' string from the mode line but does not stops the process. Any pointer will be of great help.

Thanx.

Christopher Wellons

The body of your minor mode definition is run both when the mode is toggled on and off. In that body you want to check if your mode is being disabled: the variable of the same name as the mode with be nil. If so, kill the network process at that time. If it's being enabled, this is where you start the network process.

I'm guessing you're intending this to be a global minor mode rather than buffer local, so that's all you really need to worry about. If your mode was buffer local, you'd also want to put a cleanup function in the buffer local kill-buffer-hook. Minor modes are *not* toggled off before the buffer is killed (this was probably a design mistake), so you would need an action to perform cleanup. But, again, if it's a global mode then this step isn't needed.

If you're really worried about correctness, you'll also want to support unloading. You'll want to kill the network process if your feature is unloaded, probably using a _feature_-unload-function.
http://www.gnu.org/software...
Honestly, though, I don't think anyone really uses feature unloading, so I wouldn't worry about this detail unless it sounds interesting.

psachin

I was able to implement my script in to a minor mode.

Thanx Christopher.


JavaScript "Map With This"

mgsk

"However, I sometimes find myself wishing there was a map-like function that used the element as the context of the function. Then I could apply a method to each of the elements in the array rather than be limited to functions."

Could you clarify this? What do you mean by context?

Christopher Wellons

When a JavaScript function is invoked, all of its explicit parameters are bound to values (the provided arguments) and then the function's body is evaluated. There's also an implicit parameter, this, sometimes called the context of the function. Under normal function invocation, the context is the global object (window) or undefined (in strict mode). When a function is called as a method -- that is, it's called as the property of an object -- it is automatically bound to that object during the evaluation of the function's body. This happens even for "static" methods like JSON.parse() because JavaScript is a prototype-based language.

The first argument to the Function methods call() and apply() are the context to use for that function call. This context can be locked in for a function permanently, no matter how the function is called, using the bind() method.

function foo(x) { return this + x; }
foo.call(10, 2);  // => 12
var bar = foo.bind(4);
bar(2);  // => 6

What I was saying was that I wanted a map method that bound this to each element rather than the first explicit parameter.


Memory Leaks with XMLHttpRequest Objects

Brian

Great explanation. Thanks for doing this. Now I have somewhere to send people when this question comes up!

Ben

This post is very speculative. I can't get this to show ... at least in Chrome.  Theoretically, if what you're saying is true, I should be able to go into the developer console and enter this: 

(function () { for(var i = 1000000; i--;) { new XMLHttpRequest(); } })();

And see my memory footprint spike and well... stay spiked. It goes up, for just about as long as the function takes to execute.

I think what you're saying *might* be true, but only on a browser per browser basis... and so far I haven't seen anything non-anecdotal supporting your theory.

Christopher Wellons

Yeah, instead of actually digging into browser source code I'm speculating and treating it like a black box. It's lazy but a lot more fun. :-) I was able to demonstrate the XHR specialty, so it proved to be useful anyway.

The memory spike itself is the artifact I was demonstrating. Substitute a different type, like my Point prototype above, and you won't see any spike in Chrome. The heap never needs to grow like it does with the long-living XHRs. I almost crashed my computer again just trying to double-check this with XHRs.

Roland

I believe that if you remove the load event handler or delete xhr.onload just after calling the callback you'll allow everything to be garbage collected. This will allow the garbage collector to remove the event handler function thus allowing it to clean up the xhr reference.

Greg Reimer

Couldn't it be that `xhr.onload` is a setter, and the function assigned to it is simply stored in some asynchronous queue, thus preventing GC since it can reach `xhr`? As a trivial example:
function Foo(){
    this.__defineSetter__('onload', function(f){
        setTimeout(f, 1000);
    });
}

(function(){
    x = new Foo();
    x.onload = function(){...}
})();

// x still lives!!

king

Great!思路不错

Timothy

Finally someone can explain this bug. But I'd love to see a more detailed explanation of your pooling solution.

Ross

Is the memory leak on this page intentional? :)

Barney Carroll

Fantastic insights. This only became a problem for me recently when we did a lot of work on caching for huge time series databases. Despite far-future expiry cache control headers and a UI which otherwise did its best to destroy obsolete references, I noticed (thanks again, chrome dev tools) *colossal* garbage collection operations on XHR invocation – and even then memory usage continues to grow until the process dies. The network tab confirms that no new requests are being made.

So do you think this is something XHR API wrappers should take care of? I'm wondering if judicious memoization could work… The only way of doing it right would be to effectively replicate the browser's internal logic as to cache control headers, which is not a task I particularly relish… But this problem is critical, so even this conceptually ugly chore might be worth considering.

Bugsy

So what you are suggesting is keeping a say, array of already used XHR objects, and when a new request is to be made, use one of those existing objects instead of creating a new one?

Christopher Wellons

Basically yes, or use a library that does this for you. However, this issue is becoming less and less relevant as older browsers are gradually phased out, so it may not be worth the trouble in new applications. (Example: IE7 doesn't work for the biggest parts of the Internet anymore, so there's no reason to bother supporting it and all its bugs.)

Srinu

Looks like this issue is still there in the modern browsers like IE 10 & 11. We are encountering a memory leak in both of these browsers, not in chrome and firefox. The memory snapshots (IE) do not show any memory leaks but the IE process memory simply increases and eventually it crashes after 3 to 4 times of ajax calls.

Hervé Stern

I am facing the same problem.
Did you find any workarounds?
Thx

Barney Carroll

I encountered this problem when making dozens of XHR requests synchronously. The most modern browsers would intermittently freeze as sudden spikes occurred. The solution that worked for me was to wrap my XHR calls in a function which took count of the number of currently active requests. When that number exceeded 50, I would return a promise, and only bind that promise to a new XHR request once the concurrent requests had diminished (running a check whenever a request resolved ).

Christopher Wellons

Were you making this requests to different domains? By default, browsers will limit the number of parallel connections per server to between 2 and 8 (the HTTP/1.1 spec explicitly says 2; adjustable in Firefox in about:config; and hardcoded to 6 in Chrome). After that limit they *should* get queued internally and even pipelined within individual HTTP/1.1 connections. If yours were all to the same domain and the browser was still choking, then I guess that's more evidence that XHR is a weird object!

Barney Carroll

No, all same domain. The HTTP throttling was happening under the hood, but simply initializing any significant number of XHR objects (and this is with latest Chrome & Firefox) causes huge memory consumption.

CM

When you created your instances with makeMany, they were all not yet opened, which according to the spec about XHR garbage collection, means they should get garbage collected, but maybe the browsers only implemented the XHR GC to occur *after* the request is completed? However, the usual use case is that you're only actually doing a few requests at a time, which means it would be more effective to see how long they stay around after completing. (It could be smart enough to notice that open was never called, hence waiting for the request to complete is pointless, but perhaps that's an optimization the browser doesn't make?)

What I'm getting at is that your makeMany test is not really representative of how XHRs are typically used, so maybe garbage collection works fine when you use them "normally" (i.e. creating a few at a time, having them completed, creating more).

All that said, thank you - your insight and testing was very useful, and you're a good writer.

Spec here: https://xhr.spec.whatwg.org...

peter

Thanks for posting this. It was the only article I came across mentioning this strange behaviour. It has helped me to better understand why my program continued to work even though I had forgotten to retain a reference to my xhr object. The trick of removing the onreadystate handler on completion seems to make things a bit better - as can be demonstrated by removing the handler before completion say when readyState is only 2 or 3, the callback for readyState 4 never happens so I am guessing the object is at that point up for GC


Fast Monte Carlo Method with JavaScript

Foo

Mostly you are testing the speed of the random number generator.

possiblywrong

I don't think the "programmer route" is lazy at all, particularly since in this case the referenced description of the "mathematical" solution is rather spectacularly wrong.  (That is, the final answer is right, but the derivation of it is essentially nonsense.)  Which is unfortunate, since this is a really nice problem.

The blog author doesn't even ask the question correctly, referring to draws of "random numbers [distributed] exponentially between [0, 1];" an exponential distribution isn't confined to the unit interval.

Christopher Wellons

Interesting. This is where I originally came across the problem and it was called out on reddit by a few people as well,

http://www.reddit.com/r/mat...

possiblywrong

Cool.  The top commenter's "volume of cube minus simplex" solution is how I remember approaching this problem... but I really like the internet_poster's elegant/short/simple functional equation solution.  Thanks for pointing this out!

Brian

Try marking the trial function in C inline. I'd hope -O3 would see the inlining opportunity on its own but maybe it didn't.

Christopher Wellons

I'm seeing no difference with or without the explicit inline. I'm also running this on a different computer now -- Intel Pentium Dual CPU T3200 -- same exact software except that it's 32-bit, and C is 3x slower than JavaScript here!

Helio Frota

Hi cool post !

I test with java and i got 11 seconds:

public class MC {

static double trial() {
double count = 0d;
double sum = 0d;
while(sum <= 1) {
sum = sum + java.lang.Math.random();
count = count + 1;
}
return count;
}

static double monteCarlo(int n) {
double total = 0;
for(int i = 0; i < n; i++) {
total = total + trial();
}
return (total / n);
}

public static void main(String args[]) {
long start = System.currentTimeMillis();
monteCarlo(100000000);
System.out.println(System.currentTimeMillis() - start);
}
}

Christopher Wellons

For the record, on the same machine that I ran the other programs I'm getting 6.3 seconds for your program. Faster than SBCL and slower than C, which is where I would expect it to be.

Jon

The difference is that in the 'C' you have put "/ (double) RAND_MAX" in the trial function so you are doing an extra 100 million divisions, which is expensive

Christopher Wellons

Great point! I assumed GCC would optimize this into a multiplication by the inverse, but that's not so. I just checked it now by disassembling the output of GCC 4.9.2 and it's doing a division. If I turn on -ffast-math it becomes a multiplication, shaving 8% off the running time. This makes sense because multiplying by the inverse is less accurate, due to the inverse being an extra intermediate value.

Paul Canning

I'm struggling to understand exactly what this does? What I can see is that it is adding random numbers together to get a result that is greater then 1, then giving the average number (over x number of runs) as a final result? so the number at the end can't be seconds?

Christopher Wellons

This is one of my worst articles, and I should probably put a disclaimer at the top about it, so don't put too much thought into it. I didn't do the benchmark right. The 2.7185 (i.e. close to e) you're seeing is the result of the calculation, not the running time. However the JavaScript version just happened to complete in 2.7 seconds.


Serializing JavaScript Objects

Markus Roth

This post answered the question I had in my head when I started searching the web perfectly. Thanks a lot for posting it!


7DRL 2013 Complete

(no comments)

Applying Constructors in JavaScript

Terrence Watson

Thanks. This was exactly what I was looking for.

Ray Foss

Genius way to contain the ugly of "new" and now "class" without having to resort to a full blown overhaul.


A Seedable JavaScript PRNG Sampling Several Distributions

JOE SIXPACK

Thanks :)


JavaScript Fantasy Name Generator

(no comments)

Precise JavaScript Serialization with ResurrectJS

bora

how do you handle a nested object tree? can the tree be fully resurrected?

Christopher Wellons

Yup, it will handle trees just fine. Arbitrarily nested objects is one of the primary use cases. This includes the situation where the same branch or leaves is referenced from multiple places in the tree, even circularly, since identity is maintained.


Prototype-based Elisp Objects with @

Eduardo Hao

Heya! Just found this post although i have been lurking your blog. It contains a lot of helpful info.

I've looked for but found nothing, by any chance do you know if operators can be overriden? Ex:


(setf vector (@extend :_x :_y))
(setf v1 (@! @vector :new))
(setf v2 (@! @vector :new))
(setf v3 (+ v1 v2))

Thanks in advance

Christopher Wellons

Currently Emacs Lisp doesn't allow you to override functions for
different types, neither with compile-time nor run-time polymorphism. It
*does* support single-dispatch through the CLOS-like library, EIEIO.
However, very few functions are written to support this. For the sake of
performance, it's highly unlikely math primitives like + will ever be
generic.

Instead you could define your own + operator as a method
within @. There are a number of problems with your small example code,
like the first like being "vector" instead of "@vector". Hopefully the
following displays correctly:


(defvar @vector (@extend :x 0 :y 0))

;; constructor, called via :new
(def@ @vector :init (x y)
(@^:init)
(setf @:x x @:y y))

;; addition operator, mutates "this"
(def@ @vector :+ (v1)
(prog1 @@
(cl-incf @:x (@ v1 :x))
(cl-incf @:y (@ v1 :y))))

;; export to Elisp vector
(def@ @vector :export ()
(vector @:x @:y))

(let ((v1 (@! @vector :new 1 2)) ;; (:x 1 :y 2)
(v2 (@! @vector :new 3 4))) ;; (:x 3 :y 4)
(@! (@! v1 :+ v2) :export))
;; => [4 6]
Eduardo Hao

Corrected vector to @vector

It seems like i missed the setters when reading, that's why i was using underscored variables (":_x :_y" ) believing it wasn't dynamic so i actually ended up doing something like


(def@ @fasor :x (&optional x)
(when x
(progn (setf @:_x x)
(setf @:_magnitud (sqrt (+ (* @:_x @:_x) (* @:_y @:_y))))
(setf @:_radianes (atan @:_y @:_x))
(setf @:_grados (rad2grad @:_radianes))))
@:_x)

^It had to update other variables based on the new value

I am making an object called phasor [1] that is used on electronics to represent currents and voltages, it eases my calculations because you dont have to manually be converting rectangular to polar coordinates and viceversa to perform addition and multiplication.

Here's my full code:
https://github.com/LaloHao/...

I will however update it to use :set instead, since it can do all i (wrongly) scripted right?

[1] https://en.wikipedia.org/wi...

Christopher Wellons

Here are my notes, from top to bottom:

* The performance of @ is absolutely terrible, so only use it if this isn't important and you like the @ API. :-)

* Don't use load-file. The require function will already use it if needed.

* You should prefix your global definitions with a package prefix. For example: fasor-grad2rad, fasor-rad2grad. For a @ value, @fasor is already an appropriately prefixed name, though some Elisp "linter" tools might find it confusing.

* You don't need to use progn with "when". You would if you used "if" but that's what "when" is there for anyway.

* Don't make fasor-temporal a global variable. Use a local let binding. Also, don't use setf at the top level in a package: Use defvar instead.

Otherwise you've got the right idea about how to use @! Very cool.

edcrypt

Have you considered using Self-like "maps", as described by the PyPy folks here: https://morepypy.blogspot.c... ?

Christopher Wellons

That's a neat technique. I don't remember if I mention it in my article, but with @ I did experiment with caching fields/methods as to cut down on the costs of dynamic lookups. It actually made performance even worse. I think Elisp is just too high a level for the serious performance issues of @ be solved within it. Though ... perhaps the new Emacs 25 dynamic modules could be used to power something better.

I've used this map technique in another context (C, with dynamic structs) and had huge gains.


Tracking Mobile Device Orientation with Emacs

Marco Paolo Valerio Vezzoli

nice! to me this looks interesting for video games mainly (even if someone used a wiimote to plot a chart of the holes in the street while using a bike).


Userspace Threading in JavaScript

faruzzy

This is pretty slick! I'll try to recode it myself based on your work! Great work!


Disc RL in the Media

(no comments)

Inventing a Datetime Web Service

Marco Paolo Valerio Vezzoli

:)

Brian

Awesome hack!

Ray

So, if your single page app makes alot of xhr requests, you could concievably hook into the xhr object and update the server time with every request. Awesome


JavaScript Function Statements versus Expressions

(no comments)

Load Libraries in Skewer with Bower

Sindre Sorhus

Thanks for the article. Just wanted to clear up some points.

> 113 (5%) of them have unreachable or unresponsive repositories. About half of these are due to invalid repository URLs.

That's mostly because of missing validation which is a high priority.

If you have a way to come up with a list of these it could potentially be fixed: https://github.com/bower/bo...

> 1,830 (83%) have no bower.json metadata file. This means the client has to guess at the metadata.

Bower recently renamed the metadata file from `component.json` to `bower.json` make sure your stats also account for the `component.json` file. In addition a metadata file is only required when a lib has dependencies otherwise it will use the git tag version, which ultimately makes this percentage useless.

Christopher Wellons

Good idea, I'll post a list all of those invalid URL packages in that issue.

Interesting, I had no idea it used to be component.json. The note about this was added to the Bower README just hours after I made this post, so I didn't realize this when gathering statistics. Looking at component.json is actually one of the ways my tool guesses at metadata.

The README says, "You must create a JSON file -- bower.json by default -- in your project's
root," so that's why I'm assuming bower.json is required. I really wish the libraries that had reasonable endpoints would specify them in their bower.json. It's essential information for my particular situation.


Should Emacs Packages Self Configure?

Steve Purcell

A possible solution in the case of skewer might be to provide either a "skewer-setup.el" file, or an autoloaded "skewer-setup" function, which users could optionally use to perform a bunch of standard set-up. If that setup code doesn't quite work for them, they could then make their own version of it.


Skewer Gets HTML Interaction

Ahmed Fasih

The selector mechanism reminds me of enlive!: https://github.com/swannode...


A Handy Emacs Package Configuration Macro

Meatball

Nice post, only thing is that the package-helper.el link is broken.

Christopher Wellons

Thanks for the warning. The link was originally correct, but I had since moved that file within the repository. It's fixed now.

耀华 王

Seems really cool...BTW, what color theme are you using, it's so beautiful :)

Christopher Wellons

That's the wombat theme and it comes with Emacs. I have a slight eye problem that makes reading light-on-dark a lot easier than dark-on-light, so I always prefer darker themes.

耀华 王

well, thanks for your reply, i was hanging around in your blog, and i found it's quite a good place to learn something emacs, nice work :)


Emacs Mouse Slider Mode for Numbers

(no comments)

Long Live WebGL

Guest

Chris,

This is fantastic work! I was psyched to see this uploaded on melpa, can't wait to play with it. I already have a question, even though I haven't dug into the source yet, because I'm impatient:

- Have you considered integrating the storage format with org-mode at all? You could leverage org-feeds (a part of org core now) to store feed data, and org-sync (currently a separate library) to facilitate syncing between machines via git or dropbox.

I'm working elfeed into my .emacs this evening, I'll let you know how I fare.

Thanks for sharing this with us,
Chaz

tuxflo

Hey! Thanks a lot for that article! I searched a lot to find out why it is not possible to use the fixed pipeline stuff (they theach us at the university) in WebGL. Now I understand things better and I will focus on learning the new API and forget about the old stuff.
Greetings,
tuxflo

Yobi3D

Thanks to WebGL. We are able to make our 3D search engine fast and interactive.

https://www.yobi3d.com

Haben Girmay

would please give me some hints or links for the earthquake using shaking table..thank you

hi

Totally agree about low-level understanding being important! This mozilla post might be useful to others too: https://hacks.mozilla.org/2...


Life Beyond Google Reader

Josh Gunderson

I initially dug NewsBlur and paid $12 for a year (must've raised the price since then), then I grabbed a Digital Ocean sub and installed tt-RSS, which is decent, but within the past couple of weeks I've found InoReader, which is nearly perfect for me.

I've tried dozens of others, nothing else really stands out as being a replacement for me. Feedly is decent once you whip it into shape with Greasemonkey. And their Android app is really nice. The two-way sync with Reader is worth A LOT of points.

I'll continue to use Google Reader as my main reader until the day it dies, then take a final snapshot through TakeOut and move to InoReader probably.


Small HTML5 Canvas Design Pattern

(no comments)

Personal OS Configuration Live System

cal

Regarding your Firefox configuration annoyance, the user preferences file 'user.js' loads early enough to accomplish what you want.

As an example, you can prevent the Adblock first run tab with the following in 'user.js':

user_pref("extensions.adblockplus.currentVersion", "2.1");

For a 'Live OS' environment, it's a workable solution.

Christopher Wellons

Thanks for this idea. However, this still won't work due to Firefox's outdated profile concept. The user.js has to be placed in the user's profile directory. This has two issues:

The profile directory is a randomly generated directory name: it's unpredictable. I don't know where to put the user.js until *after* I've already run Firefox, which is too late.

Providing my own profiles.ini allows me to specify the profile direcotry. Unfortunately a directory isn't actually a real profile directory until Firefox has initialized it. As before, this means I still have to run Firefox once before I can make use of a user.js. This is, again, too late to stop the initial extension popups.

cal

In my experience, Firefox will use a preexisting profile folder for an uninitialized profile.

What
I mean by that is, one could create the necessary directory structure
and populate it with the customized files (e.g.
~/.mozilla/firefox/profiles.ini,
~/.mozilla/firefox/$DIRECTORY-SPECIFIED-IN-CUSTOMIZED-PROFILES.INI/user.js)
presumably from the '/etc/rc.local' script. When FF runs the first
time, it will simply add the other config files and miscellany to that
folder which already exists. Since user.js is not one of those files it
doesn't get clobbered.


Liquid Simulaton in WebGL

Ahmed Fasih

If Nvidia CUDA performance for 2D convolution is any guide to WebGL's capabilities, your underwhelming GPU throughput makes sense. [1] is one of the early CUDA SDK white papers, and the code ships with the CUDA installer. It helps explain how memory latency and architectural limitations such as the maximum number of threads per GPU multiprocessor combine to make 2D convolution, even when separable, a hard problem (I assume you used separable kernels in your code). I wish the white paper included a CPU comparison, but the code sample [2] (also included in the CUDA installer) will include a CPU version that could be tried. This CPU implementation will be used only for testing the GPU's output for correctness and is probably not as optimized as it could be, but might give you a responsible comparison for the window sizes you're looking at.

In the past I spent some time squeezing performance out of GPUs for general purpose computing, and it's hard! One of my personal goals is to understand how much of CUDA intuition applies to WebGL on CUDA-capable hardware, and what the differences are for non-CUDA-capable hardware like your tablet.

[1] http://docs.nvidia.com/cuda...
[2] http://docs.nvidia.com/cuda...

Christopher Wellons

I'm glad my conclusion makes sense. Thanks for the links. Also, congratulations, you're now my local GPU expert when I have questions! :-)

Ahmed Fasih

I also just read this detailed article on the difficulties of memory management on Android (and other mobile devices) that reminded me of this post: http://sealedabstract.com/r... (via John Carmack's Twitter feed).


Live Systems are Handy

(no comments)

Coining Autoism

Jay Dugger

This idea had a long discussion in an early Clifford Pickover book, Mazes of the Mind, perhaps?


JavaScript Function Metaprogramming

Ivan

Hi, you can also use this form of the Function constructor:
var f1 = new Function('x,y', 'return x + y');
f1(1, 2);

var f1 = new Function ('x,y', 'return x + y')
f1(1,2)
var f1 = new Function ('x,y', 'return x + y')
f1(1,2)

var f1 = new Function ('x,y', 'return x + y')
f1(1,2)

Ivan

Oops, sorry for the mess, this disqus thing is a strange beast when pasting. You can edit/remove the post if you like.

Christopher Wellons

Very interesting. I took a look at the spec for this just to double check. Section 15.3.2.1 describes how the arguments are collected into a string. Not only does the spec allow this to happen pretty much by accident, but it even makes an explicit note that this is allowed. Extra whitespace is therefore valid, too:

new Function('x\n', 'y', 'return x + y')

new Function(' x ', ' y\t', 'return x + y')

Cool trick.

It would also seem that my pure JavaScript implementation of Function is more valid than I thought. If I switched to a global eval it would be 100% correct.


Leaving Gmail Behind

Samask

Re: notmuch vs mu4e

Notmuch use a very robust thread algorithm, while mu4e has a nicer emacs interface. The former is developed by quite a few developers, while the latter is mainly the work of one. How I wish I could use the more advanced core of notmuch with the ncier interface of mu4e! Ah, choices, choices...

andschwa

You've inspired me to finally move to hosting my own mail, but with it all managed via Puppet. I've looked into it several times, and did move from Gmail to Namecheap to Zoho for hosting, but I really want to run my own. I already have my server on Digital Ocean, and it has more than enough juice to spare to be an email server as well.

Is there any way to live-test my email server without switching all my MX records? I don't want to risk losing email if I don't get it running right the first time.

NicolasPetton

This looks like a very nice setup!

One question though: what about your gmail contacts?

Christopher Wellons

I've been using it nearly a year now and e-mail has never been better! To address your question, follow through to the "Hacker's Guide" article I linked and look at the address lookup section. Daniel provides a addrlookup.c program that provides notmuch with a list of addresses harvested from my very own email archives. I never had to export contacts or anything like that since this makes it automatic.

w

One suggestion for your email address needs with your DO account would be to use your cell phone provider's email->sms gateway feature. e.g. use 9195554312@vtext.com if you have verizon. Most of the time these are receive-only email accounts and don't require anything from you. They're also usually not publicly available.

Christopher Wellons

That's not a bad idea, thanks! I've used the AT&T SMS/email gateway a bit, enough to have found and reported a security issue, but it never occurred to me to use it as a backup email address.

larryw

Glad its working for you. I felt the same way about Google and privacy the reality is that a lot of the people I communicate with are still on Gmail. I did setup my own email server as well and had to immediately bulk-up security (fail2ban and other tools) to combat the increased hacking attempts and STILL got hacked by a script that managed to hijack the mail server to send spam. At that point realized this was becoming a time sync and opened up a PAID email account with mykolab. So far its nearly a year and NO complaints from me and a couple of folks that I communicate with have now also switched out from gmail. Note I did NOT drop gmail its just used for anything that's non sensitive or potential for becoming crapware (sorry).


Introducing Elfeed, an Emacs Web Feed Reader

chazu

Poopsticks, I seem to have accidentally put my comment containing glowing praise of your work into another article on your blog. Nevertheless, bravo!

Christopher Wellons

I presume you're talking about this comment.

Thanks! I was actually completely unaware of org-feed. Taking a look at it now I think Elfeed could integrate with some of it. For example, if elfeed-feeds is empty it would use the feeds listed in org-feed-alist. It could could also pass entries over to org-feed or hook into it recieve entries from it.

However I wouldn't want to share database storage with org-feed. Its format is definitely written for human consumption rather than performance. One of the goals of Elfeed is to be performant with large numbers of feeds (targeting hundreds at a time). I'll write about the Elfeed database in a future post soon but, in short, it's very much oriented around fast filtering, saving, and loading. I want it feel instantaneous even with hundreds of thousands of entries at once, which is where Elfeed is at right now. What I could do instead is provide an export/import to/from the org-feed format. That could possibly cover syncing the Elfeed database between machines, assuming I could get the tag metadata in there.

You said you were psyched to see it in Melpa. Does this mean you were aware of it before I got it added to Melpa?

I'm glad to hear I have another user. I think you're the third one now, if it works out for you. Look in my .emacs.d/etc/feed-setup.el to see my current config.

Samrat Man Singh

Looks great. I am getting this error when running `elfeed-web-start`: https://www.refheap.com/18394

Christopher Wellons

I added a few things to simple-httpd that makes writing things like the Elfeed web interface much easier. You just need to update simple-httpd.

next-user-here

Hey Christoph,
I am daily user of org user since 2006.
I 've used opera 12 for feed reading, since opera has abandoned the
reader in their new version I 've looked for a replacement.
Have tried out the Emacs build-in 'newsticker', but it wasn't
customizable (filtering) enough for me.
Using your elfeed package for 5 days now and I'm very pleased.
Thanks a lot!

P1: Please make the view size of the feed title customizable (or better dynamically growing depending on title size)
P2: Please add also the time to the showing date
Q1: I 'd like to change the showing feed name 'Hacker News' to HN,
how?
Q2: Can I remove feeds older than 2 day automatically from the
database?

Thanks again
Greets
NUH

Christopher Wellons

Thanks! To generally answer your questions/proposals first: I intentionally designed Elfeed to be cleanly extensible, so ideally all these things you're asking can already be done without requiring changes to Elfeed itself, and your extensions today should work forever.

> P1: Please make the view size of the feed title customizable (or
> better dynamically growing depending on title size)

There are the variables elfeed-search-title-max-width, elfeed-search-title-min-width, elfeed-search-trailing-width. I just noticed that they were not declared as defcustom making them harder to find. Does setting these accomplish what you want?

Keep in mind you will need to clear the display cache after you set these values in order for the display to really start using them. It's as simple as evaling this code:

(clrhash elfeed-search-cache)

> P2: Please add also the time to the showing date

You can redefine/advise elfeed-search-format-date to display whatever you like and I *think* this will Just Work. For example:

(defun elfeed-search-format-date (date)
(format-time-string "%Y-%m-%d %H:%M" (seconds-to-time date)))

> Q1: I 'd like to change the showing feed name 'Hacker News' to HN,
> how?

This one's a little trickier since Elfeed tries to keep the feed title in sync with the feed itself. You can set it yourself at any time by grabbing the feed's struct with elfeed-db-get-feed and setting the title on it.

(let ((feed (elfeed-db-get-feed "https://news.ycombinator.com/rss")))
(setf (elfeed-feed-title feed) "HN"))

The main problem is making this change stick after updates. Probably the best way is to add this as "after" advice to elfeed-update (defadvice).

> Q2: Can I remove feeds older than 2 day automatically from the
> database?

An important concept is that Elfeed must never forget about entries. It needs to keep track of them -- remembering that it's seen them -- so that they don't appear to be new entries next time it's parsing its parent feed. To accomplish what you want you should just put "@2-days" in your default search filter.

(setq-default elfeed-search-filter "@2-days +unread")

You can also set up a tagger to mark older entries as read. (This tagger only applies to newly-discovered entries, not known entries.)

(add-hook 'elfeed-new-entry-hook
(elfeed-make-tagger :before "2 days ago"
:remove 'unread))

If you know Elisp well enough you should be able to make Elfeed do pretty much anything you need.

next-user-here

Wow that fast a response, merci.
- than[x] P1
- than[x] P2
- Q1: didn't work I'am an elisp-novice so I 've tried:
(defadvice ad-elfeed-update (after elfeed-update activate)
(let ((feed (elfeed-db-get-feed "https://news.ycombinator.co...")))
(setf (elfeed-feed-title feed) "HN"))
)
- Q2: I want to use elfeed for years, with many many feeds, but I don't want to keep such a huge amount data. An ability to delete entries on time criteria would be nice for me.
- Q3: (a new one ;-) is it possible avoid image downloading fo specific feed sources (advertising)?

Thanx for your support!

Christopher Wellons

The name right after the defadvice is the name of the function being advised, not a name you make up yourself (i.e. "ad-elfeed-update"). Also, I forgot that elfeed-update is asynchronous, so advising it won't work. Instead, try putting the advice on elfeed-search-update as "before" advice, so the name is set properly just before displaying it. You might need to use that clrhash expression before you'll see it working. Also, make sure the URL you use with elfeed-db-get-feed is *exactly* the same as listed in elfeed-feeds.

This worked for me to rename the title of my blog in Elfeed:

(clrhash elfeed-search-cache) ;; clear display cache first

(defadvice elfeed-search-update (before nullprogram activate)
(let ((feed (elfeed-db-get-feed "http://nullprogram.com/feed/"))) (setf (elfeed-feed-title feed) "NP")))

;; then go to the elfeed buffer and hit "g" to refresh the view

The size of the database should be rather small. I have 6,000 entries in my database right now and the index file is only 3.5 MB. The content database after garbage collection, which is the data/ directory under ~/.elfeed/, with these 6k entries is 17MB. When I run M-x elfeed-db-compact (experimental!) it drops down to 1.8MB. That's less than 1 kB per entry in the DB. It's also less than my personal Liferea database of roughly the same amount of content (~15MB) before I wrote Elfeed.

If your concern is DB performance, the DB is very efficient at filtering entries by date. As long as you keep a date filter ("@2-days") in your search filter, filtering should be speedy even for extremely large databases. In fact, I can run Elfeed on my Raspberry Pi with very little interface latency. The slowest part is by far parsing all the XML during updates, which is something I can't optimize.

If this really becomes a demonstrable issue, in the future I may add a function for deleting entries. You would use it in your own with-elfeed-db-visit form to clear out entries matching whatever criteria you wish.

Images are actually downloaded by the Emacs HTML rendering engine, shr, when you view the post. You could try blocking these with your system's hosts file. Or you could try writing an elfeed-new-entry-hook function that filters the HTML content of newly-discovered entries to block advertising. That's a whole project of its own, so you're on your own for that one.

next-user-here

Hi Christopher,
- than[x] Q1
- Q2: I see, there is no delete functionality in the db implementation.
If things get slow, I 'll just remove ~/.elfeed/ after reading feeds and start
a fresh db, that should be a work around
- Q3: Do you have a quick elisp skeleton for "an elfeed-new-entry-hook function
that filters the HTML content of newly-discovered entries" solution, for a
bloody elisp novice like me, please

Thanks you indeed
Have a nice Sunday
NUH

Christopher Wellons

I've turned our conversation into a tips and tricks post: Elfeed Tips and Tricks. That has some examples you might find useful.

csantosb

I just discovered elfeed and I'm quite confident about its utility as it may easily be adapted to any workflow. Now, I am wondering, first, what is the advangtage on using elfeed instead of newsbeuter, for example; second, do you think there might be anytime soon a way to ge tin sync with feedly, for example ? by now i'm using beyond pod on a tablet, wich syncs with feedly, it would be great to use elfeed as a desktop rss reader provided it is aware of already read posts. Thanks anyway for all this work, very good job !

Christopher Wellons

The primary advantage over other readers is extensibility. Elfeed is written entirely in Elisp, so any part can be changed on the fly, and all configuration is done through Elisp. You can add custom interface actions, custom filters, tagging tricks, etc. See the tips and tricks post.

Syncing with other readers, including other Elfeed databases, is something I've been interested in adding since the beginning. I just haven't personally had a need for it yet. In the case of Feedly, someone would need to write an extension that hooks into the Feedly API and performs the synchronization.

A sort-of alternative to syncing that I had started but didn't complete yet is a web interface to Elfeed. The web interface exists -- you can install it from MELPA -- but it's strictly read-only at the moment. The idea is that you could visit your desktop client from your mobile devices through this interface, so they're all using a single client.

csantosb

You're right, I am using it now with
(run-with-timer 0 120 'elfeed-update)
to auto-update the db as a possible extension to the base functionality.
Additionally, the :callback in 'elfeed-make-tagger allows calling external elisp code or anything else with
'start-proces. Currently I have custom desktop alert messages when a given condition is met, but possibilities are endless. Very good job. Looking forward for a feedly interface. Thanks again,

c.

andschwa

Thanks for Elfeed Chris! I've been using an instance of [Stringer](https://github.com/swanson/... hosted on Heroku, but it's just been too slow and out-of-the-way. I came across Elfeed a while ago at work, but stopped using it after I left.

Do you have any advice on making Elfeed usable from multiple computers? I'm thinking I could keep `~/.elfeed` stored on my server, using sshfs perhaps. The configuration (i.e. `elfeed-feeds`) isn't a problem using git, but keeping the database either synced or used remotely... well I'm just not sure how I'll go about it yet.

andschwa

Dang. I wish Disqus supported Markdown.

Shackra Seaslug

it supports HTML tags like <a></a> and <pre><code></code></pre>

andschwa

Markdown exists so I don't have to type all those brackets :)

Shackra Seaslug

Then, Disqus will make your life harder...

Christopher Wellons

The multiple computer thing is something I've been wanting to resolve, but I haven't sorted out a way to do it yet. You could try sharing the database directory via NFS, Sambda, or sshfs. Keep in mind that accessing it from multiple instances of Emacs at the same time will clobber each other when saving (there's no locking), though, fortunately, it will never actually corrupt the database.

You could use rsync on the database directory, or copy just the index file manually. The content database isn't very important, so you could probably tolerate not synchronizing it. The content will mostly be filled in on the next update anyway.

I've got a *partially* working EmacSQL branch that would provide a safe, shared datbase solution via SQLite. I'm not sure if I'll ever get this into a working state, though.

If you ever have any ideas, please share.

andschwa

All right, I may give it a try when the need arises, and I'll let you know how it goes.

Guest

Hello Chris!
I need some help. I have added a few blogs to my elfeed using elfeed-add-feed which I don't need anymore. Is it possible to unlink these blogs? Where is the database of the added feeds are stored?

Paul James Harper

Thanks for this wonderful tool. Combined with Remy Honig's elfeed-org it is unbeatable for reading RSS feed. https://github.com/remyhoni...


The Elfeed Database

alexbenjm

Would sqllite be a viable solution? It's a self-contained serverless engine. Just drop a file in a directory and access it via some suitable Emacs package. A quick look shows several possible Emacs interfaces to Sqllite. It looks like it may solve your issue of finding an external database program not requiring excessive setup on user's part.

Christopher Wellons

I actually attempted to drive SQLite from Emacs about a year ago when I wrote an Emacs pastebin. My conclusion was that SQLite's command line output is definitely not intended to be consumed by software. By default the output is ambiguous, but asking for CSV output resolves this. To do it properly I figured I would need a native helper program to access SQLite and send well-formed output to Emacs.

I did recently discover esqlite (via MELPA) but I still need to spend some time investigating it as an option.


Atom vs RSS

(no comments)

Atom vs. RSS

Dr. Azrael Tod

i couldn't agree more…
sadly currently we seem to have more the problem "there is no feed" then we have "there is a broken feed" :-/

isofarro

Atom's design goal was to fix the problems developers were seeing in RSS 2.0 but proved incredibly difficult to fix. So better specified was a prime requirement.

RSS2.0 was an evolution from RSS 0.94 (Pity Mark Pilgrim's article "The Myth of RSS compatibility" / "9 versions of RSS", which identifies a number of compatibility issues of RSS over the years).

RSS 1.0 is a different kettle of fish, one where a community decided to improve the existing RSS with Semantic Web approach. The original version of RSS introduced by Netscape was based on RDF syntax (Resource Description Framework, a new W3C recommendation). Versions driven by Dave Winer preferred simpler and more logical constructs.

RSS 1.0 (RDF Site Syndication) brought RDF back, and introduced namespaced element (notably Dublin Core, which brings in some very useful metadata elements, like publish dates, titles, descriptions). That's why it looks so drastically different to previous RSS versions. I guess the Web wasn't ready for automated generation of RSS feeds. On the bright side, it did push RSS 2.0 to allow for namespaced elements, and the use of Dublin core became more mainstream.

I know your pain for parsing RSS and Atom feeds. I wrote my own parser in PHP4 and it's SAX XML processor, trying to support namespaces in a modular way. Nowhere near production capable, but a useful exercise at the time: https://github.com/isofarro...

Have you looked at Mark Pilgrim's Feed Parser http://code.google.com/p/fe... -- even if you are not using a Python stack, it's well worth seeing if you can reuse the extensive unit tests, they capture a lot of the issues you are seeing in dealing with RSS feeds.

Also, are you using Feed Validator http://feedvalidator.org to sanity check RSS and Atom feeds? That's at least a stick you can potentially use to get others to fix their feeds. But understandably, human endeavours open up human error, and a lot of RSS issues are human error driven.

There did use to be a few proxy sites that converted RSS 2.0 feeds to Atom, but that does run into the issue of that proxy having to do the disambiguation, and covering gaps like producing a valid/non-changing ids.

Mads Fog

Hi Chris

Would it be possible for you to share your Jekyll Atom template?

I am specifically interested in the <id> part.

Thanks in advance.

Christopher Wellons

My entire site's original sources can be found on GitHub, so you can inspect any part you like any time. The specific template you're looking for is here:
https://github.com/skeeto/s...
The most complex part is getting the dates right. GitHub's (read: Rackspace's) servers are in the PDT timezone and I live in EDT. If you let Jekyll use the server's own timezone as it wants to do by default (bad default!), you'll run into date mismatches that can break your site when posting articles whose date changes when pushed out onto a PDT server (e.g. the date baked into the filename doesn't match the date that the timestamp falls in for that particular timezone). The _config.yml at the repository root sets the timezone to "GMT" (though this should probably be "UTC" instead?), which is required for this template to work properly, since it's got thr hard-coded Z in the timestamp.

I see Jekyll now supports ISO 8601 dates (required by Atom) directly. At the time I wrote this template, that wasn't supported yet, so I had to hardcode UTC to work around the omission. I think the workaround is still required for my particular use case.

http://jekyllrb.com/docs/te...

Mehdi Sadeghi

Thanks Chris for your insights. I was in the process of updating my websites feed and now for sure I go for Atom over RSS!

As a person who writes his language in RTL format I have to play with the feed specs to see how could I achieve such layout. So far I did not find any specs for it. Moreover I compile my website locally because I come from the only country in the world which uses an official calendar rather than Julian calendar! So the ruby calendar library is not supported by Github and I go for local compile and then I push to gh-pages branch (for each post I hard-code the publish date in yaml header).

A Nonny Mouse

“Unlike the other RSS versions, the top level element is rdf:RFD. That’s not a typo.” I'm inclined to disagree here: shouldn't it be rdf:RDF?

Christopher Wellons

Whoops, you're right! I was referring to "rdf:RDF" being such a weird root element for RSS, and that was an unfortunate coincidence with an actual typo. I've fixed it.


My Grading Process

(no comments)

My Grading Process

possiblywrong

I remember (sometimes fondly, sometimes not so much :)) that bi-modal distribution of the time needed to grade a student's math homework/exams/etc. Usually, when a student has even a modest grasp of the material, grading his/her work is easy and quick, even if there are mistakes... because they are "sensible" mistakes. But when a student is clearly lost, then it's difficult to know where to even begin "correcting."

Christopher Wellons

Bi-modal is definitely the right word for this.

Ahmed Fasih

A thought: I've graded a class, and have had graders (obviously); I've also submitted papers to journals, i.e., wrangled with reviewers' responses, and been a reviewer for a couple of papers. The two tasks seem like they share a common goal, viz., judging quality, but whereas quantifying the badness of a homework "paper" in points is time-consuming, I think the binary response of an editor is often easier. First, you would need to replace points for homework with a binary accept/reject. And second, with a terrible submission, you would say "Here are the first three-odd things that are wrong with your submission, out of many more. Fix at least these and come back." (Unfortunately I guess you can't flat reject a homework submission like you can with a paper...)

(NB: I know some paper reviewers who do the same thing as you with bad submissions: exhaustively catalog all the things wrong with it, which is awfully timeconsuming and depressing. I am a fan of the more brutal approach, i.e., listing a subset of the major problems, and rejecting if it seems unsalvageable.)

(NB 2: I never got a single paper accepted by an archival-grade journal, so feel free to ignore me completely :)


Elfeed Tips and Tricks

happyelffeduser

Great post with great tips.
I have two question about improvements to my 5min code:

1. cycling filters

(defvar my-elfeed-counter 0)
(defun elfeed-search-filter-toggle ()
(interactive)
(let ((filters
'("@4-weeks-ago -junk +unread -video -yt -musicvideo"
"@4-weeks-ago -junk +unread -youtube +fast"
"@4-weeks-ago -junk +unread +spam"
"@4-weeks-ago -junk +unread +humor"
"@4-weeks-ago -junk +unread +musicvideo"
"@4-weeks-ago -junk +unread +video"
"@4-weeks-ago -junk +unread +youtube"
"@4-weeks-ago -junk +unread")))
(setq current-counter (+ my-elfeed-counter 1))
(setq my-elfeed-counter (mod current-counter (length filters)))
(elfeed-search-set-filter (nth my-elfeed-counter filters))))

(define-key elfeed-search-mode-map "c" 'elfeed-search-filter-toggle)

2. yanking

I changed 'PRIMARY to 'CLIPBOARD in elfeed-search-yank,
because PRIMARY has not effect for me.
(Arch Linux without Desktop Env., only XMonad)
Maybe some defcustom for this?

Thank you for elfeed.
Currently I'm using it for my 1k feeds :) and I'm extremely
happy that I can filter out entries with buzzwords like agile, nosql, php, ... ;P
or things like "X things that"... http://xkcd.com/1283/

Christopher Wellons

1000 feeds! Wow! That's the most I'm aware of at the moment for a single user.

Very good point about the clipboard selection. I just assumed PRIMARY would work for everyone. I just added the customization variable elfeed-search-clipboard-type since there's no clean way to set it otherwise. It will show up in MELPA on the next update.

I really like your elfeed-search-filter-toggle command. That's creative and clever! Be careful with current-counter because that's a global variable with the way you've written it. Declare it in that let.

next-user-here

Hi Christopher,

after two weeks using elfeed I'm very happy with your package, great work man!
Thanks a lot, also for your tips.
- than[x] Q2
- than[x] Q3

I love the way I can customize every thing I want.
Here some examples:

;----
; Ad killer

(defun my-elfeed-no-ads (entry)
(let ((entry-feed-id (elfeed-entry-feed-id entry)))
(cond
; heise & golem
((or (string= entry-feed-id "http://www.heise.de/newstic..." )
(string= entry-feed-id "http://www.golem.de/rss.php..." ))
(let* ((original (elfeed-deref (elfeed-entry-content entry)))
(replace (replace-regexp-in-string "<img.*" ""="" original)))="" (message="" (elfeed-entry-link="" entry))="" (setf="" (elfeed-entry-content="" entry)="" (elfeed-ref="" replace))))="" )))="" (add-hook="" 'elfeed-new-entry-hook="" #'my-elfeed-no-ads)="" ;----="" ;="" read="" all="" key="" (define-key="" elfeed-search-mode-map="" "r"="" '(lambda="" ()="" (interactive)="" (save-excursion="" (mark-whole-buffer)="" (elfeed-search-untag-all-unread)="" )))="" ;----="" ;="" comment="" links="" in="" hacker="" news="" (defun="" my-elfeed-browse-hn-comment="" ()="" "in="" an="" hn="" entry,="" open="" comments="" link"="" (interactive)="" (search-forward="" "comments")="" (backward-char)="" (execute-kbd-macro="" [="" return="" ?q="" ]))="" (define-key="" elfeed-search-mode-map="" "v"="" '(lambda="" ()="" (interactive)="" (beginning-of-line)="" (let="" ((hn?="" (re-search-forward="" "[(,]hn\\b"="" (line-end-position)="" t)))="" ;="" is="" it="" an="" hn="" entry?="" (call-interactively="" 'elfeed-search-show-entry)="" (if="" hn?="" ;="" yes,="" open="" comments="" link="" (my-elfeed-browse-hn-comment)="" ))="" (define-key="" elfeed-show-mode-map="" "c"="" 'my-elfeed-browse-hn-comment)="" ;----="" ;="" handle="" pods="" via="" emms="" (defun="" my-elfeed-play-pod="" ()="" "play="" selected="" feed="" item="" (should="" be="" pod)="" via="" emms"="" (interactive)="" (elfeed-show-yank)="" (let="" ((mp3-url="" (x-get-selection="" 'primary)))="" (if="" (string-match="" "[mm][pp]3$"="" mp3-url)="" (emms-play-url="" mp3-url)="" ;="" not="" an="" mp3="" link,="" search="" for="" it="" (re-search-forward="" "enclosure:[^h]+http")="" (setq="" mp3-url="" (get-text-property="" (point)="" 'shr-url))="" (if="" (numberp="" (string-match="" "[mm][pp]3$"="" mp3-url))="" (emms-play-url="" mp3-url)="" (message="" "no="" mp3="" link="" found!")="" ))))="" (define-key="" elfeed-search-mode-map="" "m"="" '(lambda="" ()="" (interactive)="" (call-interactively="" 'elfeed-search-show-entry)="" (my-elfeed-play-pod)))="" (define-key="" elfeed-show-mode-map="" "m"="" 'my-elfeed-play-pod)="" one="" more="" question="" ;-)="" -="" q4:="" i="" want="" to="" learn="" elisp="" better,="" any="" tips="" beside="" +="" introduction="" to="" programming="" in="" emacs="" lisp="" +="" emacs="" lisp="" reference="" manual="" greetings,="" nuh="">

next-user-here

something went wrong with the code snippets, sorry for that
I've pasted it here:
https://gist.github.com/ano...

One more question ;-)
- Q4: I want to learn elisp better, any tips beside
+ Introduction to Programming in Emacs Lisp
+ Emacs Lisp Reference Manual

Greetings,
NUH

Christopher Wellons

I like your EMMS trick. That's a good one.

Sorry, I didn't notice you had a question at the bottom until now. Those two manuals are definitely the place to start. Any Common Lisp material is also good because Elisp is very similar, especially with cl/cl-lib. If you want another free resource to check out, look at Practical Common Lisp.
This one's not free but "Object-Oriented Programming in Common Lisp" by Keene is a surprising gem of a book. While Elisp has a CLOS-like object system called "eieio," the lessons don't transfer to Elisp as much as other Lisp resources, but it's still a really good example of Lisp in action.
Other than that, read other people's Elisp code and don't hesitate to play around with ideas in your scratch buffer. I've written tons of throwaway experiments that way.

next-user-here

something went wron with the code snippets, so I've put them here:
https://gist.github.com/ano...

One more question ;-)
- Q4: I want to learn elisp better, any tips beside
+ Introduction to Programming in Emacs Lisp
+ Emacs Lisp Reference Manual

Greetings,
NUH

next-user-here2

strange the code snippets didn't upload, I've paste them here
https://gist.github.com/ano...

One more question ;-)
- Q4: I want to learn elisp better, any tips beside
+ Introduction to Programming in Emacs Lisp
+ Emacs Lisp Reference Manual

Greetings,
NUH

Yukang

Hi nice work, I have a question, do we need to manually press "G" to fetch the new contents? or it will do it automatically backend? IF it do fetching backend , do we have a option to control the time gap?

Christopher Wellons

Right now you have to manually trigger the update. Automatic updating is a sort-of long term goal, but I see it being done more as a separate extension to Elfeed. I recently added metadata to Elfeed structs (the setf-able elfeed-meta function) that would help support updating. It could be used to track the last time a particular feed was updated, or log extra information about a feed's update frequency.

Two things hold me back. One, I haven't really worked out how often and when feeds should update. I need to read more on this topic. Ideally it should adjust based on the feed's history: update frequency, typical update times, etc. That's why I see it as being a separate extension. If the update logic is elaborate, the complexity is kept separate from Elfeed. Two, updating can be very disruptive. Parsing the XML of a single large feed can lock up Emacs for as long as two seconds. It would probably be annoying to have that happen while you're working.

Now that I think about it, over the last 4.5 months I've accumulated a sizable Elfeed database. I could use that as a base for an update frequency algorithm, to see if it can use the first half of the database to accurately predict the second, later half.

Yukang

Hi Christopher, thanks for your response.

yes, if Emacs automatically updating the feed, I may lock up Emacs some time, even the time is short, it will be annoying if users can feel it. A good update frequency algorithm may save some time, but I can not solve the problem totally.

I suggest you could implement it as a separated daemon process(even could be implemented in another language, python/ruby etc), this process response to fetching feeds on time and put it into DB file, and Emacs just response to reading the DB file and show it.

Jean-Michel

very usefull - I'm coming from liferea, but much prefer an emacs integrated approach !

A question :
I have feeds with entries refering to an mp3 file, like this :
"Enclosure: http://podcast.rcf.fr/sit[...]PRISOIR_20140227_1915.mp3"

I'd like to automatically download the mp3 files, which would turn elfeed into a kind of podcaster. Ideally, I'd like to be able to choose if I download the mp3 files in the elfeed database or somewhere else.

Is that feasible ?

Thks

Christopher Wellons

There are a couple of ways to do this and they'll each take a bit of work to get right.

One, you can accomplish this with a elfeed-new-entry-hook. I'd just run wget or curl on each of the entry's enclosures since Emacs doesn't really have a clean way to download files on its own. An enclosure is actually a list -- laziness, sorry, should have made it a defstruct! -- (url, type, length).

I didn't actually test this, but it should look something like the following. You can use elfeed-make-tagger's other key arguments to filter to what sort of entries you act on.

(add-hook 'elfeed-new-entry-hook
(elfeed-make-tagger
:callback (lambda (entry)
(dolist (e (elfeed-entry-enclosures entry))
(start-process "wget" nil "wget" (car e))))))

The second, and my preferred method, is to initiate the download from the entries listing, or when viewing the entry, with a key binding (i.e. "d" for download). I do this with YouTube entries and youtube-dl. Take a look at it my config:

https://github.com/skeeto/....
https://github.com/skeeto/....
It's complicated by process management stuff, but it's all not strictly necessary.

sgtpep

A snippet for periodic update for feeds (3 mins since Emacs start, then every half hour):
(run-at-time 180 1800 (lambda () (unless elfeed-waiting (elfeed-update))))

Christopher Wellons

Good one! One of my long-term goals is to make an extension to Elfeed that does smart, automatic feed updates -- i.e. feeds that update frequently get checked more frequently.

sgtpep

That could be awesome! Especially on Emacses that have no proper gnutls integration.

Clément

Hi,

I just started using elfeed to replace the complicated scheme I had setup with rss2email, and this made me realise that many feeds do not use a proper pubDate tag, when they have one at all. In those cases it seems that elfeed changes the date with every update, instead of keeping the date of the first update. Is that expected behaviour ? Or am I doing something wrong ?

Christopher Wellons

Do you happen to be using the version in Marmalade (1.1.0)? What you're describing is how Elfeed used to work, but that was fixed in May this year (09349d6). However, the version in Marmalade doesn't have this fix yet because I forgot to update it.

Clément

I got the package from melpa, which I believe is the last one (it's
tagged 20141005.730). I dug a little, and it seems that the feeds with no pubDate at all get the correct time, but the ones with a faulty

pubDate (i.e not respecting RFC-822) are modified with every update.

Christopher Wellons

Interesting. Could you give me examples (URLs) of a couple of feeds with a broken dates? I want to see just how broken they get, possibly using them as test cases.

Clément

Sure. I have just emailed them to you (with subject elfeed: broken RSS feeds).

Dan LaManna

Thanks for the great article - just discovered elfeed and it's been a huge help!

I understand "deleting" entries isn't viable, but ignoring/hiding certainly is. I've used the new entry hook for example to :add 'junk and :remove 'unread for any entries containing certain strings.

What I'm uncertain how to accomplish however is ignoring *duplicate* entries, for example a news site has news and politics as 2 separate RSS feeds, but occasionally have overlapping articles, so I'll end up with 2 identical title/url/content entries which in addition have different tagsets.

I don't particularly care too much about tagsets - but having 40 unread entries with 10 duplicates is frustrating - any thoughts on how to handle such a situation?

Thanks again!

Christopher Wellons

The duplicate entries thing is due to an early design decision now that I regret. I was worried about RSS's shortcomings (non-unique entry IDs) and chose to be prepared for that situation rather than gracefully merge identical entries from different feeds. In short, an entry is keyed by a tuple of feed URL and entry ID. If it was just keyed by entry ID, you wouldn't see duplicates.

Unless I'm mistaken, this also can't be fixed by a new entry hook (tagger), because by the time the hook is called Elfeed has already decided the entry is new and unique, so you can't fixup the entry key.
Here's an untested idea: in a new entry hook, search for a duplicate from the other feed already in the database. If it's there already mark it as read/junk.

Sorry about this! It's turned out that RSS feeds in the wild aren't quite as broken as I had feared and Atom's feed merging capabilities come up more frequently (including my own blog) than I expected.

Noorul Islam Kamal Malmiyoda

I am trying out elfeed and I see that it is not displaying entire content. Currently I am using feedly. For the same entry feedly displays everything.

Christopher Wellons

This is because the feed author is truncting the content in their feed, and that's what Elfeed displays. I'm guessing Feedly uses the entry URL to go fetch the actual entry content. You can get the same effect by using eww (Emacs' built-in browser) as your browser (see "browse-url-browser-function").

surfingandlaughin

Elfeed <3

Hello Christopher,
I must be very dumb as I can't find how to set:
elfeed-search-title-max-width
elfeed-search-title-min-width
elfeed-search-trailing-width

I'm using elfeed 20150404.545 (melpa) on Emacs 24.1
Regards!

David Feest

First of all: Thanks for this great package! I especially love the endless possibilities to configure it. Still, I'm really an amateur with elisp. Therefore my question is rather simple. Since I am using instapaper, I have integrated it into elfeed using the package instapaper.el:

(require 'instapaper)

(defun url2instapaper (message)
(call-process "url2instapaper" nil nil nil message))

(defun elfeed-url2instapaper ()
(interactive)
(let ((entry (elfeed-search-selected :single)))
(instapaper-add (elfeed-entry-link entry)))
)

(define-key elfeed-search-mode-map "i" #'elfeed-url2instapaper)

This works perfectly. But is there a possibility to have the tag changed from "unread" to "instapaper" within the same defun? I tried using "elfeed-make-tagger" but I don't know how to use it on just one entry.

Thanks in advance!

Greetings,

David

Christopher Wellons

You're off to a good start! Note that elfeed-make-tagger is for tagging entries as they're discovered, not for changing the tags later interactively. The functions you're looking for are elfeed-tag and elfeed-untag. Call them right after instapaper-add, so that if it fails the tags won't change.

David Feest

Hi Christopher,

thanks for the super fast answer! And sorry about my elisp illiteracy! I added:

(elfeed-untag entry 'unread))
(elfeed-tag entry 'instapaper)

to my defun, making it:

(defun elfeed-url2instapaper ()
(interactive)
(let ((entry (elfeed-search-selected :single)))
(instapaper-add (elfeed-entry-link entry))
(elfeed-untag entry 'unread)
(elfeed-tag entry 'instapaper)))

Yet, the tag does not change, although I don't get an error message. I must be missing something very obvious, but what is it?

Best wishes,

David

Christopher Wellons

I believe the tag is being correctly applied, you're just seeing a stale display. Add either an elfeed-search-update-line or elfeed-search-update-entry call after updating the tags. This makes the display redraw the line (elfeed-tag and elfeed-untag are low level database functions and are unaware of the entry listing).

I also like to add (forward-line) as the last action to these sorts of functions so that it automatically advances to the next entry. For example, when you hit "r" or "u" to mark read/unread, it moves to the next line so that you can make a decision about the next entry.

Also take a look at elfeed-search-tag-all and elfeed-search-untag-all, which are like elfeed-tag and elfeed-untag but are higher level and aware of the entry listing. Perhaps these are what I should have recommended to you in the first place.

Sorry, this is probably more complicated than it should be!

David Feest

Thanks a lot, I made the adjustments and the defun works great now! And don't worry, this is only complicated for a newbie like me! And I got more than I bargained for anyhow! Actually, I was only looking for a RSS-feed client for Emacs that would simply do its work. Now I have even integrated Instapaper and so many more possibilities! Thanks again for your help and for this great package!

Daniele Giglio

Hi, I'm playing with this awesome program in order to find an effective replacement for Feedly.
I've noticed that some feeds, like http://www.notebookcheck.ne... , bring an enclosure tag for images. These are shown in the article buffer as links to the remote files. Is there a way to automatically show them into the buffer?
Thanks in advance.

Christopher Wellons

There's no quick-and-easy way to get this right now, but you can do it without mucking with Elfeed directly (e.g. it would work well in your configuration). There's a variable elfeed-show-refresh-function that decides how an entry gets displayed. You could define your own refresh function to wrap the default refresh function such that it inserts the image somewhere in the buffer after the default function has rendered the content.

JohnKitchin

Has anyone tried implementing some kind of scoring method for entries? or conditional tags, e.g. to add tags to entries by a particular person, or on titles that contain particular words?

JohnKitchin

To answer my own question: I tried it here http://kitchingroup.cheme.c... in case anyone else is interested in trying it.

Christopher Wellons

Interesting approach, thanks for sharing! I added a link to your article to the Elfeed README.

JohnKitchin

Thanks. Elfeed is a great application, thanks for making it!


Thanksgiving and Hanukkah

(no comments)

Emacs Lisp Reddit API Wrapper

(no comments)

Clojure-style Multimethods in Emacs Lisp

(no comments)

Emacs Lisp Printable Closures

Marcin Cieslik

R shares the printable-closure property as well:

> a = function(x) {x*2}
> b = eval(print.function(a))
function(x) {x*2}
> a(1)
[1] 2
> b(1)
[1] 2

Christopher Wellons

R doesn't quite do it either. It's in the same bucket as JavaScript where the function body is printed but the lexical environment is not. If you actually close over a variable in your closure you'll get an unreadable memory reference.


> myIdentity = function(x) { function() { x } }
> myIdentity(1)
function() { x }
<environment: 0x20feb18>

Marcin Cieslik

Thanks, I see your point now.

Constantine

Interesting stuff! Thanks for sharing.

Note that "It takes on two forms depending on weather the closure is compiled or not." should read as "It takes on two forms depending on whether the closure is compiled or not." ("whether" instead of "weather").

Christopher Wellons

Oops, thanks for pointing out the typo. Fixed.

ascii

I think also erlang shares this property of serialized closures. Its similar

> X = 10
> F = fun(Y) -> Y + X end.
> Serialized = erlang:term_to_binary(F).

... you can save Serialized to disk, send it across the network to another running program, etc, and then you can get your F back again:

> Deserialized = erlang:binary_to_term(Serialized).
> Deserialized(5).
15

Pretty cool.

Christopher Wellons

Oh, interesting. In that case it's now two languages I know about with serializable closures.

Wilfred Hughes

Fascinating! If I'm not mistaken, a smarter byte-compiler could actually remove the temp-buffer variable, as it's not used in the body of the closure. Is that correct?

(The problem would still exist, I think, if the closure body contained a reference to the closed-over variable.)

Christopher Wellons

Actually, the byte compiler is already smart enough to eliminate unreferenced variables, including temp-buffer. This is only a problem in the interpreted code, where that sort of analysis isn't (and can't reasonably be) done. I elaborate more on this in a later article:

http://nullprogram.com/blog...


Emacs Byte-Code Internals

rgiar

Great stuff, thanks, I've always wanted to trace through the bytecode interpreter -- after your article, I'm ready to skip forward a step and start thinking big :)

Lars Brinkhoff

My current Common Lisp compiler emits Emacs Lisp. I plan to write a new one that generates lapcode instead.

Christopher Wellons

You should have linked your project because it's very interesting! I'll do it for you:

https://github.com/larsbrin...

What do you think the advantages of targeting lapcode would be? So far I think the Elisp compiler does a great job already and using Elisp as a target language takes full advantage of it. Do you think you could beat it, or would it just be for purity or fun?

Lars Brinkhoff

Sorry, didn't see your answer until now.

The translation of Common Lisp -> Emacs Lisp -> bytecode isn't very good. There seem to be a lot of redundancies. I imagine having full access to every low-level detail of the bytecode virtual machine will make it possible to generate much better code.

I guess most higher-level languages will be more restrictive than a low-level assembly language. E.g. there are things you can do in machine code that you just can't do in C, even though C is touted as a high-level assembly language.

Vladimir Kazanov

Thank you for the article. Aren't too many docs on the Emacs VM out there, apart from the bytecomp.el code itself.

As for the project ideas of mine: a) implement a small Python-like language on top of the VM; or b) try out a few interesting IR-lang compiling techniques (think "Compiling with continuations").

Rocky Bernstein

A couple of things. First, I've started a bytecode decompiler. See https://github.com/rocky/el...

The other thing is that I think it's about time a full reference manual for bytecode be written. Python for example does a pretty good (but not great) job of this for example. So I've started that as well See https://github.com/rocky/el... When that doc is more fully completed, I hope to get this incorporated into the Emacs Lisp Reference. So please contribute.

Also, it would be great if relevant parts of this were added to that. Is that okay?

Christopher Wellons

Those are interesting and ambitious projects. I look forward to seeing where they go. At the footer of my blog is a public domain dedication, so all of my articles are in the public domain by default (with certain specially-marked exceptions when I've used other peoples' images). You're free to use my article and transform it however you like, for any purpose, including your byte-code document. No strings attached.

I believe part of the reason Emacs' byte-code hasn't been formally documented is so that it's not locked down to any particular semantics. It's not part of Emacs' formal API, so interfacing with it is perilous. It's already the case that byte-code isn't compatible across releases, allowing the Emacs devs change things as needed and without warning. And they do take advantage of this fact. For example, the byte-code generated for throw/catch changed significantly in Emacs 24.5 with addition of a new "pushcatch" opcode. This was only documented in source code comments.

Rocky Bernstein

Many thanks. Will make sure to give credit where credit is due. The situation in Emacs is no different than say Python. In fact, Python is probably worse as the bytecode changes every year, and sometimes very drastically.

But note,a couple of things. First, in contrast to LAP, Python does document the bytecode for each release and indicates when a bytecode got added, or removed or changed semantics. To see what has changed between say Python 3.5 and Python 3.6 see https://github.com/rocky/py...

Second, I've still been able to write a decompiler for the 20 or so releases and variants.

I do intend in this document to indicate when bytecodes enter, leave, or change. I have a query out already in emacs-lisp for this. See https://lists.gnu.org/archi...

But I know someone will have to do a bit of digging to get this information. Some of it is in https://www.emacswiki.org/e... for very old stuff.

But if you know of other changes, please let me know as this will save time.

Again, thanks

Klaus

Another nice detail which I wasn't aware of before experimenting based on this article: Lexical variables that are captured by closures are encapsuled into cons cells, and accessed through a "car-safe" or "setcar" operation, with the reference to the cons cell being stored in the constants-vector.

I wonder if the possible optimization of using also the CDR of the cons was omitted purely because of the added complexity for the byte-compiler (vs presumably small gains)?

Christopher Wellons

Yup, I talked about that aspect of closures in this later article:

What's in an Emacs Lambda
http://nullprogram.com/blog...

In closures, the cons cell is being used to "box" the value so that different function objects can share and set the same variable by sharing this cons cell. The CDR just serve any purpose in this case.


Measure Elisp Object Memory Usage with Calipers

(no comments)

Measure Elisp Object Memory Usage with Calipers

(no comments)

Emacs Lisp Object Finalizers

(no comments)

Introducing EmacSQL

abo-abo

Hi,

What's your indent setup for Elisp? How did you make vectors indent properly?

Christopher Wellons

That had been annoying me so much that I added an interactive function, emacsql-fix-vector-indentation, to fix it. It's described in the README (link below). I haven't sorted out all the edge cases, so it's still not exactly right, but it should help a lot for now.

https://github.com/skeeto/e...

Not many people write vector literals in Elisp so it's one of those broken things in emacs-lisp-mode.

abo-abo

Thanks, defadvice makes sense now. I patched `calculate-lisp-indent` instead and was thinking if there's a chance it could be accepted upstream.

Anyway, I want to select stuff from the db with a regex, like here: http://stackoverflow.com/a/.... It seems that emacsql doesn't support this. So I've tried to open the db created with emacsql from the shell, but it says "Unable to open database "semantic.db": file is encrypted or is not a database". Any thoughts on this?

Christopher Wellons

EmacSQL comes with a custom build of SQLite, so whatever you install on your system doesn't matter. I didn't want to also embed a regex library, so it's disabled in the EmacSQL build. The full-text search is enabled, though, which should generally be more useful.

As for the issue of opening the database, I don't know what would cause that. EmacSQL comes with SQLite 3.8.2. If your system's SQLite shell program is old enough, perhaps it can't read the file produced by EmacSQL. With FTS4 turned on, I think there should be backwards compatability problems with older SQLite builds.

maxxcan

Hi,

I'm learning Elisp and I started programming a simple form with widget library and emacsql but I don't finish to understand how implement this in a elisp function with variables.

May you write any examples??

Thank you very much for this and for all your work with Emacs, editor that I love. I hope soon translate the documentation of your emacs package to my own language.


Reimaging a VM from the inside with Debian

Gavin Black

This is almost exactly how you install Gentoo, except instead of
debootstrap you just unpack a stage tarball...that and compiling
everything at each step :P Also on the makedev step could you not just
replace it with "mount --rbind /dev /chrootDir/dev" and "mount --rbind
/dev /chrootDir/sys" before chrooting?

Christopher Wellons

Oh yeah, I kind of remember Gentoo's stage tarballs.

The linked Debian manual appendix says this about bind mounting /dev:

note that the postinst scripts of some packages may try to create device files, so this option should only be used with care


So I just followed the recommended procedure to avoid any issues.

Deepti

Nice Post !!
I have debian whezzy vm on ubuntu .I want to convert vm to usb bootable live cd (persistent) or bootable iso so that it can be easily distributed.
Can you please guide on above issue ?
I have tried but either the created iso is read-only or boot with errors.


The Julia Programming Language

PuercoPop

Nice overview of Julia. One question I'm left with, how do generic functions interact with keywords and optional parameters? Do they exist in Julia?

Christopher Wellons

Julia has keyword arguments,
http://julia.readthedocs.or...
The manual doesn't say how this interacts with generic dispatch so I experimented. It looks like it doesn't use keyword arguments when dispatching. I defined two methods with different keyword types and the function had only one method, meaning it overwrote the first definition.
Optional arguments is one of those things Julia does especially elegantly and I should have mentioned it in my post. Optional arguments are actually syntactic sugar for multiple method definitions. A separate method is defined per optional argument. Done this way, optional arguments aren't a special feature but rather a side effect of multimethods. This also means that optional arguments' types are considered in generic dispatch, though only when supplied.

Johan Sigfrids

Regarding strings I believe both Go and Rust behave the same way. A char/rune is a 32 bit integer and indexing strings gives you a byte, not a character. I don't know why it was designed this way, but if the designers of three separate, modern languages have reached the same design, I would assume there is a compelling reason.

Matt B

There's a very compelling reason: algorithmic complexity!

Indexing is generally thought to be an O(1) operation, but with UTF8/UTF16 it's O(n). What's cool, though, is that Julia's iteration protocol jumps from valid character to valid character. So `for char in str` only gives you valid characters. If you really just want the nth character, you can just ask for `str[chr2ind(n)]`. It's easy to use an O(1) algorithm in the development of more complicated functions… but were it O(n) by default you'd have to be careful how many times you index into the string. If you naively iterated over all the characters like this from `str[1]` to `str[end]`, you suddenly have O(.5*n*(n-1)). Julia's solution to use next with external state is very simple and lightweight. For a performance-driven language, their choice is a no-brainer.

Ahmed Fasih

Eeek, anytime I read '"Matlab done right"' I think of Linus Torvalds' scathing comment on Subversion ('Subversion used to say CVS done right: with that slogan there is nowhere you can go. There is no way to do CVS right.' 2007). It looks like Julia went back further than Matlab to start doing things right :)

The sealing of modules is a roadblock only to interactive development of modules, right? Using modules to prototype algorithms interactively or just do calculations isn't impaired by it?

I'm now going to go look at its C++ interop.

Christopher Wellons

It's only a problem when you want to modify the module without destroying your program's running state. If changes are made, the entire module has to be reloaded, obsoleting any instances of types made from the previous module definition even if the type's definition didn't change.

Ahmed Fasih

I should probably know the answer but how does Python deal with this? I often use the 'autoreload' plugin with Ipython where it reimports all imported modules precisely for situations where I'm editing the modules as I'm experimenting interactively.

Christopher Wellons

I only know a little bit about Python, so there might be something I'm unaware of. To try to answer this -- since I was really curious, too -- I did some research and asked around. It looks to me that the Python situation is slightly better.

In Python there's no way to switch namespaces into a module to modify it at run time. Reloading a module has the same effect as reloading a Julia module: all types (classes) are overwritten such that old instances are obsoleted. Since Python doesn't have multimethods, this part isn't so bad. Only methods are effected by the class update (old instances still call old methods) and functions will continue to work without surprises (functions applied to old objects run the new version of the function).
Something curious I noticed is that I couldn't find anyone anywhere asking how to switch the "current" module in the manner of Clojure (in-ns) or Common Lisp (in-package). It seems no one using Python cares about this.

The autoreload documentation suggests that reloading modules is a messy process, and they have some hacks in place to make it nicer:

"Reloading Python modules in a reliable way is in general difficult, and unexpected things may occur. %autoreload tries to work around common pitfalls by replacing function code objects and parts of classes previously in the module with new versions."

Where Python has a distinct advantage over Julia is that you're actually allowed to redefine a class. This way you can still live-develop a single module per Python instance, where the current "global" namespace is implicitly your module.

In Julia, the only way to redefine a type is to reload the module it came from. If it wasn't from a module then it's permanent for the remainder of the program. It makes for a lot of restarting, like writing in C, C++, or Java.

slocklin

I'm curious what you think octave did right that matlab didn't, other than giving it away for free. They seem to be interoperable to me. Even the modern octave IDE is starting to look like Matlab's.
You've definitely identified one of Julia's big problems with not being able to redefine things in modules. An even bigger (related) problem is, if you redefine things at the top level, and there are references to the named function elsewhere, those changes won't propagate through, and it won't tell you what just happened. That's pretty amateur hour for something which purports to be usable as an interpretor. Even makefiles prevent that sort of error in C. Try it and see:

g = function() {return 9}

f = function() {return g()}
g = function() {return 2}
f()

I said "rotsa ruck" when I saw that atrocity.

Christopher Wellons

Matlab's parser is like PHP's parser. It's just an ad-hoc mess written by an amateur. I'm amazed they still ship it the way it is. Lots of expressions don't parse (throwing an error) that really should parse, particularly expressions involving compound matrix indexing. It's a regular source of annoyance for me. Here's my favorite example:

http://nullprogram.com/blog...

In contrast, the Octave folks have developed a proper grammar so that everything parses pretty reasonably. They took Mathwork's own language and implemented it far better -- something open source has always been far superior at doing. What Octave is missing is all the fancy toolkits, huge selection of functions (though they're gradually catching up), newbie-friendly IDE, and rich plotting API.

Wilfred

> There's no opening the
module back up to define or redefine new functions or types.

Good news: eval() lets you add to a module! http://docs.julialang.org/e...

Christopher Wellons

Thanks for the info. I've since found out that Julia evolves so fast that my article is quickly becoming dated, especially with 0.3.0 now out. It leaves me excited for the future.


Emacs Lisp Defstruct Namespace Convention

Daniel Hackney

This is exactly what I ended up doing in my "package.el" rewrite. It would've been nice if "cl-defstruct" did this by default, but the alternative isn't so bad.


Northbound 7DRL 2014

(no comments)

Duck Typing vs. Type Erasure

Dawid Midura

Very interesting. Seeing how the same concepts look in various languages is always entertaining.


Three Dimensions of Type Systems

someone

scoping is by no means related to type systems


An Emacs Foreign Function Interface

Kototama

Awesome. I think in general it could be useful if a programming language compiler support incremental compilation through a library? Or maybe for tools like cscope if they have a library? Or you could build an interface on top of libptrace :-). Many cools ideas will come...

wasamasa

I can only think of silly things. Like doing a MOD player/tracker by using any of the appropriate libraries. Maybe extend that to do a non-puzzle game with either SVG or XBM/XPM for graphics and let the FFI part do sound. Or do a NES emulator instead of the game. And while I'm at it enhance the display engine to use OpenGL.

For less silly things, I'd love to replace the approach of building a binary that uses Clang to analyze code with an Emacs Interface to libclang. And avoid baking in crypto libraries maybe.

Christopher Wellons

Oh, crypto is a good idea.

wasamasa

Another thing that occurred me would be the creation of thumbnails. image-dired shells out to Imagemagick and creates its thumbnails in a temporary directory which is fine if it's just one or two, but slow for larger directories you want to preview. It would be cool if one could use FFI and imlib2 to just hand over the thumbnail directly to Emacs which should be both faster and avoid temporary files.

x4da

we desperately need C FFI for writing tox client for emacs. http://tox.im
for now i can't see how to handle callbacks with your FFI. Is it possible?

Christopher Wellons

I only got far enough for it to be a proof-of-concept, which didn't include callbacks. Callbacks are technically feasible, especially since I'm using libffi, but would require reworking the backend protocol a bit. Currently the FFI subprocess is a slave that can only respond to requests by Emacs and it cannot initiate any action.

huix

i use ffi-call,but it occur error that emacs minibuffer shows the error "(error "Process ffi not running")",how can i resolve this problem

JohnKitchin

I would use this to get zeromq bindings so emacs could talk to Jupiter kernels directly. Also to integrate bson and mongoc for native communication with MongoDB. Finally I would integrate something like the GSL to get numerical mthods into Emacs. Those are my current itches :)

Christopher Wellons

I've read your articles on accessing GSL and ZeroMQ via dynamic modules / FFI. Interesting stuff. What you've already done with the other native interfaces will work way better than anything you could do with my little (incomplete) FFI toy here.

JohnKitchin

I have more or less come to that conclusion for all the ffi options available so far. i am leaning towards building up some C macros and helpers to make writing the modules easier for me. Thanks for confirming that!


Digispark and Debian

Juan Manuel

Excellent! Thank you very much, it helped. Juan.

sdsds

Yes, oh yes, thank you!

FWIW your blog entry is now number seven on the list when google is queried with "digispark micronucleus udev rules"! ;-)


Emacs Buffers as String Builders

Jon O.

Nice post. It's not relevant to your main point, but you could also replace your first use of `with-temp-buffer` + `let` with the `with-output-to-string` macro.

IMO, one of the nicer things about (Emacs) Lisp is that although it's full of side-effects, it's generally easy to ensure that any side effects are contained strictly within a particular lexical scope using special forms / macros: `save-excursion`, `save-match-data`, `save-restriction`, `with-current-buffer`, etc. Just like `let` allows you to delimit a local scope (which I also miss in other languages), these forms give you something like a locally-limited scope of side-effects.

Christopher Wellons

Huh, somehow I was unaware of with-output-to-string. I'll have to use it more often. I know I've looked for with-input-from-string before, which doesn't exist in Elisp.

All those with-* and save-* macros are definitely handy. I often provide my own package-specific with-* and save-* macros in my packages.

Wilfred

Really interesting post. Since strings are mutable in Emacs, what do I gain from using buffers instead of strings directly?

Your article suggests that some operations are more efficient with buffers, but I'd be really interested to know what operations.

Christopher Wellons

Elisp strings are fixed length, so they don't support insertions or deletions. To get that you need to allocate a whole new string and copy the old contents. The one way they're mutable is that you can change the character at a specific position.

In contrast, buffers are implemented as gap buffers, which are really good at clustered insertions and deletions.

Artur

Good post. It's very easy to forget that temp buffers can be a lot more convenient than concat's and format's.
And apparently (point) is a valid place for setf? That's pretty cool. :-)

Christopher Wellons

The cl-lib package defines setf for lots of things. It's one of the best features of Common Lisp to come to Elisp. It cuts the size of APIs almost in half because there doesn't need to be a setter paired to every getter. The cl manual has a list of other setf places:
http://www.gnu.org/software...
I strongly prefer to use setf over specialized setter functions, especially now that setf is a core part of Emacs (gv). Also, if you think this is cool, be sure to look up cl-letf!

Artur

Thanks for the link! I ended up writing about it.
http://endlessparentheses.c...

Christopher Wellons

Cool, nice work!


Upcoming Emacs Chat with Sacha Chua

PuercoPop

Nice, I hope you get to talk about skewer's innards!

Christopher Wellons

I'll try to remember to bring that up!

sachac

I'm definitely looking forward to asking you about skewer and impatient-mode, and probably a "*boggle* How does it... work?" =)


A GPU Approach to Voronoi Diagrams

Rekkal

Nice work. Kenny Hoff worked on something similar to this here (http://gamma.cs.unc.edu/VOR... back in the day. I think he was also able to extend it to three dimensions with slices. Very useful for fast motion planning.

Christopher Wellons

His SIGGRAPH presentation is really interesting. It covers my entire article, and a lot more. He even used a very similar red/blue/green cone image and the same phrasing as me to describe the exact algorithms: "complex and difficult to implement."

Flipping the cones over to get "farthest" is mind-blowing. When I initially got things working, my depth buffer was inverted and I was seeing an inverted display, but it didn't occur to me that this was useful information.

Oleg

awesome

Bertjan Broeksema

Pretty much what I've been playing with lately. You should have a look at: http://www.cs.rug.nl/svcg/D... and the this paper as well: http://www.cs.rug.nl/~alext.... Telea actually describes the same approach as your cone-based one, but he uses a half-ball texture in stead to get somewhat nicer shading.

Christopher Wellons

Interesting links. I do like the way the ball shading looks.

simon

The technique of drawing cones is mentioned in the original OpenGL red book:
http://www.glprogramming.co...

Christopher Wellons

I really need to read a recent version of the red book sometime. Since writing the article, something I'm discovering is that *lots* of people have had this cone idea years before me.

Rafael Martins

You can also simulate the "cone" using only GL_POINTS and a big gl_PointSize. Set each fragment's depth to its distance from the center (0.5, 0.5) and only the closest fragments to each seed will be drawn. The only limit is that gl_PointSize has a hardware-dependent maximum value (on my relatively high-end rig it's 200). For me it's not really a problem because I'm also discarding points that are too far from any seed (which makes them look even more like cones). Nice article!

Rafael Martins

Well, come to think of it, if you need to overcome that limit you could probably just draw a quad around each seed and send the center as attribute to each vertex. This would mean just 2 triangles for each seed instead of 64.

Christopher Wellons

In my experience a typical gl_PointSize limit is a mere 64. Even though you can kind of make it work, it's not really the right tool for the job. It also faces the same problem as your quad idea: in plain WebGL there's no gl_FragDepth. It's available as an optional extension, though.

When this article was discussed on reddit someone had an idea similar to this that involved two extensions, EXT_frag_depth and ARB_conservative_depth.

http://redd.it/273vtq

It would be faster because it would take better advantage of the fragment shader and depth testing.

Rafael Martins

Oh, right. Thanks for the clarification and the link. I was about to start porting my stuff to WebGL and now I can see I'll have some bumps -- I didn't know OpenGL ES 2.0 had no gl_FragDepth. Also I didn't know about conservative depth, pretty nice.

Ryan Kaplan

I enjoyed this article - thanks for writing it! I just wrote my own article on generating Voronoi diagrams on the GPU using a method called Jump Flooding. I thought you might find it interesting: http://rykap.com/graphics/s...


Emacs Chat with Sacha Chua

Malyshev Artem

Magnar Sveen chat discover many tips too

Xah Lee

hi Christopher. I'm doing the transcript for Sacha. There are few places I couldn't make out. Sacha indicated you might be interested in looking at it. Would you be?

btw, it was a fantastic emacs chat. Quite a lot very cool and heavy things there. I plan to play with i think all of them.


Per Loop vs. Per Iteration Bindings

sunny1304

Nice article. Thanks.


Tag Feeds for null program

mvanderboom

Are you planning on making elfeed 'a bit smarter' and use the atom UUIDs to avoid duplication?

Christopher Wellons

Someday Elfeed will get a database makeover in a big 2.0.0 release. There were mistakes that I'd love to correct, mistakes I made in both the database design and in my feed modeling, but it would require breaking things. I consider breaking things a really big deal -- it forces all my users to spend time fixing and debugging their own extensions -- so it's not something I'll do lightly. To help with the transition, I'd make it play nice with the new MELPA stable: users should be able to keep using the older version for awhile until they're ready.

The database update might be EmacSQL, or it might be a similar solution but in pure Elisp (a pure Elisp RDBMS?). I'm still gauging how disruptive EmacSQL is to users with its native binary requirement. Updating it is currently a little annoying, having to compile SQLite from scratch every time. I already wrote half of an EmacSQL port for Elfeed (there's an emacsql branch in the Elfeed repository), but I need to spend time *really* making sure I get the database schema and API just right.

mvanderboom

Thanks for the explanation. I didn't realize it would be that big of an undertaking.


A GPU Approach to Conway's Game of Life

Xah Lee

are you on linux or Windows? I'm on linux now and i found that most js lib for WebGL doesn't work (but it does when i boot to Windows, same machine).

my machine is a plain $500 pc desktop of 2012, sans dedicated GPU.
Is this due to my machine, or mostly due to linux lacking gpu drivers?

btw, i've been doing Game Of Life heavily in 1990s. This post and many of your code are seriously interesting. (⁖ your atom vs rss post is right on the spot. The Elfeed i'd be seriously interested to look at (not so much with gnus). Your Skewer mode is also very interesting, and EmacSQL, which to me is quite non-trivial. In particular, seems you have a DSL there that map to SQL... i meant to ask did you invent that on the fly? (i haven't dug into them yet. They are very interesting to me both as usable tools and the elisp implemenation.))

Thank you.

Christopher Wellons

I run Debian Sid (unstable), and both Iceweasel (Firefox) and Chromium support WebGL here. I believe these browsers also support WebGL in Wheezy (stable), but I'm not certain of that. This is despite me using only open source video drivers. It's not as fast as it could be, but, outside of performance, they work just as well as any proprietary drivers. If I were to guess why it's not working for you, it would be because your browser vesion is too old, WebGL is disabled (it's disabled by default in Chromium), or you don't have some important OpenGL library installed. In the very worst case where you had no real graphics card at all, you would be running pure software OpenGL with Mesa or something like that, so it would still work, just super slow.

In my last paragraph where I tested the performance, I used my wife's desktop, which runs Windows and has a GeForce GTX 760. (I drove the benchmark from my laptop via Skewer, so she didn't even have to step away from her machine for this. :-)

The EmacSQL DSL was a a series of trial and errors. It changed a whole lot over the month I was working on it, which the Git log reveals if you really dig into it. I was trying to follow the full SQL grammar very closely for awhile, but found this to be both tedious and messy because I was trying to normalize it across SQLite, MySQL, and PostgreSQL. I ended up falling back to a simple ruleset which seems to cover almost everything. In the worst case you can always pass a plain old string of SQL if the DSL doesn't do what's needed.

jdashg

"In OpenGL ES, and therefore WebGL, texture dimensions must be powers
of two, i.e. 512x512, 256x1024, etc."

This is not true. WebGL has full support for NPoT textures, but:
* NPoT textures must use either NEAREST or LINEAR for min and max filters. (default for min filter is NEAREST_MIPMAP_LINEAR)
* NPot textures must wrap with CLAMP_TO_EDGE.

If you do wrapping in the shader, and use a non-mipmap filter, your NPoT textures should work fine.

Christopher Wellons

You're right, thanks! I just tweaked that statement to make it right. :-)

Rikkert Koppes

How would you arrive at 18.000 iterations per second? When I disable draw and reduce the interval to 0 in start(), I arrive at 200 fps max

Christopher Wellons

Don't limit yourself to one iteration per JavaScript interval. The JavaScript event loop will dominate the timing. Instead, put it inside a tight for/while loop within the same interval, and all inside a function (so you're not using global variable for your loop counter).

Ultimately it's going to depend a whole lot on your hardware and drivers. Plus I don't remember exactly how I got that result!


Emacs Unicode Pitfalls

Alan

Regarding, "UTF-16 offers no practical advantages over UTF-8", many CJKV language characters in UTF-8 consume three or more bytes, while only two in UTF-16.
I'd consider that decrease in size to be a fairly practical advantage for many systems, especially embedded.

Christopher Wellons

The FAQ (#9) in the linked UTF-8 Everywhere article addresses this point. Real world CJKV data is usually smaller as UTF-8 because the markup and protocol data typically surrounding the text (XML, HTTP, file system paths, etc.) is plain old ASCII. Even with the markup stripped, the gains are small (21% in Japanese Wikipedia's case). And if compression is applied, as it often is, it makes no difference anyway.

anonymous

Perhaps, you may confuse "combining character sequence" and "grapheme
cluster".

'<base character=""> + <multiple combining="" characters="">' doesn't mean
anything. To be least, `<base character=""> + <multiple marks="" (which="" is="" a="" superset="" of="" combining="" characters),="" zwjs="" or="" zwnjs="">` (combining character
sequence) is needed. There are characters and glyphs which composed of
multiple base characters.

For example, try "M-x ucs-normalize-NFD-region" on "각" and see the
combining classes of decomposed characters with "C-u C-x =".

If you want 'reverse-string' to work with Emacs's cusror movement unit,
you may need to implement "extended grapheme cluster" algorithm
described in Unicode Text segmentation (UAX #29,
http://www.unicode.org/repo....

Perhaps, it is appropriate time to implement them in Emacs kernel, so
that C-f/C-b works with this unit, or M-f/M-b to work with word unit
described in this standards, too.

Christopher Wellons

Yes, I was using "grapheme cluster" as a synonym for "combining character sequence," but I see now that that's outdated. I couldn't find in which version of the standard that extended grapheme clusters were introduced, but it must have been recently enough that the Unicode resources I was reading predate them.

Unfortunately, Emacs cursor movement doesn't always step by grapheme cluster boundaries, and the behavior even changes depending on the circumstances (i.e. interactive vs. non-interactive). I was going to mention it in my article, but I couldn't make sense of it. You're right that it would probably be best to fix it within Emacs itself, then lean on that for things like string reversal.


Feedback Applet Ported to WebGL

(no comments)

Feedback Applet Ported to WebGL

(no comments)

A GPU Approach to Path Finding

Tim

Well this is pretty awesome.

Christopher Wellons

Thanks!

Jesse Himmelstein

Fantastic work! The diagrams make it really clear how the algorithm works.

Quick question: how did you generate the GIFs?

Christopher Wellons

I used gif.js to gather up all the frames into a single, convenient "download." Unfortunately gif.js doesn't produce very optimized GIFS, even on the highest quality setting. So after I save it to my filesystem, I explode it with Gifsicle and put it back together at a fraction of the size. I talked about some of the process here:

http://nullprogram.com/blog...

And here's the gif.js website:

http://jnordberg.github.io/...

Steven Stewart-Gallus

One annoyance with this approach is that it finds a shortest path according to taxicab geometry and not Euclidean geometry. Of course the problem is even more complicated because in the real world one might want to find the shortest path according to many different types of non-Euclidean geometries. Also, in a game one probably wouldn't want to find the shortest path according to Euclidean geometry but in the sort of weird approximation to Euclidean geometry based off a grid that most games use.

Daniel Tebbutt

I imagine you could do it on a hex grid pretty easily, because it still holds that each cell is equidistant to all its neighbors and it maps onto a square grid (just need to change the definition of 'neighbor').

Christopher Wellons

You're right. It could just use offset coordinates on a normal square grid to convert it into a hex grid:

http://www.redblobgames.com...

anon

Awesome job!

Dan

No heuristics? Without heuristics this is not useful.

Christopher Wellons

What do you mean by heuristics?

Alan Wolfe

I think he means the feature of A* where you can give it some hints about better paths at branching, versus the brute force flood fill that is Djikstra's algorithm. It's kind of odd though in this case to think of heuristics because you are running a program per pixel regardless, so you can't really "early out" based on heuristics as easily. Maybe you could quit out the pixel shader more quickly and then open up shading units for (possibly) more useful pixels to run, but dunno. Maybe the shortest path algorithm could try some heuristics? not sure if that'd work. Anyways interesting read! There's a decent amount of this stuff on shadertoy.com too btw. Heck, for all I know you made those too though (:

mattbaker

I disagree, it's a great article on using the GPU to do computation in WebGL. If you're looking for path finding algorithms using heuristics I'm sure the author could point you in the right direction.

Dan

It's a good article on implementing an algorithm on the GPU, however I find the algorithm itself to be large useless in terms of pathfinding as the CPU will beat the GPU for large terrains. The algorithm here is nothing more than Djikstra's algorithm, which of course, can be useful too. I would have preferred the article focus on more practical applications of Djikstra's algorithm, like perhaps a harvester finding the actual closest resource to go and harvest. So long as the algorithm can halt when a resource is found, it should be very fast.

mattbaker

I've been struggling to figure out how to offload computation of a large number of particles to the GPU in WebGL. I understand the general problem of needing to pass state to successive render calls, but I would always get confused trying to deal with textures to store data.

Your GoL article as well as this one are a great resource. I have a renewed desire to chase this :) Thanks for providing source code.

Christopher Wellons

Cool! I hope you figure it all out.

If you want to render the particles as you compute them, you'll run into that bottleneck again with WebGL. In OpenGL ES 2.0 and WebGL 1.0 there's no way to copy a texture directly into a vertex buffer. Fortunately, the upcoming WebGL 2.0, based on OpenGL ES 3.0, has this feature (pixel transfer).

mattbaker

All I need is to compute, store, and retrieve coordinates so I can process them. I can set the vertices of the particles in the vertex shader. My understanding is the only way to do that without transform feedback is to pass data through the texture? Either way, thanks for writing these things up :)

Christopher Wellons

If you're just crunching numbers you don't need to do anything with vertices. You'll be running the fragment shader pixel-to-pixel between textures, so you just need a quad like I do with cellular automata.

Initially pass the particles in as a texture. Treat a single RGBA pixel as a 4D (or less) position vector for a single particle. If you want velocity, too, use a second texture as another per-entity 4D vector. Acceleration? Another texture, etc. You can really only write one texture per draw (though there's also depth and stencil to consider), so you'd need a separate draw to update each of position, velocity, acceleration, etc. When you're all done, pull them back out as as image data (glReadPixels) and parse it back into your own data structures.

katopz

Great article! Really have fun reading along the way! :D

Christopher Wellons

Thanks!

Ahmed Fasih

Really awesome!!! Is generating im/perfect mazes as computationally juicy as solving them, or is that step much easier?

Christopher Wellons

Overall, solving a maze is probably about as difficult as generating one. In the case of a random depth-first search maze, it's the exact same process but with a slightly different halting state.

There's a cellular automation called Mazectric (B3/S1234) that generates patterns closely resembling mazes, but they're not perfect mazes. You can convert my Game of Life demo into a maze generator by changing one line of code in the fragment shader (the sum == 2 part). I haven't thought of any other way to generate mazes on a GPU.

The file maze.js in this project has three different maze generators: DF, Kruskal, and Prim. The generators don't know anything about the CPU or the cellular automation, so they could be re-used anywhere. They each produce different styles of maze, enough so that you could reconize them by watching the cellular automation run on them. You can try this yourself by cloning the repository and editing main.js to choose a different maze type. DFS mazes have long winding paths, so the flood fill is far less dramatic. Prim mazes are like swiss cheese (while still being perfect), with the flood search spreading evenly. Kruskal is in between these extremes.

There are algorithms that can generate infinite perfect mazes, only keeping track of two rows at a time. This was popular back when printer paper was attached continuously (perforated). The printer could keep going and going and going printing out an unbroken maze until it either ran out of ink or paper.

ftfish

Thank you for this interesting post! I'm trying to implement this in OpenCL :)
BTW, it is called a cellular automaton (pl. automata), not automat*I*on.

Christopher Wellons

Thanks! Somehow my whole life up until now I had always read it as cellular "automation" in my head without realizing it. I don't know how I messed that up so consistently all this time.

Another recent example of this for me is depreciated vs. deprecated. For a long time I didn't realize they were two different words.

Mubbasir

Excellent post!

Have you seen these two papers that tackle the problem in a similar fashion?

Dynamic Search on the GPU
http://ieeexplore.ieee.org/...

For adaptive resolution grids:
http://people.inf.ethz.ch/k...

Christopher Wellons

I wasn't aware of these. So, thanks, I'll take a look at them!

Gary Frost

Very Cool. I know this is old, but just came across it today. I was planning on writing a track layout app using GLSL compute shaders. I never considered using a FSA.

i was involved in VLSI routing at one stage (when 3 level metal at 3 micron was 'state of the art' - so we are talking 1980s) and track/maze routing was solved using either Channel Routers or Flood Style routers. The FSA approach above looks like a specialization of Lee's routing algorithm (https://en.wikipedia.org/wi... which is a brute force Flood router. With flood routers, one approach to halving the time taken was to flood from both ends. I wonder home hard it would be modify your algorithm to do this.

Again a great read, and very well explained. Kudos.


A GPU Approach to Particle Physics

(no comments)

A GPU Approach to Particle Physics

philogb

Amazing article. Thanks so much for writing these! One question: have you considered using floating-point textures instead of "regular" ones? That way you don't need packing / unpacking.

Christopher Wellons

Floating point textures would definitely be pretty handy for this sort of thing, but, unfortunately, WebGL doesn't support them. If I were to port this to desktop OpenGL, that's what I'd be using.

philogb

Oh they're supported as an extension: http://www.khronos.org/regi... and apparently 92% of people using WebGL has them: http://webglstats.com/ (look for WebGL extensions) :)

Christopher Wellons

This is perfect, thanks! I bet the 8% that doesn't support it overlaps almost entirely with the same devices that don't permit textures in vertex shaders, so that downside is moot.

So far my OpenGL learning has been entirely from the WebGL specification, the linked man pages, the OpenGL ES specification, and occasional questions for a fairly knowledgeable friend. (Google searches are generally fruitless because not many people are talking about WebGL yet.) None of these resources cover WebGL extensions, so I need to spend some time learning them.

philogb
philogb

Also.. I wonder if there's an automated way to create the normals for the obstacles, using maybe some distance field generation? Would it be easy to automate that?

Christopher Wellons

What do you mean by automated? Right now it's handled entirely by a really simple fragment shader (ocircle.frag), which is probably the most straightforward way to do it. It only runs when the obstacles change, so it doesn't need to be very fast.

Fredrik Nordeng

can you change every particle in to waves?


An RC4 Password Hashing Function

possiblywrong

Very interesting read. A question occurred to me about the practice you mention of slowing down a hash by iterating a relatively fast hash function, viewed as mapping its output space S back into itself (i.e., f:S->S). I wondered if, by doing this, there was any potential danger of making that hash space significantly smaller (thus making inverting easier), from the amount of "non-one-to-one-ness" of f. And in practice, do we know how "non-one-to-one" are typical hash functions in common use?

After some poking around, I found what looks like an answer here, suggesting that this isn't a big deal... but I didn't quite understand some of the argument in the top answer, particularly the following:

"The lengths of both the “tail” (the values you get through before entering the cycle) and the “cycle” itself are, on average, 2^(n/2)."

This is an interesting general result, that didn't appear obvious to me. After some more searching, I don't think it is :). The following paper addresses just the cycle length (not the tail):

http://www.ams.org/journals/tran/1968-133-02/S0002-9947-1968-0228032-3/S0002-9947-1968-0228032-3.pdf

Adr

The way salt was added is very weak, basically salt is known and you shufle S in a predictible way, so not much added value, cryptanalysis will be similar.


C11 Lock-free Stack

(no comments)

Making C Quicksort Stable

Russell Borogove

The glibc suggestion of sorting on address as a fallback key works if you pass qsort an array of pointers rather than an array of structures; the pointers will be swapped in place, instead of the things they point to. The swaps themselves will be faster in the (common) case where the structure is larger than the platform pointer size as well.

possiblywrong

My attempt at markdown seemed to make a mess. Let's try again:

But would this actually be stable in the desired practical sense? You would need a[i]<a[j], comparing="" memory="" addresses,="" for="" all="" i<j,="" before="" even="" calling="" qsort().="" this="" would="" only="" be="" the="" case="" if="" you="" started="" with="" a="" contiguous="" array="" of="" structures,="" and="" added="" an="" extra="" level="" of="" indirection="" solely="" for="" the="" sorting.="" allocating="" your="" individual="" structures="" via="" individual="" malloc()="" calls,="" which="" seems="" more="" likely="" in="" this="" scenario,="" would="" make="" the="" resulting="" pointer="" comparisons="" not="" even="" defined.="">

Christopher Wellons

Disqus swallowed and barfed possiblywrong's comment making it hard to read, but he's right. It wouldn't work with malloc()-allocated structures for two reasons. 1) Because you'd be falling back on whatever order malloc() happened to put the structs in memory. 2) More importantly, as possiblywrong said, comparing pointers returned by malloc() is undefined behavior (C11 6.5.6, 6.5.8). Only pointers within the same array can be compared this way. As an example why pointer subtraction is tricky: the distance between two arbitrary pointers can easily be larger than any representable negative integer value.

Suppose your structs were allocated in an array and you create a secondary array of pointers to these structs for sorting. You're right that it would probably be faster if the structs are large. However, when you're done with a sort, you need to move all the structs around in the original array to reflect the new pointer array order, which is basically another whole sort if you wanted to do it in-place. Ultimately you would be doing the same thing as my "order" field but storing it in a separate array in the form of pointers.

Rob Thorpe

I believe Glibc qsort documentation says this because it implements qsort using mergesort not quicksort. It's not an in-place sort, the input arrays stays where it is and an output array is built separately.

However, I seem to remember that it reverts to quicksort for large arrays, so it may still be wrong.

Christopher Wellons

Yup, the quicksort fallback is what makes that not true even for glibc. Plus it's not a good idea to rely on glibc's particular implementation anyway. Using a GNU extension (e.g. qsort_r()) is different because it will be an obvious compile-time error when used with an incompatible libc, but subtly using qsort() incorrectly won't be so obvious.


C11 Lock-free Stack

Chris Ryland

I'd think it would be clearer to use

lstack->node_buffer[i].next = &lstack->node_buffer[i + 1];

than

lstack->node_buffer[i].next = lstack->node_buffer + i + 1;

Christopher Wellons

Eh, I don't have a strong opinion either way. My one (weak) argument against your version is that, conceptually, it dereferences then creates a reference.

petiepooo

The top version is imminently more readable, and they likely both compile to the exact same assembly. While C supports pointer arithmetic, it smells bad to many, and is a barrier to comprehension, likely due to how infrequently it's used.

Sergey

In your description of ABA problem, it's not really clear why the head->node->next pointer would be "pointing somewhere completely new". If thread B did nothing but a single pop() and then a single push(), then it would be correct for thread A to succeed with its CAS upon waking up. I think at least one other pop() or push() operation should be performed to cause the problem. Do I understand correctly?

Christopher Wellons

You're right. There would need to be at least one additional operation, either pop or push, before the node gets recycled for the ABA problem to truly show itself.

Example: say the top three nodes are A, B, C. Pop A, pop B, recycle-push A. One thread saw ABC but it's now AC and CAS still succeeds. Or, pop A, push X, recycle-push A. The thread saw ABC but now it's AXBC and CAS still succeeds.

petiepooo

I must be missing something.

In your pop primitive, you copy the contents pointed to by head into the orig and next structs before entering the do/while loop. If the call to atomic_compare_exchange_weak(...) fails, and the loop repeats, would you not want to reload orig and next structs with fresh contents from a new call to atomic_load(head)? Without that, assuming it's failing because another thread has successfully changed the contents of head, what you originally copied from head will never match, and the loop would repeat endlessly. Right?

If I'm not missing something, I believe the same logic would apply to the push primitive as well.

Christopher Wellons

The call to atomic_compare_exchange_weak() updates "orig" automatically as a side effect in the case of a failure. This is because any reasonable call to atomic_compare_exchange_weak() would need to do this anyway as part of retrying.

Eugene

I was thinking about using very basic bounded (preallocated) stack to keep track of my threads ids in correct LIFO order. so i was wondering if my implementation is thread safe to be called from multiple threads:

// we use maximum 8 workers
size_t idle_ids_stack[8];
std::atomic_uint_fast8_t idle_pos(0);

// this function is called by each thread when it is about to sleep
void register_idle(size_t thread_id) {
std::atomic_thread_fence(std::memory_order_release);
idle_ids_stack[idle_pos.fetch_add(1, std::memory_order_relaxed)] = thread_id;
}

// this function can be called from anywhere at anytime
void wakeup_one() {
uint_fast8_t oldStatus(idle_pos.load(std::memory_order_relaxed));
std::atomic_thread_fence(std::memory_order_acquire);
size_t id;
do
{
if(oldStatus == 0) return; // no idle threads in stack; exit;
id = idle_ids_stack[oldStatus-1];
}
while (! idle_pos.compare_exchange_weak(oldStatus, oldStatus-1, std::memory_order_acquire, std::memory_order_relaxed));
// if we got here means we can wake up the thread
signal_thread(id);
}

Christopher Wellons

You have a race condition in register_idle(). You atomically grab a new index, but the write to the stack is done in a separate operation *and* without synchronization. What could happen is that you increment the stack pointer to "allocate" an index. Then, before you write the actual stack element, another thread pops off an element -- reading garbage, because you hadn't written the element yet. Then, to make matters worse, yet another thread goes to push, increments the stack counter, and writes its element. Finally the first thread writes its element, overwriting the newly pushed element.

This is the exact same fundamental problem faced by a theoretical many-writers, lock-free queue (imagine using a circular buffer). I'm not convinced there's an actual solution for this. I've never seen one.

Your idea could work if there was only a single writer and you change the operation order: write the new element first, then increment the stack pointer atomically+synchronously.

hmijail

Thank you for a very interesting post. But it has left me with some doubts that maybe you could answer.

You seem to imply that C11's atomic support will only link if the processor's ISA does provide the needed atomic instructions, like cmpxchg16 in x64.
However, cmpxchg16 is in fact not atomic by itself! (see the answers at https://stackoverflow.com/q... , which include warnings from Intel docs). The way to force the access to be atomic is by adding a LOCK prefix to the instruction - but of course then it's no longer lock-free.

So this would mean that C11 atomics are lock-free as far as the C abstract machine is involved, but the underlying implementation is free to actually use locks. And so, a platform incapable of natively doing atomic accesses of a given width, could just implement them through locks.

Is that right or am I missing something?

(Granted, this is mostly language lawyer territory, but still can be important when discussing the merits of different algorithms...)

Christopher Wellons

You're right, it's not atomic on its own. However, this instruction has always supported the LOCK prefix (being its primary use case), so if you tell GCC cmpxchg16 is available, you're also indicating the atomic form is available.

The LOCK prefix doesn't take away its "lock-free" status. It's a simple overloading of terms and is a perfect example of why this technology has been misnamed. (This isn't mentioned in the article since I wasn't yet able to articulate it.) Lock-free would be much better named "wait-free." That is, progress always moves forward without waiting or blocking. The LOCK prefix isn't blocking until some specific event on another thread, it's just forcing the instruction to be atomic. It could have been named ATOMIC instead (though I'm not saying that would be more accurate).

Calling malloc() isn't wait-free because it may have to wait on another thread to release a mutex that locks the heap, or have to wait on the OS to allocate virtual memory for the process, which itself may involve swapping to disk, etc. If the thread holding the mutex dies, we will end up waiting forever. Compare that to a LOCK-prefixed cmpxchg, which will only loop to try again if another thread made progress in the same instant. It doesn't have to wait on anything.

hmijail

>The LOCK prefix isn't blocking until some specific event on another thread,

> [a LOCK-prefixed cmpxchg] ... will only loop to try again if another thread made progress in the same instant. It doesn't have to wait on anything.

Well, you *have* to wait until the other thread unlocks. And if there are N processors all trying to lock on the same region, you might end having to wait for the N-1 operations to end - or even more, if there is no protocol against livelock/starvation.

How is this not blocking or waiting?

Anyway, the overarching question would be: can you point to any concrete reference which says that C11's atomics have to be implemented lock-free? What I had read up to now is that it's not terribly well defined in the standard.

Christopher Wellons

The hardware guarantees that you won't wait forever and progress will be made regardless of the activity of the other processors. It's definitely *not* the sort of lock that concerns lock-free programming. Lock-free programming is concerned with operations involving the OS scheduler (semaphore, mutex, read/write), which won't reschedule (i.e. block) a thread/process until another thread/process has completed some action.

Jason

Thank you for your post, it´s very helpful to me.
I would like to ask the concept, for example:
If usr1 wants to use Push() in stack, at the same time usr2, usr3... want to use Push() or Pop() in the same stack. Except usr1, others users have to wait until Push() finished.
How Lock-Free improve the performance of data structure Stack, if those users need to wait for each other operation finished?

Thanks in advance.

liblfds admin

The responsibility for memory allocation can be shifted to the caller, rather than being hard coded in the library itself.

Have the user pass in a pointer to a stack element structure, and operate on that. The user is responsible for allocation, and for pre-allocation. Typical expected use is to embed one such structure into the structure which is being pushed to the stack.

This is in fact vital, because with NUMA, an single contigious allocation of stack elements is not by any means guaranteed to be the correct choice.

One other note - backoff in the fact of contention is in fact *ESSENTIAL* to performance. Backoff should vary, exponentially. A random selection within the current backoff's maximum possible period is not necessary - it makes no difference to performance (prolly because the threads themselves are random enough in the first place). Optimal backoff varies by the number of threads and load, so a fixed backoff period is not really a solution at all. Without backoff, performance is no better than lock-based stacks.

liblfds admin

One other note - the use of a struct for the stack element is risky in that there is an assumption there will be no padding between the structure members.

Note also that ARM supports DWCAS.

kk

I think there is a problem with your pop() function. You should modify 'orig' again in the loop and IIUC 'orig' is set only once and hence if 'head' moves ahead as updates performed by other threads you would be doing wrong pop() settings. Please correct me if I am wrong

Christopher Wellons

That's a good observation, but atomic_compare_exchange_weak() handles this automatically. It's why the "expected" parameter is a pointer. This is what I meant by, "If not, it reports a failure and updates the expected value to the latest value."

Bob

Fascinating stuff. I've been playing with your code and so far I have no problems when used with posix threads. Currently I'm trying to figure out how to implement the lock-free queue. Have you figured it out?

Christopher Wellons

I never figured out a general many-writer, many-reader lock-free queue, and I currently don't think it's possible with the currently-available primitives. But I did figure out a single-writer, many-reader lock-free queue, which is particularly useful as a work queue:

https://github.com/skeeto/l...

Bob

I was able to work out a many-writer / many-reader concurrent queue. However it's not perfectly lock-free since I ended up designing a lock using atomics. If you or anyone would like to take a look then shoot me a line w/ way to send the demo source over. It would be good to thoroughly test it before using it in any place important.

Paulo Torrens

According to the C11 standard, §6.5.2.3-5, accessing a member of an atomic struct/union is undefined behavior, and you seem to do that on your lstack_init() function. Well, it's not reasonable that something wrong would happen there, but it's still possible nevertheless...

Christopher Wellons

Thanks, you're right. I've fixed the code to initialize without accessing individual members, though the resulting generated code is exactly identical anyway.


The Billion Pi Challenge

(no comments)

The Billion Pi Challenge

(no comments)

Global State: A Tale of Two Bad C APIs

(no comments)

Emacs Autotetris Mode

woodrat

emacs23 needs cl-lib to run it.. but it works well on emacs24

Jordon Biondo

Wow on its first run it beat my high score!


C Object Oriented Programming

Gavin Black

Good write-up, and I'm curious if any of the *nix kernels do any OOP style in C nowadays? The little bit of prodding I've had to do it doesn't look like it. Also I'm sure you've probably already hit upon the ooc project and semi-related book(http://www.cs.rit.edu/~ats/... as well. They put almost everything in macros to enforce standards it seems.

Lately I've been tending to write my more complicated C-code in functional style using function pointers and then making small functions that behave like map/foldl. Although that probably wouldn't be maintainable for big projects, I have had poor interns take over my code without too much problems...yet :P

Christopher Wellons

Linux uses this sort of thing all over. Here are a couple of articles,
https://www.kernel.org/doc/... http://lwn.net/Articles/444...

I actualy did read OOC recently and I was actually intending on linking to it, but forgot. It goes further than I think is reasonable, getting a little bit *too* fancy, but it's still interesting.

A good reference counter "class," like kref referenced above, is really useful for functional style code. You can use it to manage shared data structures, as is often the case for functional style. Objects will know how to free themselves when the counter hits 0.

Michael Terry

Great post. Don't know C, don't have a CS degree, so this shone light on some issues which had been hazy to me.

manvscode

Great post!

kuangdash

Great Post!+1

Charles Lehner

Great article. Here is another container_of macro that I have been using. It embeds a check that ptr is of the same type as (type)->member. I haven't checked if it compiles to the same object code as the one using offsetof, but it works and offers some more type safety.

#define container_of(ptr, type, member) \
((type *)( ((ptr) - &((type *)0)->member) * sizeof(*ptr) ))

Christopher Wellons

If I'm not mistaken, finding the offset using 0/NULL like that is undefined behavior. So while that will work fine on every compiler and platform I care about, it's *technically* not portable. That's why C99 blessed us with offsetof(). I do like the tiny bit of extra type safety in your version, though.


LZSS Quine Puzzle

Shalom

Excellent article! well explained..

Andrew

If you ever encounter a copy of Games People Play, let me know! I also enjoyed it in my youth, but of course my copy of the CD is also long lost. I would love to get my hands on it again.

Christopher Wellons

I managed to find a copy a few months back, and I created an ISO image. Here's a BitTorrent magnet link:

magnet:?xt=urn:btih:e3b64178dc6d8e551eb168a7abb03471b23baab1

Since this is all shareware and intended to be shared (though you probably couldn't order/register any of these games anymomre), it's a perfect example of a legal torrent.

Andrew

Thank you, that's wonderful! Unfortunately, my client doesn't seem to be having any luck finding the torrent using DHT.

Christopher Wellons

Give this a shot:

http://skeeto.s3.amazonaws....

If that fails for you, too, I'll just put it directly on S3.

Andrew

That works nicely, thank you for going to the trouble. And good article, by the way. I've read and enjoyed a number of things on here over the years :)


How to build DOS COM files with GCC

possiblywrong

Very interesting-- and nostalgic-- read. I'll have to try playing this when I get home.

Joe Jamison

This method of tweaking gcc is perfect! I wanted to do the same for a
similar project of my own, but I would like to use MinGW, and am having
trouble getting the linker to produce my binary. I have tried some
tweaks, but can't seem to get a functioning executable to come out of
the toolchain.

You mentioned that you got it working in MinGW, "with an extra objdump step". What exactly did you do here? Thanks!

Christopher Wellons

Yeah, MinGW is a little wonky when it comes to this. Produce a normal executable by removing the OUTPUT_FORMAT directive from the linker script, and instead use "objcopy -O binary" to extract the contents into a headerless COM file (not objdump as I had originally stated). I tested it now with both MinGW and MinGW-w64 and it works fine for my game.

Joe Jamison

Awesome! I ended up using this in conjunction with the --noinhibit-exec flag set on ld so I could compile & run in one go.

David E Jones

I seem to be having a similar issue with mingw do you have the compile, link and objcopy commands that worked for you in the end?

EDIT:
Ah never mind I've got it now I think i was overcomplicating things. This is what worked for me on mingw32 if anyone else needs it https://gist.github.com/dav...

Zirias

Very interesting read, gave me the idea to port a little curses-based snake game I wrote recently to DOS realmode. You might be interested that llvm/clang as well as gcc meanwhile know a "-m16" option that renders this a bit more convenient :) I had to resort to clang though because gcc dislikes clobbering of "ebp" which happens with buggy BIOSes when using int 10h 02h for scrolling ... if you're interested, here's my code so far: https://github.com/Zirias/c...

So far, there do not seem to be any optimizer issues with clang. When optimizing changed behaviour, it was always something like accidentally counting on a register being 0 etc ...

Christopher Wellons

It's important to note that gcc's -m16 option doesn't do quite what one might reasonably expect it to do. GCC can't emit straight 16-bit 8086 code, instead using its ".code16gcc" hack to allow its 32-bit 80386 code to run in real mode.

Looks like it's a similar story for clang. Its -m16 option still uses some 32-bit registers and addresses, but, judging by the output, it does a *far* better job than gcc. Very interesting. I wonder how much the optimizer gets in the way (if at all?). That's a constant struggle with using GCC this way. When I originally wrote this article, I don't think this clang feature was available to me (running Debian Wheezy, with some 2012 version of LLVM/clang), so I hadn't considered it.

GCC breaking on ebp clobbering makes sense. That's an essential part of its expected calling convention.

Your libdos project is very interesting. It's got a surprising amount of stuff implemented! You should be aware that ELKS has a complete 16-bit standard C library (https://github.com/jbruchon.... There's a "Bruce's C compiler" (bcc) out there that can build pre-ANSI C programs and will link against it. I was poking around with these tools first before figuring out how to bend GCC to do the same job.

Zirias

Indeed, clang does the same thing using -m16: emitting 32bit code for real mode. So no way to have the clang-compiled binary run on anything prior to 80386. But who cares. If you look at this bugreport: https://gcc.gnu.org/bugzill... another reasoning for -m16 seems to be that the compiler should be aware of real mode and NOT doing optimizations that break when running in real mode. From what I tried so far, GCC gets it wrong sometimes, clang does a great job. Unfortunately it WILL introduce bugs, too, when attempting to compile one unit at a time, linking with ld.gold and -flto. So I was forced to change my Makefile to build the entire binary in a single compiler run -- linking without -flto would probably work, too, but given you have only 64K, it's unacceptable.

About the ebp clobbering: Not sure how clang handles this, I didn't bother to check the assembly because it "just worked", while GCC refused to compile.

Btw, thanks for the link. But what I'm trying to do here is just implementing enough runtime-stuff so my curses game will run (maybe plus a little extra when it's easy to implement, but minus curses, I'll just create an alternative I/O module using my custom conio interface). About bcc, I heard of it before -- well after reading your article, I didn't think it would be something I'd want to use :)

So, just thanks for this cool article that gave me the initial idea. It was a great starting point, too, and finding that clang seems to do a better job, I thought that was worth mentioning (it also does a nice job optimizing e.g. a whole lot of bit-shifting and masking C code to just 2 or 3 equivalent instructions -- I was amazed to see the assembly created with -Os for my dosversion() in core.c).

Zirias

As my project does some progress (I'm nearly there supporting my curses game, "just" a curses compatibility layer and an sqrt() implementation are missing) -- I wanted to let you know that I linked this blog post in my README.md for credit. I hope that's ok, if you're opposed please let me know.

Christopher Wellons

Impressive work! I bet a bunch of small curses software (roguelikes, etc.) could be ported to real mode on top of your project.

Zirias

Thanks Christopher. Now I can finally say: I'm done (so far). My curses-based game (https://github.com/Zirias/c... compiles and runs as a DOS .COM file -- great :) There were quite some surprises on the way, the biggest one being: although clang seems to do a great job optimizing the code, it messes up at the assembly stage (see http://stackoverflow.com/qu... .. Not sure whether only the LEA instruction is messed up, but that's what I found ...). My workaround for now is let clang output just assembly source and handle the actual assembly using GNU as. I'll try to create a minimal example for a bugreport to clang soon. Nevertheless, it works, so thanks again for this great blog post that initially gave me the idea :) For porting other curses-software, well, so far my curses implementation is merely a "shim" providing just what my game needs....

Zirias

Posting one more comment because it could save your readers quite some though debugging: It appears the linker will not occupy *any* space for .bss (uninitialized variables of static storage in C) using the binary output format, but just assume .bss uses anything appearing directly after the file contents. EDIT: what I posted here before is NOT the solution! Still trying to figure out ...

FIXED IT: space occupation for .bss was ok, the problem was COMMON symbols didn't get any space in the output and so "competed" with .bss AND the heap, leading in all kinds of data corruption. Fixed linker script: https://github.com/Zirias/c...

bsa

Link is broken :(

Jan Minar

s/DOS will so most of the setup/DOS will do most of the setup/

Christopher Wellons

Fixed, thanks!

Андрей Иванов

Very interesting reading, thanks :) And there is a little typo: "COM files are limited to 65,279 bytes in side." (side -> size).

Christopher Wellons

Thanks! It's been corrected.

Free bee

gcc throw's an error :/tmp/ccjO1BhE.o: In function `boot_entry':
boot2.c:(.text+0xb): undefined reference to `_GLOBAL_OFFSET_TABLE_'

how to deal with this sir

Christopher Wellons

Looks like your C compiler produces position-independent code by default, which is definitely inappropriate here. Add "-fno-pic" to the build arguments and that should fix it. Are you compiling on a BSD system? I'm curious whose compiler does this by default.

Ryan Reno

I was recently following along with another person's blog writing a simple x86 bootloader that called C code.

In it I had to use the -fno-pic flag with both gcc 7.2.1 and clang 5.0.1 on linux 4.14. Looks like pic is enabled by default on the most recent versions of both compilers.

Thanks for this post by the way. It's really interesting how you worked around gcc doing its best to be clever.

Christopher Wellons

Yeah, it looks like -fpic is the default these days. I just added -fno-pic to the build to avoid these issues, but I probably should have done this years ago.

René Rebe

After SoundBlaster DMA and such, I'm currently going more crazy w/ S3 2D graphic acceleration, and the next video tomorrow will even be S3/Virge hardware 3D accelerated triangles: https://www.youtube.com/wat...

Christopher Wellons

Thanks for keeping me updated, René. It's impressive how far you've taken this. I always enjoy having a really solid specification / reference like that s3d manual. Opens up so many possibilities.

René Rebe

Thank you for starting this great hack ;-) I mostly do this also for some general education and inspire people to work on low level / driver stuff as part of my YT channel. I just went further and implement hardware z-buffer for this S3/Virge acceleration https://www.instagram.com/p... and will make the usual video about this z-buffering and 3d engine details in the next days ;-) PS: it only flickers because of double-buffering (Yet ;-)


Hot Code Replacement in C

Arseny Kapoulkine

> Since we can't ask dlopen()about the inode of the library it opened, we can't know.

Of course you can. Just stat the file again - and repeat the reloading process if the inode changed (that'll trigger in some cases when dlopen already opened the new version but it's much better than missing updates).

Also note that file locking and updating inode are separate - e.g. you could dlopen the file even if the linker updated .so in place (and preserved the inode in the process). In fact you're relying on the exact details of linker behavior when doing the inode check as opposed to timestamp (granted, it's unlikely that a modern linker will write the file in-place).

Christopher Wellons

Since inodes are reused, the ABA problem is present and the race condition still exists. I'm pretty confident there's no way around it without either carefully writing to unique filenames or using some not-yet-existing extension to dlopen() to reveal more information about the loaded shared object file.

For example, start by loading version 1 with inode 1. Later version 2 is written to inode 2. The wrapper notices "libgame.so" now points to inode 2 so it unloads version 1 freeing inode 1, but, before loading, version 3 is written using inode 3, freeing inode 2 (it was unlinked and unopened). Next version 3 is loaded since that's the current file called "libgame.so", but before calling the second stat() to check the inode again, version 4 is written reusing inode 2. (Imagine we've got a *really* fast, unlucky programmer here!) Since stat() reports inode 2, it looks like we've got the latest version, but version 4 will never be loaded. Version 3 is loaded (inode 3) and the wrapper thinks it's got a hold on inode 2.

You're right that the linker could update in place via truncation, but I'm fine with relying on the to linker to not do that. I believe such behavior would be incompatible with dlopen() for this project anyway, as specified by POSIX. It says: "Only a single copy of an object file is brought into the address space, even if dlopen() is invoked multiple times in reference to the file, and even if different pathnames are used to reference the file." That suggests to me that a shared library modified in place (i.e. the same file) can never be loaded a second time with the modified contents. It depends on what exactly they mean by "file", though.

jp

To do this reliably you might want to use inotify, which has an event type for when a file opened for writing is closed, which is exactly what you want.

Christopher Wellons

Not a bad idea. That would certainly eliminate polling and probably serve as a smarter, though more complex, solution. I've only used inotify a tiny bit, but couldn't inotify handling itself still be pre-empted by another file write/update?

The main disconnect in the API is that dlopen() takes a path (a filesystem link) rather than a file descriptor. That link can be updated/changed just before dlopen() reads it, and libdl won't tell us exactly what file (read: inode) actually opened/loaded.

Miguel Lechón

(thanks for the post)
My take on a simple version that avoids polling is to notify the binary of a rebuild by sending it a SIGUSR1 as the last step of the libgame.so Makefile target.

Christopher Wellons

Oh, using a signal is a good idea! I think that's what I'll probably do when I use this for real.

Arseny Kapoulkine

Re: ABA - that's a good point, I haven't thought about that. This suggest that timestamps are possibly more reliable. The common case when timestamps fail to update is copying, so maybe timestamp+inode is a good combination...

Michael Terry

Great post.

Mr Go

Hehe, I'm wondering how this would look like in go (golang) as shared libraries are rather hard to use with go.

Christopher Wellons

That's a good question. There's also the garbage collector to worry about.

behemoth

Interesting!

anonymous

> Due to Windows' broken file locking behavior, the game DLL can't be replaced while it's being used.

I feel like there is not enough investigation done here. What dwShareMode does LoadLibrary end up using when it opens the DLL? What dwShareMode do you use when you try to replace it? Have you looked into using handle based APIs in ntdll instead of filename based functions kernel32? That will give you more control over sharing flags.

It's not enough to say you don't understand the semantics therefore they are broken and not workable.

Christopher Wellons

Windows' default file locking behavior has wasted tens of hours of my time, for absolutely no benefit, and will continue to do so into the foreseeable future. It's the reason why Windows, even going on 2015, still has to reboot after every little trivial update. Perhaps there's *some* way to work around it sometimes using obscure system calls that only a handful of people know, but since no applications actually do this, it doesn't matter. I'm always tripping over unnecessary file locks in Windows. Most recent annoyance: at my workplace the mandated virus scanner (Symantec) locks my fresh builds for about 15 seconds for scanning, preventing me from rebuilding within that timeframe.

What makes it broken in this case is that it's locking the wrong thing. LoadLibrary() locks the link (the path) on the filesystem -- preventing deletion, and, in some cases, renaming -- when, if anything, it should be locking the file itself (as is the case on Linux). There's no technical reason for this. It's just a design mistake MS made a long time ago that they continue to do, I'm guessing, for the sake of backwards compatibility.

win_is_not_a_monster

What about something simpler? Move the DLL to temporary location, load if from there and monitor your original path for changes. When you catch an create&delete, write or rename event on your DLL, you simply unload the old temp one, copy new it to new temp location and load it again. Then goto 1. Then if you have more time to invest in nice solution, you could invent nice solution that does not smell as a hack.

possiblywrong

Re the Symantec issue, we've also encountered this in closed environments, where it's configurable to keep Symantec out of designated folders. On your desktop, though, I'm not sure if you have enough control over Symantec to do the same thing. Agree that this is annoying :).

Christopher Wellons

You nailed it: this is happening to me in a closed environment. I had simply accepted it as an unavoidable annoyance of working on that platform, especially with the lock time being inconsistent. I'll have to ask about getting the configuration adjusted!

celeron504

Hello Christopher,

Love the post! Would like to get this working on my RHEL 6.2 system that has ncurses and ncurses-devel installed yet when I attempt to run make, it fails miserably. I see there is a ncurses.h included in game.c.

What am I missing?

Christopher Wellons

Someone else mentioned having problems with it freezing when updating in Linux Mint, if that's what you're talking about. I've tested in Debian, CentOS 6.6, CentOS 7.0, and OpenBSD 5.6, but unfortunately I haven't been able to duplicate the issue so I don't really know how to debug it.

celeron504

Thanks for the reply! I cannot even get that far. When I attempt to build and run, the compiler spits out an error about -lncurses. Oh well, I will keep at it.

Muhahah

Windows allows dynamic load/reload of the DLL plus hotpatching of any PE image. Therefore, you just need to implement either of these two techniques.

plops

Thanks for this post. I just tried to create graphical output using libvncserver using this method and it works quite nicely -- even better than expected, really. I wrote a short description on http://fourierlisp.blogspot... and put the code on https://github.com/plops/ar...

Christopher Wellons

Interesting, thanks! It really is surprising how well it works.

Ahmed Fasih

If Python or Julia are interactive languages, this kind of approach is hyper-interactive: you can change values inside "the game loop" (one iteration of loading the shared library and updating your game state) and see the result right away. A "game loop" written in a standard interactive language wouldn't allow for this much granularity in changing values inside the loop at runtime.

Wow! What a neat idea!

Christopher Wellons

You're right about Python and Julia. It's why I prefer Lisp's packages, Clojure's namespaces, and JavaScript's nothing to Python and Julia style modules, which are locked away from runtime modification. Unless you intentionally break it yourself, the former three let you manipulate anything you want about the "game loop" while it's running, like the C version here.

Ahmed Fasih

Makes me want to learn erlang!

Robin Hack

I like this approach much. I see one another quirk. If you want to use threads, you must be careful.
What can happen is ilustrated here:
https://github.com/marmolak...

Christopher Wellons

Yup, that's another tricky spot due to hidden global state. Since I wrote this article, multi-threading was added to Handmade Hero in the form of a work queue. The work queue is created by, and belongs to, the main program, not the game library. The game library is only swapped out when the work queue is empty, so it doesn't run into problems.

Playing around with your "badass" program, I can't figure it what specifically causes it to crash! I think it requires looking at the pthreads source. GDB is no help since the debugging info is removed along with dlclose(). I thought maybe it was similar to a problem I was having in my own example: ncurses was linked only with the game library not the main program, and swapping the game library would unload and load ncurses along with it, without properly tearing down ncurses in between. This lead to a crash and display issues on some systems. It could be fixed either by shutting down ncurses before unloading or by forcing (a smart linker might notice it's unused and not actually link it!) linking the main program with ncurses so that it doesn't unload. This doesn't solve the problem with your pthreads demo.

Robin Hack

What happes in "badass":
badass binary maps libbadass.so to own address space - you can check it via /proc/pid/maps.
Library have constructor which creates new thread. Thread (almost) immediately blocks on sleep() call (or long running I/O operation in real world).
Back to badass binary. Right after dlopen() and check I call dlclose().
Thread is still stuck with sleep(). dlclose() removes libbadass library from badass adress space and just take a nap via sleep(). And now fun begins. We have thread with unmapped address space :)! So.. thread will wakeup from sleep() and then try to write to x global variable... which are in unmapped area. Bang! :).
Actually I saw this in the wild with GLib and it's very hard to debug if you don't know that this can happen :).


BBC site Age check

(no comments)

Generic C Reference Counting

BruceDawson

Be aware that snprintf is not supported in VC++ up to VC++ 2013. Some clever people 'fix' this with "#define snprintf _snprintf" but this is dangerously invalid because _snprintf does not guarantee null-termination. This horror show goes away with VC++ 2015.

But I'm still not a fan of snprintf. As my blog post says, it still suffers from the requirement that the programmer explicitly specify the size, and programmers have been shown to get that wrong in at least half-a-dozen different ways.

It looks like atomic_fetch_add does a full memory barrier which is expensive, especially on ARM and PowerPC. A full memory barrier should not be necessary for a reference count. An explicit barrier might be needed after the count hits zero and before the call to free in order to ensure that all writes from other threads are visible to the freeing thread.

Christopher Wellons

I didn't realize until after writing this article that MSVCRT doesn't have snprintf(). I thought it did because MinGW does some tricks to provide a mostly-correct version (missing the same format flags as MSVCRT printf), despite linking against MSVCRT. That's the only Windows C compiler I ever use.

That's a good point about atomic_fetch_add(). I need to learn more about these low level atomic semantics.

newguy

Great post! I was using a simpler model for reference counting, this definitively beats it. I was wondering, why did you chose to use "static" for node_free?

Christopher Wellons

Thanks! I made node_free() static because it's not part of the "node" public API -- it wouldn't be listed in its header file. Only its function pointer, not the name, escapes the translation unit via ref's free field. Users of node linked lists would never call node_free() directly, instead going through ref_dec(), which will call node_free() through the function pointer (and not by name).

PerilousApricot

Great article. One thing. Even with the atomics, this seems not thread-safe. While executing node_pop()...

if (*nodes)

ref_inc(&(*nodes)->refcount);

....another thread could ref_dec and then free (*nodes) between the if and the ref_inc. I've spent some time pondering, but I can't figure a way out of it...

Christopher Wellons

You're right. That's definitely not thread safe, and what you pointed out is just one several things that makes it not thread safe. However, I never intended it to be thread-safe, just an example of linked lists sharing tails with little friction. Since I put the example after talking about atomic reference counting I can see why that would be misleading.

A few months prior I *did* write about a thread-safe, atomic, linked list stack:

http://nullprogram.com/blog...

Getting it right can be pretty hairy.

PerilousApricot

I was using node_pop as an example of how the refcounting could get weird, but I can't actually think of a counterexample where concurrent use of this refcounting would work correctly.

Every "toy" usage I try to think of fails because you can't both change the reference count and dereference the pointer atomically, and I can't think of how to get around it except for hazard pointers (guh).

Christopher Wellons

Here's an example of how the atomic reference counters could be used safely. Suppose you construct two linked lists, X and Y, in thread A and you make these linked lists share a tail somewhere along their length. Then you spawn threads B and C, and for each you increment the counters of X and Y and pass them to B and C, respectively. These hreads are now partially sharing a data structure. Normally this would be tricky to clean up if we can't know when the other threads are done.

When using the atomic reference counters, any thread can safely free their linked list at any time without leaking. The first to call ref_free() will decrement the counter to 0, free any non-shared nodes, and will observe the first shared as still being held, and doing nothing more. The last to free the list will observe a decrement of all nodes to refcount 0 and free the remaining structure.

PerilousApricot

In your example, aren't there still windows where things can race? I guess this is the simplest test case I could think of:

obj * global_ptr = obj_new(); // ref initialized to 1
void thread_A() {
// How can I be sure this will work?
obj * local_ptr = global_ptr;
if (local_ptr)
ref_inc(&local_ptr->ref);
}
void thread_B() {
ref_dec(global_ptr);
}

What keeps the following chronological ordering from happening?

if (atomic_dec(ref) == 0) //from ref_dec, refcount is zero.
atomic_inc(ref); // from ref_inc
ref->free(ref); // from ref_dec

Christopher Wellons

A thread that doesn't have a reference can't get one on its own. As you pointed out, it would be a race condition. However, a thread can safely be given a pre-incremented reference from another thread that already has a valid reference. That way there's no possible sequence of operations that would allow it to reach 0 early.


Goblin-COM 7DRL 2015

possiblywrong

This is very cool, particularly for its compactness! I like the addition of your own directives to panel_printf(); I remember once using vsprintf() to route the already-expanded string output to an LCD in an embedded application, but it never occurred to me to poke around *inside* the format string. Neat.

A question that might be more about the post than the program: how did you create the WebM video of the terminal?

Christopher Wellons

Something I discovered through this project is that with GCC you can tag your own format functions with the __attribute__ ((format ...)). That way you still get all the regular print() compiler warnings. That was one of the main reasons I've hesitated to make vsprintf() passthroughs in the past, outside of simple macros.

As for recording the video, ffmpeg has a handy screen capture capability:

https://trac.ffmpeg.org/wik...

I prefer to capture it as Y4M (raw, uncompressed) so that I can pass it through vpxenc to encode the WebM, then separately through x264 for the h.264 MP4 version.


A Basic Just-In-Time Compiler

Tom Galvin

Good read - interesting to see how you pulled off the solution!

Louis L.

Thank you very much. I tried few months ago to write a toy JIT, but I failed to solve the problem with executable memory. Your post is a very good point to start anew.

none
Christopher Wellons

I was going to mention GNU lightning at the end, but at this point I don't know why I'd use it instead of LLVM.

Ahmed Fasih

This reminds me that the best way to learn about a topic is sometimes (usually?) not to try and read about it directly but to see someone solve a real and readily-grasped problem using techniques from that topic. The whole "tell me, show me, let me" aphorism.

This really helped me understand what JITs (Julia, Matlab, JVM) do, but could you say a few words on the opposite: how do *interpreters* work? I'm trying to visualize what a small interpreter to solve this problem would look like: would it be a program that parsed the description and instead of producing assembly, running the appropriate arithmetic at each step? I don't think that's as enlightening an example to understand interpreters as your code is in understanding JIT compilers. (I ask as someone who has never implemented a Scheme interpreter.)

Serious gratitude!

Christopher Wellons

You're right, an example is worth a thousand words. When I'm reading a specification, an example up front provides so much important context for the rest of the document. A lot of the time, an example is all I need in order to get something Good Enough for my needs.

An interpreter is a lot simpler than a compiler, which is why you'd usually start with writing one of those before getting into compilers. An interpreter steps through a program and does what it says as it walks the code. For example, when it sees a plus sign (or the AST equivalent), it performs an addition operation right on the spot against the runtime state of the program. Few major language implementations, especially newer ones, actually have interpreters like this, instead compiling to at least some sort of simpler bytecode representation at the last minute (then interpreting that instead!).

A compiler outputs another program, usually in a lower level language (x86_64 machine code in this article), that executes the original input program's instructions. For example, when it sees a plus sign, it outputs code to perform an addition.


NASM x86 Assembly Major Mode for Emacs

invalid-id

Hi,
I haven't been programming with nasm for years, but your nasm-mode sounds elaborate.

Have you considered publishing the link to your nasm-mode repository on "http://www.emacswiki.org/em..."?

Nehal Patel

Hi -- Your emacs package is nice. Can you explain how to set the column location
for right hand comments (I want them to be a little further to the right). Trying to look through your elisp code, I think it relies on (comment-indent) rather than having a column number set in your package, but I'm not totally sure how this all works.
Thanks

Nehal Patel

actually it was easy enough to figure out. using: C-h f comment-indent
shows help, which has links that show how to set and save the column width.

Christopher Wellons

Glad to hear you got it figured out so easily. This is a perfect example of why there's value in building extensions on top of existing Emacs features: the extension inherits the user's existing configuration and the user may already know how to configure it. Your new understanding of comment-indent will directly carry over to other modes that also use it.

Ricardo

Your arguments against GAS are misleading. It is possible to use intel syntax (-masm=intel) with it and you can write shellcodes perfectly without glibc (-nostdlib)

Christopher Wellons

The GNU "-masm=intel" option produces/consumes assembly that looks like Intel syntax, but it's a second-class citizen. It's just a crummy version of Intel syntax with the wrong mnemonics. Also, it takes more than -nostdlib to write sophisticated shellcode. GAS doesn't allow for the necessary direct control over the assembly, particularly with addressing. It *really* wants to produce an object file to be consumed by a linker, leaving addressing details for the linker to work out. In order to get that sort of control, you need a custom linker script.

On the other hand, NASM handles all this stuff gracefully with a very comfortable flavor of Intel syntax. Its syntax is a bit too loose for automated code generation, and I wish it was a little stricter, but it's pretty great for hand-written assembly. Just beware of the GNU linker's braindead default of creating an executable stack unless the .note.GNU-stack "marker" section is included in every object file.


Raw Linux Threads via System Calls

Anon

>it takes less than 15 instructions to spawn a
thread with its own stack

Well, on the userland side, yes you only write 15 instructions. But (sadly!) the kernel can't magically process your syscalls and spawn a thread in 0 instructions :)

Interesting read, though.

kant kodali

Nice Article. So those User Thread Library calls will event gets mapped to clone() System call? say for example pthread_create will that eventually get mapped to sys_clone() or clone()? Where can we see that mapping?

Christopher Wellons

Since the details of making a system call are specific to each architecture, and library authors generally want to minimize the amount of platform-specific code, a threading library would be encouraged to go through a portable syscall() function rather than hand code it in assembly, or call clone() in a standard library. The clone() function could be implemented on top of syscall(). However, taking a look just now at two of the Linux C libraries, glibc and musl, I can see they each have separate clone() written for each architecture. Here's where you can find that for x86_64 on each.

glibc: sysdeps/unix/sysv/linux/x86_64/clone.S

musl: src/thread/x86_64/clone.s

Ultimately, yes, no matter how you create a kernel thread on Linux you'll eventually make a clone() system call.

Bakka

Very nice article thanks.! Interesting read for sure!

Shamir Udi

Great article

wang xbing

Very impressive.

Ofir

Thank you or the fantastic post, as this makes lots of sense on how thread creating works behind the scene.
For completeness, I would add support for thread function that receives an argument (similar to pthread's void* argument), by extending your thread_create function to get another argument.
This would probably need no more than two additional instructions:
push rsi
at the beginning of thread_create, and:
pop rdi
just before ret.

Christopher Wellons

Yup, you're exactly right. Otherwise it takes some synchronization acrobatics just to get that initial per-thread information to each thread.

Peter Cordes

Very nice writeup. One improvement, though: MAP_GROWSDOWN is considered harmful, and shouldn't be used even for thread stacks. The kernel doesn't reserve the address-space below the mapping to stop future mmaps from going right below the new stack, preventing any further growth. The only safe option is to allocate a big enough stack in the first place, and not depend on it being able to grow. (There's nearly no cost to mapping extra memory pages that you never touch, except for i386 code where virtual address space is limited.)

Ulrich Drepper even proposed removing it in Linux 2.6.29 (https://lwn.net/Articles/29.... See also https://stackoverflow.com/a... for more details.

MAP_GROWSDOWN also makes stack clash exploits easier, because the mapping can grow all the way into another mapping. See https://blog.qualys.com/sec... (more links at https://stackoverflow.com/q...

Christopher Wellons

Thanks for the heads up! From your links, I'm guessing the proper alternative is to reserve the entire stack at the beginning (no growing) and manually map a guard page/region below, preventing the kernel from picking those addresses for some other mapping. Though that's still subject to some kinds of stack clashing attacks unless the application is compiled with -fstack-check.

I did write about stack clashing recently, including linking the same article you linked: http://nullprogram.com/blog...

Liam Huang

Could I translate it into Chinese and post it on my own blog?

Christopher Wellons

Sure, no problem. If you include a link back to this original article then I'll add your link to the translations list at the top.

Liam Huang

Sure thing!
I've finished the translation and put your name, the link of this post on the top of the translation copy.
The translation copy could be found at: https://liam0205.me/2018/04...

Thanks for your interesting post, again!

Christopher Wellons

I've added a link to your translation. Thanks!

Liam Huang

: )


Minimal OpenGL Core Profile Demo

esgames.org

Great article! Also don't forget to mention GLFW! Its more powerful than FreeGLUT (has swap control) but not as big as SDL.

http://glfw.org

Christopher Wellons

Oh, thanks! I actually intended to mention GLFW instead of SMFL, but got them mixed up (LWJGL, SDL, SMFL, GLFW, GLUT, GTK, FLTK ... too many similar names in this domain!). I might have used GLFW originally instead of FreeGLUT, but it wasn't yet in Debian stable at the time I made that choice, so I skipped on it.

iongion

Dear zlohrd! You are so right with that enumeration!

kefeer

Thanks!

Nitpick: Either i am missing something, or there is glBindBuffer(GL_ARRAY_BUFFER, 0) missing somewhere around line 189 just to be consistent.

Christopher Wellons

Good point, I just added that. That's a perfect example of keeping the OpenGL state tidy. It's a guard against accidentally modifying the state of an object left bound by another part of the program.

Oren Hazi

Nice write-up. I wrote a (mostly) 1-1 re-implementation of your demo in Rust to play with the gl and glfw bindings.

https://github.com/ohazi/op...

Christopher Wellons

This is great, thanks! As someone who doesn't know Rust (yet?), it's interesting to see how the idioms translate. I'm also impressed with Rust's OpenGL and GLFW bindings.


Mandelbrot Set with SIMD Intrinsics

Vasileios Anagnostopoulos

Could you expand in a later article in combined SIMD and multi-core?

Thank you.

siavashserver

It scales linearly: numCores*singleCoreSIMDPerformance

False data sharing and CPU cache polution will have a negative effect.

Christopher Wellons

It was mostly summed up in siavashserver's comment: scales linearly, with caveats on how those cores interact. The complexity of x86 is makes it easier to write correct programs, but harder to reason about performance. It has out of order execution (instructions execute in a different order than written), but it has a strong memory model (memory reads and writes mostly maintain their original order). But on the other hand this makes it harder to tune performance, especially across microarchitectures. Then if that's not complicated enough, throw in hyperthreading, where logical cores share a single physical core using CPU-level scheduling (underneath OS process/thread scheduling).
http://bartoszmilewski.com/...
http://preshing.com/2012051...

It's something I need to learn more about before I'd feel comfortable writing on article on it.

ttsiodras

You may find my own (similar) attempts - from a decade ago - interesting: http://users.softlab.ntua.g...

Christopher Wellons

Interesting, thanks for the link! I haven't actually written any SSE assembly by hand yet, just the intrinsics, so it's neat to see the more direct approach.

possiblywrong

Very interesting post. I have approximately zero experience with ARM, so if you continue further down that road I would love to read more.

Ahmed Fasih

Is there any way to simplify writing SIMD, so I don't need to write different code for sse2 vs avx? I.e., some way to convert intermediate-level SIMD-ish code into the intrinsics appropriate to the different architectures? I can imagine a half-baked source-to-source transpiler, but…

Also, did you find any indication that compilers can start auto-SIMDing things? It would be amazing if, say, Julia did this automatically.

Christopher Wellons

A question all the way from Ohio! :-P

There's certainly a lot of similarity between the different SIMD instruction sets. I haven't tried it yet, but perhaps what you're asking could be done via a high-level macro interface. Macros that expand to the target intrinsics and types (e.g. SIMD_ADD_FLOATS(), SIMD_MUL_FLOATS(), etc.). Some higher-level macros would expand to multiple intrinsics. For example, testing for the early bailout condition across all lanes is rather different between the different architectures, requiring multiple intrinsics in some instruction sets.
Unfortunately I've never seen compilers output SIMD instructions (outside of single-lane SSE) and I wouldn't expect it anytime soon. I can understand why: exploiting SIMD usually requires some significant architectural considerations (packed, homogeneous collections; ex: xxxxxx yyyyyy zzzzzz), which are often at odds with OOP (sparse heterogeneous collections; ex: xyz xyz xyz xyz xyz xyz).

Ahmed Fasih

Julia seems to have a @simd macro: https://software.intel.com/...

Christopher Wellons

Interesting ...

Ahmed Fasih

Having read your blog post, the Julia Intel article, *and* this article on fast numeric Haskell https://izbicki.me/blog/fas... all in the same day, I find myself really wondering if C/C++ isn't the way to go…? The Intel writer concludes that a dynamic prototyping language like Julia getting to within 2x stone's-throw from Fortran/C is “amazing”—and I would agree—but for the amount of knowledge, experimentation, and luck needed to get “smart” languages like Julia and Haskell to within that 2x, I wonder if those of us who need to get Real Work Done aren't better served just slogging through C… and that's not even considering the desire to squeeze blood from a stone and get to parity with a C implementation.

Christopher Wellons

Here's a very relevant video:

Fast as C: How to write really terrible Java https://vimeo.com/131394615

Summary: Getting high performance out of Java essentially requires giving up all the nice things about Java and writing some really ugly code in a narrow, C style.

I don't know what it is, but over the past year or so I haven't really had a desire to code in a language other than C (or C++ to a degree). I did a few things in JavaScript (supporting GUI) in the past year at work and it felt really fragile compared to the C work it was supporting. C's gotten so comfortable for me now (at least when I have near total project control!) that it doesn't feel at all like "slogging." About a year ago I also completely gave up on the Lisp family (except Elisp), in part because the runtime platforms are far too bulky across most of the implementations. I've been enjoying crafting small, tight binaries where I can account for nearly every byte of the result.

Ahmed Fasih

Now we have a nice API to SIMD in Rust too! Time to play.

notimportant

You may want to look into this:
ispc.github.io

intel spmd program compiler exports to C/C++ and compiles to sse2/4, avx/2/512 and neon. there is also a paper available.

Ahmed Fasih

Something to watch: SIMD.js: https://developer.mozilla.o...

Christopher Wellons

I saw a video and read about this a couple of years ago. I'm still skeptical about its usefulness.

WebGL operates asynchronously and, once everything is set up, only requires a couple dozen function calls per frame to perform a draw. The speed of the language making those calls is largely irrelevant.

SIMD on the other hand is not (truly) asynchronous. The heavy lifting happens within the context of, and is synchronous to, the JavaScript runtime. There's a lot of fanfare just to execute a single SIMD instruction. More so, I learned from this Mandelbrot Set project that NEON, unlike SSE and AVX, is really difficult to use effectively; it essentially requires hand-tuning the assembly. As mobile continues to grow, ARM is the platform that would benefit the most from better JavaScript performance, but, IMHO, NEON can't really be driven well by this sort of low-level API.

If something higher level, sort of like WebGL, was built-up around SIMD, something useful could be done with it. For example, on Linux, MESA will render using the CPU's SIMD capabilities when a GPU isn't available. For those platforms, that makes it a high-level API for SIMD which can be used effectively from JavaScript.

Diego Alonso Cortez

I get the code to compile, and if I do `./mandel.x86 > mandel.png` I see the beautiful picture. How do I visualize the render in real time, though?

Christopher Wellons

You would need to hook up some sort of windowing toolkit and make it continuously redisplay the image buffer as the SIMD threads run. Personally, I'd do this with OpenGL running in its own separate thread. To keep things simple for this article, I only made the program dump out the image buffer when done.

Note that the final output format isn't actually PNG. If that's what you're doing and it still works, your image viewer is simply sniffing the file header to detect format, ignoring the file extension (i.e. what should usually happen).

Diego Alonso Cortez

Great info, thanks! (Yes, a PNG dump works, and I had no idea why!)

Cameron Elliott

Beautiful work! Could you consider adding a LICENSE file to the repo? If you want suggestions, I would recommend the BSD or MIT license.

Christopher Wellons

Whoops, forgot to do that. I just added the UNLICENSE (public domain), which is my favorite. Thanks!

Maxim Chinyakin

Thank you for the explanation, a nice subject to exercise on SIMD intrinsics, showing why old and badly written programs can't use the modern CPU power to the full.
Note that if you use an integer accumulator register (__m256i mk instead of __m256 mk), you don't need to convert it with _mm256_cvtps_epi32 for storing, just scale it and select the bytes which you need to store with _mm256_shuffle_epi8.

Christopher Wellons

That's a good observation about an integer accumulator. I don't remember why I chose to do that as a float. Unfortunately _mm256_shuffle_epi8() is AVX2. I don't have access to any hardware with that instruction set, and I'd prefer to have a separate implementation just for AVX2 anyway.

Peter Cordes

The CPUID instruction is fairly slow, so you don't want to do it for every call. With a really expensive function, it won't make a measurable difference, but maybe you're vectorizing something that does something fairly fast on a 1k buffer. It's better to check CPUID once when your program starts, and keep the result in a global.

gcc / clang can do this for you with `__builtin_cpu_supports("avx")`. See https://stackoverflow.com/q....

Also, instead of branching on a global, you can set function pointers in an initialization function and then call through them later. An indirect branch still needs branch prediction at the runtime-dispatch site, but it takes less code. It also avoids a chain of branches to get to the final selected function, which is good if it's not the first one.

This is also good if you want more complicated logic for figuring out which function to use on which CPU, like x264 (the video encoder) does. (e.g. SSSE3 with slow shuffles vs. SSSE3 with fast shuffles). Or if your functions depend on multiple ISA extensions, like one version that uses SSE4.1 and popcnt. You need to check for both separately.

Peter Cordes

When you compile, I'd suggest using `-O3 -march=native` to make a binary that's optimized to run on the host you built it on, and can use everything instruction set extension it supports. `-march=haswell` for example sets `-mtune=haswell`, as well as enabling `-mavx -mavx2 -mfma -mbmi2 -mpopcnt` and so on. Enabling FMA lets the compiler fold multiply and add intrinsics into an FMA, which can help a lot for Mandelbrot.

`-mtune=haswell` is also fairly good, especially for AVX, because it disables splitting of unaligned 256b loads/stores into two 128b halves with vextracti128. See https://gcc.gnu.org/bugzill... for more details.

Christopher Wellons

I deliberately limited the architecture support along the main path, avoiding anything like -march=native for the translation unit containing main(). The idea is to build a binary that runs correctly with SIMD across different hosts by dynamically choosing the appropriate implementation at run-time rather than compile-time, like your example of x264 doing just this.

Good points about FMA and the advantages of tuning to a particular architecture (-mtune=haskell). I had sort of assumed -ffast-math would cover something like FMA, but that appears not to be the case. I should probably be using FMA intrinsics manually anyway.

To address your other comment: CPUID _is_ only executed once. It happens at the very beginning of the program when selecting an implementation. As noted in the article, "_builtin_cpu_supports" wasn't yet supported by Clang two years ago, and even still it only arrived in Debian stable (my OS of choice) a couple weeks ago. I chose an alternative that worked on both without changes.


Shamus Young's Twenty-Sided Tale

possiblywrong

> Unfortunately it’s not so much a technical problem as it is a social/educational one.

This is spot-on. A recent several weeks spent "group-editing" a technical document... using tracked changes in Word :), felt comically tedious in places where it didn't have to be. I think the problem is really two-fold: there isn't even widespread comfort with a more reasonable text document format (e.g., LaTeX), let alone use of source control to more efficiently "share" editing effort.

Samrat Man Singh

How do you find the experience reading CS/programming-related books? I already have a Kindle but I'm considering buying a tablet for reading PDFs.

Also, are you still using the Note you mentioned in your other post?

Christopher Wellons

Programming books and papers are mostly what I've been reading the last few years. I initially got a cheap Nook Tablet in 2012, which has a smaller screen and resolution that my Samsung Note. The Nook is just garbage, honestly. I totally regret getting it. (Worse, it's one of the unrootable varieties!) Even ignoring that it runs an awful not-quite-Android OS, the screen is simply too small for comfortably reading PDFs and was preventing me from reading many programming books.
With the Note in 2013 onward, I had a screen big enough to comfortably read PDFs, making a lot more books available. The aspect ratio (landscape) is also good for flipping through PDF presentations. At this point I prefer PDF programming books to paper books because: 1) The backlit screen means I can read in the dark 2) It's not heavy and clunky like a real programming book 3) It's searchable 4) Being comfortable to read on this screen, there's no real disadvantage to it 5) Quick, easy Internet access to look things up as I read (though, this could be a distracting downside!).

Ahmed Fasih

Reading it now! Off topic, but my friend plays Crusader Kings II and tells me about the crazy events that happen when he's playing (setting up an anti-pope who usurps the Vatican, forcing the actual pope to flee to a Mediterranean island which becomes his much-reduced kingdom) and I always, always beg him to try and turn his games into stories—like this guy did: http://www.peteradkison.com... looks like Twenty-Sided Tale is along the same lines.

I've used Pollen (from the Racket ecosystem) recently, after tiring of writing custom Pandoc writers in Lua. I wrote a tutorial on Pollen: https://github.com/fasiha/p... but I think there's a lot of dimensions the state of the art in authoring hasn't explored.

Christopher Wellons

Ahmed, you and I are on the same wavelength! This is the core reason I consume YouTube in lieu of traditional television programs. I enjoy watching (and in some cases participating via comments as) people's game stories unfold, drawing on their insight and interpretation of the in-game events.

To share a similar story, just a few months ago I watched a long playthrough of Medieval II Total War by Damo2986. (I watch most of YouTube at 150% speed in VLC, so it doesn't take quite as long as it might look!)

https://www.youtube.com/pla...

Total War doesn't have quite the same depth as a grand strategy game like Crusader Kings II, but, despite this, in this game he had very similar interactions with the Vatican in an effort to control the Pope (to turn a blind eye to his conquests of Christian lands).

As far as computer games go, probably the best story engine of all is Dwarf Fortress. For the player's personal sandbox, it generates an entire Tolkienesque world with a history full of kingdoms, heroes, villains, and ancient beasts, simulated at the resolution of individual teeth, fingers, and claws. It becomes a rich, unique context for the player's own story. If you want something to read, I highly recommend you look up Boatmurdered. Or, for something shorter, Bronzemurdered. For video, Damo2986 also has a collaborative playlist on his channel where the group takes their time to dive into the history of the world and their dwarves.

Thanks for the tip about Pollen! I'll get up to speed using your tutorial.

essay editing service uk

The power of technology that you can be able to read these kind of stuff that would totally help a lot of people. They can manage to bring all of those information into a handy piece of thing and it's quite wonderful.


Recovering Live Data with GDB

australianwritings.com reviews

Learning this kind of guide is actually helpful for someone who needs to gain some effective recovery if they encounter some loss data problem and the main solution is to only do this kind of thing.


Web Tips For Webcomic Authors

Ahmed Fasih

I hope an author or five take you up on this! To the webcomic authors reading this: Chris’ offer here isn’t like your little cousin Vinny’s offer to set you up with a Wordpress site or anything like that. Chris is a very skilled software engineer who loves to mix his profession with his passions and will very likely make you something beyond your dreams.

Christopher Wellons

Aww, shucks ...

Yu0

I was wondering: Do you also follow some webcomics irregularly (e.g. catching up only every few months), and on mobile devices?

I don't know the exact number of webcomics I follow, but between the likes of XKCD, SMBC, Dilbert... I have started following too many webcomics to keep up with reading the RSS feeds regularly. Because I follow content both on PC and on mobile devices, I am using feedly - but feedly and other hosted alternatives keep the "unread" status only for a limited time (a necessity to avoid database clutter) and hence are unsuitable for following a given webcomic on-and-off. This especially applies to story-heavy webcomics that suffer from viewing pages one at a time.

The best alternative I know of is comic-rocket, but on iOS it doesn't work too well: The dedicated app is broken and the webapp isn't mobile-friendly. Also many webcomics use redirect scripts to prevent embedding as a frame, which breaks the webapp. This last part is probably the only part directly related to

Christopher Wellons

Depends on the webcomic. If it's short, standalone content and doesn't have long story arcs, I'll subscribe right away via Atom/RSS. I'm picky, so I don't subscribe to enough comics that I have trouble keeping up. I think part of what helps is that I use my own web feed reader (Elfeed, in Emacs), which allows me to manage feeds very efficiently. However, I don't have a good way to access this via mobile, so I don't try. I read all my feed content on my desktop.

If I find a good comic that requires I catch up before subscribing (example: OOTS), I don't subscribe right away. I'll scrape the entire comic to date (wget, curl, etc.), turn it into a PDF/ebook, and read it on my tablet in the evenings as if it were a regular book. Once I'm caught up, then I subscribe.


Counting Processor Cores in Emacs

(no comments)

RSA Signatures in Emacs Lisp

possiblywrong

Interesting post. The calc-next-prime function doesn't really use Fermat's primality test, does it? Exercise 10 in the linked documentation even says, "If n is not a prime number, [x^(n-1)%n==1] will not be true for most values of x," which is unfortunately not true (there are some composites for which the equation holds for *all* x). However, the *solution* to that same exercise mentions the use of a "variant" of Fermat's test, which hopefully plugs this hole?

Christopher Wellons

Up until I wrote this article I assumed calc used Miller-Rabin like everyone else. Wanting to avoid making claims based on false assumptions, I dug into the code and found either Fermat's primality test or at least something that resembles it. See here, line 865.
http://repo.or.cz/emacs.git...
Unfortunately that function is a real mess and I'm not interested enough to spend time plucking out the exact algorithm. Based on what you found in the documentation, I'm guessing the author must have changed something about it to turn it into a "variant."

possiblywrong

Thanks for the link. This is essentially Miller-Rabin, with some extra special cases thrown in.

Yu0

You mentioned hashing files. Did you by any chance think about a hashing solution for files larger than the 512MB limit in 32bit Emacs or for files that are too big to be loaded into a buffer due to memory constraints?

Christopher Wellons

That's a good point. The interface to Emacs' secure-hash doesn't permit hashing anything larger than the Emacs limit because it doesn't expose a running hash state. The secure-hash interface should have something like this in addition:

(let ((context (secure-hash-create 'sha384)))
(while (...)
(secure-hash-update context more-content))
(secure-hash-final context))

A similar effect can be achieved by hashing parts individually, then hashing the concatenation of the hashes, but that won't produce a *true* SHA-X hash. That would be important if you wanted signature verification by an independent implementation without needing to define a custom hash function.

Alternatively, when it's a large file you could call out to a command line program if it's available.

(defun sha384-file (file)
(with-temp-buffer
(call-process "sha384sum" file t nil)
(buffer-string)))

A third alternative would be to implement the hash function in Elisp itself, but that would be *incredibly* slow -- just like this pure Elisp RSA stuff.

Yu0

Actually the problem doesn't even start only when calculating a hash. In 32bit emacs, any bytes of a file beyond the 512 MB limit are entirely inaccessible to elisp functions, so apparently even just for reading those bytes an external program like dd would be needed.

Which is a shame, since these are not available on Windows (also: Possibly slow).

(On a curious side note: Apparently calculating a file's sha512sum from a stream in python3 is faster than using the sha512sum shell command).


Quickly Access x86 Documentation in Emacs

(no comments)

9 Elfeed Features You Might Not Know

Omar

"Previously it faked a header by writing to the first line of the header." should probably be "Previously it faked a header by writing to the first line of the buffer."

Christopher Wellons

Fixed, thanks!

Fuco

Love the alist for faces. And it even composes! (unread + a tag face makes that tag face bold, which is very neat)

Daniel C Berman

Based on the inspiration of Ryan Rix and his "Body Computing System" as well as Howardism with "Emacs is my New Window Manager," I have been working on creating an integrated work environment inside emacs. Elfeed is intriguing me, but I am trying to figure out how to follow twitter posts and RSS feeds inside the same Emacs interface similar to what Inoreader does. (http://blog.inoreader.com/2... Is this something you have looked at or considered?

Christopher Wellons

Twitter used to host RSS feeds for all its content, so it would be trivial to follow anything on Twitter from a standard RSS reader. However, a few years ago they declared RSS an obsolete technology and completely dropped support in favor of their proprietary API.

One way to deal with this would be to add a new fetcher that goes through the proprietary Twitter API (requires an API key, etc). Since I don't use Twitter, this is unlikely to ever become part of Elfeed proper. But perhaps twittering-mode could be plugged into Elfeed as an extension.

A much simpler solution is to use a service that exposes Twitter feeds as RSS: https://twitrss.me/ Unfortunately this particular service doesn't support following hashtags/searches/etc. as your link mentions. That's where a custom Twitter API fetcher would have an advantage.

Daniel C Berman

Thank you for your thoughts!

Kai Arzheimer

Twitter-RSS-parse is a simple but reliable self-hosted solution https://github.com/jdelamat...

Kai Arzheimer

I love to work within Emacs as much as possible and I am intrigued by elfeed, but rely on a self-hosted instance of TT-RSS for all my feed needs (works in any browser, decent Android app, lots of goodies for filtering high-volume feeds). Last summer, you mentioned that syncing with a TT-RSS server could become a feature. https://github.com/skeeto/e...
Is that still on the cards?

Christopher Wellons

It's still in the cards, just hasn't been an itch for me to personally tackle yet.

If Android is important to you, there is an Android interface to Elfeed maintained by Toni Reina:

https://play.google.com/sto...
https://github.com/areina/e...

JohnKitchin

Is there a way to do custom sorting of entries? I am working on a scoring method for new articles again, and I wondered if there was away to get them to show in the search window in sorted order, e.g. by descending score, or grouped by tags, etc. I saw elfeed-sort-order, but that seems limited to time according to the docstring. Maybe a user defined sort function that takes two entries and returns non-nil if the first entry should sort before the second one would be suitable.

Christopher Wellons

You got the right impression. Currently you can sort by any property so long as it's time. That's because the database only has an index for time and not for any other property, and entries are naturally gathered up already sorted by time. During live filter editing this is exploited to speed up the displayed results by bailing out even earlier, making it feel more lively. There's not really any way around this without adding more indexes to the database, increasing its size and slowing down insertions (just like adding indexes to a real database).

Since you asked, I just committed b62a2d0 which adds elfeed-search-sort-function. It was pretty simple except that it doesn't interact properly with live filter editing. Since elfeed-sort-order already interacts badly with live filter editing for the same reason, I figured it was alright to add this, too. The problem is that during live filter editing the early bailout happens on the time-sorted version, not the custom-sorted version, which would be a lot slower.

While I was writing this, I just took a moment to estimate the time it would take to build a temporary index on the fly for live filter editing. Looks like the typical case would around 500ms to 1000ms. Maybe this is live editing startup cost is acceptable for correct results under non-default settings.

JohnKitchin

It works great, Thanks!


Small, Freestanding Windows Executables

lobster

"freestanding"

Ray Long

This is interesting. It is hard to imagine modern Windows development without .NET, or for the more hardcore, MFC. I'm going to re-read this a few times. Also, you mentioned that the LGPL licensing of a C library was "unfortunate." Personally, I'm a fan of the MIT License; both as a producer of code, and a consumer of code in a commercial setting with proprietary products under development. What is your preferred open source license, and why?

Christopher Wellons

The LGPL isn't bad when you're dynamically linking against it, but creates a lot of friction under static linking. You either have to distribute source code or your unlinked object files, either of which can be impractical in many circumstances. The MIT license, used by musl libc, is far friendler for a libc. A small copyright notice is easy to manage. I've used musl to make clean, standalone Linux binaries without the significant license hurdles.

Personally, I prefer a public domain dedication, both as a producer and a consumer, eliminating all legal burdens on the user. I specifically use the UNLICENSE on my open source projects. If you check the footer below these comments, you'll see this blog is also in the public domain. This isn't to say I don't want credit for my work, but rather I don't wish to enforce it legally. It's only a moral request.

Semih Masat

Looks like i really need to read about licensing and common open source licenses.

SonicJumper

"It is hard to imagine modern Windows development without .NET" - Absolutely not hard to imagine if you are aware of Delphi. Based on Object Pascal, has a visual form designer, can produce self-contained Windows executables in native code... has always been and still is a great way to develop for Windows. Microsoft actually poached one of Delphi's key architects (Anders Hejlsberg) to work on .NET and C#.

Ray Long

Embarcadero hasn't been the greatest steward of Delphi, and in the US, Delphi jobs are not in high demand compared to C#, C++, and Java.

SonicJumper

Even if that were true, it doesn't change the fact that Delphi is still a great non-.NET way for "modern Windows development". And there's also FreePascal...

Ray Long

Perhaps.

Jaime Lopez

Niee article. It is like seeing the world without all those disturbing methaporical layers. I'm going to try your suggestions.

Christopher Wellons

I'm no fan of giant runtime frameworks, regardless if the domain is C/C++, Java, .NET, JavaScript, etc. It's led to so much awful, bloated software.

thebosz

Note that the link to phreedom.org trips Firefox's malware blocker.

Embedded Systems

Great article. I would like to let you know that it has been featured in the last embedded systems weekly newsletter http://embedsysweekly.com/e...

Christopher Wellons

Interesting newsletter, thanks. I just subscribed to the RSS feed. And one small correction: my last name ends with an s. :-)


Calling the Native API While Freestanding

Alex Elsayed

Hey, just an FYI - one can, in fact, implement memmove efficiently in standard C - or at least, more efficiently than using a temporary buffer.

Specifically:
- When dst and src are nonoverlapping, memmove is equivalent to memcpy
- When dst's tail overlaps with src's head, memmove is equivalent to "forward" memcpy
- When dst's head overlaps with src's tail, memmove is equivalent to "reverse" memcpy

One important thing to note is that in the latter two cases, dst and src must be pointers into the same object (allocation) according to the C standard - and thus, their pointers are comparable.

Thus, the trick is to distinguish the first case from the other two - if either of them is in effect, we're free to use the relational operators in order to decide which way the overlap goes.

This can be done trivially:

int overlapping(void* a, void* b, size_t len) {
void* btail = b + (len-1);
// For every offset in 'a'...
for (void* cursor = a; cursor != a+len; ++cursor) {
// ...check it for equality to the ends of 'b'
if (cursor == b || cursor == btail) {
return 1;
}
}
return 0;
}

This is still O(n) - we have to iterate over the length of a - but does not require any allocation, and can be optimized by the compiler, which knows the target machine, and thus isn't bound by the C standard's restrictions regarding inter-object pointer comparisons.

Of course, this same function can - with no added complexity - also tell us the direction of the overlap, and thus fully implement memmove without using the relational comparisons at all:

void* memmove(void* dst, void* src, size_t len) {
void* srctail = src + (len-1);
// For every offset in 'dst'...
for (void* cursor = dst; cursor != dst+len; ++cursor) {
// ...check it for equality to the ends of 'src'
if (cursor == src) {
// 'src' begins inside of 'dst', and thus we'd
// need a forwards memcpy
return memcpy_forwards(dst, src, len);
} else if (cursor == srctail) {
// 'src' ends inside of 'dst' (and thus 'dst' began
// inside of 'src'), so we do a reverse memcpy
return memcpy_reverse(dst, src, len);
}
}
// No match, safe to memcpy!
return memcpy(dst, src, len);
}
Christopher Wellons

Oh, that's clever! In the time since I wrote the article, I had believed the situation was even worse: allocating the temp buffer could fail, and memmove isn't supposed to fail. Your idea technically solves the problem.

One important fix to your example: those pointers should be cast to char before any pointer arithmetic. Arithmetic on void * isn't defined, though gcc supports it.

I tried your idea with gcc (4.9.2) and clang (3.5). Unfortunately neither compiler optimized the loop into an O(1) comparison. That would be pretty awesome if they did.

Alex Elsayed

I'm kind of curious what CompCert would make of it, to be honest.

Alex Elsayed

Thinking about what inference would be needed in order to implement O(1)-izing this, I think the following would be sufficient:

1.) dst <= cursor < dst+len for all cursor (can be inferred from bounds, and it only being updated via ++)
2.) (memcpy forwards branch) As cursor == src, dst <= src < dst+len (pointers can only be equal in the same object, so in this case this ordering relationship can be inferred from cursor)
3.) (memcpy reverse branch) As cursor == src+len, dst <= src+len < dst+len (same)
4.) (memcpy reverse branch) As dst and src are both pointers to char, they must have the same layout, thus the pairs (dst, dst+len) and (src, src+len) are different by the same amount
5.) (memcpy reverse branch) Therefore, src <= dst < src+len. This is disjoint from (2) _except_ in the case that src == dst, and that can be resolved by the ordering of the 'if's.

However, all of these are almost certainly already implemented - so where's the catch?

The trick is that the compiler, in order to lift these out of the loop, needs to prove the fallthrough case, which is easy to miss because it's never written explicitly - dst <= dst+len < src || src < dst <= dst+len.

At the C level, it cannot do this as it has no proof of same-object (because it didn't pass an equality test). That's why it's never written explicitly, in fact - testing for it would be UB, but testing for something that is its exact inverse is fine, and so we just allow ourselves to fall through.

As a result, the lifting requires the backend inform the optimizer that pointers to the same type form a total ordering, regardless of object.


Hotpatch a C Function on x86

Albert Zeyer

Very nice read. What compiler option is missing in Clang?

Christopher Wellons

Sorry, I should have elaborated. Clang doesn't (yet?) support the ms_hook_prologue function attribute. This has been a problem for Wine, since they rely on it to mimic parts of the Win32 API.

Doug Binks

I hadn't realised GCC supported the ms_hook_prologue function attribute, so glad to read this.

I've added links to your blog on a list of runtime compilation alternatives I keep on my Runtime Compiled C++ wiki:

https://github.com/RuntimeC...

Let me know if I need to correct the text there, and thanks for the information here - I'm mulling over a xplatf hotpatcher for C now...

Rick Gorton

This is not guaranteed to work - the largest write which is atomic (in terms of granularity) is a 4 byte write. That is, a write of all 5 bytes is not guaranteed to complete in the same cycle. What you have to do is a stuff in a jmp to ., perform a cache flush, then the 4-byte write of the displacement. This is documented in various of the Intel/AMD microarchitecture documents

Christopher Wellons

The "Intel 64 and IA-32 Architectures Developer's Manual: Vol 3A" section 8.2.3.1 states that aligned 8-byte memory accesses are atomic:
"The Intel-64 memory ordering model guarantees that [...] the constituent memory operation appears to execute as a single memory access: [...] Instructions that read or write a quadword (8 bytes) whose address is aligned on an 8 byte boundary."

If this wasn't true, there would be a whole lot of broken software out there that assumes aligned pointer writes are atomic.

Rick Gorton

I was thinking re: x86 (32-bit), not x86_64 - thanks for the correction.

Fanael

You forgot __attribute__((noclone)), otherwise GCC could make a clone of the function for optimization purposes.

Christopher Wellons

Yup, you're right. I noticed that this morning as I was experimenting with it more. I just added a paragraph for it. Even more, GCC needs to be convinced there's a side effect so that it doesn't optimize away future function calls.

chamibuddhika

This works since x86 guarantees atomicity for aligned reads and writes w.rt. instruction fetch. LOCK prefix can be used for unaligned reads/writes to ensure atomicity of data reads and writes. But Intel SDM says (in 8.1.2 Volume 3A) "Locked instructions should not be used to ensure that data written can be fetched as instructions", which suggests instruction fetch is not atomic even with LOCK'd instructions, specially when the accesses are unaligned. Intel SDM (8.1.3 Volume 3A) suggests a cross modification protocol which requires global synchronization to ensure correct operation in such scenarios. Anyhow recently we unwittingly hit this limitation in our work and this led to some exploratory work for finding a way to relax the global synchronization requirement. Our work can be found at http://conf.researchr.org/e... if you are interested.

Christopher Wellons

Thanks, that's a very interesting paper! I'm learning a lot from it. I added it as a "further reading" link.

chamibuddhika

Thanks. Glad you found it useful.

Fanael

Another way of achieving thread-safety would be mmaping (RW) a completely new page, copying and patching the code there, and only then mprotecting it to RX and mremaping into the old place. Magic 8-byte alignment becomes unnecessary, is perfectly W^X safe, and it's trivially portable to architectures not supporting atomic writes of whatever is their instruction size.

Christopher Wellons

I was thinking more about this and I'm not convinced it's entirely reliable. Sure, you can "write" an entire page atomically, but what about instructions that span cache lines? The write is atomic, but the instruction fetch wouldn't necessarily be. Also, what if the target instruction spans pages? Can I rely on the OS to remap my entire request atomically or might other threads see one page remapped, then another?
However, I do believe your approach *does* cleverly solve the problem of W^X safety.


Mapping Multiple Memory Views in User Space

Konstantin Khlebnikov

I know one usage for that: ring buffer. You just mmap buffer twice, so message at the end overlaps into beginning but you can read tail from next mmap of the same buffer.

BTW for cpus with VIVT caches ring buffer must have particular size to avoid problems with aliasing.

Nicholas Frechette

You have to be careful with things like this since it can have implications for the TLB and cache. You can read more about it here https://cseweb.ucsd.edu/cla...

Here be dragons :)

Christopher Wellons

Thanks, that's useful information!


Makefile Assignments are Turing Complete

awreece

Cool! I played around with this a while back -- if you (ab)use the `patsubst` function, you can get subtraction: http://codearcana.com/posts... This is a more direct approach than the pure variable substitution.

Christopher Wellons

Interesting, thanks for the link! You were a whole lot more formal about proving it.

possiblywrong

Very cool! I would like to share this with some students-- as another layer of bootstrapping, I'm wondering what you used to create the makefile (since I'm guessing you auto-generated most of it)?

Christopher Wellons

Yeah, I definitely wasn't writing that all by hand! I wrote a few Emacs Lisp functions in a scratch buffer to generate each chunk of the Makefile. But since 1) it was simple, 2) it was ugly, due to feeling my way around the problem as I went, and 3) I massaged the final result by hand, I didn't bother saving it. It should be easy to reproduce just from observing the patterns in my life.mak. Writing a program to generate a Makefile with arbitrary-sized grid and with a specific starting state would be a good exercise.

buskanaka

One of the original mal implementations (Clojure-inspired Lisp) was in GNU Make macro language: https://github.com/kanaka/m... One of the most challenging parts was implementing arbitrary precision numeric operations using string/boolean operations: https://github.com/kanaka/m...

Christopher Wellons

That's both impressive and horrifying at the same time! Very interesting. Thanks for the link.

Tom Ritchford

That is both brilliant and silly. :-)

Marcus2012

How can we handle a whole class of files in one rule?

like, I want to compile all the .c files, do I really need to list them all, then create a rule for them all, or can I just be like "here's the source folder, all the *.c files inside it should be compiled to *.o in a different folder?

Christopher Wellons

Unfortunately there's no way around this. One of the weaker points of make is that, for better or worse, it really is just a dumb tool. It's up to you to explicitly build the entire dependency tree for it. That's why I think it's appropriate for larger projects to use some other tool (script, CMake, etc.) to generate most or all of the Makefile. For a couple of my own projects (not open source ones, sadly) I've hand-written bourne "configure" scripts to generate the Makefile from a template (via -MM/-MT).

But being a dumb tool means you can use it to build _anything_: C programs, LaTeX documents, ebooks, VM images, etc. Its just driving the dependency tree that it was given.


You Can't Always Hash Pointers in C

Kat Marsen

The reason for the insanity is because the "flat address model" we all take for granted didn't exist in the first place. For example, on the 8088, which x86 is based on, the instruction set allows for addressing by segment:offset, where each partially overlapped. "near" pointers only use the offset, but can not address all available memory. "far" pointers use the segment to reach further, but segment:offset cannot fit into a single machine word (an int). Casting a far pointer to an int would necessarily lose information.

Alan R Wolfe

That issue was still present on the 486 in 16 bit mode. That's when I "joined the fray" of programming, so remember it well (:

x3ns

Interesting stuff. You have a typo in your example function.

example(void *>>prt_a<<, void *ptr_b)

Christopher Wellons

Oops, thanks! Just pushed the correction.
https://github.com/skeeto/s...


Four Ways to Compile C for Windows

Mārtiņš Možeiko

If you compile your source with __USE_MINGW_ANSI_STDIO defined to 1 then mingw-w64 will support at least stdio c99 library features (like z modifier).

Christopher Wellons

Whaaa ... I've never heard of this before! It's entirely undocumented as far as I can tell --- it's not mentioned anywhere on either website, nor in any man page, nor is it explained in the headers --- but it certainly fixes all the printf woes I'm throwing at it. Do you know why this isn't the default when using -std=c99? This is incredibly useful information, thanks!

Mārtiņš Možeiko

Yeah, it's a very obscure feature. No idea why it's not default.
These wiki pages looks like official documentation on this:
https://sourceforge.net/p/m...
https://sourceforge.net/p/m...

Miguel Lechón

Thanks for taking the time to write this down.

FRex

I found out about Pelles C via this post and used it for few toy C programs and I'm not very happy with it.

I ran into a few errors related to warnings being wrong:
https://forum.pellesc.de/in...
https://forum.pellesc.de/in...
https://forum.pellesc.de/in...
https://forum.pellesc.de/in...

and into one that outright breaks code of stb_image on -O2:
https://forum.pellesc.de/in...

Christopher Wellons

Incorrect diagnostic messages is something I can live with, though the ones you found in Pelles C do suggest there might be more fundamental underlying issues. Every version of GCC I've ever used has edge cases where it produces false warnings or otherwise incorrect warning messages. Fortunately for GCC it really is restricted to rare, edge cases.

However, your last link showing Pelles C getting basic integer arithmetic wrong is scary and makes me very wary of trusting it. I was able to reproduce that bug myself and confirm it. Thanks for pointing that out! I've never caught anything this serious when running my test suites on binaries built by Pelles C.

FRex

It was hard and annoying to track since it's such a narrow case, all of: -O2, result that has a top bit of 1, same type for both operands and the result, uchar or ushort, etc. Maybe you can note that in the post itself too to warn someone who reads it but doesn't look at the comments to not rely on it, or at least avoid or be careful with -O2 and these implicit integral promotions (like 2 or 3 things out of 5 I reported to them are related to these).

Funnily enough if you inspect the asm it produces (pass to their cc.exe -O2 and one of -Tx64-asm or -Tx86-asm) it breaks consistently with itself. If the input values to (a + b) / 2 are known at compile time it's a single mov of a precomputed value (with top bit zeored already), if not then you can see an 'and al, 127' or similar in the asm after the operation, but the value is the same (top bit incorrectly zeroed out) in both cases.

For me it came up in this small program: https://github.com/FRex/topng

Long story short it was the average filter decoding (STBI__F_avg) that did such an operation. I was testing if running topng few times on same image over and over produces same result as it should but in case of some images it'd instead progressively corrupt them, or corrupt them once, or corrupt them if I ran optipng on them first, etc.

I meant topng to be safe, robust, easy to use, ran as a batch via some find + xargs + topng combo on many images, etc. so I quickly dug into the corruption once I saw it (thinking first it must be some mistake I make or some edge case of STB) to not cause someone to lose their data by using it, especially since they might not notice if it corrupts one image in a batch of many or corrupts very slightly at the corner or edge only.

I now also put a warning about this -O2 bug in the README of each of my small C utils so that someone who sees my "here's 32-bit Windows exe made with Pelles C" releases and looks up Pelles C themselves won't destroy their data by running my C program after compiling it with -O2 too.

I try not to be too harsh on gratis software made by a single person and not run around saying it's bad like some nitpicker but it's sadly not FOSS (that'd probably help find such bugs) and yesterday I ran into yet another warning bug, posted it and got a reply from a moderator (that they updated after my next post) that sounded like they assume I'm a C newb (despite all my 5 threads on there being bug reports) and like they didn't even read the title and the text, just skimmed the code. That made me decide to not worry about sounding like an asshole spreading FUD and warn you too.

FRex

For some reason a longer reply I left is gone (?). It was basically story of how/what my program I found the -O2 bug in and that if you look at asm output (-Tx64-asm) you can see it being broken the same no matter if its precomputed at compile time (single mov of value with top bit already zeroed) or calculated at runtime (it does an and of result register with 127 to zero top bit out after the operation) so it's consistent with itself in that way.

Christopher Wellons

Sorry, the Disqus spam filter got it. It should be visible now. Your response arrived to me via regular email just fine, so I read it without realizing there was an issue here on Disqus.

FRex

No problem, glad you "enjoyed" the -O2 uchar/ushort bug.

Graham Toal

tcc and lcc have been good to me in the past for quick to install and easy to use systems that don't drag you kicking and screaming into the Windows world like the Microsoft compiler does.


Elfeed, cURL, and You

Fuco

No longer will elfeed clog up and hang my emacs? I love you for doing this!

Christopher Wellons

If Elfeed normally clogs Emacs for you, then think of cURL as a cable auger.

Byron Schlemmer

As a avid user of elfeed, on Windows, *many* *many* thanks for the work and succinct post on the topic.

Erik Sjöstrand

I've been using Elfeed on Windows for quite a while, but I don't think it has crashed on me. Will try it out with curl though!

Seth Mason

Nice. This fixed a few of the mysterious problems that `url-retrieve` had with getting some of my feeds.

Thanks!


Stealing Session Cookies with Tcpdump

possiblywrong

Very interesting post-- maybe not entirely unrelated, I have noticed that Road Runner seems to be particularly aggressive *in the store* about asking for a lot of unnecessary personal information.

Do I understand your last section correctly-- setting cookies to both Secure and HttpOnly, would the online cart "fail" anytime you tried to search, since it would redirect to an HTTP page and thus show a logged-out view?

Christopher Wellons

The shopping cart state would be preserved and untouched once you got back to an HTTPS version of a page. If done correctly, you wouldn't be able to put items in your cart from the HTTP page. If you *could* still put items in the cart, then the page would also be vulnerable to cross-site request forgery (CSRF). If the logged-out HTTP site can supply a link capable of manipulating the cart, then so could an attacker from any other page, an e-mail, instant message, etc. (This may even be the case here, but I haven't checked.)

Visiting the HTTP version wouldn't log you out. That is, it would neither invalidate the cookie on the server, nor delete the cookie locally. That's the value of explicitly logging out of a service via a "log out" link/button rather than allowing the session to expire. Note: session expiration — the server forgetting old sessions — is different from cookie expiration, which locally destroys the cookie but leaves the session open on the server. An explicit log out (again, when done properly) tells the server to forget about the session: the session cookie is no longer useful to an attacker. This also generally deletes the cookie locally, but it's not strictly required since the server wouldn't recognize it and would ignore it. The cookie I used in my example was a real session cookie, but has since been invalidated.

The browser wouldn't send the Secure cookie for the HTTP page, so the server wouldn't know who you were, due to HTTP being stateless. You'd see the logged-out version but it wouldn't affect the session cookie. However, there is one caveat to all this: the server might respond to the lack of a session cookie by creating a new "logged-out session" overriding the existing, previous session (i.e. a cookie's identity tuple is [name, domain, path] and doesn't include its flags or protocol). Road Runner doesn't do this, so the Secure cookie session would survive the search page.


Const and Optimization in C

MasonBially

However the most correct way to write it, in my opinion, and in the coding standards of the projects I write, is:

int const*

This is because it is locally deterministic and reads correctly right to left.

Consider:

int const**
const int const**
int const* const*

The second is invalid in my coding style because the left most const is
modifying the type to it's right, instead of it's left, causing the old
const to change what it modifies. The third represents the correct way
to cause the same type change.

Also consider their spoken/written versions as read right to left.

> pointer to a pointer to a constant integer.
> pointer to a pointer, which is constant , to an integer, which is constant
> pointer to a constant pointer to a constant integer.

Totozero

I do not understand what you mean after "Curiously, the specification allows...". It looks like there is some mismatchs is your functions names and missing prototpyes.

Christopher Wellons

Whoops, you're right! I got "foo" and "bar" swapped in all the assembly listings, as well as a few other places. These should now all be corrected.

Travis Downs

Good post. It could perhaps use a summary of the key finding: "In general, const *declarations* can't help the optimizer, but const *definitions* can."

In particular, const definitions are the only case where the optimizer really knows the value can't be changed. In addition to optimizing the code as your example points out, it also allows the compiler to put any globals (and function static and class static in C++) so declared in the .rodata section (as you last listing shows), which can be a big deal for big static arrays (they can be shared among all processes using the same binary) and prevent at the hardware level inadvertent modification.

Christopher Wellons

Very succinct, nice! I've quoted you at the end of the article.

jdefr89

const with relationship to pointers act as a “read only” const int *ptr = & Val; tells the compiler that you will not mutate what is in Val by through ptr.

Also I just found this website, I really enjoy your work. Thanks!


Appending to a File from Multiple Processes

Alex

There is no such thing as nfsv5, you mean nfsv4.x?

Christopher Wellons

You're right, I must have misread something in my research. Maybe it was anticipated to be fixed in some future version of NFS. I've removed the parenthetical from the article.

nwf

I don't understand how you are interpreting the POSIX standard to allow two O_APPEND write(2) calls to interleave their bytes in the file. It says very clearly that *no file modifications* take place between the implied seek and write, which should exclude any other (partial) write, which would be a file modification.

Interleaving appears to be permitted by POSIX (IEEE Std 1003.1, 2013 Edition) only on FIFOs, in fact. What a mistake that is, though I guess I understand the implementation that gave rise to someone arguing for its inclusion in the standard. Boo. It'd be so easy to... not do that.

Now, that said, POSIX still isn't strong enough, there is plenty of room for write(2) to not write its entire buffer in a single atomic pass (e.g. signals), but AFAICT in those cases it will write a prefix and then return to userland indicating a short write, not interleave with other writes. That's still pretty bad from an application perspective, but it's not quite arbitrary interleaving.

Christopher Wellons

I interpret the lack of information on atomic or interleaving writes to files as being very open-ended (just as POSIX is with filesystems in general). It makes no statement about PIPE_BUF for anything other than pipes and FIFOs, so there's no reason to believe that value applies to files. It's also impractical to allow an application to perform an arbitrary large, atomic append. If that's what was *really* intended by the standard, then that part of the standard doesn't matter because no real systems works that way. They all interleave appends greater than some arbitrary threshold, often PIPE_BUF but not always, likely varying by filesystem.

For "no intervening," I read that as being entirely about the file offset. That is, the write may be broken arbitrarily into multiple writes (just like a write greater than PIPE_BUF to a pipe), but each will always append without overwriting. Again, that's what real systems do even with their most well-behaved filesystems.

Matrix AI

How do you deal with this if your CSV has rows longer than what a system allows as an atomic write/append? Explicit application synchronization?

Christopher Wellons

If your lines may be longer than what's allowed by writev(2), then, yes, you'll need explicit synchronization (e.g. database, etc.). That limit is 4MB on Linux, which would make for some _very_ long CSV lines, and, in that case, you should probably be using a database anyway.


Automatic Deletion of Incomplete Output Files

Bubihybi

There is also ReOpenFile on Windows. With it, it should be possible to clear FILE_FLAG_DELETE_ON_CLOSE.

Christopher Wellons

Interesting, I didn't know about ReOpenFile(). Apparently it's so rarely used by Windows applications that the Wine project has never bothered implementing it, and essentially nobody has noticed its omission. I tried to test this in Wine first, and it crashes in the ReOpenFile() stub with an "unimplemented" error.

This function does work in Windows 7 (and later) once I choose an appropriate sharing mode. Unfortunately it creates a new file handle. The old handle still exists and retains its FILE_FLAG_DELETE_ON_CLOSE flag, meaning the file is still doomed to be destroyed.


An Elfeed Database Analysis

(no comments)

Two Years as a High School Mentor

Brannock Device

Fantastic! How did you pick your student?

Christopher Wellons

Internally there's a big list of students who have applied and have not been assigned a mentor. I narrowed it down to those with an interest in computer science, then picked the only student who had something other than Microsoft Excel for past experience. That extra experience was Python (though still very little), which also fit well with the advice from Paul Graham's "The Python Paradox."

Timothy Dahlin

Thanks for sharing your mentoring experience.

By the way, I find the new look of your site to be cleaner.

Christopher Wellons

Thanks! I think I went overboard on my previous design (5 years ago). Check out the printed version, too. I fixed that one up a year ago.

possiblywrong

Great post. I really like the decision to start with C; even with much younger students, I have been very dismissive in the past of Scratch/Alice/etc., for similar reasons. But having experimented with both C and Java as a teaching language, I *have* found myself migrating toward Python as that "middle ground" between easy learning and production-ready. I'm only now getting old enough to wonder how dangerous this approach is? That is, like your mentee, I (and I'm sure you as well) started learning closer to the bare metal, where the correspondence between the lines of code and the complexity analysis is more one-to-one, so to speak. It's harder, I think, to squint at Python or similarly higher-level languages, than C or often even C++, and find performance bottlenecks. By not *starting* with that intuition for what's happening under the hood of all those easy-to-use data structures, is it harder to pick up later?

Christopher Wellons

Actually, I'm a counter-example to your guess. My elementary school's custodian (yes, I remember you, Mr. Jason, if you ever come across this), who had been studying programming in his free time, initially tutored me in QBasic, and on my own I soon moved on to Visual Basic 5 (bought at a computer expo from my allowance). I learned everything I knew about programming from reading the entire manual and experimenting with new things. That was the entirety of my programming experience up through my teens. It wasn't until college that I started learning anything else, including Linux, C, and C++.

My ad-hoc education in VB5 left me with a twisted impression of OOP, but otherwise I didn't have to unlearn too much. If I had a good mentor to guide me in my teens, I wonder how much further ahead I could be. (Part of why I write this blog is so I can track my improvements over time.) So I don't think starting high makes it harder to pick up low-level intuition later. However, I don't think it's possible to honestly *master* Python or other higher-level languages and frameworks without that low-level intuition. For the craftsman, they're productivity tools, not replacements for the pesky details.

There's one thing I've realized: My situation was easy. I only had one student to worry about, and we could always focus precisely the next concept for him grasp. If I was running a classroom of students all at different levels, Python would probably help a lot to level the playing field.

Joshua Katz

Is your student's next task to write MIKE?

I'm going through the book right now, one of the best!

I'd also like to say that I started out in CS like your student: taken under the wing of a retired professional. Now that I'm in college I'm so far ahead of the other students in my classes it's insane. Most of them can't write simple bash scripts, complete the labs, use data structures, or even write their own algorithms. These are things I was taught how to do at 12-14; I've ended up becoming a defacto tutor in my classes.

I wish we could teach CS like this rather then via a college environment. I always say CS is a trade, not a science in the traditional sense because it is in reality more an understanding of how to use our surroundings and their limitations to solve problems. The only way I see to do that is via experimentation. Just like you, your student, and I have done.

Paul

Except computer science != programming. The author even says as much by saying that a lot of vocational experience (programming) was gained, but not so much computer science.

Yohaï Berreby

Hi. I'm self-taught and started at 11. I'll start college next year. Would you be so kind as to provide some advice to someone who is in a similar position to you a few years ago?

It all started out as a center of interest among others, but grown into a passion, and today I know that I will work on software development. I've gained some experience along the way - worked on real-world, revenue-generating projects for relatives, read through a lot of technical books, and learned varying amounts of Python, Ruby, C, C++, JavaScript (client- and server-side), and Rust. As well as some HTML and CSS. I learned how to use the command line, how to administrate a UNIX system, the basics of information security (just enough not to make overly stupid mistakes as a developer), among (many) other less important things.

Python was my first 'real' programming experience. I 'know' C, but I wouldn't trust myself to write production software in C without a lot more experience. I can read C code, and I understand most of the low-level details nearly down to register allocation, but writing a sizable amount of _correct_ C is not an easy endeavor and requires a level of care that I am not prepared to put in. I wrote a few production utilities in Ruby and recently started to dabble in Rails. I got interested in game development and read a lot on the matter, so I just _had_ to learn C++. I did for about four months before learning JavaScript seriously. I learned to use JS client-side, and server-side with Node.js, which I used to sell Steam trading bots for peanuts. More recently, I learned Rust, which has been my most pleasant experience so far (static type checking, memory safety, close to the hardware with C-level performance? Sign me in!), and the language in which I've invested the most. I contribute to open-source projects whenever I can, hung out a lot on the #rust IRC channel and on the /r/rust subreddit for some time, and used the language to write a few small production utilities (a few kLOC).

I like to think that this is a great track record for a 16-year-old. However, my perception of my own skill level is what I'd describe as unstable. Sometimes I'll read through some particularly technical HN threads and feel like a pretentious idiot who knows no more than the bare basics. At other times I'll read stories about college graduates and 'professional' programmers with 5 years of experience who cannot seem _much_ less skilled than me, and I'll feel like I don't even need to go to college.

So I turn to you. Enlighten me. What new things will college teach me? How should I approach it?

Here's my GitHub profile in case you want to take a look: https://github.com/yberreby/

Thank you for your time.

Christopher Wellons

I'm not the person to whom you directed your comment, but I want to say this: You're on the right track. They key is to practice everything you learn immediately after you learn it. Invent projects for yourself that exercise the knew knowledge. More than likely you'll be practicing several things at once. Keep doing this on a daily basis.

College will be great for exposing you to concepts you might not have come across on your own. Use the classroom concepts as a stepping stone to more advanced things they won't cover in class. The most obvious sign of a 100% self-taught programmer without the relevant college education is gaps in knowledge. They've naturally focused on practical matters and are very effective, but they don't have the more theoretical concepts that are sometimes useful. This doesn't apply to everyone, but I've noticed it often enough.

Christopher Wellons

My experience has changed my views on education in general. Like you said, experimentation is essential. Traditional classroom education doesn't practice or encourage this. I suspect the reason is that it's incredibly difficult to scale, and the resources simply aren't there, especially when broadening it to more students with differing ability and interests. It worked well in my mentorship since it was one-on-one with an eager student. I don't think I could do the same working with 10 students simultaneously, let alone 30 or 40.

This all plays even more into treating it like a trade. There are a limited number of these apprenticeship-style positions available due to a limited number of mentors, and it's not something that would or could be available to every student.

vfulco

Inspirational. Well done.


Read and Write Other Process Memory

Konstantin Khlebnikov

Since 3.2 linux has new syscals for that: process_vm_readv and process_vm_writev.
They work without stopping another process.

Christopher Wellons

Thanks! I hadn't come across these. They're way better than mucking around with ptrace. I've added an update to the article.

Terrence Andrew Davis

TempleOS doesn't use paging, basically. It is single address map. therefore one task has code at one locationb and another task had code at another. One task has stack at one location and another task has stack at another location. You'd be surprised how many people pretend to believe multitasking is impossible without paging.

Christopher Wellons

You've certainly proven that it's possible, but I think anyone who weathered DOS's terminate-and-stay-resident days would argue that virtual memory is far easier for mere mortals to use and manage.

FerminChe YT

Hi, what program did you use to create the Trap Engine?

Christopher Wellons

I don't know what the Trap Engine is, and an online search isn't providing any answers about it.


Inspecting C's qsort Through Animation

Thus Torler

It's interesting to see the different sort algorithms in action. Thanks for this!

GabeJCsapo

This is awesome, hands down coolest sort animation I have seen so far!

Aleksander Balicki

Some gifs don't work

Christopher Wellons

GitHub Pages, which hosts this blog, has been having hosting issues the past 12 hours or so. That might be the problem. Also, some mobile devices (including mine) choke up when too many GIFs are loaded on the page. GIFs below the fold don't seem to count until they're visible.


Modifying the Middle of a zlib Stream

(no comments)

Render the Mandelbrot Set with jq

(no comments)

Linux System Calls, Error Numbers, and In-Band Signaling

(no comments)

The Vulgarness of Abbreviated Function Templates

Paul Phoenix

Seriously, why?

You wrote this long blog post without actually, really putting a strong argument against AFT?

Can you please tell me what is *so* wrong with them? Isn't this concept is a logical next step for auto? And doesn't it mirrors what auto already does in other contexts? Generic lambdas do this already.

So, your reasoning may also be easily applied against current status of auto. You can easily abuse auto now. Same with AFT and probably half of the features of C++.

It's just sometimes things are simple enough to stay simple. Why ask developer to type more code?

Christopher Wellons

My last paragraph sums it up. In C++11 the auto keyword was repurposed for type inference, which was both clever and reasonable. However, the semantics of "auto" in AFT is something fundamentally different than type inference. And, unlike type inference, it is not a zero-cost abstraction. It's nothing more than syntactic sugar for a template (hence the name). Not elaborated in the article: IMHO, templates aren't something to be used lightly, so it's not deserving of such syntactic sugar.

I actually haven't thought about AFT since writing this article, and I haven't seen any more uses of it. Revisiting the situation here in 2018, it seems that, because AFT has been so controversial, should it someday get accepted into C++ it will likely require an explicit template. In my view, this would take the surprising bite out of it, making for a pretty fair compromise. (Unfortunately for me, most of the discussion around Concepts requires a much deeper technical understanding of the C++ specification than I have, so I'm unable to really follow it.)


Small-Size Optimization in C

possiblywrong

You mention the Visual Studio functions _malloca() and _freea(); I have tended to stay away from Visual Studio entirely for C stuff, having the impression that although their C++ implementation has actually gotten pretty good, they seem to have ignored C99, so I haven't bothered playing with it much. Your thoughts?

Christopher Wellons

I discussed this some back in June in my article "Four Ways to Compile C for Windows."

http://nullprogram.com/blog...

You're right about C being a low priority for Visual Studio. Fortunately the situation dramatically improved in Visual Studio 2015, when it got C++11 support. Just as most of C99 got merged into C++11, so did many C99 features finally appear in Visual Studio's C compiler, almost by accident. For example, VLAs didn't make it into C++11, so neither does it in Visual Studio's C. Many of its new C99 features are still considered extensions by the compiler, and you'll get warnings about using them. Despite the mediocre C support, since it inherits from Visual Studio's C++ strengths, the code it generates is basically on par with gcc and clang.

So as of 2015, it's actually in a pretty usuable state, at least for builds. Debugging is still an issue. Naturally it doesn't generate GDB-compatible debugging symbols, so you're basically stuck using Visual Studio as a debugger. So, keeping Visual Studio's limitations in mind (something I've been doing more and more), I can develop and debug comfortably on Linux, then pretty easily port to Visual Studio later for (non-MinGW) Windows builds.

Even though it's not the compiler I like to use, I think there's value in studying what Microsoft is doing with their tools. The idea of a function like _malloca() hadn't occurred to me before, and seeing/trying it out got the gears turning. Another example is Control Flow Guard (CFG), which, as far as I know, no one else is doing yet due to the (relatively) burdensome loader requirements. It's a ROP mitigation technique that, at run-time, validates all function pointers just before they're followed. If, say, the stack canary fails to detect a function pointer overwrite (i.e. the function epilogue has been run yet), then this is the next line of defense. It also protects function pointers on the heap.

Denis Bakhvalov

Great explanation!
This inspired me to implement it in C++.
https://github.com/dendibak...


An Array of Pointers vs. a Multidimensional Array

Simon

This stuff fascinates me too, such a fantastic write up. This is the second time I have seen an article of yours shared so adding your blog to my feed reader :)

Christopher Wellons

Thanks! What was the first one? I'm going to guess the small-size optimization article.

Simon

haha, quite likely that it was the small-size optimization article, possibly from hacker news or reddit? In any case, I have since read a majority of your articles - all quite interesting.

Mike Zamansky

This is terrific -- would have loved it when I taught systems at Stuy - I'll have to share it with the current teacher.

world_turtle

Isn´t it the same with arrays of structs? I suspect that this man got it right: http://www.the-adam.com/ada...
Will we discover in the future that the ultimate programming language is... FORTRAN 77 ??? :-)

Alf P. Steinbach

`int ncolors = sizeof(colors) / sizeof(colors[0])`, where `colors` is a pointer, is a classic error. In this case `colors` appears to be an array. I think that's worthy of being pointed out: don't do this if `colors` can be a pointer.

Denis Bakhvalov

That's interesting.
Such a detailed explanation. Thanks!
1. What if there will be multiple instances of those arrays?
I suppose 2d arrays will consume more memory, because with ptr arrays there will be just ptr, not the copies of all strings. Am I right?
2. If there are only benefits, can compilers create such an optimization by always choosing the multidimensional option? Because semantically they are the same. Did you have a chance on feeding this code into different compilers?

Christopher Wellons

Yup, your thinking is correct about string sharing between arrays. The standard doesn't require string literals to be distinct objects (6.4.5-6), which is why writing to a string literal is undefined behavior. If multiple tables have common strings, an array of pointers may very well be the better option due to sharing.

Compilers have restrictions related to sequence points and observable behavior/state. If the array is static and the compiler can see all accesses at once, and none of those accesses leak pointers, then I believe the compiler could choose any arbitrary representation. This could make debugging awkward, as optimization always does, but that's outside the scope of the language specification. For example, compilers don't need to allocate storage for variables whose address isn't taken.
I mostly look at GCC and Clang output, and on occasion MSCV (cl.exe). I didn't see anything unexpected from GCC or Clang with this code. Today's C compilers generally don't do anything as crazy as changing array representation to 2D. I think that's a good thing, since the programmer is likely to be operating with information the compiler doesn't have, so it can't properly make that decision. It's sort of like manual transmission getting better fuel efficiency than automatic: The driver operates on more information than the automatic transmission (upcoming curves and hills, intended route, turns, etc.).

However, I do know GCC will do interesting things with some algorithms, like convert non-tail-recursive functions into iterative functions:

http://ridiculousfish.com/b...

Graham Toal

closely related to this is an old trick from the 60's for saving space in triangular arrays - it was more efficient to base the structure on a 2D array rather than an array of varyingly-sized slices: http://gtoal.com/src/triang... (after writing that recently I found an older reference to the technique in Atlas Autocode from 1965: http://history.dcs.ed.ac.uk... )


Emacs, Dynamic Modules, and Joysticks

VanLaser

Cool - your post just gave me the idea of trying to make (in the future...) a module for head-tracking in Emacs, to use with my TrackIR device! :) Nice work!

Christopher Wellons

Cool idea, I'd like to see that.

'a

Hello,
Could you expand/give pointers on the closure pointer and how it can be used.

Christopher Wellons

I have an entire article about this topic: http://nullprogram.com/blog...

Summary: I'm using the term "closure" pretty casually in this context. Modules are a C interface, and C doesn't actually have closures. A closure isn't just an anonymous function. It's a function paired with its captured ("closed over") lexical environment. Since C only has function pointers, closures are usually simulated by pairing a function pointer with a void pointer. The void pointer sits in for the lexical environment, allowing the function pointer to behave like a closure, and sets one function pointer apart from another. Another place you can see this technique is in glibc's qsort_r().

'a

I guessed it was to be used as a environment store, but my confusion was when and who frees it?

Christopher Wellons

As far as Emacs is concerned, it's just an opaque pointer without any particular meaning and its up to the module to decide its semantics. For example, the argument may not actually be a pointer, but instead an integer cast to a pointer, to be cast back to an integer before use, so dereferencing the pointer would cause a crash.


Zero-allocation Trie Traversal

(no comments)

Baking Data with Serialization

possiblywrong

Very interesting. I'm not sure I understand this comment: "The Right Way is to never dereference the data directly but rather memcpy() it into a properly-aligned variable. Byte-order is a non-issue." Can you describe this in more detail? I think the part that confused me is the last part about byte-order not being an issue-- or is the idea that it *is* one of several possible architecture-specific issues covered in bullet (1), but even on the same architecture, there are *compiler*-specific issues covered in (2), namely alignment?

Christopher Wellons

I've gone back and adjusted some of the wording. I threw on the sentence about byte-order much later, so maybe that's why it stuck out.

Byte-order is solved by (1), due to host and target being the same. But (1) doesn't address alignment. Once the data is dumped out, the alignment information is lost. It's now just a buffer. In the main program, this buffer is being treated as an extern variable by the source, but the linker that produced the data's object file (and therefore its storage) doesn't know any of its alignment requirements, and so it may be misaligned in the linked result.

Other than manually setting the alignment information in the data object file before linking, the safe way to access this data is to memcpy() from the global variable, treating it as little more than a buffer, into a variable that is known to be aligned properly for its type.

Frank Ruben

happy to have followed the traces from medalled u/skeeto to this blog - this is seriously good stuff. You should be famous ;)


A Magnetized Needle and a Steady Hand

Miguel Lechón

Good bedtime story!

xem

Great read! There's a dead link though: http://www.x86-64.org/docum...

Christopher Wellons

Looks like x86-64.org is temporarily down again, which happens from time to time. That link should be back up eventually. If it's not, I'll replace it with another host. Maybe this one:

https://github.com/hjl-tool...

Thanks for the heads up!

ylluminarious

Nice article! Funny as it sounds, hypothetical scenarios like this drive me to learn stuff sometimes...


Portable Structure Access with Member Offset Constants

Mon_Ouie

Any reason to hardcode constants instead of using the offsetof macro? I believe the latter would be more portable, seeing as the size of some types (including types like uint16_t, on platforms where CHAR_BIT > 8) can be different depending on the target platform. Even if you don't intend to compile to such a platform, at least you can get rid of these magic numbers.

Christopher Wellons

Using offsetof relies on the host's structure alignment rules, which is one of the issues with structure overlaying. The magic numbers in my example would come straight from the formal format specification, making them less arbitrary than they might appear.

You're absolutely right about problems when CHAR_BIT > 8. But that was already a showstopper since such machines couldn't supply the optional uint8_t type (no padding bits allowed), and probably none of the other fixed-width types either. But I do think that's a decent argument in favor of "uint_least8_t" and friends for handling the extracted integers.

Additionally, it would be unclear exactly how an octet-oriented format maps onto an unusual byte width, and that detail would need to be hammered out with more specification. So there's not really any way to plan for that case anyway.


A Showerthoughts Fortune File

Toby Cubitt

Nice use of elisp!

It's actually more efficient to use a heap instead of an AVL tree for this particular task, i.e. for selecting the top-10,000 entries.

Intuitively, the reason a heap data structure is more efficient is because, whereas an AVL tree effectively fully sorts every element as it's added to the tree, including elements you later discard, a heap doesn't bother fully sorting elements until it has to, so avoids wasted effort sorting elements that later get discarded. That makes heaps more efficient in the asymptotic complexity sense for this application.

A heap will also be more efficient in practice, because it's a far simpler data structure than an AVL tree (or any other self-balancing binary tree, for that matter). On top of that, heaps can easily be implemented internally using arrays, which makes most operations very fast. Growing an array-based heap is slow when the internal array has to be resized (though still log(n) amortised complexity). But in your case you only need a fixed-size heap, so you get all the benefits of the array-based implementation whilst avoiding all the downsides.

(Even heaps don't give the most efficient algorithm - they're beaten by an algorithm that uses a partial quicksort. I'm not certain, but heaps may give the most efficient online algorithm, though. Which matters here because you have so much data. The partial quicksort algorithm would require *all* the data to be stored in memory during the algorithm, which would surely be prohibitive. The heap algorithm should beat it easily in practice here.)

Probably you knew all this already, and just used an AVL tree because there's already an elisp implementation of those bundled with emacs. But there's actually an elisp heap.el package in GNU ELPA (which is indeed array-based). I'd be interested to hear if using that gives you any noticeable speedup...

Christopher Wellons

You're exactly about my reasons for picking an AVL tree. Also, since I use it for Elfeed's primary index, it was already familiar.

I figured you were right about the heap, so I gave it a shot using that ELPA heap. I found there was no measurable difference between the AVL tree and the heap. But I also found there was also no measurable difference between either of those and using nothing at all, just throwing away the results as they're read. The time is completely dominated by parsing JSON, and limiting the working set to 10,000 objects keeps the object management time so small that it can't be measured.

Radon Rosborough

This is great! It wasn't immediately obvious to me how to get this working on OS X with Zsh, so hopefully this will help someone else:

$ brew install fortune
$ strfile showerthoughts
$ mkdir ~/.fortune
$ cp showerthoughts* ~/.fortune

And then in your .zshrc:

if [[ -o login && -z $HAS_FORTUNE_ALREADY_RUN ]]; then
export HAS_FORTUNE_ALREADY_RUN=true
fortune ~/.fortunes 2>/dev/null || true
fi

VanLaser

Nice :) And this goes very well with the "fortune-cookie" emacs package (which I use to feed the comments in my scratch buffer)

Rei Hino

Thank you, this is really nice!

Just one thing: if you want to place these fortunes in the offensive set, you need to convert them to ROT13:

rot13 < showerthoughts > showerthoughts-o
strfile -x showerthoughts-o

Then you get showerthoughts-o and showerthoughts-o.dat

Under Debian/Ubuntu/Mint, they are installed to /usr/share/games/fortunes/off/showerthoughts[.dat]
Under Slackware, they are installed to /usr/share/games/fortunes/showerthoughts-o[.dat]

Then the fortunes are available through
All including offensive: fortune -a
Offensive only: fortune -o (Debian/Ubuntu/Mint) or fortune all -o (Slackware).
Just Showerthoughts: fortune off/showerthoughts (Debian/Ubuntu/Mint) or fortune showerthoughts-o (Slackware)

If you're interested in packaging these with a Makefile, etc, I also make custom fortunes, all are CC0 unless noted:

YKYWTMSMW (Sailor Moon joke book, custom noderivatives statement): https://github.com/redblade...
Fifteen Thousand Useful Phrases: https://github.com/redblade...
Mark Twain: https://github.com/redblade...
Catholic Saints: https://github.com/redblade...
BitchX Kick & Quit Messages (New BSD): https://github.com/redblade...


Faster Elfeed Search Through JIT Byte-code Compilation

Clément Pit-Claudel

Wonderful post! Mind if I share this with emacs-devel? There are not that many example of lexical scope making things faster in the wild (I think json.el is another one), and I'm sure many of the emacs-devel regulars would enjoy reading this :)

Christopher Wellons

Thanks! Not only do I not mind you sharing it, I encourage it. I've been preaching about better performance with lexical scope for 3 years now (since my "Emacs Lisp Readable Closures" article), though this is the first time I've gathered hard data to support it.

Phil

Very cool. Thanks for posting this (and your other articles) Chris -- always interesting and enlightening reading. I suspect that some nice improvements to the elisp manual could be derived from them as well, if that's something you'd be interested in? (I admit I've put zero actual thought or investigation into that comment, but I'll raise the idea anyway :) Even if the manual *does* discourage actively poking around with byte code, I see no reason for the details -- and the potential benefits of that knowledge -- not to be documented.)

Frank Ruben

Late to the game reading this, thanks anyway for another highly interesting post.

When you say "It’s as fast as Elisp can get.", there is one more trick that might help: Not language-specific, but as used by database-systems: because of the short-circuiting you can use static information about the distribution of the tags in the entries to build the (or (memq ...) ...) clauses, in that case so that the most-used tag comes first. For real-world data with an expected unequal distribution of tag values this should give a measurable speed-up.

Christopher Wellons

That's an interesting idea, thanks! I might try this out. If it seems like it would prove generally useful, I might want to (redundantly) track tag distribution in the database so that the information is readily available.


Some Performance Advantages of Lexical Scope

Phil

It's good advice, with the proviso that judicious use of dynamic variables ought to cover everything that a user of your code might conceivably want to override, even if the author did not originally anticipate the use-case. Using lexical scope by default has the potential to deny users the ability to leverage dynamic scope for their own benefit, so it's important to write and review your own code with that in mind (unless sharing your code with others is somehow not a possibility). Good code will indeed provide the best of both worlds, though.

Xah Lee

a quibble: i was asking about the lexical-binding directive/“file variable” vs a new special form, with respect to language design choice. Not about the existing macro in cl lib that happens to be called lexical-let (am aware of it but avoid using anything in cl lib, so far.).

Among your explanations, the one info you convinced me that a language directive is more proper is this:

“It applies not just to let, but to all forms that create local bindings, such as function parameters, condition-case, and any macros that expand to let.”

from this, i understand that a new special form won't work. It must be a language directive (e.g. file variable in emacs)

this one scratched my spot. :D and i learned many other things from your post!

Christopher Wellons

I primarily mentioned our discussion in order to indicate what sparked the article, using it as an excuse to describe how lexical-let works since it's a pretty neat hack. Regardless, I'm glad I was able to convince you of the value of lexical-binding! Thanks for the discussion topic.


Relocatable Global Data on x86

ner0x652

Great article! Thanks!

P.S. and uses cl (the lowest bit of rcx) in a shift operation. -> I know you meant lowest byte ;)

Christopher Wellons

Thanks! It's now been corrected to "byte."


Domain-Specific Language Compilation in Elfeed

Phil
The remaining time is dominated by xml-parse-region, which is mostly out of my control.

Are you using libxml-parse-xml-region when it's available?

Christopher Wellons

Surprisingly, libxml-parse-xml-region isn't any faster than xml-parse-region, at least for syndication feeds. In fact, it's a tiny bit slower in my tests. I don't know why, but maybe its integration into Emacs isn't ideal, or maybe all the consing and interning is what actually dominates the time.

Also since libxml2 is an optional dependency, Elfeed would need to support both. That's complicated because libxml-parse-xml-region is less configurable and its parsing results are subtly different. So even if it *was* faster, it would be need to be fast enough to make it worth the extra complication.

Elfeed *does* use libxml-parse-html-region for elfeed-show since the xml package doesn't support HTML parsing. That means libxml2 is required for viewing feed content.

Phil

That is surprising -- but I was merely assuming it would be quicker. Not a low-hanging fruit after all, then.

Clément Pit-Claudel

Very nice. I doubt compiling regular expressions would help; parsed regexps are already cached by the regexp engine, and it's fairly well optimized.


C Closures as a Library

zephyrpellerin

It must be in the Zeitgeist,

I was working on something like this as well a few days ago -- a META II compiler that emits x86 assembly code for custom C 'closures' and partially applied functions. Right now it just basically rewrites the existing instruction constants, which seriously limits it's functionality (so for example, in an 'add' partial function, the instruction `add EAX, 0x0`, and later in the function body locate the address of "0x0" and replace it with the value of EAX).

I wrote this in Rust and put it on Github, but having something in C is just as well!

Something I'd be curious about if you happen to know is how to ensure that ld respects omagic options, or otherwise ensure that the .text section begins life as RWE (without mprotect).

Christopher Wellons

You could try using your own linker script.

Greg V

A more practical solution that already works on many operating systems (well, everywhere with LLVM) and gives you nice syntax: clang -fblocks -lBlocksRuntime

Boštjan Vesnicer

First of all I want to say that I really enjoy reading your posts/articles. They are very technical and informative.

Just a small remark. You are marking distance function as inline (which I know is just a hint to the compiler), but in case of passing it as a callback to the qsort function the compiler can't possibly inline it.

This is in contrast to the C++'s std::sort function which is a function template so the compiler is free to inline the compare function if it wants. Inlining could of course make a performance difference.

Christopher Wellons

You're right about the qsort() callback not practically being inlined. (Though LTO can and does inline callbacks, and it could do this when libc is statically linked, achieving the same effect as std::sort, but this is not generally the case.) However, the callback is actually coord_cmp() / coord_cmp_r(), not distance(), and distance() would certainly be inlined into those functions regardless of my hint.

Boštjan Vesnicer

You're of course right.

proteansec

If the C had support for nested functions, would defining a closures be a rather simple task, since all we had to do is to write a write a wrapper with nested function that would call the original function that doesn't accept user pointer argument right?

Basically, I would like to know whether closures are dependent on the support on nested functions (in other programming languages as well) and if such support exist, closures can be easily defined/used in arbitrary programming language. If such support existed in C, this whole task would've been rather simpler?

Christopher Wellons

The syntax of nested functions would be an important part of integrating closures into the language proper, but one of the real difficulties is how closures are to be allocated since they have memory associated with them. You can see this with C++11 where closures are explicit in how things are captured, leveraging C++'s implicit memory allocation. C has always been explicit about memory allocation, and closures don't really fit that model. Ultimately they're not very C-like.

Nested functions themselves could be pretty C-like (and GCC even supports it already), and sticking to the explicit pointer argument mostly solves the closure problem in a C-like way.


Manual Control Flow Guard in C

Guest

>The size of the name buffer is 8 bytes, and peeking at the assembly I see an extra 8 bytes allocated above, so there’s 16 bytes to fill, then 8 bytes to overwrite the return pointer with the address of self_destruct.

What?

Christopher Wellons

In other words, the compiler allocates 16 bytes for "name," not just 8. In order to overflow the buffer, I have to write more than 16 bytes into the buffer. I knew about the extra 8 bytes from examining the disassembly for the function. In the end, that's 24 bytes in total for the exploit.

Guest

Oh, I see. Thank you for the reply!

psh3nka

"to make it harder for an attacker to manipulate the bitmap should he get the ability to overwrite it by a vulnerability." should he? it's not misprint?

Christopher Wellons

That's exactly what I intended to write. Imagine a comma appears after "bitmap" and it may help you parse the sentence.


How to Write Fast(er) Emacs Lisp

agumonkey

Upvote as usual. How often do you write perf sensitive elisp code ? I was under the impression you were kinda busy with other languages these days.

Thanks for your great insights!

Christopher Wellons

Thanks, agumonkey! These thoughts came out of the two big Elfeed optimizations I did last month (JIT and DSL articles). I've been using Elfeed on a daily basis for nearly 3.5 years now, being an vital part of my routine. I have a deeply vested interest in its smooth operation.

Then again more recently someone found a bug deep inside simple-httpd (issue #13), in some awful code I wrote 8 years ago. I completely rewrote it, fixing the problem and dramatically faster, subsequently giving Skewer a nice boost as well. You can really tell the difference in Skewer for large (100kB+) results.

And then last week, just for fun, I wrote a throwaway Killer Sudoku solver. For practice I spent a couple hours trying to make it as fast as possible. Applying the techniques in this article, plus a jump table hack (to be covered in a future article), I got it down from around 3 minutes to 12 seconds. This was without changing the algorithm. Some careful considerations in the right places can have dramatic results!

But otherwise you're right, since I've been focusing on a lot of C stuff lately.

agumonkey

Interesting, eager to see that future article.

progrock

This is fantastic. How did you get to this level? I see so many article on emacs which are little more than basic functions on a hook. Seibel's Practical Common Lisp, what got you over the first hurdle Chris? I'm aware there is no easy path, but as an experienced procedural programmer how to plot the course to intermediate elisp and beyond?

Christopher Wellons

My advice is always this: regular purposeful practice. In programming, experience really is the best teacher. When you've completed something useful in Elisp, consider how it could be better. Could it be faster? Could it use less memory? Could it be done with less code? (Not in the line count sense, but with fewer conceptual components.) Could it be clearer to the reader? These are often trade-offs with each other, and it takes experience and practice to strike that balance well.

anon

It's probably with mentioning that the loop unrolling example is the same as just using cond.

Christopher Wellons

Oh yeah, you're right. That didn't occur to me, so it's a less optimal example than I intended. When I've applied unrolling to real code, it wasn't actually equivalent to a cond. For example, see my other article "Faster Elfeed Search Through JIT Byte-code Compilation."

Clément Pit-Claudel

I wouldn't word rule 3 so strongly — especially because it contradicts rule 2. Here are the timings I get for your three functions, on my machine:

0.6968s [slow]
0.6654s [fast]
0.6707s [cl]

This is for a 5000-elements list; I used the following code to benchmark it:

(defvar ~/input
(let ((lst nil))
(dotimes (n 5000)
(push n lst))
(nreverse lst)))

(dotimes (_ 10)
(with-timer "slow"
(dotimes (_ 1000)
(expt-list ~/input 2)))
(with-timer "fast"
(dotimes (_ 1000)
(expt-list-fast ~/input 2)))
(with-timer "cl"
(dotimes (_ 1000)
(expt-list-fast ~/input 2))))

with-timer is this:

(defmacro with-timer (message &rest body)
"Show MESSAGE and elapsed time after running BODY."
(declare (indent defun))
`(let* ((start-time (current-time)))
(prog1
,@body
(message "%.4fs [%s]" (float-time (time-since start-time)) ,message))))

All in all, the lambda form isn't that bad :) (and it's much faster when the code isn't byte-compiled, though that's not a big concern)

Btw, did you hear about the byte-switch branch?

Christopher Wellons

I'm also seeing about the same 5% speedup from the "fast" version. It's not as dramatic as I thought it would be. Since mapcar is a built-in, I think this makes sense because the loop is implemented in C rather than byte-code. However if you reduce the list to something a lot shorter (I chose 20 elements) and increase the number of benchmark iterations to compensate, that gap opens up to 25% faster. Going in the other direction, up to around a million elements, the "fast" version actually gets much slower.

I speculated that the boost for short lists was garbage collection, since the closure allocation isn't amortized over as many list elements. However, disabling garbage collection (high gc-cons-threshold) makes the effect even more dramatic, up to 35%-40% faster.

The slowdown in the other direction may be in part due to cache misses on the second traversal. That's a lot of memory to go back and touch. However, I the "slow" version also does a lot fewer garbage collections for huge lists. I don't know why this is. Maybe mapcar is doing something special internally? The "fast" version remains faster for huge lists when garbage collection isn't involved.

I made some changes to your benchmark. First, I added a manual garbage collection right before grabbing the start time. That way all tests start with the heap state. I then added a garbage collection counter. I also dropped the "cl" version since it's literally just the nreverse version with an extra variable. Finally, I made the list length and number of iterations parameters and include them in the output (message, length, iterations).

Here are the results with 25.1. You can see the garbage collector was disabled in the second grouping. I did multiple iterations at each length/iteration pairing but only kept the last pair of each figuring it was sufficiently "warmed up" by then.

4.6893s 510gc [slow 20 1000000]
3.5296s 400gc [fast 20 1000000]
4.2031s 409gc [slow 200 100000]
3.7720s 400gc [fast 200 100000]
3.8490s 400gc [slow 2000 10000]
3.4427s 400gc [fast 2000 10000]
3.7330s 333gc [slow 20000 1000]
3.6769s 400gc [fast 20000 1000]
3.1441s 100gc [slow 200000 100]
5.1803s 346gc [fast 200000 100]
2.6902s 10gc [slow 2000000 10]
4.9232s 61gc [fast 2000000 10]

2.7241s 0gc [slow 20 1000000]
1.7058s 0gc [fast 20 1000000]
1.7881s 0gc [slow 200 100000]
1.5595s 0gc [fast 200 100000]
1.7724s 0gc [slow 2000 10000]
1.5482s 0gc [fast 2000 10000]
1.8012s 0gc [slow 20000 1000]
1.5467s 0gc [fast 20000 1000]
1.8160s 0gc [slow 200000 100]
1.6160s 0gc [fast 200000 100]
1.8444s 0gc [slow 2000000 10]
1.6827s 0gc [fast 2000000 10]

So the guideline is more complicated: For lists under 100k elements, use nreverse. Over 100k elements, use mapcar unless you're temporarily suspending garbage collection.

Wilfred Hughes

There's no reason why dash.el couldn't use an explicit loop in its macros, so I benchmarked this with the intention of sending a PR.

https://gist.github.com/Wil...

However, I'm struggling to find a scenario where mapcar is slower. Any ideas why?

The timings I got were:

Small list with mapcar (seconds): 1.0259000000000001e-05
Small list with loop (seconds): 2.8387e-05
Medium list with mapcar (seconds): 0.0027399819999999997
Medium list with loop (seconds): 0.007948615
Large list with mapcar (seconds): 0.7446324980000001
Large list with loop (seconds): 1.236648743

Vibhav Pant

update: emacs 26 has support for jump tables for elisp code written with cond/cl-case/pcase.

Christopher Wellons

Thanks, I'll have to check that out. It's great that existing Elisp gets the free upgrade when it's byte-compiled for the newer release.


Asynchronous Requests from Emacs Dynamic Modules

Vladimir Kazanov

Chris, thanks for the article and useful ideas/feedback/contributions!

I'll add some background to the project's README and a link to the article, if you don't mind.

For blog readers: actually, there are a few caveats in the way Elfuse works right now. First, it does not support multiple FUSE filesystems per Emacs instance. Second, Emacs-side API is a bit awkward right now. I'll fix the limitations in the next iteration of work on Elfuse.

Meanwhile, I'm looking for interesting Elfuse-based demo project ideas :-) I have a few of my own but who knows...


OpenMP and pwrite()

Boštjan Vesnicer

Just another small remark from an interested reader :-)

Since you are using Windows' WriteFile in an asynchronous manner, wouldn't it be more correct to issue an GetOverlappedResult call after calling WriteFile to wait the IO operation to finish before returning from write_frame function? Please correct me if I am wrong.

Christopher Wellons

I could be mistaken since I only know enough Win32 to be dangerous, but since standard output presumably wouldn't be opened in overlapped mode (FILE_FLAG_OVERLAPPED), it's not actually asynchronous. The "OVERLAPPED" argument is merely doubling as a plain old file position even for synchronous operations.


Why I've Retired My PGP Keys and What's Replaced It

Ryan C

Generating public keys from a passphrase is a catastrophically bad footgun - it makes the public key into a password-hash equivalent that can be cracked. For bitcoin, this was called a "brainwallet" and resulted in people - even very smart ones - losing large amounts of money to theft.

This is *exactly* why you're not supposed to roll your own crypto.

Christopher Wellons

The danger with brainwallets is poor passphrases, address re-use (in the naive case), and poor key derivation (i.e. a single round of SHA-256). Electrum, my personal Bitcoin wallet client of choice, is doing just fine with passphrase-derived addresses because it's dealt with each of these issues.

With the default key derivation settings, keys derived by Enchive are very resistant to cracking (and probably *too* paranoid). As I mentioned in the article, each cracker guess requires its own 512MB random-access read+write memory buffer through a computationally intensive process. Guessing ain't cheap. It's still ultimately the user's responsibility to pick a good passphrase, which is why this feature isn't the default. I know my own Enchive passphrase is sufficiently long, so this feature will be useful.

Ryan C

You cite Electrum, which as far as I know, no longer allows users to choose their own passphrase, then go on to say it's the user's responsibility to pick a good passphrase.

Electrum style tools that generate the passphrase for the user and actively try to prevent people from choosing their own are fine if done right, but making the user responsible for their own security in this sort of tool is actively harmful. Most people simply do not have the necessary understanding to choose one that is secure enough (even with a heavy KDF).

Greg

Borg backup might be up your alley, for this use case or a similar need. I replaced a pile of rsync/reverse encfs stuff with it.

Michał Kosek

This doesn't build confidence: "Enchive uses an scrypt-like algorithm for key derivation". Why are you rolling your own key derivation instead of using scrypt? Scrypt has received a fair amount of attention from cryptographers and one can be relatively sure that it doesn't have obvious flaws, which is not the case with your algo.

Christopher Wellons

At least at the time I originally wrote Enchive, there didn't exist embeddable libraries of scrypt nor Argon2. In a sense you could say the algorithms weren't yet mature enough to have a wide variety of implementations. Since then, I have developed my own embeddable Argon2 library, and it's what I use now in C applications that need a KDF. This is all in contrast to, say, SHA-256, for which I had half a dozen embeddable options. Avoiding dependencies was, and still is, a higher priority than using either scrypt or Argon2, so I went a different route.

Fortunately, key stretching is only necessary if you want to use a shorter passphrase. If you're using, say, a 10-word Diceware passphrase (worth ~129 bits), a single SHA-256 hash over your passphrase is a perfectly sufficient KDF for Curve25519. It wouldn't matter if you're using any of scrypt, Argon2, or Enchive's custom KDF.


How to Write Portable C Without Complicating Your Build

Nicholas Wilson

It may be a tool too far for your tastes, but CMake is becoming very popular as a makefile-generator and we use it very successfully for generating Unix makefiles + ninja rules + VS projects + Xcode projects and more. If your goal is a portable project, something like CMake is ultimately the way to go, rather than trying to get Unix makefiles to work on Windows.

Doug Cuthbertson

I see two separate problems for creating portable application (regardless of whether they have a GUI component) - available APIs in each operating system, and the build tools. Your solution to the API issue, to create your own functions to isolate API differences is great. Make the operating system adapt to your application. Your sleep example can even be extended to other OS-specific things like reading and writing files. Make wrapper functions that put data into a form that's internally consistent for your application, but read and written in an OS-specific way.

As far as build tools go, use what's appropriate and easiest/simplest for the build environment. On unix-like operating systems, it's probably make and makefiles and your favorite compiler. On Windows - well, I'm most familiar with Visual Studio. It makes it easy to create consistent builds in a variety of configurations (debug, release, 32-bit, 64-bit), and to add, move and remove components for an application as the software grows. Also, msbuild.exe reads the solution and project files, so you can execute a build from the command line just as easily as from the IDE.

CMake and other tools that build build-configuration projects are overly complex and are an impediment to the software development cycle. They hide details like compile and link options, or make you use their language/syntax to express those options, and don't help you avoid os-dependent configuration issues. I just spent a couple of months removing CMake from a project at work. CMake was used to generate Visual Studio projects from several third party libraries, as well as the in-house,project-specific libraries and executables. It generated a lot of project files, and made it very difficult to see which projects built libraries and which were there just to kickoff Cmake so it could do some kind of internal dependency checking. Once CMake was removed, I was left with 14 projects, each with a clear purpose. A fresh build would run in a couple of minutes instead of 17 minutes. If no changes are made, it would take Visual Studio and Cmake 20-30 seconds to determine there was nothing to build. Now it takes a split second.

Make the OS a service to your app. Keep the build as simple as possible.

Edit: fix a typo

Christopher Wellons

That's a good point about just having a separate Visual Studio build, since that's the natural way to build on that platform. I've seen other projects do this, where there's a Makefile for unix-like systems, and in parallel a Visual Studio build. The only real downside I can see is having to maintain them both.

The main hesitation for me is that the Visual Studio build is more of an afterthought. I've learned its quirks and such (e.g. being _almost_ C99, but not quite), and it's easy to plan ahead to make my software Visual Studio friendly. Ultimately I only run the command line tools (vcvars.bat, then cl.exe, link.exe, etc.) and avoid firing up the whole thing if possible. If I took it more seriously, I'd probably follow your advice.

I agree with you on CMake. I've tried using it in a few projects and was never satisfied. Seeing LLVM make pretty effective use of CMake makes me hestitate to write it off completely yet, but maybe there's just some really heroic stuff going on behind the scenes to keep it working well.

Powerslave

There’s a much easier solution: Document that the application requires,
say, C99 and POSIX.1-2001. It’s the responsibility of the person
building the application to supply these implementations, so there’s no
reason to waste time testing for it.

That's the most unprofessional paragraph I've seen in a good while.

Zeturic

Are you trying to say that "professionals" are responsible for ensuring that their programs can be built on any random machine?

Powerslave

Would you be happy if you had to hand-pick resource and library files for your favorite GTA/WoW/Blender/whatever because some lunatic developer insisted it's your responsibility to provide them? That'd pretty much be in the same vein.

Like it or not, real professionals are responsible for ensuring that their software can be built on any "random" system it is intended to be built on. That's absolutely part of being a professional (as opposed to hacking shit together and crossing fingers).
Providing a ubiquitous, automated way to resolve your product's dependencies is, give or take a few corner-case exceptions, always your responsibility as someone who writes the said software. Trying to delegate this task to a downstream consumer is as unprofessional as it gets.

Zeturic

Here's the thing: if you're writing C99 code for a POSIX system, systems without a C99 compiler or POSIX support necessarily fall outside of the set of systems that you're intending your program to be built on.

Do you think that it is unprofessional that, for example, Chromium can't be built for the PDP-11?

Powerslave

"for a POSIX system"
That's exactly where one fails at portability.
Do you think it'd be professional to say a Win32 application is perfectly portable and it's the responsibility of the end user to provide the W32 API (e.g. have Wine installed)?

L Michaels

I'm not a fan of autoconf or the standard GNU way of building programs. It requires Perl and m4 languages just to build something in C/C++. There are some times when just using #ifdef to test for the platform and/or compiler just isn't enough. I think CDetect makes a nice (C based) replacement for autoconf/configure. It's available at Sourceforge. There have been a few forks of it as well including mine which adds support for cross-compilation and other features like working in conjunction with pkgconfig/pkconf.

I do quite a lot of cross-platform development. I use a custom build system and also use CDetect and pkgconf (when needed) and make. My build system can autogenerate parts of a makefile and deal with utility/tool differences on various platforms to make things more portable. I've been able to build applications that run on Windows, Linux, FreeBSD, Android. I sometimes build cross-platform programs that work on DOS as well. On Windows, I prefer MinGW to Visual Studio. I've used both and I've hit some bugs using Visual Studio when the same program passed various validation tests and worked fine with 4 other Windows compilers and a Linux compiler. If anyone wants a POSIX compatible Windows compiler, there are projects like Cygwin and midipix that supply them.

I'm currently investigating portable GUI options and working on writing a minimal one for my own use. There are several good cross-platform graphics and/or GUI libraries that can be useful cross-platform, such as nano-x/microwindows (which works on a variety of platforms and has a subset of the X Windows and Win32 APIs), SDL, allegro, PDcurses/ncurses, nuklear, GLFW.


My Journey with Touch Typing and Vim

Nagora

I do find these sorts of posts very surprising. I've done relatively little modification of my Emacs keys (CapsLock->Ctl is the main one) and I write a LOT of text and code with it. I've just finished a 75,000 word novel (with week-long bursts of 2000 words every day) and I work on PHP, Perl, and Bash scripts as my day job (and a plethora of other languages for fun at home); touch-typing is absolutely required for the level of output I sustain. In 20 years of Emacs use I've never had any trouble doing this, so I can't see why touch-typing would be an issue to drive anyone to model editing.

Perhaps your “Hmm, that’s not very convenient. I’ll change it.” approach has actually hampered you and switching to an unmodified Vim simply represents a "reset to sanity" opportunity? Or perhaps the keyboard itself places the Ctl and Alt keys in odd places? I know that the keyboard on my laptop is definitely harder to use than my work's HP or home IBM-M boards (although the laptop does give me a Super- key to play with).

Certainly, you would have to hold a gun to my head to make me use Vim for even 15% of my output, and I say this as someone who values having Vim on the various servers I have to access for routine maintenance. Modal editing was my introduction to Unix and I am very happy that I do not have to use it more than once or twice a day for short periods.

Christopher Wellons

Congratulations with the novel! That's a ton of text editing.

It could be that I still haven't figured out how to comfortably press some of the frequent Emacs bindings without my fingers getting lost. Unlike right SHIFT, right CTRL isn't really an option because it's badly placed and its position varies so much from one keyboard to the next. Between work and home, I do use a number of different keyboards. Being stuck with only left CTRL makes for awkward left-handed CTRL combinations (C-x, C-a, C-s). Using caps-lock as CTRL only helps a little bit. In contrast, modal editing has felt very natural while building touch typing muscle memory.

I don't suffer from RSI, but I know it's fairly common for sufferers to switch to modal editing to relieve their RSI problems. The most recent example I've seen is Casey Muratori (of Handmade Hero), who switched from Emacs 23 to 4coder (a modal editor). I've taken this overall trend as evidence that Emacs' default bindings really do cause strain under touch typing, and it's not just a matter of technique.

When I've changed some the default Emacs bindings -- which I haven't actually done *that* much -- I don't think it's that I choose something worse. It's just that I chose something arbitrarily different. Then I built up muscle memory for the non-standard bindings. Since I'm starting fresh with Vim, I can avoid this by stopping it before it even starts.

VanLaser

As a person that started with Vim before Emacs, I can only admire Emacs people that are sufficiently intrigued by Vim to start studying it. There are a lot of 're-inventing the wheel' Emacs packages, or keymaps, or bits of elisp that try to solve problems that Vim solved "perfectly" a long time ago.

The "one" book to read afterwards IMHO, and before starting to move to 'leader keys' and all kind of Vim-oriented keymaps, is "Practical Vim" - it's an excellent distillation of 'good habits' and manners of working with Vim.

Christopher Wellons

I've observed an Emacs re-invented wheel myself: multiple-cursors. You create a region, then invoke its main function to put a cursor on each line in the region, allowing you to edit multiple lines at once, though it's not quite as powerful as the normal cursor. It's really slick, and it's visually impressive, but 99% my uses can be done with Vim's visual mode and visual block mode. Technically multiple-cursors can do a bit more, but I never it all. Vim's approach is better because it composes elegantly with the existing editing commands, and it doesn't have a bunch of little quirks.

Thanks for the tip on the book. The official user manual has all the information, but not a lot of opinion on how to use it. When I'm new to something, I really do want guidance on good habits.

VanLaser

Glad I could help a little! About multiple cursors, the topic is also periodically mentioned on vim reddit. Newish vim versions have the 'gn' text object to help with (an alternative way of) offering some of the same functionality (together with visual block etc.). This blog post covers it pretty nicely I think: https://medium.com/@schtoef...

Christopher Wellons

Interesting article, thanks! That's exactly the sort of thing I was talking about with multiple cursors. I didn't know about gn, either. With some research I see Evil has gn as well, though it requires some coniguration to enable it.

Christopher Wellons

Following up on your suggestion, I'm happy to report that "Practical Vim" has been an excellent book, thanks! I should have started with it in the first place instead of diving into the official Vim manual. I'm going to add a note to the article about it.

VanLaser

YW! - glad I could help a little :)

VanLaser

(with the risk of posting too many "helpful" links)
Something that helped me a lot in order to go in the other direction (Vim -> Emacs) was this document: https://github.com/noctuid/... which provides a kind of informational "glue" between the two editors.

Anonymous

Evil is the best of two worlds, I'm sure you'll love it.

mirabilos

I just use ed(1) for that, it’s easy to use and pretty universal.

Well, enter GNU distributions. First Gentoo “oooh, GNU nano is sooo user-friendly¹, let’s install it as the default editor², oh and then we don’t need the standard³ editor⁴”, then others like Debian followed. FU!

① it’s not, I always configure my pine to never even TRY to show pico and to dump me straight into my favourite curses-based editor, and GNU nano’s just a pico clone

② it’s “editor” not “nanoitor” you imbeciles!
③ also ignoring the POSIX mandated install location of /bin/ed and not even providing a symlink… fortunately this got fixed after I bugreported it AND figured out where the relevant Linux standard says to follow POSIX in this regard… *sigh*
④ not installing the others by default; not nice

I mean, ed(1) is concise and easy to lern, so every serious Unix user ought to know it anyway.
Manpage: http://www.mirbsd.org/man1/ed
Tutorial: http://www.mirbsd.org/manUS...
Advanced: http://www.mirbsd.org/manUS...

Greg Graham

I like vim, and I learned the original vi in 1984. However, when I am using vim a lot, it causes me problems when I have to edit text in a non-vim environment, especially situations where ESC means cancel. As I get older, I have more trouble switching back and forth, so now I do my programming in Sublime Text.


Two Games with Monte Carlo Tree Search

Boštjan Vesnicer

Interesting post, as always. Keep posting ...

Shameless plug: I managed to win Connect4 in first try. Here is the proof :-) https://uploads.disquscdn.c...

Christopher Wellons

Nice! I've got to figure out how to make it stronger. If you want to try something simple yourself, tweak the "AI parameters" weights at the top of the source. Recompile it and see if it plays better.

Ruben

I won't feel bad for that AI when it finally falls into 'shallow trap'. It kicks my ass everytime in connect four...
Is the only reason for saving the tree that the information can be re-used?

Christopher Wellons

Yes, it's just for re-use. The previous game states aren't considered, and so tree could be constructed from scratch each time a decision is needed. The 500,000 playout limit is an additional 500,000 on top of the information carried forward in the reused branch, so reusing it effectively increases the number of playouts for free. Besides this, it may generally be faster to garbage collect the dead branches than to reinitialize the buffer for a fresh tree (e.g. reconstructing the entire free list). If that's true, then there's no benefit at all to starting from scratch each turn anyway.

Eric Boesch

Late comment after I stumbled onto your blog --

I would not assume UCT is a poor fit because of traps, but rather that basic UCT is not enough. The extra step of feeding the UCT tree back into the playouts to make that stage smarter can discourage traps, and I believe that's what rather quickly pushed Go programs from around KGS 2 dan to 7 dan, just shy of pro strength. (I have not looked into what AlphaGo did later to reach its current ridiculous strength of maybe 13 dan.) I do not think that extra step is easy, since it requires judging similarity of different positions. I had had the same idea -- it's kind of obvious -- but I did not figure out how I might implement it.


The Adversarial Implementation

Jan Tušil

There is a compiler based on formal semantics of C, which detects every undefined behaviour: https://github.com/kframewo.... It could be used as an adversial compiler.

Christopher Wellons

Thanks, this is an interesting project! My adversarial implementation concept is more of a thought experiment, a tool to reason about a questionable piece of code, but having a concrete compiler like this should be useful for testing.

Paddy3118

> However, this code is incorrect. The deterministic handle closing an implementation behavior, not part of the specification.

Implementation details aren't necessarily incorrect. The spec may omit mentioning a detail, (by accident or design).

Specs need to end. Expect some discrepancies found by this method to be accepted, but some to be rejected as being too pedantic or better *not* added to the spec.

Christopher Wellons

What I mean is that the small sample program is incorrect, not the implementation or specification. That's because sample's correctness depends on the behavior of a particlar implementation, and it's perfectly fine for implementations to deviate on that detail.


Web Scraping into an E-book with BeautifulSoup and Pandoc

Eric Burns-White

For the record? I literally hand-converted this to Markdown about four months ago so I could edit and prep the upcoming eBook and print editions. Your script would have made that 8000 times easier. :)

I used the hash for scene breaks specifically because I use hr differently, as you surmised. My wife mocks my HTML markup, it's worth noting.

Christopher Wellons

Thanks for chiming in, Eric! And thanks for sharing your book online. Knowing now that you've read my commentary about your markup, I feel like I was being a little too harsh, but, at the same time, I'm also vindicated by your wife's teasing. :-)

Converting an entire book to Markdown by hand is incredibly tedious, but I hope at least it paid off for for you as an extra pass of proofreading. Pandoc can also convert HTML to Markdown, and if I had known about BeautifulSoup two years ago when I converted the other book, I would have used a script just like this to do the bulk of the conversion. Especially since that book is almost twice as long as yours.
If you ever need to bulk process your writing again in the future, feel free to get in touch with me if you need any technical assistance.

Eric Burns-White

Following up -- I've been using modifications of your script for scraping/converting other projects that are on Banter Latte proper, and they work like a charm.

Also -- when I mentioned all this to my wife, she pointed at me, shouting "See?! See?!" Hell hath no fury like a usability specialist vindicated. :-)

Eric Burns-White

Oh, and while I appreciate not handing out the ebook of the original (mostly so I can track readership), I don't have any problem with folks doing this themselves. It's not Creative Commons released, per se, but it's online for free and I sure as heck don't mind people adjusting things to their reading preferences.

Not that people would need me to thumb this up, per se, but no reason I shouldn't.

Cees Timmerman

if language == "Python": print("No need for braces in if.")

Christopher Wellons

You're right. Hold C habits are hard to break. I've just fixed it:
https://github.com/skeeto/s...


Switching to the Mutt Email Client

B Brad

Kind of confused by the "Maildir setup (i.e. not IMAP)" comment. In any case if you are using IMAP I suggest dovecot. Very easy to setup, and if you use dovecot's deliver (recommend and default) you get Sieve for free. Dovecot's deliver is of course aware of dovecot's index format, so everything stays in sync without reindexing. I greatly prefer sieve's secure and human friendly configure files over procmails insecure and line noise like configuration files.

Christopher Wellons

You're right, I could have both, but I prefer to just ssh into the server and work directly with Maildir. Currently, my procmailrc just sorts out spam, but if I ever needed something more complex, Sieve does look like the nicer, better option.

possiblywrong

I skimmed the RFC and maybe missed the appropriate details, but it isn't clear to me how source code is handled, particularly things like "raw" triple-quoted multi-line strings in Python which might have trailing whitespace, but you wouldn't want to be flowed?

Christopher Wellons

Unfortunately I think there's just no way to express code with trailing whitespace under f=f. You need those hard newlines to prevent wrapping. That's a good point about Python triple-quoted strings, being one of the handful of legitimate places to require trailing whitespace in code. Another case is inline patches where the context has trailing whitespace.

Anonymous

You might be interested in mutt-sidebar, it's a patch but I believe Debian distributes a separate mutt package with the sidebar pre-patched (under the mutt-sidebar name).

Christopher Wellons

Thanks for the tip. I have looked into the sidebar patch, and Debian does indeed include the patch (Stretch actually ships Neomutt), but it's not something I want to use. I don't want to split the Mutt window down any further.


Building and Installing Software in $HOME

Nicolò Balzarotti

What about using the nix package manager? I think most of the problems would have been solved easily

Christopher Wellons

I was aware of nix but hadn't really considered it. I do see it supports home directory installs, which is great. However, it doesn't really fit either of my use cases. 1) I want to install specific versions or patches of software not available in nix (i.e. my multiple Emacs example). 2) I didn't mention it in the article, but when I'm not root, I also usually don't have direct internet access, so nix can't help me.
I had considered discussing pkgsrc, which I like better than nix, but it also doesn't fit the bill for the same reasons.

Nicolò Balzarotti

You can easily edit the nix expression to add patches, and it's just a
"patches = [./patch1.patch ./patch2.path];"
away. And I think that having an offline repository is not that difficult, but I've never tried. But with a small setup I think it's possible to be able to install everything needed from a USB drive

OT: thanks for your posts, I always enjoy reading them

Anonymous

Have you tried GNU stow? It's well suited for this, though I haven't used it myself as my system (practically identical to yours) "just works".

Christopher Wellons

I don't really need all the bells and whistles that comes with Stow. A few environment variables and my dotfiles install script have been perfectly sufficient for these past few years:

https://github.com/skeeto/d...

In all things, I'm very adverse to increasing the number of dependencies from 0 to 1. That's a huge leap. Going from 1 to 2 ain't so bad.

Kaushal Modi

+1

It's a one-stop shop for managing dot files, installing multiple versions of libraries and packages. You simply set PATH, MANPATH, etc. to point to the stow target.

Wilfred Hughes

Have you looked at Gentoo Prefix?

I've used it to great success to build an entire Linux userland in $HOME. I was using a Red Hat 5 box with an ancient GCC, and Gentoo Prefix helps you bootstrap the compiler. It also provides a package manager, so you don't need to manually chase dependencies.


Stack Clashing for Fun and Profit

(no comments)

Rolling Shutter Simulation in C

Tudor Watson

nice :)

Christopher Wellons

No weight, just me pulling on the string slowly. Though using a weight would have been a good idea.

Tudor Watson

I thought that, then rescinded as I think it would accelerate over time right ?

Christopher Wellons

True, but at some point the acceleration would be negated by friction and the speed would become constant. Striking that balance within the available space would be the tricky part.

Tudor Watson

I think this will do the job, https://en.wikipedia.org/wi... maybe you can 3d print it to :)

Murielle Bolle

I am using Linux Mint 18.1 and the transparent encode/decode command in the article does not work for me:
lavf [error]: could not open input file
raw [error]: raw input requires a resolution.
x264 [error]: could not open input file `/dev/stdin' via any method!

but this works:
ffmpeg -i input.mp4 -f rawvideo - | x264 --input-res 1080x1080 -o output.mp4 -

Then adding the C program in the middle produces a segfault which tells me that the file format isn't what's expected...

Christopher Wellons

The "rawvideo" format is neither PPM nor YUV4MPEG. It's just a raw dump of video data and may not even include the video dimensions or pixel format. This definitely won't work with the provided C program. For decoding, you should use the command I provided in the article so that you specifically get a PPM stream.

However, it looks like the version of libavformat in Linux Mint 18.1 (libavformat 2.8.11, derived from Ubuntu 16.04) doesn't support PPM input, so you'll need to throw ppmtoy4m (from mjpegtools) in front to first encode it to Y4M. I've added a note to the article about this.

Murielle Bolle

Fantastic, works now!


Integer Overflow into Information Disclosure

possiblywrong

"The frightening takeaway is that this check is very easy to forget." Very interesting post. And the check was indeed apparently easy enough to forget that I wasn't able to find the cause of the problem, even when you put it right under my nose in the article.

Christopher Wellons

Being so easy to miss, I've probably left dozens of unchecked potential integer overflows throughout my C career so far. My primary saving grace is that I've mostly developed for 64-bit platforms, and so all the exploitable arithmetic has been performed as 64-bit integers (size_t), generally beyond the range of untrusted input.

Clément Pit-Claudel

Doesn't the fix introduce a new potential vulnerability? You don't seem to check that nmemb is greater than 0 anywhere, and nmemb seems to be taken directly from the image height (which is attacker-controlled), so there's a risk of dividing by 0, which is undefined behavior.

IOW, couldn't a malicious user cause your program to run into undefined behavior by crafting a zero-width image and thereby making your safety check divide by 0? (real callocs actually do seem to check that nmemb != 0 — see e.g. https://code.woboq.org/user...

Christopher Wellons

Whoops, you're exactly right! I neglected to include the first part of the overflow check. I've corrected the article, thanks!


Introducing the Pokerware Secure Passphrase Generator

Impossibly Stupid

Minor typo: "Suppose in step 2 you draw King of Hearts (KS" should clearly end with KH.

Otherwise, I always like seeing things mapped to word lists (I did it for locations on my own blog). Doing it for cards raises another interesting possibility: cheating! With the right tools (or just an astounding memory) two players in cahoots could use plain English to communicate their hand's value.

Christopher Wellons

Good catch, thanks! It's now fixed.

That's a neat idea about cheating. As you point out, that sure would be quite the memorization job. Reliably memorizing 10 random words from this list for a strong passphrase takes some effort. Memorizing thousands of these words along with their mappings would be a serious undertaking.

Impossibly Stupid

Keep in mind, though (and as I note in my discussion about creating a location phrase), once you set out to specifically make a "random" selection more meaningful, you would likely pick different encoding techniques and curate a different word list in order to more efficiently convey the info that matters. Especially if you want to slip it into the regular banter that might occur during a card game.

It also helps that dealt order doesn't generally matter for most card games, so you're working with combinations rather than permutations. And the "garbage" cards in a hand can probably be left out. Maybe 16 bits of information worth conveying in total.

Anon

But why?

Aaron Toponce

Hmm. Using playing cards as an RNG is an interesting idea. Unfortunately, as an RNG, it's very prone to error, both intrinsic to the deck of cards itself as well as to human error.

First, you mention on your Github page that six to seven riffle shuffles is sufficient, and you cite https://possiblywrong.wordp... as a reference. However, it turns out that you need to get about 10-12 riffle shuffles before you've maximized the entropy in the deck.

Brad Mann, a mathematician and Harvard, shows mathematically why you need to get up to 11-12 riffle shuffles before you can say that a deck is thoroughly shuffled enough that each of the 52 cards is fully unpredictable. PDF here: http://www.dartmouth.edu/~c...

An answer on StackExchange also shows that by mimicking a riffle shuffle in software, and calculating the entropy of deck after each shuffle, entropy isn't maximized until about 10-11 riffle shuffles. https://stats.stackexchange...

The problem with riffle shuffling, of course, is that paper-based cards get sticky after heavy use, and cards stick to each other during the riffle shuffle, lowering entropy during each shuffle. High quality plastic cards, like those from Kem or Copag, will not stick together, and thus keep the entropy maximized during each riffle shuffle.

As a human, you can also help maximize the entropy during each round by doing:

1. Riffle shuffle
2. Cut
3. 4-pile shuffle
4. Cut

You would still need to do 10-12 rounds, but at least then you can guarantee that by the end of step 3 in each round, every card has been separated from each other by a distance of 4. The problem with this approach, however, is that it's slow.

Personally, when I'm playing card games with my friends or family, much to their chagrin, I:

1. Riffle shuffle
2. Cut
3. Overhand shuffle
4. Cut
5. Weave shuffle
6. Cut
7. 4-Pile shuffle
8. Cut

I'll do this three times, at which point the deck has gone through 12 shuffles and cuts, and I've guaranteed that each card has been separated from the others. The riffle shuffle and overhand shuffles are providing me entropy, while the weave and pile shuffles are deterministic, but guaranteed to be separating cards.

Aside from execution time (it takes a few minutes), I've never had anyone complain about the deck shuffle during a game of Uno or Go Fish, for example. The deck is sufficiently shuffled, that players aren't seeing repeats of the past game or round, or some combination of it.

That's why dice are so attractive as an RNG. No shuffling or cutting, and no waiting around. Just throw the dice, and see what falls out. Write down the response, and throw them again. The problem with dice as an RNG, however, is that they can be biased just due to poor manufacturing quality.

You can purchase precision dice, with either razor edges or floating edges, but they're pricey. Even then, it's difficult to guarantee that the surrounding environment isn't affecting the throw, like an unlevel table. Regardless, they provide less control variables than a deck of playing cards, and are less-likely to exhibit bias in the system.

Christopher Wellons

Thanks for the additional references. I'll have to check them out.

In this application, the only thing that matters is that an attacker doesn't have sufficient information to take shortcuts in a brute force search. If it is true that 7 riffle shuffles has a slightly uneven result, it won't practically matter unless the attacker has a really solid idea of the original card permutation, and probably also a decent model of your personal shuffling quirks. It's likely the cards were last used to play a game, and therefore the deck started in a nearly shuffled state. An notable exception is solitaire, which leaves the deck sorted at the end of the game.

Second, only a small amount of entropy (about 5.5% of the total) is drawn from the deck for each word, and additional shuffles are made in between words, introducing even more entropy back into the deck. Even if the first word is drawn from a not quite perfectly shuffled deck, the deck will certainly be thoroughly shuffled for the remaining words.

Aaron Toponce
If it is true that 7 riffle shuffles has a slightly uneven result, it won't practically matter unless the attacker has a really solid idea of the original card permutation, and probably also a decent model of your personal shuffling quirks.

Playing cards maintain state, so unless you sort them in order, or sufficiently shuffle them, leaving them lying around will reveal that state, and could potentially expose your prior generated passwords.

Dice do not maintain state, so there is no indication of what past passwords were generated just by examining the dice. You could potentially get one word from the current state the dice are in on the table, but that's as far back as they can take you. Picking them up, and dropping them in a cup would remove that possibility.

Even if the first word is drawn from a not quite perfectly shuffled deck, the deck will certainly be thoroughly shuffled for the remaining words.

This is a bias that you do not have with fair dice, and why using playing cards as an RNG is less-than-optimal. If you are not re-inserting the cards after every draw, then not every word is equally likely to be generated, and you've introduced a bias into the system.

Aside from the execution speed, which is horribly slow, you have to deal with these biases, or you weaken your result, as the words are not chosen true randomly. Fair dice are fast and unbiased, of which are extremely difficult to achieve with playing cards.

It's an interesting idea, but one that doesn't stack up under scrutiny, and is cumbersome enough to prevent the general public from adopting.

Penguin Cardsharp

Could you deal with the state issue by washing the cards for a minute before you start generating a passphrase, following the Pokerware procedure, and then washing the cards for another minute after generating a passphrase but before you put them back in the deck box?


A Tutorial on Portable Makefiles

Andrew

Two small nits:

1. "Fortunately there is, in the form of inteference rules" should be "... inference rules".

2. "A target is out-of-date if it is older than any of its prerequisites" is ambiguous (English can't make up its mind about distributing comparisons over quantifiers, so this can also be read as saying that the target has to be older than *all* of its prereqs). I would suggest "a target is out of date if any of its prerequisites are newer than it is" for safety.

Andrew

Sorry, forgot to say: thanks for a useful article!

Christopher Wellons

Thanks for the tips, Andrew! I've corrected the "interference" mistake (how silly of me), but I still prefer my original wording for your second suggestion.

Lucas

First of all, kudos for being one of the few people online that posts about POSIX make, breaking the GNU-makefile tendency that reigns.

There's a detail about .POSIX and the usage you show: it should be the first non-comment line of the file, and not the first target. Otherwise, the implementation could do whatever it wants to.

Christopher Wellons

Oops, you're right! I got the .POSIX situation mixed up in my head at some point. The article as been corrected. Thanks!

Konstantin

Thank you! Because of your posts I want to write in C again!

Christopher Wellons

Perfect! My secret plan of writing lots of articles about C and its ecosystem in order to increase its popularity must be working. :-)

agumonkey

You have a talent for explanation. It just flows. And includes Hedberg jokes. What else could we want ..

Christopher Wellons

Thanks, agumonkey! You being such a long-time reader is a compliment on its own.

Ralph Corderoy

Hi Christopher, The common target for running tests is ‘check’ rather than ‘test’. Cheers, Ralph.

Christopher Wellons

You're right. I've seen both, so I added it to the article in addition to "test".

Ralph Corderoy

Thanks. I think ‘check’ is more common, probably because ‘test’ could be the name of a target, especially test(1). https://www.gnu.org/prep/st... says ‘check’ and it's then been popularised by the GNU autotools so ‘./configure && make all check install’ is common.

Joshua Barnett

So when writing a portable make file should you worry about checking and resolving dependencies required in order for your build to run? (a compiler, an architecture, an OS, shared libraries and etc.)

Should these be resolved out of Make in some prior step to configure the project?

Christopher Wellons

I've had the best results on small and medium projects from doing two things. First, I consider how much I *really* need to rely on some non-standard feature of the compiler, operating system, etc. Most of the time I don't think it's worth the cost. Anything non-standard that can be eliminated is another feature that doesn't need a configure-time / compile-time test.

A perfect example is byte endian. Rather than test for the architecture endian in order to select an integer parsing implementation, just write clean, architecture-independent code. Modern compilers will figure it out and do the efficient thing anyway (see my "Portable Structure Access with Member Offset Constants" article for a hands-on example).

Conditionals in Makefiles are nearly always a mess, and it's very difficult to reason about the effects on the build. I've had a lot more success pushing the conditions down into the C preprocessor. Of course there's definitely a wrong way to do this, and you can just as easily end up with the same mess in the preprocessor. Make it a hard rule to *never* use an "#ifdef" within a function body, and avoid testing the same value more than once (e.g. do it all in one place).

Instead invent an abstracted API, and then use a single, large "#ifdef" to select the appropriate implementation of that API based on the operating system, configuration, etc. There's an example of this in my "OpenMP and pwrite()" article for the write_frame() abstraction. Do it well and the compiler will inline much of it, so that it's no less efficient than the traditional within-function "#ifdef" spaghetti, while still clean and readable.

If it's a configuration option for the user, have it configured via a documented "-D" option in CFLAGS. For an example of this, see "config.h" in my Enchive project. If the user doesn't explicitly set an option, it gets a default value in "config.h", and the rest of the program can assume it's always set. The user's interface looks like:

make CFLAGS='-O3 -DMAX_FOO=64'

This sort of thing composes well with the "-MM" feature for those projects large enough to require automated dependency management. The preprocessor will be doing the same work during dependency discovery.

I should just write an entire article on this since it's hard to describe it properly here.

For a large project where the selection of libraries to be linked is configurable (and should even be discovered automatically), this all wouldn't work well. The user would have to coordinate LDLIBS with CFLAGS, which is far too complicated an interface. That's where you *really* need a layer above make to generate a configured Makefile — the "prior step" as you said. I'm a fan of hand-written configure scripts, but something like CMake would be reasonable, too. (I just hate Autoconf.) This layer would also run the "-MM" stuff.

int19h

Wouldn't "del /q" be the equivalent to "rm -f" on Windows?

Christopher Wellons

Hmm, I thought there was something about "del /q" that didn't work quite right, but I can't remember what that would have been. It complains when the files to be deleted don't exist, but that fact doesn't change the errorlevel and therefore doesn't communicate a failure to make. Maybe that's the part that threw me off. So, except for the pointless error message from del — which can be silenced with a stderr redirection — I guess it's essentially the same as "rm -f". Thanks for pointing that out!

Vincent Picaud

As usual high quality posts... Thanks for sharing.

James Wright

To avoid having the wrong make run you makefile they should be named GNUMakefile or BSDMakefile

Adam Chambost

Fantastic article Chris!

Your paragraph "Out-of-source builds" touches on an area of makefiles where I've yet to find a solution that is simple, concise, DRY, and portable.

The question I have for you is whether you think "immediate evaluation syntax" (::= or :=) is standard POSIX syntax.

Having read and re-read the 2016 Edition of the standard you link to [1], I'm fairly convinced the answer is "no, this syntax is (not yet) in the POSIX standard".

Yet, in an old article by David A. Wheeler [2], where he links to the POSIX author's (the Austin Group) discussion forum [3], with the claim that "I am happy to report that the POSIX committee accepted this proposal and has added support for immediate evaluation." The discussion timestamps are from 2011 and the article from 2014.

The only conclusion I can make from these three documents is that while this was proposed and positively received back then, it ultimately never made it into the 2016 Edition of the standard.

Does this sound right to you? What are your thoughts on "immediate evaluation"?

I run my makefile against GNU and BSD make variants and both seem to support it (at least the := version, ::= seems to be problematic if I recall correctly). Personally I'm tending to adopt a custom POSIX+ImmediateEvaluation standard for my own Portable makefiles. In theory this may make them non-standard, but in practice it does not seem to impact their portability in all known test cases available to me. (The only portability concern I am aware of is that := is incompatible with SunOS make).

This discussion relates to my own "Out-of-source builds" solution. If you would like me to post very concise makefile examples with and without immediate evaluation that models this solution please let me know and I can provide them.

[1] http://pubs.opengroup.org/o...
[2] https://www.dwheeler.com/es...
[3] http://austingroupbugs.net/...

Christopher Wellons

Thanks, Adam. Yeah, I _really_ wish inference rules were just a little more capable so that they could cover out-of-source builds. Regarding your final sentence, where would you post these Makefiles? Have you written an article about this? I'm interested in seeing how you approach portable out-of-source builds.

That's a good question about immediate evaluation. I'm also not seeing it anywhere in the Open Group document. Adding another source to your list, the GNU Make manual claims it was standardized by POSIX in 2012. Yeah, I agree that it seems to have not actually made it into the standard in yet.

The are two implementations I know about that don't support immediate evaluation with "::=". One is Microsoft's nmake — but that's got a bunch of bigger issues anyway. The other is Solaris / OmniOS, which is probably what you've seen with SunOS. I like to test stuff on OmniOS since it's the "wierdest" unix available to me. (It's funny that being closer to the original unix actually makes it look less like unix since I'm so accustomed to Linux and BSD.)

So while not yet formally standardized, as you noticed it seems like a fairly safe bet, especially since it's likely be standardized eventually. In practice it's very portable, and that's what ultimately counts. You're not tying yourself to GNU Make by using it.

Yannick Duchêne

About `rm -rf` on Windows: MS-DOS has the `deltree /y` command.

Lifepillar

>When talking about files in subdirectories, just include the subdirectory in the name.

According to the standard: “Applications shall select target names from the set of characters consisting solely of periods, underscores, digits, and alphabetics from the portable character set […]”. It doesn't mention slashes. So I wonder: are paths in targets POSIX compliant?

Christopher Wellons

Good observation! I hadn't noticed that. I suppose that does mean using a path as a target isn't compliant. That's a really unfortunate detail in the spec since it means only recursive Makfiles can build subdirectory targets. Though, fortunately, every implementation I've ever seen supports slashes in targets.

Lifepillar

I find that quite surprising, as inference rules apparently do support paths (there are 'D' and 'F' modifiers for internal macros). It looks like an unnecessary restriction to me.


Vim vs. Emacs: The Working Directory

deathbullet

Where else being aware of "project root" be useful? I think with time, it is losing its relevance

Jan Ciger

Many places - e.g. search, revision control, debugging, etc.

However, a much better solution than these hacks is to use something like Projectile and cpputils-cmake to set up your environment. Then you don't need to bother with it - Projectile will correctly find the root of your project and a lot of tools integrate with it automatically.

Clément Pit-Claudel

I use the following instead:


(defun ~/compile ()
"Same as `compile', with bells and whistles.
Add -C if command is make and a Makefile can be found, and always
run in comint mode. Written by Clément Pit-Claudel."
(interactive)
(let* ((command (eval compile-command))
(makefile-dir (locate-dominating-file default-directory "Makefile"))
(is-default-dir (or (null makefile-dir)
(string= (expand-file-name default-directory)
(expand-file-name makefile-dir)))))
(when (string-match "make \\(?:-C \\([^'\" ]+\\|\"[^\"]+\"\\|'[^']+'\\)\\)?" command)
(setq command (replace-match (if (not is-default-dir)
(format "make -C %S -j8 " makefile-dir)
"make -j8 ")
nil t command)))
(let ((current-prefix-arg (cons 4 nil))
(compile-command command))
(call-interactively #'compile))))

The difference is that it doesn't change the working directory: instead, it gives make an appropriate -C argument (it also adds -j8, for convenience, and it runs compilation in shell mode to facilitate interaction)

Christopher Wellons

If you want to automatically choose the proper value for -j, I have an article about finding the number of processor cores in Emacs across different operating systems: http://nullprogram.com/blog...

jonEbird

If you haven't already checked out projectile[1], do it. There is a nice function `(projectile-project-root)` that works to find the root of your particular project for many project types. That is what I use to set `default-directory` in a let statement.

[1]: https://github.com/bbatsov/...

Christopher Wellons

I actually had the Projectile repository open in my browser while writing the article, but I ultimately didn't mention it since I've never actually gotten around to using it. As I understand it, it really is solving the same problem in a really smart way.

Nick F

Great comparison of the trade offs that each editor makes. I love the insight, hope to see more posts like this.

VanLaser

Interesting comparison (and I did find that automatic path change annoying when moving from Vim to Emacs). I wonder if a solution would be to use either `locate-dominating-file` or `projectile-project-root` (or both, if the 2nd fails, use 1st) with a "new buffer" hook to setup the `default-directory` local buffer variable?

jonEbird

I personally would just use (projectile-project-root) and configure it to support any special cases I might have with my projects. Easier path for maintainability, I'd think.

Oleksandr Gavenko

I've written "mymake", that looks directories up for build.xml, Makefile, build.gradle etc and run corresponding build tool. I have (setq compile-command "mymake ") and In order to move to error I need to do M-x cd sometimes...

Christopher Wellons

Looks like we've ended up on similar solutions. I've got an "smake" (search make?) script to do the same thing as your "mymake", though mine just looks for a Makefile:

https://github.com/skeeto/d...


Ten Years of Blogging

Vincent Bernat

Thanks for your blog!

I liked the part on the impact of technical blogging. The figures may miss some visitors like me, using the RSS feed and/or using an adblocker preventing Google Analytics to work.

Christopher Wellons

I completely block Google Analytics, too. Since that's easy to do, I don't feel too bad about using it here. For feed analytics there's Feedburner, but I hate pretty much everything about Feedburner.

deathbullet

Thank you for the blog! (and for elfeed which I use to read the blog!)
This is my favorite place after lwn.net

Christopher Wellons

Elfeed primarily exists to scratch my own itch, but the secondary reason is that I _really_ want syndication feeds to continue being popular. They're vital to a properly decentralized internet. Some of the huge platforms (Facebook, Twitter, Google+) that can afford to operate as walled gardens either don't support feeds or eventually stopped supporting feeds, but the vast majority of sites publishing content are still expected to provide either RSS or Atom feeds, and I want to keep it that way.

NoonianAtall

The more I put into customizing Elfeed, the more I enjoy it. Thank you very much for making and supporting it!

Christopher Wellons

You're welcome! And thank you for all your efforts within the Emacs community, too, Adam.

compunaut

I learned a lot reading your blog! Somehow I read this post and thought: What if he stops blogging and this is some kind of goodbye message? Thankfully not! I will follow your future posts as I did in the past and I'm quite happy that your are planning to continue blogging for such a long time!

A thankful reader.

Klaus

Isn't this discussing mixing up the concepts of "magazines" and "journals"? (Though I had to lookup how the terms are formally used, to even find the right words for the distinction, so please bear with me if the choice of words is bad.)

Your articles are interesting even for non-specialists like me (physicist with some education in computer science + private interest), but they are typically not fulfilling the novelty criteria expected of academic journals.

Additionally, formal publications are expected to be subject to long-term archiving. For scientific publications, it is desirable for original sources to be still available in a century; In physics I had several cases where I had to dig up original publications from the 1930s, and routinely reference original sources from the 1960s. A friend working in paleontology routinely works with sources from the 19th century, as far as I understood. On such time scales, a blog cannot be reliably kept available.

I would however agree, that journals should incorporate some aspects of blogs; Then "open xxx" things are mostly doable without dropping the advantages of journals, though there is probably a lot of conflict-of-interest stuff going on in that regard.

Christopher Wellons

I'm still new to the journal/magazine thing, so I could indeed be mixing them up. You're right that my content here isn't the same as what you'd find in a journal, but content that _would_ normally be published in a journal could be published in blog form with only some minor structural changes for the different format.

For example, suppose the ROP paper I linked — which did go through the whole journalistic peer review process — had been published across, say, a three-part blog posts here. It would have gotten a lot more exposure and at least an order of magnitude more people would have read it, on top of the other benefits I mentioned. Unfortunately since blogs aren't taken seriously, it would never be cited by anyone following the traditional route, only by other bloggers. That's a cultural issue rather than an issue inherent with blogs.

I've seen the archive argument before, and that is a really good point. Lots of stuff disappears from the internet every day, and links go dead far too often. I've probably got hundreds of dead links throughout my older articles, and the only place left to access the original content is on the Internet Archive, if they happened to catch it.

This is where open access on par with open source really pays off. Since my entire blog is in a git repository, it's ready to clone at a moment's notice. The content here has really good prospects for longevity. This is very much like open source software. While important websites to occasionally disappear without any sort of backups, it's basically unheard of for an open source project to disappear from the internet. Open source software with at least a handful of users is archived many different places, such as by various OS distributions.

While journals are good at archiving articles, they're absolutely terrible at archiving data. It basically doesn't happen. That's why so many researchers have to scrape data from graphs/plots. Even for recently published articles, most of the time it's not possible to get the data (they almost never even respond to my emails). If scientific publishing followed the open source blogging model, the article and the associated scripts and data would be archived together as a single unit. Massive datasets (hundreds of GBs or more) still present a problem, though.

I agree with your last point. Authors don't have to switch to blogging if journals were to become a lot more like blogs. :-)

Impossibly Stupid
URLs work better when they include human-meaningful context. Ideally I should be able to look at any one of my URLs and know what it’s about.

On this matter, you could do what I did to maintain backwards compatibility for my own blog: add a meaningful query string to clarify things. For most server side processing, it'll just be ignored if it isn't being used. For example:

What’s All This, Then?

Christopher Wellons

That's an interesting idea with the query string. I'll have to consider it.

One problem is that it's technically a different URL regardless if the server here happens to respond the same way. Other sites (reddit, etc.) can't know this and will treat the query URLs as if they were different articles than the query-less ones. On the other hand, fragments don't truly constitute a fundamentally different URL, so maybe that's a better way to do it.

Vincent Bernat

You can use link rel="canonical" to workaround that. However, I don't know if reddit would understand it. Google does.

Impossibly Stupid

Keep in mind, though, that the fragment is something that the browser may act on by default. There probably won't be much chance of collision with element id's for the particular case of acting as a stand in for the page title, but the query string is more flexible for all kinds of other things (especially if you do want to do some JavaScript processing on the client side).

I guess it all comes down to who you want to annotate it for. I do it mainly to remind myself of what I'm linking to, and secondarily to give people an idea of what they're clicking through to. Deduplication isn't a big priority for me; I just never publish the URL without the query string.

Kaushal Modi
URLs are forever, and I don’t want to break all my old links. Consistency is better than any sort of correction.

I use Hugo (gohugo.io) as my static site generator. It has a feature called Aliases that allows setting multiple links (aliases) to point to the same post, and this can be easily scripted too. I won't be surprised if other static site generators support this too.

And keep your blog posts coming! They are very informational, whether they appeal me (post on writing your own minor mode) or not (recent post on campaign against multiple cursors) :)

NoonianAtall

Well, gee, Chris, I was about to say, "Thanks, and keep up the great work!" but then you went and PROMISED to keep doing it for another DECADE. So what do I need to be nice for? :)


Gap Buffers Are Not Optimized for Multiple Cursors

Theldoria

It would be nice to have the ability of multiple-cursors to insert
multiple cursors (I often use by lines, by same word/symbol and manually
setting), record a macro for the work to do and apply the macro to all
cursor positions.

This, of course, can easily be done with first recording the macro, setting multiple cursors and applying them with C-x e.

Christopher Wellons

The typical way to handle this is to end the macro with a search (or just movement) to the next position, allowing you to repeat the macro N times in a row without any intervention. If the movement will fail properly after the last position, then you can safely overestimate N and the macro will stop firing when it's run out of things to do.

Aankhen

Do you work with a lot of large files? I never find Emacs to be the bottleneck in any sort of editing, whether using multiple cursors or not, but I also don’t work with a lot of large files, so perhaps that’s why our experience is different.

For what it’s worth, I normally use multiple cursors as an alternative to regexp search & replace that lets me think less.

Christopher Wellons

I occasionally use macros to uniformly modify large files with complex enough transforms that the other options either aren't viable (unix text utilities) or would take a lot longer (writing a script). For instance, when I made my r/Showerthoughts fortune file:

http://nullprogram.com/blog...

Ultimately I did end up writing some Elisp since I was expecting to process more of the same data in the future, but I initially did the work with a macro. Processing those 10,000 entries takes Emacs several minutes, and that's in the gap-buffer-efficient script/macro form. It would have been infeasible to do interactively with multiple cursors.

Aankhen

I see, that’s interesting. Thanks for the example.

Mike Zamansky

This is great!!! Now I can just point people here when I try to explain why I don't use multiple cursors much if at all

Christopher Wellons

The "You don’t need more than one cursor in vim" article I linked has been the article I've shared in the past when making this argument. That one probably works better for Vim folks since gap buffer argument doesn't apply to them.

Mike F

Good to see some commentary in the direction of less is more. You don't need a 100,000 piece mechanics tool kit to do most home repairs. Most of the time a screwdriver, hammer, and pair of pliers will do. In the same way many of the Emacs add on packages (no names) can be over complicated (complicated to learn, use, maintain) for what is typically needed to develop code. Maybe the principal of diminishing returns is here somewhere.

Many of the same packages are very interesting and fantastic accomplishments by smart and generous people, to whom I am grateful to for showing is what is possible, for sharing their work freely, and sometimes personally supporting my use of their work for free.

Emacs user (elisp hacker) since 2011.

Alex V Koval

I've got your points about efficiency, but in real life only a small fraction of real situations are CPU intensive, and in real life visual picture is what users need. With multiple cursors you can get immediate feedback on screen, and so it is better because of that.

agumonkey

I like multiple cursors a lot, even though I agree it meshes badly with gap buffers and has UI flaws. But I'd love for more ideas around the concept to pop off.

Reificator

Implementation A is obviously superior because Editor B uses it internally. Feature C does not mesh well with Implementation A, and therefore you should not use it.

Is that an accurate summary of this article?

Christopher Wellons

I didn't say gap buffers were "obviously superior" but that, despite being dead simple, they've been perfectly sufficient for 30 years, especially when used with long-established editing paradigms. However, multiple cursors—a relatively recent invention—interacts poorly with gap buffers, making this young feature even less valuable than it already is.

Reificator

Fair enough, that phrase was unfair on my part.

Still, an up and coming feature that some users seem to enjoy not meshing well with a particular implementation disproves the idea that the implementation is perfectly sufficient.

I take no issue with the rest of the article, as I don't have a lot of experience with macro based workflows, and I've seen people be quite effective with them.

But to start the article by saying that this implementation is perfectly sufficient, as a segway into why you shouldn't use a feature that it doesn't handle well seems to me like putting the cart before the horse.

AreaMan

We can write a program (a macro) to process the data more efficiently. But writing and testing and documenting the program take time too. I find that after running a macro I need to inspect each spot visually and often run a diff of before and after. Mistakes from letting macros run without inspection are no excuse for introducing bugs into source code.

Or we can use the new tools to get the same work done with less human effort. I might still need to inspect and diff the source code. But the simpler original task should reduce the incidence and effect of subtle macro bugs.

Longer running editor tasks will eventually provide sufficient psychological pressure on a coder somewhere to crack open a text on algorithms and data structures and figure out a new implementation that works better than gap buffers. Eventually the additional human effort will reduce the cost in both human and computer terms, if properly amortized over the many programmers using the improved editors.

Machines should work for us. We should not work too much for machines.

Besides, computer hardware is astonishingly cheap.

Stu

Interesting - it sounds like the underlying editor should be built on multiple cursors. This all seems to be side effects of multiple cursors being bolted on after the fact.

Christopher Wellons

Exactly!

Evgeny Panasyuk

1) Only one small part of underlying editor
2) With today's memory throughput (RAM: order of 10 GiB/s for single core; Cache
order of 100 GiB/s) - sequential multi-tossing of typical several MiB gap
buffers is not a bottleneck. It could be a bottleneck only for very large files.

Stu

This may be true, however having an architecture based on multiple cursors from the very bottom could well open up different paradigms further up the stack, so could be interesting nonetheless.

Evgeny Panasyuk

Yes, having them in mind from the begin would result in better treatment of edge cases. This is also true for many other features Emacs has.

Nonetheless, current implementation of multiple-cursors is pretty usable, and does provide practical advantages over keyboard macros in range of use-cases. Yes, there is some overlap, but each of them also has unique advantages.
And by the way, there is also some overlap with visual-regexp.el - and this is OK, I use vr/* too.

Evgeny Panasyuk

I use both - multiple cursors and keyboard macros. It is not only esthetic difference, but practical as well.

For multiple cursors you have instant feedback on multiple edge cases, and more importantly - you can undo and fix mistake immediately. While with keyboard macros you have to either: redo all from scratch or edit macro as text or execute stepwise editing - and none of them is as convenient as instant multi-undo-fix.
Moreover, for multiple cursors phases of places selection and actual editing are distinctly separated, while with keyboard macros you are doing both movement and editing in single iteration, and it is possible to catch interference between place-movement and editing-movement

On another hand, keyboard macros are very powerful.
They are not limited to "for each" workflow - steps of iterations can sequentially depend one on another.
They are also not limited to "edit/manipulate multiple places" workflow. With them you can jump between buffers/windows/frames, For instance I used this to emulate "debug step into" in source files based on extraction of file:line from trace log output in separate buffer.

Regarding performance - with Emacs I am usually working with source files, which are not big, and taking into account today's memory throughput (RAM: order of 10 GiB/s for single core; Cache order of 100 GiB/s) - sequential multi-tossing of several MiB gap buffers is not a bottleneck. Even if it would be real bottleneck - that means that bottleneck should be fixed, but not abandonment of multiple cursors. (for instance it can be fixed by switching to other data structure - multi-gap, rope, etc or maybe postpone execution for out-of-screen cursors).
That being said, I have to admit that in some cases Emacs is not fast in doing movements, editing, maybe it is related to complexity of some of major modes or maybe some other under-the-hood stuff - and this affects both keyboard macros and multiple cursors. And here, what really matters performance-wise - is that with keyboard macros you can fire-and-drink-coffee batch execution, while with multiple cursors you have to suffer multiplied latency for each single step. But multiple cursors are so connivent, that I am ready to pay occasional price of switching to keyboard macros or even to shell-command-on-region.

I understand, it can be appealing to stick to single tool and single approach, but what I observed from my experience - it is counterproductive to artificially limit yourself to single tool, and trying to find contrived excuse for not using other which is more appropriate for particular problem.
I do use and even combine Linux and Windows; statically typed and dynamically typed languages; scoped-based lifetime and garbage collection, Emacs and Vim.
I do use and even combine keyboard macros and multiple cursors.


Blowpipe: a Blowfish-encrypted, Authenticated Pipe

possiblywrong

"In CTR mode, blowfish_decrypt() is never called." If I understand this correctly, does this mean decryption is just re-applying the same encryption transformation-- the same sequence of XOR operations, with the same initialization, same incrementing counter, etc.?

Christopher Wellons

Yup, exactly right. CTR mode turns a block cipher into a stream cipher, which itself is a simulation of a one-time pad. Or another way to think about it is that it turns a block cipher into a CSPRNG. It almost seems wasteful to throw away literally half the block cipher algorithm!


Finding the Best 64-bit Simulation PRNG

Lugus

Just a typo, you write "Generic algorithms" but I think you mean "Genetic algorithm".

Christopher Wellons

Oops, thanks! You're right. It's fixed now.

possiblywrong

It would be interesting to see how xorshift1024* fares next to these others in your analysis; I'm not sure I agree with the xoroshiro128+ authors that "it [xoroshiro128+] is acceptable only for applications with a mild amount of parallelism" (I wouldn't call stepping on a 2^64 period, with a fast jump mode, "mild parallelism"). It seems fine at least for the Monte Carlo applications I typically encounter, and it seems more convenient than Mersenne Twister particularly in its ease of supporting parallel streams, which is quite a bit more awkward and less "standardized" in Twister.

Also curious that the speed difference between xoroshiro128+ and MT is so much greater here than in similar analysis referenced by the authors here. It's Mac and Swift instead of Linux and C, but the actual speed numbers seem otherwise pretty comparable.

Christopher Wellons

I just added xorshift1024* to the shootout. On the same machine and setup, I've measured it at 5620 MB/s (GCC) and 4090 MB/s (Clang). That puts it in third place for GCC and 5th place for Clang (behind both PCGs). It got 5 WEAK results and 0 FAILED.

I've completely ignored xorshift1024* in the past because of its large, heterogeneous state, and because it's cumbersome to seed. I figured I might as well use MT if I was actually considering xorshift1024*. However, it _is_ faster than I expected.

The speed difference between MT and xoroshiro128+ may be due to my aggressive unrolling and inlining. I can't find any code for their benchmark other than snippets in their paper, let alone see how they built it. Perhaps they put their PRNGs in separate compilation units, so they never got inlined. That would add function call overheard to each and reduce the disparity.

It also depends a lot on the hardware. On an old Core 2 Duo (a 2009 laptop), all three xorshift, xoroshiro128+, and both PCGs run at essentially baseline speed in my shootout. MT is still only 1/3 the speed of xoroshiro128+.

Just to add some of my own wild speculating: MT's main generation algorithm has lots of dependencies between steps. Each shift+AND depends on the results of the previous shift+AND. That makes it harder to exploit the deeper and wider pipelines of more recent CPUs. In xoroshiro128+, s0 and s1 are loaded from the PRNG state and used more independently. Those operations can be re-ordered and even executed on different ports in parallel, especially when unrolled.

Aaron Toponce

You should try Threefish-1024. That gives you 16 concecutive non-overlapping 64-bits, and it operates about 8 cpb where Blowfish is around 20 cpb. I'm guessing it'll be a serious competitor in your shootout, although memorising the algorithm may not be feasible.

Pelle Evensen

If you have the same configuration available, it would be interesting to see what timings you'd get for my mixing function when used as a PRNG.
It's a relative of SplitMix but with much better behaviour w.r.t. sparse increments:
http://mostlymangling.blogs...

Christopher Wellons

I added rrmxmx() as a PRNG in the rrmxmx branch of the shootout repository. I randomly generated and hardcoded a gamma, though the specific value doesn't matter for benchmark purposes.

Using an identical setup as before, I'm getting 7100 MB/s, which is about 93% of the speed of splitmix64() (added after I wrote my article, so it's not in the table). That's about what I'd expect considering rrmxmx() replaces a shift with two rotates and an XOR (e.g. a net addition two bitwise operations).

As I write this I'm also running it against dieharder just for completeness, but I'm not expecting to see any issues.

Pelle Evensen

I have exposed it to BigCrush of TestU01 (with a gamma of 1, multiple starting points) as well as PractRand, which it does fail after 2^44 bytes with a gamma of 1. As soon as the gamma gets less sparse (2 bits seems to be enough), it passes BigCrush and PractRand.

It might be of interest to note that SplitMix with sparse gammas fails BigCrush quite miserably. PractRand even more so. Sadly, when Steele et al tested SplitMix, they used the floating point interface of TestU01 which is quite flattering, if one expects to eventually use *all* the output bits.

I have a slightly different construction that (on my machine) is 5% faster than rrmxmx but which I yet have to put under the same scrutiny. More on that to follow. I am still pursuing a mixing function that passes PractRand with 1-bit gammas as well as traversal in gray code order. rrmxmx gets close but is not enitrely there.

Jorge Fuentes González
Elliot

Blowfish is secure but slow. How about another block cipher? If it's good enough for military secrets, it's certainly good enough for a userspace PRNG.

AES-128 is extremely fast in hardware, and otherwise you have e.g. ChaCha20 and Speck. Speck has a shorter implementation than xorshift, and will absolutely beat the snot out of it regardless of any NSA backdoors. Furthermore, it can have an arbitrarily small or large state.

Christopher Wellons

My focus here was on simple, portable C code, but at one point I did compare AES-NI (CTR) with xoshiro256**: https://redd.it/ahbs47 The result was that AES-NI was ~50% slower than xoshiro256**. Considering that the quality is excellent, that's reasonably competitive.

Since I have my own software implementation of Speck128/128, I just added a "speck" branch to see how it competes. It's not competitive at all at the full 32 rounds: an order of magnitude slower (600 MB/s). Cutting to 8 rounds helps a lot (2400 MB/s), but it's still trailing behind: 3x slower than xoshiro256**. At 4 rounds it fails statistical tests.

I also just pushed a "chacha" branch with ChaCha8, and even at 8 rounds its also not meaningfully competitive (800 MB/s) with xoshiro256**.

Dr. Strangelove

Great blog post! Thank you for sharing the code for the shootout too, I was itching to try it out with a few less known PRNGs! I cloned it down today and had no major problems getting it to run (except some MinGW-shenanigans when using SIGALRM). Nice "final" choice btw, xoshiro256** is a great PRNG. I have seen some discussion in Julia and rust-lang, and it seems that there is a big chance we might be getting xoshiro in some form in both language standard libraries. Also, something cool:

I was thinking xoshiro256++ would be a great target for vectorization, and it seems I wasn't the only one that thought so: https://github.com/JuliaLan..., someone has already created a vectorized version of it. Performance is very good, and the assembly it spits out reads almost like poetry (on both gcc and clang). I implemented it for your test suite too, and comparing it to standard xoshiro gives me around 200% better performance numbers (~5000 MB/s vs almost 10000 MB/s on my testing system).

static void
xoshiro8x256pp(uint64_t s[4][8], uint64_t* r)
{
uint64_t x[8]; uint64_t t[8]; // Optimized for 8-way AVX/SSE instructions.

for (int i = 0; i < 8; ++i) x[i] = s[0][i] + s[3][i];
for (int i = 0; i < 8; ++i) r[i] = ((x[i] << 23) | (x[i] >> 41)) + s[0][i];
for (int i = 0; i < 8; ++i) t[i] = s[1][i] << 17;

for (int i = 0; i < 8; ++i) s[2][i] ^= s[0][i];
for (int i = 0; i < 8; ++i) s[3][i] ^= s[1][i];
for (int i = 0; i < 8; ++i) s[1][i] ^= s[2][i];
for (int i = 0; i < 8; ++i) s[0][i] ^= s[3][i];
for (int i = 0; i < 8; ++i) s[2][i] ^= t[i];

for (int i = 0; i < 8; ++i) s[3][i] = (s[3][i] << 45) | (s[3][i] >> 19);
}

I had to tweak the DEFINE_BENCH definition to consume eight uint64_t, but other than that, it was very straightforward to integrate into your shootout test suite. Some compilers are a bit stingy when vectorizing the loops of the code above (so take a peek at the assembly). I would recommend you pass in the "-ftree-vectorize" flag in GCC so that it does the right thing. For me 8-way vectorization gave me the best results (I assume this depends on your CPU architecture, so play around with that one).


A Branchless UTF-8 Decoder

Andrew

The self-synchronizing feature of UTF-8 isn't only good for error-recovery — it also means that functions like "strstr" work on UTF-8 without modification. As long as the needle and haystack are both valid UTF-8, strstr can't accidentally find the right sequence of bytes with the wrong alignment. It also can't fail to find the thing it's looking for because the prohibition on overlong forms guarantees a unique representation for each codepoint.

Of course, if you want "[LATIN CAPITAL LETTER A WITH ACUTE]" to match "[LATIN CAPITAL LETTER A][COMBINING ACUTE ACCENT]" and vice versa, you have to bust out the normalizer.

Christopher Wellons

You're right. Since substring search is much more important than error recovery, I've updated the article to mention this. Thanks!

James Dunne

Have you considered taking this optimization into SIMD territory? SIMD code can often benefit from branchless algorithms. You could probably dscode 4 groups of 4 characters simultaneously.

Christopher Wellons

That's an interesting idea to explore in the future.

Bryan Donlan

I was inspired by this article to give this a shot myself: https://github.com/bdonlan/... (see also the PEXT-based non-vector implementation in the previous commit)

There's probably still room for optimization - I'm doing a lot of loads of constant vectors for various utility masks and shuffle tables. It's interesting to note though that depending on CPU microarchitecture this approach can either be modestly faster than the non-vector implementation, or _very_ slow.

Christopher Wellons

Wow, nice work! That's impressive. And nice job commenting your assembly, too. I'm surprised how much the performance differs across microarchitectures.

Martin Dürst

Regarding SIMD, please make sure you check out the work of Robert D. Cameron (https://www.sfu.ca/computin.... He gave some very interesting talk including UTF-8 handling with SIMD at one (actually two) of the last Internationalization and Unicode Conferences.

Christopher Wellons

Thanks for the heads up!

Tmr

This is pretty neat, but as you note, it's just a mental exercise. I've been spending the past week profiling and optimizing some big data processing pipelines, and UTF-8 encoding/decoding is never the bottleneck. You could make it 100x faster, and I would never notice.

It would be cool if you could somehow apply this obsessiveness to some software that needs it. There's lots of software out there that really needs it. ;-)

Christopher Wellons

Yup, I'd be surprised if UTF-8 was ever a bottleneck in practice. Especially a strict, error-checking version.

However, I _can_ imagine a situation where a text editor might have a giant buffer of UTF-8 text in memory and needs to traverse or operate on a large contiguous section of it. On the other hand, it's likely the buffer is assumed to be in a valid state and better shortcuts are available.

Guy Shaw

On modern architectures memory references are expensive. Many optimization techniques that involve table lookup that were great advice in the 1960s, 1970s, and 1980s -- such as table lookup for trig functions -- are now pretty much obsolete. Changing some or all of the table look ups with calculations, or small tables in registers could speed things up considerably. Also, not all if-then-else decisions coded in C lead to branches. Even on x86 hardware, some simple decisions can generate conditional moves, instead of branches. And modern architectures can do much more with predication and nullification.

Christopher Wellons

These things are all true. Originally a few of those tables where computed values (shiftc and shifte in particular), but error checking became an annoying edge case that complicated the expressions (len=0), so I just switched to tables.

sneakyruds

I wonder if you can avoid "precomputing" next by declaring s and/or len as const?

Christopher Wellons

Pointer to const never helps the optimizer because it's not strong enough to prove anything useful. Const values can occasionally help the optimizer, such as when pointers to const objects escape. But generally const values don't improve performance either. I wrote an article on this topic last year: http://nullprogram.com/blog...

Impossibly Stupid

First, a typo: "to made" probably should be "to make".
Second, Rosetta Code lacks any sort C implementation, so you might consider making your version available there:

http://rosettacode.org/wiki...

It was also interesting to hear that !len was a costly operation. On a whim I replaced it with (len==0) and saw an overall benchmark improvement of 1.4%. Exercise for the reader to compare if/how the assembly code differs for your preferred compiler.

And finally, your benchmark has a bug in that the random characters it fills the buffer with can be longer than the remaining space. My fix was to just stop short:


char *end = p + z - 4;

Christopher Wellons

Thanks for the heads up about those two mistakes. I fixed the typo in the article and added your fix to the benchmark. At some point I think I did have a "- 4" in there to deal with the overflow, but it got lost at some point (prior to committing anything).

Michael

Given that only the most-significant four bits are significant (you can dispense with the 0 in 11110 for a 3-byte encoded value), can you go faster if you ROT 4 instead of ROT 3? Or is it absolutely identical?

Christopher Wellons

Some architectures can only shift 1 bit at a time, and so there could be an advantage to using smaller shifts. On x86, all the static shifts are equal (as far as I know), so it would only help to make the lengths table shorter.

However, that 5th bit _is_ important for error checking since it must be 0 for 4-byte sequences (11110xxx). For example, byte values 0xF5 through 0xFF never appear in a UTF-8 stream.

Martin Dürst

Interesting exercise, great writeup. However:

"the
caller must zero-pad the buffer to at least four bytes"
This is a pretty strong restriction, as it makes it impossible to integrate this directly into any existing library. One would have to write special copies e.g. of strcpy and friends. Removing the restriction that all bytes are null-padded may be possible, but that would not help with the danger that we hit a segment boundary (->segmentation fault). The problem could be fixed by making the indices depend on len, but that would make everything slower.

Christopher Wellons

Yup, I fully admit the interface is rather inconvenient! Even if this did turn out to be dramatically faster, the padding rule would still cause hesitation about choosing it.

Martin Dürst

One additional comment:

"Adding that !len is actually somewhat costly, though I couldn’t figure out why."
Well, !len is a hidded 'len==0'. And C-level comparison operation operations traditionally don't have directly compatible machine-level instructions. The average instruction set uses a 'compare' instruction to set some flags inside the CPU, and then some conditional jumps. So !len may well hide some branching!

Christopher Wellons

Both GCC and Clang do use comparison instructions but consume the result from the flags register with a branchless SETcc (which is what I was counting on). I imagine other architectures would have to implemented this as a branch, though.

Vincent Bernat

What about unlikely() (translated to GCC's __builtin_expect())? Are they translated to something sensible to the processor or is it just help for the compiler? This would seem to be a good use case for error handling.

Christopher Wellons

On x86 there used to be branch prefixes to provide additional static prediction hints to the CPU. As far as I know, on x86 it now only affects the compiler's instruction scheduling. You're right about it potentially being useful to indicate that errors are not expected, or that most code points are only one byte long, though I doubt it would have a significant effect.

David Zmick

this is correct, the compiler hints only are hints for the compiler. This is stated in the intel optimization guide :)

Marcus2012

What about Emojis? the U.S. flag emoji for example is composed of 2 code points, with no zero width joiner between them.

That requires a table lookup as far as I can tell, and it's the reason I haven't tried writing my own UTF8 decoder.

Christopher Wellons

I don't know about that particular emoji, but that sounds just like a combining character, which is perfectly normal for Unicode. Dealing with those is not the UTF-8 decoder's job, which is merely to decode a sequence of code points. It's up to the renderer, or whoever is consuming those code points, to resolve combining characters into grapheme clusters and such. That's the complicated part of Unicode.

Marcus2012

It's not a combining character, the UTF32 codepoint is U+1F1FA for the U and U+1F1F8 for the S.

the whole flag is: U+1F1FA U+1F1F8

aka 🇺🇸

Feel free to look it over in a hex editor if you're confused.

in UTF8 it's: 0xF09F87BA 0xF09F87B8

Anyway, I was wrong, it doesn't take a table lookup, I didn't understand how UTF8 encoded the literal Unicode code point when I wrote that last comment.

and since I'm not writing a GUI it doesn't matter to me how it's parsed out, all I care about are code units and code points.

Quinten Lansu

Very interesting! The decoder I wrote for utf8rewind does use branches:

https://bitbucket.org/knigh...

The reason for this is edge cases like overlong sequences and not enough input data. The only problem with your decoder I can see is that you do not write U+FFFD on incorrect sequences. You can find all test cases I have here:

https://bitbucket.org/knigh...

Overall, very nice example of branchless code!

If you want a tougher challenge, see if you can optimize code point composition for normalization. ;)

Christopher Wellons

Thanks! As for U+FFFD, I think that's just out of scope for a plain UTF-8 decoder. The proper way to respond to an invalid byte sequence depends on the caller's needs. The caller may prefer to abort the decoding operation entirely — particularly in a security-sensitive context — rather than use a replacement character. Having the decoder silently emit a replacement character removes that control: It would be ambiguous whether the original stream had that character or if it was the result of a bad sequence.


Make Flet Great Again

NoonianAtall

I wish you had written this article a year ago, would have saved me a bit of time. ;)

Seriously, though, I can recommend Nic Ferrier's noflet package. It works very well and adds some handy features, like being able to access the original function definition. I use it in some unit testing, works great.

Phil

http://endlessparentheses.c... would have saved you a bit of time :)

Christopher Wellons

Thanks for the link. I had no idea Artur Malabarba wrote about this already. I usually avoid writing about topics that have already been covered elsewhere.

Phil

FWIW I'm glad you did write this, as the explanation about byte-complied opcodes vs flet/advice was new to me, and definitely useful to know.

Christopher Wellons

I've had a trivial bugfix PR open on noflet for over two years, so I've basically given up on it.

Raimon Grau

That's a great trick to emulate the effect of dynamic scope for mocking.

I guess problems arise if you use any kind of multithreading as #'symbol-function deals with a globally unique symbol table, but well... at least it's something :)

If anyone wants to turn that into a mocking library, here's a CL function with a similar approach where one can take ideas from https://github.com/Ferada/c... .

Christopher Wellons

Yeah, good observation. That's one the big issues of adding threads to Emacs, even if those threads aren't preemptive. Even the coroutine-like accept-process-output can run into trouble with these temporary global bindings.


Render Multimedia in Pure C

possiblywrong

Very cool. And definitely radix sort for the aesthetic win.

The image describing the envelope tapering the amplitude of each tone looks like a parabola, but the code suggests a 6-th order envelope (comparison here), which I imagine sounds even cleaner?

Christopher Wellons

I actually like the radix sort in the original w0rthy video better. It's downright frightening. I was unable to match the effect with my own implementation.

Good eye spotting the parabola discrepancy. I'm not surprised it's you that noticed! I made my plot in a hurry and got sloppy. Mind if I steal yours to replace it?

Why 6th order? It's the result of experimentation. 2nd and 4th order were still leaving some artifacts. They went away with 6, so I stopped.

Vincenzo La Spesa

I see that, in the end, you didn't use Frei0r (i came from here https://stackoverflow.com/q... )

In your opinion, is there a specific reason why Ffmpeg doesn't allow to load external filter? security reason maybe?

Christopher Wellons

Thanks for the info. I didn't know about Frei0r, so I'll have to check it out. What I still like about my PPM technique is that I can also interface with other tools in the middle, like ImageMagick.

Purely speculation since I've never looked into it, but I think you're right about security. Video data is generally untrusted (i.e. it's from the internet), so it must be handled very carefully. An external filter could have flaws that may allow ffmpeg to be exploited by carefully crafted video data.

Ciro Santilli

Here is a minimal C FFmpeg example that synthesizes video directly from buffers, without intermediate PPMs: https://stackoverflow.com/q...

Christopher Wellons

Interesting, thanks for the link!


Initial Evaluation of the Windows Subsystem for Linux

Ionel Cristian Mărieș

Note that Xming hasn't made a public release for 10 years. Use this instead: https://sourceforge.net/pro...

Christopher Wellons

Thanks for the tip! It's literally been about that long since I've run an X server on Windows, so I haven't kept up with developments.

J. Ryan Stinnett

As of Windows 10 Fall Creator's Update (shipped on 2017-10-17), you no longer need to enable developer mode to use WSL.

https://blogs.msdn.microsof...

Christopher Wellons

Thanks for the heads up! I've modified the article.

Szabolcs Szasz

Chris, for the poor fs performance, FYI:

"Interoperability with Windows

... DrvFs [the WSL component that lets you access your normal Windows drives or shares] does not store any [metadata]. Instead, all inode attributes are derived from information used in NT, by querying file attributes, effective permissions, and other information.

*DrvFs also disables directory entry caching* to ensure it always presents the correct, up-to-date information even if a Windows process has modified the contents of a directory. As such, there is no restriction on what Windows processes can do with the files while DrvFs is operating on them."

(https://blogs.msdn.microsof...


What's in an Emacs Lambda

(no comments)

Debugging Emacs or: How I Learned to Stop Worrying and Love DTrace

NoonianAtall

Thanks for the writeup, Chris. A couple of thoughts:

1. I'd like to cite this as an example of why apparently unreproducible bug reports should not be closed. Sometimes old bug reports linger a long time before things fall into place and they are finally solved--but if the bug report had been closed, it would likely have fallen by the wayside (unless someone else reported it, which would have duplicated effort and wasted time). Patience is the key. Obviously you know this, and some projects get it (e.g. Debian), but a lot of other software projects don't (e.g. Ubuntu).

2. Partly inspired by #1, I encourage you to go ahead and report this to the Emacs bug tracker. As you know, the Emacs devs are really smart, and they might have ideas for how to reproduce it, or they might have puzzle pieces floating around in their heads that might fall into place. If not, the bug already being reported might save a future reporter some time and help solve the bug by keeping evidence together.

Thanks very much for your work on Elfeed!

Christopher Wellons

Yup, I agree on point #1. Though often these bugs sit around so long they end up becoming irrelevant before they get fixed, due to something being rewritten or some component being swapped out. As for #2, I really do intend to report it, but I'd like to spend some more time narrowing things down. Since it's not urgent, I'd prefer a more complete report later than an incomplete report now.

Thanks!

Noam Postavsky

> but if the bug report had been closed, it would likely have fallen by
the wayside (unless someone else reported it, which would have
duplicated effort and wasted time).

Hmm, but even if a bug is left open, it'll be hard to find in the huge pile of open bugs, so probably someone else will report it and duplicate the effort anyway.

smitty

>I mostly program in Vim these days

Have you availed yourself of http://spacemacs.org/ ?

Christopher Wellons

Spacemacs doesn't interest me since I prefer to fully manage my own configuration, but I *have* been using Evil for about 10 months now. Evil is by far the best Vim clone I've seen in any editing environment. When I'm testing a package in an unconfigured Emacs, I turn on Viper since I've lost all my old Emacs muscle memory, but I wouldn't want to use Viper long term.

So, thanks to Evil, I still do some editing in Emacs, including all of my Emacs Lisp programming. It would be silly to do that in Vim. More importantly, Smartparens (and previously Paredit) is just too good to pass up for an s-expression language. However, Evil is sometimes a bit flaky, getting stuck in weird states or just not being Vim enough in some situations. And, by no fault of its own, it's ultimately a stranger in a strange land, with a jarring to switch from an Evil buffer to a plain old Emacs buffer (due to major mode conflicts with Evil). If I can do something comfortably in Vim, then I just do it in Vim.

xiongtx

The Emacs Lisp manual is actually pretty explicit about this:

> If available, ptys are usually preferable for processes visible to the user, as in Shell mode, because they allow for job control (C-c, C-z, etc.) between the process and its children, and because interactive programs treat ptys as terminal devices, whereas pipes don't support these features.

> However, for subprocesses used by Lisp programs for internal purposes, it is often better to use a pipe, because pipes are more efficient, and because they are immune to stray character injections that ptys introduce for large (around 500 byte) messages.

https://www.gnu.org/softwar...

jcs

Yes, I noticed a dramatic increase in Elfeed's feed loading speed but didn't know what happened until I read this post. This is on macOS.

xiongtx

Good news on DTrace: Oracle is apparently GPL'ing it and doing a proper Linux port: https://gnu.wildebeest.org/...

Christopher Wellons

Awesome! Apparently Oracle was just waiting around for me to complain about it. :-)

xiongtx

Now you're got to be very strategic about your next target of complaint 🤣.


Inspiration from Data-dependent Rotations

possiblywrong

"In these cases, a data-dependent shift would require a loop." My initial interpretation of this was a loop iterating over the number of bit shifts, right? I wonder if this could be compacted into a tighter, loop-less algorithm that only used, say, a 256x64-element lookup table of *byte* shifts?

Christopher Wellons

Yup, that's exactly what I meant by a loop (or even something unrolled via Duff's Device). I hadn't thought of a table, though. While in theory a table reduces the operation to O(1), a table can still present a significant side channel due to cache effects. RC4's 256-byte s-box is large enough that researchers have demonstrated cache timing attacks to steal its contents. A huge lookup table will be even more vulnerable.

In a data-dependent shift lookup table, an attacker could flush parts of the table from cache (such as via cache collisions) and then measure the cipher's performance to discover the shift operands, revealing secret information about the cipher's internal state.

For the less-security sensitive PRNG, an architecture that lacks an instruction for data-dependent shifts also probably can't afford the memory for lookup tables, especially if a another PRNG would work fine.


Options for Structured Data in Emacs Lisp

Yu0

The dynamic dispatch part has me somewhat worried.

One of the reasons emacs lisp has become my favorite programming language is discoverability with C-x h f and that the same global name will mean the same anywhere in the system. It creates a sort of clarity that is obfuscated by namespace systems to some degree, which I regularly stumble over when using python. Was .move os.move or shutil.move?

I can't quite imagine how this can translate into dynamic dispatch.

Christopher Wellons

I generally agree. Dynamic dispatch currently isn't very transparent, and it can hamper debugging. In my original draft I actually cautioned against it, noting that it makes a program more complicated. It's occasionally useful (such as how I use it in EmacSQL), but it should be applied judiciously.

Damien Cassou

I just finished a complete rewrite of mpdel, an MPD Emacs client, and made great use of cl-defstruct and cl-defgeneric (especially in libmpdel.el). I enjoyed writing my code with it and had the impression it made the logic simpler. I would love feedback: https://github.com/DamienCa....

Thanks for your great post.

Andrew Kirkpatrick

I really appreciate multiple inheritance and method combinations in CLOS, they give more options to interoperate with preexisting code without having to actually patch it. They must be used judiciously but I'd not go so far as to say they are a bad idea.


Two Chaotic Motion Demos

Fred

If you want to avoid energy loss, you probably want a symplectic integrator. There's a symplectic version of Runge-Kutta, but a quick search didn't turn up any immediately usable code. It's described here: http://www.unige.ch/~hairer...

Christopher Wellons

Thanks for the tip! I'll have to study this.


Emacs Lisp Lambda Expressions Are Not Self-Evaluating

Clément Pit-Claudel

That's a nice write-up. I'm always scared of quoting lambdas, though; we ran into the following example recently in Flycheck:

(progn
(defmacro m (f) `(function ,f))
(message "%S" (m (lambda ()))))

(To see what's wrong with it, try evaluating it twice)

Andrew Kirkpatrick

That's intriguing, and you can reset the state to get the original result with (fmakunbound 'm). So its an interaction between read time and/or macroexpand time and/or evaluation time. It doesn't happen in Common Lisp (SBCL).

Clément Pit-Claudel

That particular example turned out to be a bug that was fixed just days after I posted here; but there's now a lively discussion going on about with-eval-after-load on emacs-devel

Andrew Kirkpatrick

BTW at first I tried evaluating it twice in ielm but got the same result. I got the different results you're talking about by evaluating it in a lisp interaction mode buffer.

Phil

A valuable write-up as usual Chris, but I think it would be improved significantly if you showed how the example is "subtly broken" (i.e. displayed the error that results) much much earlier, in order that the reader has some context for the ensuing discussion.

The eventual mention of the (void-function closure) error is buried a long way into the text, and with no direct reference to the original code, so it's all too easy to miss what the article is all about.

Christopher Wellons

Thanks for the tip, Phil. I was intentionally delaying the "reveal" for the sake of telling a story, but, you're right, I probably obscured the issue too much by doing so.


A Crude Personal Package Manager

Ralph Corderoy

Have you come across GNU Stow? https://www.gnu.org/softwar... It maintains a symlink farm. In your case, your ~/.local/bin would end up with symlinks to ~/.local/stow/.../bin/... The stow directory can have many versions of a package; just one of them is symlinked in at a time. It's easy to switch between versions. It doesn't provide your `build from nothing but cc(1)' though, unless you build Perl too. :-)

Christopher Wellons

I had thought of Stow only as a "dotfiles installer," but since writing this article I've found out it's more general than that (since you're not the first to mention this). My little package manager is still probably better suited for my needs, but it seems I need to spend some time with Stow to see what other ways it could be useful.

Kaushal Modi

Can you share a wrapper script if you use any? Here's mine :) The linked is my wrapper script for Stow for dotfiles (I have separate groups of those for private and public), and then various packages, versioned.

The only thing I don't like is the requirement to create that dir_struct that you see in that script around line 125.

oso2k

This sounds very similar CRUX's package manager

https://crux.nu/Main/Handbo...


Blast from the Past: Borland C++ on Windows 98

Kurt Jung

Thanks, Christopher, for the ride in the time machine. I've often thought about firing up a virtual DOS, circa 1988, to run a fairly large application I wrote in Turbo Pascal. When Lawrence Kesteloot discovered some old diskettes with graphics programs he and a friend wrote in 1989, he wrote a web-based Turbo Pascal compiler (https://www.teamten.com/law... ) to run them!

I had already spent a few years working with Turbo Pascal by the time I switched over to their C/C++ compiler. My editor at the time, some features of which I still miss, was Borland's editor toolbox with a bunch of home-grown extensions. It had everything I needed for software development so I never really got to know their packaged IDEs. With Pascal, Borland never considered itself constrained by standards. If some feature made sense, they implemented it. They borrowed so heavily from the best parts of Modula-2 that I am sure it was only marketing that compelled them to keep the name Pascal. It was a joy with which to work daily. My switch to C was dismal -- it felt slow and, with C's header files and macro language, ridiculously antiquated. I am grateful to Go for bringing back some of the magic of Turbo Pascal.

Of course, it was the C standard that allowed that language to flourish, and gives your program a lifetime that will be measured in decades. Nice work!

Christopher Wellons

Interesting story. Thanks for sharing. Kurt. I think I missed out on Pascal by a few years or so. By the time I got into programming it had already fallen off the radar.

possiblywrong

This was a really enjoyable and rather nostalgic read, both the OP and your comment. Most of my experience with Pascal was also via Turbo Pascal, and then initial exposure to C++ via Turbo C++. I still have a lot of Pascal source from 30+ year-old projects.

Mike Zamansky

This was a fun read. Back in the late 80's I was developing for Windows 2.1 / 3.0 which was really on top of DOS.

I used the Microsoft compiler and Epsilon which was (is?) an Emacs clone by Lugaru software. Things ran pretty well. The real trick was debugging and logging info and we handled that by having two displays on our PCs - one, a graphical display to run Windows and our application and another monochrome text only display where we could redirect logging info as our programs ran.

On the CLI, I used a product called the MKS toolkit which had a pretty functional shell along with many of the Unix command line tools we all know and love.

Christopher Wellons

Somehow I'd never heard of either Epsilon or MKS Toolkit, but I see they both have Wikipedia articles. Interesting stuff. Thanks, Mike!

Tim

I'm glad you found a better command history program, otherwise I'd have to tell you about F3 (last command) and F2 (interactive partial of previous command), and that famous backwards compatibility means they still work in Windows 10.


When the Compiler Bites

Davidbrcz

On a related subjet, there is a a paper on the effect of compilers' optimizations toward program security: https://www.lightbluetouchp...

Christopher Wellons

Interesting paper, thanks! I haven't finished reading it yet, but I've enjoyed it so far.

justsomeguy

Interesting blog post, thank you for posting.

Irrelevant and possibly autistic nitpick, please ignore: For the "new_image" function, shouldn't it test for

w != 0 && h != 0 && h <= SIZE_MAX / w

instead of

w == 0 || h <= SIZE_MAX / w

?

Christopher Wellons

The "w == 0" check is to avoid division by zero. If one of the dimensions is zero, then the empty allocation decision is left to malloc(), either returning NULL or unique pointer: http://yarchive.net/comp/li...

justsomeguy

I knew about the division by zero, but returning an empty allocation if one or more of the dimensions are zero I did not anticipate. Also, thank you for the link. Reading through it, it does seem a bit specific to kernel development; Linus writes among other things that in the kernel, they use NULL for representing a specific case:


No. NULL really _is_ special.

We use NULL in tons of places for saying "we haven't allocated anything at
all".

He also writes about returning a special case pointer that is not NULL to represent a different case, namely empty allocation:


That's *not* the same as saying "we have initialized this pointer, it just
happens to point to a zero-sized object".

So, the way you are doing things here may be the best and optimal way given the style, idioms and chosen approaches in the code base. The usage of C here with a rendering project makes a lot of sense, since C is both extremely portable and also has the possibility for very good runtime performance. In a different language and different project where runtime performance would not be as important, returning a union type or a tagged union type (one case for successful result, one case for error) or similar might be more maintainable and less error-prone.

Anonymous

1.3f case felt really wrong, so I checked the standard. Unfortunately, the standard explicitly says (in the section "Characteristics of floating types") that FLT_EVAL_METHOD 1 and 2 should ignore the constant type.

Yury Schkatula

"Floating point precision" - you should never compare them by equality/inequality, regardless the compiler and Standard applied. Just rule of thumb :)

LilleCarl

Very interesting article!


When FFI Function Calls Beat Native C

Kreemy N. Almind

Great article as always! I learned so many new things. One small question regarding your code. In your benchmark code you use indirection to enable and disable the running flag for each benchmark https://github.com/skeeto/d...

Is there any specific reason for this? Only reason I see is because you have another running flag in your jit struct https://github.com/skeeto/d... and you free it in main() after the jit benchmark is over. Why not have one running flag and enable/disable it before/after each benchmark directly?

Christopher Wellons

I wanted one signal handler to cover all benchmarks. The PLT and Indirect benchmarks trivially share a "running" flag since they're all nearby in memory. They can (and do) both access the flag using RIP-relative addressing, per the "small" code model in the Eli Bendersky article.

However, the JIT code is intentionally placed adjacent to the shared object's base address, not the main program, so its address is almost certainly very far away (>2GB) from "running". The flag is not accessible by RIP-Relative addressing, which has the same 2GB (signed 32-bit) range of the call instruction. To deal with this, the JIT code gets its own flag allocated adjacently in its own page. I mentioned this (without much explanation) in the paragraph about allocating two pages.
I could have used absolute 64-bit addressing to reach back to the original "running" flag, but this would make for a larger instruction, different from the other two benchmarks. Worse, it's inside the tight loop that I was measuring. I wanted each benchmark to be as comparable as possible, differing *only* by call style, accessing the running flag in an identical way.

Travis Downs

You could also consider static linking. Now static linking isn't really an apples-to-apples comparison when talking about FFI, which kind of implies a shared object that other languages can call into, whereas static linking really only applies to C, C++ and other "link-level" compatible languages. Still, this type of call is the major type of non-inlined call found in most applictions, even that don't statically link libraries, since it is the one used for calls between different translation units in the same binary.

AFAIK static linking will end up the same as the JIT example (a direct relative call, at least for the normal small code model) for executables including PIE executables, but for code inside a shared PIC object it's more complicated due to the possibility of interposition. Basically if you want to support interposition of calls within a library, interposable calls within the library have to jump through the same hoops as cross-library calls - but there are various tricks you can do to control this (linker maps, anonymous namespaces, etc).

Christopher Wellons

Yup, static linking gives you the equivalent of the "JIT" calls. If the static library was built with with LTO (unusual) and you're using a compatible compiler, then in theory it could do even better and inline library functions.

Good point about interposition, too.


Emacs 26 Brings Generators and Threads

Clément Pit-Claudel

Neat article. I'm happy that we're gaining (cooperative) threads, mostly because it helps with getting out of callback hell. Note also that we alread have an equivalent of web workers in Emacs, in the form of the emacs-async library: https://github.com/jwiegley... .

Noam Postavsky

FYI, generators were added in 25.1, not 26.1

Christopher Wellons

Doh! I feel stupid now. Thanks for the correction. It wouldn't be so bad if I didn't base the title around my mistake.

NoonianAtall

Great article, thanks, Chris!

Lawrence D’Oliveiro

Given that Elisp is Lisp, I wonder why they don’t implement a full continuation http://www.codecodex.com/wi... facility to begin with.

Christopher Wellons

As I mentioned, generators require no special run-time support. Adding them to Emacs came at no cost. But continuations typically require significant architectural run-time support, where stacks are first-class objects, managed directly by the run-time. This is one of the big hurdles when implementing Scheme, especially because its continuations are undelimited. An implementer typically can't simply use implementation language's stack (unless that language supports continuations itself).

To add another complication: Continuations couldn't be safely combined with Emacs' dynamic modules. A dynamic module frame "sandwiched" inside a continuation would likely break or misbehave if control returns through it other than exactly one time. For example, imagine an intermediate frame in C++ that's using RAII.

Mike

Did you try printing and reading a generator? It does not work here:

LISP> (iter-defun test-gen ()
(while t
(let ((inp (iter-yield nil)))
(cond
((equal inp 'end)
(signal 'iter-end-of-sequence nil))
(t
(message "Got %s" inp))))))
test-gen
ELISP> (read (prin1-to-string (test-gen)))
*** Eval error *** Invalid read syntax: "#"

It would be useful for a game (ala Lua pluto library, save-the-world).

Christopher Wellons

Since generators compile into a set of closures that reference and call each other, they form a circular data structure. So in order to print an iterator object properly, you need to bind print-circle to t when printing. (It doesn't matter when reading.) Without this, the printer still inserts special markers when it detects loops, but they can't be read back in.


Minimalist C Libraries

Roy

you should check out seans (nothings) stb libraries on github https://github.com/nothings...

Christopher Wellons

Oh, yes, I love Sean Barrett's stuff. stb has influenced my own minimalist libraries. In fact, the "growable-buf" library linked at the bottom of the article is my own take on "stretchy_buffer.h".

Klaus

What you describe is what I wish more python libraries were. It often feels as if the public APIs are overly complex and scattered. Mind, it still leaves Python often enough as the preferable choice (data analysis), but view libraries provide minimalist interfaces with the full power that would be possible.

Then again, the same seems to go for any language.

Christopher Wellons

I've noticed that about Python, too, even including some of the officially included modules. It seems its "only one obvious way to do it" philosophy never quite extended to API design.

Klaus

I'd argue, that it doesn't even extend to the core language API, e.g. strings. We have ``s1.startswith(s2)``, ``s1.endswith(s2)``, but ``s2 in s1`` instead of ``s1.contains(s2)`` and ``len(s1)`` instead of ``s1.length()``.

Or for that matter, assignment syntax. We have ``NAME = VALUE`` as an assignment statement, but ``VALUE as NAME`` everywhere else. This can make with-statements quite hard to parse, as it requires placing the name *after* the closing parantheses of a potentially long expression.

It also violates the "explicit is better than implicit" mantra by not having explicit variable declaration (like javascripts' untyped const,var,let), which can easily create weird errors due to shadowing or unintentional variable-reuse. This is typically defended by saying "its a name, not a variable", though it really just means "variable without fixed type", which makes the distinction somewhat irrelevant.

*edit* Monday morning rant... >_<

James Lin

It's a very minor point, but your preprocessor macros should wrap the negative constants in parentheses so that there are proper compilation errors if someone somehow accidentally types something like `some_number UTF7_OK`.

Christopher Wellons

Good point, thanks! I'm usually pretty conscientious about wrapping my macro values but hadn't considered the unary negative being an issue.

heden

cat article | sed 's/limites/limits/'

Christopher Wellons

Thanks! Looks like "limites" is a real word, which explains why the spell checker missed it.

Adam Kennedy

Would these be suitable for CCAN?

https://ccodearchive.net

Christopher Wellons

I never heard of CCAN before. That's pretty neat. Yeah, I'd expect these sorts of libraries would be very suitable there since they embed so easily.

Will Good

Nice examples. How would you distribute your minimalist c library? Also, would it be statically or dynamically linked?

Christopher Wellons

The libraries I discussed are all embed libraries. They're small, portable, and trivial to build (e.g. just include as a header library). So you copy them into your won project when needed. That's why it's especially important to get the bugs shaken out up front, since updating later isn't so smooth. Fortunately small interfaces with explicit state are easy to test!

This makes it most like static linking. I'd expect this from most typical minimalist C libraries, though I can imagine situations where embedding may make less sense.

Graham Toal

for some time I've been starting to think that operating systems are overkill and that a suitable environment is just a good library that can be bound with a program and run as bare metal! At least for a large number of applications. With computers like the Pi Zero W being so cheap, it shakes up the cost/benefit ratio and the relative value of the software to the hardware.

Christopher Wellons

Check out rump kernels and unikernels: https://en.wikipedia.org/wi... https://en.wikipedia.org/wi...

Graham Toal

Yep, that was pretty much what I was envisioning. So, not a crazy idea after all :)


Intercepting and Emulating Linux System Calls with Ptrace

Xiaoguang-Gary Wang

Nice article. I'm wondering whether we could use ptrace to trick a syscall execution. For example, can we trick the SYS_read with our pre-defined input, so that user will not have to type anything from keyboard? Thanks!

Christopher Wellons

The "trick" you're talking about is exactly what I was getting at with emulating system calls. When the SYS_read arrives, you change the system call to an invalid system call number, let it run (and fail), but on the way out you service it yourself writing to the caller's buffer and returning the number of bytes written. The tracee won't know their system call *really* failed (due to your sabotage) since you emulated a success with a different result.

That's what I did in this little project, which intercepts entropy-gathering system calls: https://github.com/skeeto/k...


The Value of Undefined Behavior

Concerned Programmer

Honestly this triggered me a little. I hate the idea of exploiting such danger for a little performance gain - and an unreliable one at that! Rather write correct and maintainable code than a strange /`/ here be dragons` work. PS. Your example code is actually slower on Clang.

Christopher Wellons

Yeah, I don't know what's going on with Clang in my example. Neither function compiles as cleanly as GCC's code for sum_signed(). In the video I linked, Chandler is (of course) using Clang, and it produces much better code for int32_t than uint32_t in its bzip2 example.

PaX Team

your 'int i' is more of an example of why 'int' should be considered harmful. if you use a register sized type (long/unsigned long here) then all is well, you get the same efficient code in both functions:

movl (%rdi,%rsi,8), %eax
addl (%rdi,%rsi,4), %eax
ret

and if you extend this idea to the callers, you'll never force the compiler to juggle with sign extension, truncation, etc. seriously, people should have retrained their muscle memory from 'int' to 'long' decades ago.

renozyx

Replace long with size_t and I agree with you

Christopher Wellons

On x86-64, strictly using only 64-bit integers will lead to larger code and put more strain on the instruction cache. Even though it's a 64-bit platform, the default operand size is still just 32 bits. Most common instructions require a REX.W prefix byte (e.g. 0x48) to select 64-bit operands. That is, the "l" on "movl" typically makes the instruction one byte larger than it would otherwise be.

To see this for yourself, hex dump some x86-64 code and note how often 0x48 shows up. It's actually a pretty handy trick for differentiating x86-64 code from arbitrary data using only your eyes.

It's not always larger: any use of r9–r15 already requires a REX prefix (the 4th bit of register selection comes from REX, and lacking a REX byte implies a 0 for this bit), so it doesn't cost anything extra to use these registers for 64-bit operands. But it's still not something you _always_ want on x86-64.

Travis Downs

Yeah, 64-bit code is larger, *everything else being equal* - but often it's not equal: here you save 1 or 2 instruction which way more than makes up for a couple of prefix bytes in size. I have often looked at otherwise "efficiently" written loops that use mix 64 bit pointers and 32-bit values, and by extending everything to 64 bits in the right away everything ends up much smaller and faster.

Shafik Yaghmour

I wrote a long article on strict aliasing here https://gist.github.com/sha... and one of the point I cover is whether

int8_t and uint8_t char types and you are correct they do not have to be. Although currently on implementation I know of they are.

One point that is covered in a gcc bug report https://gcc.gnu.org/bugzill... is that changing them to not be char types would be an ABI break for C++.

Christopher Wellons

Great article! I've added a link to it. Also thanks for the links to that very interesting discussion on GCC's Bugzilla.

Brian

The problems I have with undefined behavior are twofold.

First, no user or programmer ever wants to invoke undefined behavior. The main point of C (AFAICT) is that it's easy to implement efficiently. Therefore, if an abstraction offered by the language can't be implemented in such a way, maybe it's just a bad abstraction for C. There's several new languages today which offer null-safety and also high performance. Undefined behavior could be seen as a red flag in language design. It usually is, in every other context. If there was some combination of buttons I could press on my toaster or my car to exhibit "undefined behavior", that would be bad design, without qualification.

Second, because nobody wants to invoke UB, it's essentially an unwritten "assert cant_happen;", except the user isn't required (or able) to say this, and the compiler has no means to report this to the programmer. As such, it's almost always a Surprise, and thus violates the Principle of Least. It's even unusual for C. In all other cases in the language, something that "can't happen" is enforced by the compiler. When I write a function that takes an int, for example, I know it won't be called with a double. Actually, in pre-ANSI C, you didn't have to declare parameter types, but eventually they made the programmer do extra work to declare all types to eliminate this type of surprise. Why not be consistent and do the same for undefined behavior, too?

Your article explains quite well why undefined behavior is beneficial to compiler writers, given a fixed language specification, but I don't understand why it was beneficial for the language designers to have left it that way. Given the other design changes that have been made since the 1970's, it really seems like one of the few remaining inconsistent warts that hasn't been fixed.

John Payson

Programmers may never want to perform actions which their compilers won't handle predictably, but they will often perform actions whose behavior is needed for their intended purpose, but which is not recognized by the Standard. Quality implementations which are suitable for the programs' intended tasks will support such behaviors whether the Standard requires them to or not; the fact that a program is targeted exclusively toward such implementations hardly means it's "broken".

For some reason, some compiler writers seem to view the Standard as a complete specification of everything programmers should expect. The authors of the Standard, however, not expecting it to be viewed that way, made no effort to define all the behaviors necessary to accomplish any particular task efficiently, if at all. According to the Rationale, they viewed as a quality of implementation issue the choice of what behaviors implementations should define beyond those the Standard requires. Because behaviors that are essential for some kinds of programming are useless for others, the authors of the Standard expected that the "marketplace" of programmers in various fields and compilers intended for use in those fields would be able to recognize what additional behaviors are appropriate in quality compilers intended for those fields.

For some reason, the authors of some compilers, who perhaps think "clever" and "stupid" are antonyms, have become fascinated with "optimizing" cases where quality implementations behave in constrained fashion (at worst choosing in unspecified fashion from a few possible behaviors) but where the Standard would allow them to negate laws of time and causality. A quality implementation which guarantees such constrained behavior (even though not required by the Standard), fed a program that exploits that, will be able to generate more efficient machine code than a more aggressive implementation fed code that is written to use only Standard-mandated behaviors. The notion that no programmers would want to let a quality implementation choose freely from among behaviors meeting application requirements is destructive and needs to be nixed.

SherekhudaHazratali.com

I have a hard time accepting the premise that a language is better because you have to hope that it does what you want instead of telling it, and for that reason it has become successful.

In fact, the ANSI C spec even states from your link that "Undefined behavior gives the implementer license not to catch certain program errors that are difficult to diagnose." That's just not an acceptable tradeoff anymore -- in fact I don't know it ever was. That's akin to shipping your org chart because you can't make your program do what everyone agrees it should do. You know, catch programming errors, in this case.

Travis Downs

Good article. The outcry over undefined behavior seems to have reached a peak lately, but the position in favor (of at least some cases) isn't often laid out. The idea IMO, that compiler authors are being malicious here, I think is mostly wrong: many of the "surprising" optimizations are outcomes of generic optimization steps that also produce some very necessary and "obvious" optimizations.

Note "truncate in case of overflow" comment (the implication that the prior multiplication may overflow) isn't correct - there is no overflow here: the destination register is esi, which doesn't hold the result of the prior multiplication. What is happening here is actually a quirk of the SysV ABI: the high 32 bits of 32-bit or smaller arguments may contain garbage and the caller has to clear them if it relies on them being clear. Lower unused bits (e.g., bits 16-31 when a short is passed) are generally cleared, but this was up for debate for a while: clang expected them cleared, but icc didn't do it, so they were C ABI incompatible for years!

Some discussion here:
https://stackoverflow.com/q...

and

https://stackoverflow.com/a...

In general, I don't agree with the conclusion that signed is better (for performance) when used for indexes. Yes, your example shows a case where it wins, but I've seen a lot of cases where it also causes an extra sign-extension. The design of the x86-64 ISA is such that the zero-extension needed by unsigned values often comes for free, whereas the sign extension doesn't.

For example, here's a related example that shows the opposite case:

https://godbolt.org/g/2yG9Ra

The signed version has an extra instruction to do the sign extension.

Finally, note that on the exact same code you produced, clang generated worse code for the signed version (yes, it's the fault of clang, but still it belies the advice to always used signed):

https://godbolt.org/g/BAUwBw

Overall, I think it's roughly a wash when it comes to signed vs unsigned pointer indexing outside of loops. Both win sometimes. You really have to check your code to see what it's generating for the places you care about performance. A good guideline (as you mentioned) is just to use full width types like size_t or ssize_t if you can: this almost always generates equally good or better code. It doesn't mean you need to store bulk data in that format: you can use 32-bits there, but cast or assign to full size for the performance sensitive parts so pointer and index widths match up.

For loop counters used as indexes, signed has some additional advantages because in a structure like:

for (int i = x; i != y; i++)

the compiler knows it will be a simple incrementing iteration from x to y, taking (y - x) iterations. In the case of unsigned i, however, the possibility exists that x > y, so the iteration will wrap around, and can inhibit various loop optimizations, including vectorization.

Christopher Wellons

You're totally right about my example. I misread the assembly and it wasn't showing what I thought it was showing. I've replaced the example with one lifted directly from the bzip2 example in the linked video since it actually does demonstrate a particular case of signed integers being more efficient. I was hoping to contrive an even simpler example.
Also, good points you've made on this topic. With all the trouble I've had with trying to isolate a single, clean example that demonstrates the optimization in the same direction on both GCC and Clang, I'm now thinking it's not *quite* so clear cut about signed integers being faster in the general case. However, I still appreciate that, due to undefined signed overflow, it is a constraint I can influence even if it takes more effort than simple rule-of-thumb to exploit it.

Thanks for the input!

possiblywrong

Interesting post. Perhaps a simple question: you point out that char * can alias with anything; is this also true of unsigned char *? I ask because I frequently see this used as the type for a "byte buffer," and I've written it myself in the past as well.

Christopher Wellons

Yup, doesn't matter if it's signed or unsigned, the aliasing exception still applies.

cplusplus

Great article about Undefined Behavior. Can you write now an article about Unspecified Behavior ?

Christopher Wellons

A couple of years ago I wrote an article on a specific instance of implementation defined behavior:

You Can't Always Hash Pointers in C https://nullprogram.com/blo...

John Payson

If multiple ways of performing an operation would have identical behavior in cases defined by the Standard, but differing behavior in many other cases, allowing implementations to choose among the various ways in Unspecified fashion will often facilitate useful optimizations. The value of such optimizations, however, will be much greater if programmers can exploit scenarios where all ways of performing an operation would all yield acceptable results, than if programmers must avoid at all costs any situations where behavior isn't 100% rigidly defined..

When the Standard characterizes some situation as invoking Undefined Behavior, that means that nothing an implementation could do in that situation would make the it non-conforming. That in no way implies that different ways of handling the situation won't render an implementation more or less suitable for various purposes. Further, when the Standard describes a category of actions as as invoking UB, that doesn't imply any judgment as to whether commonplace implementations should be expected to process at least some of them consistently. Instead, according to the rationale, the authors expect that quality implementations intended for various purposes will interpret UB as an invitation to extend the semantics of the language in ways that usefully serve those purposes.

One thing that would allow the Standard to be straightforward and yet more useful for programmers and compiler writers alike would be if it allowed programmers to explicitly invite certain kinds of optimizations in certain cases, even when whether doing so might alter observable behaviors, rather than trying to characterize as UB all situations where the effects of useful optimizations might be observable. If applying various combinations of optimizations might result in a program behaving in many different ways, but all of those ways would be acceptable, letting programmers exploit that fact would make it possible for compilers to do likewise, and thus produce better code than would be possible if programmers had to avoid all such situations.

As an example, consider an invitation to treat automatic objects of integer types as holding "recipes" rather than values, and its effect on something like "x=y * 2468; ... z=x/1234;" with none of the intervening operations showing any evidence of affecting x or y. If the compiler keeps track of the fact that "x" was computed as "y * 2468", it could replace the latter assignment with "z = y * 2;". This may result in x and z holding inconsistent values if an overflow occurs during the multiply, or if the value of y changes unexpectedly, but if code doesn't care about the values being inconsistent in such cases, the implementation shouldn't either. If events that would disrupt computation are rare, and program is prepared to throw out the results from a series of computations if they are found to have occurred, the costs of computations that get thrown out may be less than the cost of using "volatile" qualifiers within a tight loop.


Prospecting for Hash Functions

Nand Xorsson

Sloshing effect = confusion and diffusion

Google 'confusion and diffusion' to find out more. I think Wikipedia has an article on it.

Confusion and diffusion is created with substitution and permutation.

Jan De Kock

I wonder what the results would be when using signed numbers, as Java only has these. More for like strengthening existing hash codes. I might play around with your source code :)

Christopher Wellons

With two's compliment, the bitwise result of multiplication is the same regardless of operands' signedness. That's why Java implementations of these hash functions work just fine despite the lack of unsigned integers. Division is another matter, though, which is why it has a separate divideUnsigned() method.

Jan De Kock

Thanks! The rules for two's complement have fled my brain unfortunately... Time to put those on my reading list again :)

Thomas Mueller

The function with the 0x45d9f3b constants: yes, I found those, see also https://stackoverflow.com/q...

At the time I tested it, I found it is slightly faster to use the same constant twice. That was with Java, a few years ago... Not sure if that's still the case. Did you also test performance?

(By the way, I didn't test Murmur3, not sure why, maybe Murmur3 didn't exist back then.)

Christopher Wellons

Ah, thanks for clearing up! I did come across your Stack Overflow answer while researching the origin (via searching on the 32-bit constant). It seemed like it was yours, but you didn't explicitly take credit.

I did wrap my overall evaluation function with a high precision clock so that I could measure the performance if necessary. However, for x86 I expect there to be absolutely no difference in performance just from tweaking the constants (of course, outside of certain, obviously special values like 0 and 1). Performance should be exactly identical.

The C version compiles to the same code just with different immediates, and reusing the 32-bit constant doesn't change this fact (the value is repeated). Looking more my closely at those performance measurements, I'm not seeing any differences above the noise. Perhaps there was a difference in Java due to its bytecode constant pool, and using the same constant takes pressure off the pool. I have no idea what HotSpot's JIT output would look like.

It looks like the MurmurHash3 beta was announced 2010-11-04, and it was finalized 2011-03-01. That's about 19 months before your Stack Overflow answer (2012-10-21). You _do_ mention MurmurHash's finalizer and even specifically link to MurmurHash3, suggesting that _is_ the function you tested.

If I'm understanding it correctly, your hash evaluation in H2 is using the Monte Carlo method just like my estimate. However, I found that the best I could get from a Monte Carlo method bias estimation with 2^18 samples was between 5 and 6 digits of precision, which is just _barely_ enough to differentiate the three, 32-bit functions shown in my article (including yours). Throwing lots more samples at it was just noise. This is in line with the (disappointing) Monte Carlo pi estimates I've seen/tried. So I had to switch to the exhaustive, exact evaluation to be sure. This is what's really limiting my ability to improve on the existing 64-bit hashes.

Joshua Corbin

Some thoughts on the transition from integer hash to string hash:
- the FNV family of hash functions are, in my view, about the simplest answer to the prompt "how would you turn an LCG into a hash function?": FNV just xors a byte of input into the hash, then twists the LCG (technically it's a degenerate Multiplicative Congruential Generator, and but I like to view it as a choice in Linear Congruential Generator space)
- I came by this viewpoint by, after being inspired by M.E. O'Neill's PCG construction, rediscovering how to convert a simple integer-based PRNG into a hash function; maybe you too will find some inspiration in PCG's permutation framework

Pelle Evensen

You may want to check this out:
http://mostlymangling.blogs...

I'm using a primitive that's not on your list but as far as I can measure very fast, comparable to x ^= x shift a:
x ^= ror(x, a) ^ ror(x, b), which is invertible for 64-bit x's and a != b.

It would be interesting to see what bias you detect. When I designed it, I used an energy function that considered higher order avalanche. Here is the version I posted a few weeks ago;

#include <stdint.h>

static inline uint64_t ror64(uint64_t v, int r) {
return (v >> r) | (v << (64 - r));
}

uint64_t rrmxmx(uint64_t v) {
v ^= ror64(v, 49) ^ ror64(v, 24);
v *= 0x9FB21C651E98DF25L;
v ^= v >> 28;
v *= 0x9FB21C651E98DF25L;
return v ^ v >> 28;
}

Bulat Ziganshin

CRC-n on n-bit values is also reversible since it's remainder of polynomial division by some order-n polynomial

Christopher Wellons

Interesting, I didn't realize that about CRC-n.

C. A. Ferraris

I think there are other fast invertible primitives you may be able to make use of, e.g. BSWAP.

See also http://programming.sirrida.... for many additional candidates (sure some of these primitives are fairly complex, but you already included XORL and XORR so I'm not sure where to draw the line).

Christopher Wellons

Oh, that's a good idea! I was focused only on primitives that can be expressed efficiently in portable C, so I hadn't considered it. But both GCC and Clang recognize a manual byteswap operation and emit a single bswap instruction for it. It's essentially the same situation as rotate. I'll have to try this out!


Brute Force Incognito Browsing

a

In chromium you can simply click on the new tab button to open a new tab and search or open whatever you have in your selection clipboard.

Christopher Wellons

Yeah, that's what I end up doing. I'm mostly complaining about having to make that extra click every time I start a fresh instance. :-) Mentally it feels like starting it up should work like opening a new empty tab, though I completely understand why this can't be.

Maxolasersquad

For these cases I simply open a "Guest Window" in Chrome. It's effectively and incognito window, but acts more like a regular window.

NoonianAtall

Very handy, thanks!

NoonianAtall

BTW, the HTML of this article is not correctly structured. There is no HTML nor HEAD element. It renders correctly in Chrome, but I noticed that Pandoc can't process it, because it acts as if the whole page were in a HEAD element with an empty BODY.

Ash

That's allowed in the HTML5 spec. The Google Style Guide even recommends omitting them. https://google.github.io/st...

Christopher Wellons

Yup, exactly right. I only just learned about this part of the HTML5 spec in February, after which I removed the unnecessary elements:
https://github.com/skeeto/s...
If you wanted to run my articles through Pandoc, you're better off working directly from the Markdown source. You can find it in that repository.

NoonianAtall

Well, I'm using org-web-tools, which downloads the URL and passes it directly through Pandoc. I can capture any web page's content into Org by copying the URL to the clipboard and pressing "<f1> w r". (It also makes use of "eww-readable" to extract only the "readable" portion of the page.) Leaving out those tags breaks Pandoc, which breaks that simple workflow.

It also inevitably breaks many other HTML-consuming utilities which don't expect major structural tags like <html> and <head> to be left out, and those kind of utilities tend to not be updated for years, if ever. I can't even find any issues on the Pandoc tracker about omitted or optional HTML tags. If I posted a bug, it could be months or years before anyone fixes it, and then months or years more before the new Pandoc version hits Debian stable. (I finally had to give up and install a binary package directly from Pandoc, because the version of Pandoc in my Ubuntu system is so outdated.)

I've been doing HTML for over 20 years, and while I don't keep up with all the latest stuff anymore, this is the first I've heard of leaving out <html> and <head>. I don't think the breakage is worth it. What's gained, other than complying with Google's arbitrary style guide? If it breaks simple workflows with simple tools for Emacs users, is it worth it? Having to go dig out a markdown file from a git repo is a lot of extra trouble. :(

EDIT: Upon further investigation, it appears that it's not Pandoc that's the problem, but, probably, shr-dom-print in Emacs. After parsing the HTML to a DOM object, since the original HTML lacks <html> and <head>, shr-dom-print puts the whole page in a <head>, so Pandoc sees an empty body. And to be quite honest, I'm not interesting in digging into shr-dom-print, fixing the bug, and submitting a patch to emacs-devel.

This is an example of the kind of stuff that breaks when basic expectations are not fulfilled. I think it's a very bad idea for those tags to be made optional in HTML5. Please consider replacing them.

Ash

Back when I needed something similar, I installed a copy of my browser from PortableApps.com, did all the first time configuration, and put the installation into a git repo.

To reset the the browser back to its completely clean state, it was as easy as running a .bat file containing:
git reset --hard
git clean -fdx

Granted, this is Windows-specific, but something similar could be done in Linux.

Anon

Nice. If you want to treat the environment variables as untrusted, better switch the quote types in
trap "rm -rf -- '$TEMP' " ...
to protect from injection :)

Christopher Wellons

It's not so much that I don't trust the environment, but rather that I want the script to not choke on a weird profile path if possible. You're right that, as I've written it, it would choke if the path contains a single quote. Inverting the quotes on that line should solve it without really complex double-quote escaping.

At some point I got into the habit of writing traps the way that I did since it was necessary for the variable's current value to be captured in the trap string, rather than expanded later when the trap was executed. In this case that's not necessary.

l0b0

Neat! Trapping `EXIT` should be enough, at least in Bash (tested with Ctrl-c and `kill -TERM`).

Christopher Wellons

Hmm, looks like you're right. I tested with dash, too. If a SIGINT or SIGTERM causes the shell to exit, then it will already trap on the resulting EXIT. By trapping on the signals, I'm handling them, preventing them from causing an exit, which is kind of the opposite of the effect I want. (Note: Hitting CTRL-C while the browser is up causes the browser to get the signal rather than the shell.)

Anonymous Coward

If you really want to remove your temporary folder, you can also use `shred`. It will overwrite the content of your file instead of just delete them in the filesystem index.

Christopher Wellons

Unfortunately shred is useless on modern systems, for both software and hardware reasons. Your best bet is to only store data encrypted, and then to destroy that data, you destroy the key. In this context, that means putting the profile in a temporary, encrypted volume.

Software: Filesystems are more complex today, and there's no reliable way to overwrite any particular underlying storage through the filesystem interface. Journals delaying writes are a problem (mitigated with fsync), but really advanced filesystems such as ZFS and Btrfs actively avoid overwriting older data (especially with snapshotting).

Hardware: SSDs add another layer to the problem with wear leveling and such. Not even the operating system can guarantee a piece of data has been completely overridden on physical storage. (Though this means the operating system probably can't access it anyway, which is worth something.)

abigail buttox

one remark; EXIT in trap is bash feature. change to '0' to be portable

Christopher Wellons

According to this documentation, EXIT is part of POSIX trap:

http://pubs.opengroup.org/o...

You are correct in that I really don't want any bash-isms in my script.

possiblywrong

Interesting post. I may misunderstand the sentence, but I think "intentionally unpredicate" is supposed to be "intentionally unpredictable"?

Christopher Wellons

Oops, you're right! Fixed now. Thanks!
https://github.com/skeeto/s...

Roland

Controlling this stuff isn't that difficult once you set it up:
Prefs=>Privacy=>CustomSettgs, check everything except "Always...", 3rd party cookies never. Close. Now hit Ctrl-Shift-Delete, you should get a box. Set the timespan to forever, check as many boxes as you want/can. You have fine control.

You should hit Ctrl-Shift-Delete frequently during the day, especially before and after doing anything financial. Now that wasn't too hard, was it?

Christopher Wellons

Your instructions achieve a little bit of what I need, but misses a number of key features. There's more to my script than just clearing the basic browser state.

First, I don't necessarily want to destroy my main browser state just to test something. The whole point of this is to get a browser instance that's completely isolated from my primary browser.

Second, I mentioned in the article that, with my script, I can run multiple instances of private sessions in parallel, each isolated from each other. That's really useful for comparing between instances where I've intentionally diverged state. Destroying the state in my main browser doesn't help at all with this.

Third, my script gives me a fresh browser instance without extensions. Some extensions interfere with websites even when they're not supposed to. That's really bad for testing. For example, for a long time now if you visit Eric Raymond's blog in Firefox with uBlock Origin installed, it will have a horizontal scroll bar. This happens even when the extension is disabled. I haven't determined if this is a uBlock Origin bug or a Firefox bug (it doesn't happen in Chromium). Regardless, this reduces my confidence in Firefox to properly isolate itself when I need it to.

Further along this tangent, Ctrl-Shift-Delete doesn't clear extension state. I don't know what state might leaking through extensions. This would be a very difficult problem to solve -- exactly what state should get wiped from each extension? -- and would rely on each extension's developer to get the details right.

lukaslt

Instead of creating a fresh session, would it be possible to use a “template”?
Start a fresh session, do the initial setup (close initial browser tabs, install required plugins, etc).
Then starting new “private” session, copy the template to given random folder. Aka use it as a basic snapshot.
Or is there any risk of browsing this way?

Christopher Wellons

Using a template is a really good idea. I'll have to try it out. I'll want both since sometimes I won't want to use a template, such as when I need a truly pristine configuration. It's useful when reproducing an issue.

Jarbel Menucem

Firefox Containers solves your problem


From Vimperator to Tridactyl

NoonianAtall

Thanks for the detailed article, Chris. I was a Pentadactyl user for a long time. With the destruction of XUL, and gradual "Chromatization" of Firefox, there's less and less reason to use it over Chrome or Chromium. That reasoning has always fallen on deaf ears at Mozilla, of course. And after reading things like https://www.reddit.com/r/li... and https://blog.mozilla.org/da... , I'm not sure if Firefox is even better from a privacy perspective.

I'm glad that some people have the energy to work on Tridactyl, anyway. For me, the issue of it not working until after a page has loaded is a deal-breaker; that kind of inconsistent, dependent behavior would drive me nuts. I hope they're able to convince Mozilla to allow them to fix that, but I'm not holding my breath.

I almost feel like we're headed for another dark age of Web browsers, and it seems like the chances of coming out of it are less than last time because of the complexity issue. I don't know if a phoenix can rise from these ashes again. So I find myself using Elfeed, w3m, eww, org-web-tools, etc. more. I use uBlock and uMatrix, of course, but I feel less interested in customizing my browser like I did with Firefox/Pentadactyl (I even released the TabTint Firefox extension, now consigned to the dust bin). I feel like my effort would be wasted, whereas with Emacs, Org, etc, I feel like the improvements I make will last much longer.

Hey, NetSurf keeps chugging along, and I think it even supports JavaScript now. Maybe someday it will be viable, haha.

Christopher Wellons

Despite my optimism with Tridactyl, I too am worried we're still headed for a web browser dark age. Users are, in general, _far_ too tolerant of awful performance and interfaces. (That's much more the fault of app and web developers rather than browser developers.) Further, the complexity of the web has lead to centralization, which is ripe for abuse.

Until recently, I thought the end of XUL would play a much bigger role in this than it did. The reason that it didn't is not good news, though: Firefox is quickly becoming irrelevant. Today, most web browsing happens on mobile devices, and Firefox's mobile market share is half of a percent. Even on the desktop it's under 10% and still plummeting.

A big reason behind Firefox's "Chromatization" — not just with dropping XUL extensions, but also, as a side effect, the overall gradual reduction of customizability, including the removal of useful about:config entries — is a desperate attempt to remain relevant. But you're right, what's the point of using Firefox if it's just like Chrome? For me it's still different enough, but for most users it isn't.

Aurélien Gâteau

Nice article! I need to give Tridactyl a try!
Regarding keyboardless browsing, I find Firefox context menu surprisingly efficient. I use it a lot, especially for going to the previous page.

Visitor

You may like to try Surfingkeys https://github.com/brookhon.... I'm an Emacs user and use Emacs default key bindings. After using Surfingkeys for few months, I think I may switch to the EVIL side.

jemin

How do you set up search key now that the earlier version was deprecated?

searchsetkeyword

searchsetkeyword(keyword: string, url: string): void

Defined in src/excmds.ts:2819

Set a search engine keyword for use with *open or set searchengine

deprecated

use set searchurls.KEYWORD URL instead

jemin

Ah, actually it is the reverse. The setting you show is actually the current one.

Still, .tridactylrc does not seem to be picked up by firefox. Setting in the tri's command prompt works, though.

Ramiro Rela

Did you ever get to use muttator on thunderbird? It's the same project as vimperator. I lost it and don't have a replacement.

Christopher Wellons

Sorry, I just use Mutt itself for email, and I don't have any use for Thunderbird. I don't expect I'll ever try Muttator.


The Missing Computer Skills of High School Students

Антон Южанинов

It is definitely better to touch type but I don't think it is a big issue. I learned to touch type about year or two ago, after many years working as sysadmin/developer. And I still type only marginally faster then before. I was able to type fast looking at keyboard and using only a few fingers. If you type a lot you will type fast even without touch typing.

typer

I have something to add to your discussion of "students have poor typing skills".

Most people, at least, me, and the people I know, generally do not learn how to type from a "type training" computer class (unless word processing in an office is one's day job). Of course in the very beginning, when you just started to use the keyboard, you need some basic training and practice by using various software, but **real typing skill is learned by actually using the computers day-by-day** - on surfing the web, chatting, gaming, programming, or editing an actual document.

In this process, people can often learn their own way of typing, which "just works" for them, despite its inefficiency compared to touch-typing in the standard finger position. For example, I can type at 50 WPM minimum by just pecking two fingers on the keyboard, and even faster when I use more fingers. I barely need to look down at the keyboard, I only do it to readjust my hand position, but most of the time, I can do this by using the text I just mistyped on the screen as my feedback. Many people I know who are not touch-typers, have naturally acquired similar typing skills.

Just in the same way, when the cool kids in the 2000s were using SMS to text all the days, they learned to type at an unbelievable speed on the 3x3 numpad phone keyboard, which I see as a a form of torture.

In conclusion, IMHO, touch-typing in standard finger position is ABSOLUTELY NOT REQUIRED to be a successful computer technician. As long as you can type in a reasonable speed so it doesn't become a significant bottleneck of your human-machine interface, it is okay. It is good to learn the standard way to improve your efficiency, but often it is optional.

On the other hand, I think the real underlying cause of "poor typing skills" problem, is exactly the same "poor computer knowledge" problem you are addressing, they are closely related - if one is struggling to type words and commands on the keyboard, e.g. to use a commandline often indicates A LACK OF GENERAL EXPERIENCE ON USING COMPUTERS, and this is the real problem.

Many people from the younger generation simply don't have the experience of surfing the web, posting in online forums, writing one's own blog posts, or chatting all night long, or even gaming, ON A GENERAL-PURPOSE COMPUTER, which was exactly how the older generation learned to type, or learned the basis of computing, at least the application.

As general-purpose computers are becoming less and less common, this is going to be the new norm.

No_Ads_No_Tracking_No_PII

I have something to add to your discussion of "students have poor typing skills".

Most people, at least, me, and the people I know, generally do not learn how to type from a "type training" computer class (unless word processing in an office is one's day job). Of course in the very beginning, when you just started to use the keyboard, you need some basic training and practice by using various software, but real typing skill is learned by actually using the computers day-by-day - on surfing the web, chatting, gaming, programming, or editing an actual document.

In this process, people can often learn their own way of typing, which "just works" for them, despite its inefficiency compared to touch-typing in the standard finger position. For example, I can type at 50 WPM minimum by just pecking two fingers on the keyboard, and even faster when I use more fingers. I barely need to look down at the keyboard, I only do it to readjust my hand position, but most of the time, I can do this by using the text I just mistyped on the screen as my feedback. Many people I know who are not touch-typers, have naturally acquired similar typing skills.

Just in the same way, when the cool kids in the 2000s were using SMS to text all the days, they learned to type at an unbelievable speed on the 3x3 numpad phone keyboard, which I see as a a form of torture.

In conclusion, IMHO, touch-typing in standard finger position is ABSOLUTELY NOT REQUIRED to be a successful computer technician. As long as you can type in a reasonable speed so it doesn't become a significant bottleneck of your human-machine interface, it is okay. It is good to learn the standard way to improve your efficiency, but often it is optional.

On the other hand, I think the real underlying cause of "poor typing skills" problem, is exactly the same "poor computer knowledge" problem you are addressing, they are closely related - if one is struggling to type words and commands on the keyboard, e.g. to use a commandline often indicates A LACK OF GENERAL EXPERIENCE ON USING COMPUTERS, and this is the real problem.

Many people from the younger generation simply don't have the experience of surfing the web, posting in online forums, writing one's own blog posts, or chatting all night long, or even gaming, ON A GENERAL-PURPOSE COMPUTER, which was exactly how the older generation learned to type, or learned the basis of computing, at least the application.

As general-purpose computers are becoming less and less common, this is going to be the new norm.

Dave Delaney

As a dad this hits home for me. I especially want my kids to learn the essentials. I need to do a better job at getting them participating. It would probably help to buy them machines they can poke around with. I wonder what percentage of homes now exclusively have mobile devices/tablets without PCs. If it's as high as I think, your 80s analogy is spot on.

Any tips on the best machines for them to learn on? Maybe I need to get them a TRS-80 or C64, like I learned on.

OllieJones

For the typing, IT, and academic skills: previous-generation laptops back from corporate lease. They're tough, and they're cheap. Ebay.

For computer skills in general: Previous generation minitowers. Kids can take them apart, add components, and so forth. If you can possibly make the kids use wired ethernet to connect, that will help: that will help them get a concrete conceptual model of the intertoobz. Then, when one of them has $20 from mowing a lawn or something, she can buy her own wifi dongle.

Or, as Chris wrote, a Raspberry Pi starter kit, a monitor, a mouse, and a GOOD keyboard. Skimp on anything, but not on the keyboard.

Dave Delaney

Great call! Thanks, Ollie.

Christopher Wellons

I'd just put the kid in front of a Linux system either on a spare machine or a Raspberry Pi (clone or not). Something where you could give them enough freedom to break it (unintentionally). If you want to go old school, the Z80 paired with the excellent book Programming the Z80 would be pretty exciting:

http://www.z80.info/zip/zak...

I would have loved to have that when I was a kid. I've never tried to teach someone using this book, though.

OllieJones

Thanks for the post. It's good to be reminded what's obvious and what isn't.

There was a time when US high schools offered classes in typing. When I took one (I'm not saying how long ago) it included some basic layout skills. But the meat of it , on Royal manual typewriters, was drills:

a;sldkfjgh

repeated enough times to get the finger positions into muscle memory. A semester of that made a huge difference to my work life and my ability to create. I'm hard on keyboards because I learned to pound. But they are cheap.

For a while Apple had the pole position in computers for young students. But, when they started pushing tablets, Google and the OEMs started pushing Chromebook mini-laptops. The school system where I volunteer has ditched Apple and gone with Chromebooks, BECAUSE THEY HAVE KEYBOARDS. It helps that they're robust, cheap and easy to manage.

But the schools are not doing a good job teaching kids to USE those keyboards.

The vocational school kids learn their trades by practicing muscle memory: culinary students cook, electrical students wire things up, and so forth.

Knowledge work is a vocation too. The core skill for knowledge work is TYPING. You're a programmer: you type for a living. You're an insurance worker: you type for a living. You're a librarian, you type for a living.

You go to college: you type, and type, and type.

Endless typing drills are boring. But the benefits over a lifetime are vast.

Schools! Do this!

Mike Zamansky

I don't know if it's fair to blame the schools. I think it's more the society we're in and the technology we have.

I wrote about this a couple of times:

https://cestlaz.github.io/p...

https://cestlaz.github.io/o...

Christopher Wellons

Fair enough. I suppose the root of the problem goes deeper than schools. Thanks for the links! I hadn't yet seen the "A new digital divide" article linked from your article.

Chris Gonnerman

I have worked with general users on computers for my entire career, basically since 1984, and one of the things I've found to be true is that people either "get" paths or they don't. I often ask users, if I gave them a flash drive and told them there was a file in a folder on that drive, and gave them the names of both, could they find it? Almost all of those who answer yes tell me they don't understand what's hard about it; practically none of those who answer no ever figure it out. Age and gender don't seem to have any effect, though I have not actually done statistics. It seems to be a kind of abstract thinking that not everyone has. Curiously, I find accountants (and accounting students) mostly get it. Nothing more abstract than a page full of numbers, I suppose. And general intelligence isn't an indicator either... my favorite (now retired) doctor never got it, and I have great confidence in his intelligence. But everything in his world is real and physical; there's nothing abstract about a heart or lungs or whatever.

possiblywrong

I know we've talked about this, but I think it's worth commenting on some additional good observations here: I tend to agree with the disconnect starting in the schools (and I typically have to *defend* the quality of public school education, at least in this area). I find it fascinating that there is so much current emphasis on "STEM," while at the same time a seeming lack of recognition of how fundamental the keyboard is to almost any S-, T-, E-, or M-related vocation.

David Kraft

WHAT?
How about Excel?
There isn't a human being in the workforce that shouldn't be able to add a column of numbers together.

Cutting and pasting grid data and manipulating it to your needs is essential.

Fundamental as reading and writing.

Hirox

I'm a main programmer in a major game publisher / studio. We had a wrong hire not long ago.

She's a multimedia major student. And she applied Planner and IT department. IT dept were full so she got assigned to my team.
I can assure you, team treated her like a princess, but... no way we can say she's useful to the team.
1. Don't know where to find installers.
2. Unfamiliar with Windows, need people help setting up her PC.
3. Deleted resources / repo by accident.
4. Definitely unfit to be a programmer.
5. etc, etc.
School lab has PCs but all softwares were pre-installed, no set up needed, and at home, she only use iPhone. Apparently, she's not alone in this.

Do you think IT will suit her better? Planer? Even a Planer needs to be familiar with Version Controller, Ticket System, Project Manager, MS Office (especially Excel), Database and other business tools.
Law firm? Accountant?
How about canteen staff? Front desk? Arcade center staff?

Kids are innocent imo. I blame parents and school, the system.
Imagine this. Parents thinks their kids are so smart, knows how to swipe / operate "PC" without me teaching... Yeah, that's the source of the problem. And professors in the universities failed at teaching those kids PC skills. Or, maybe, you reckon it is already too late (typping with muscle memory for example)? When I was 13 or 14, I was fumbling around and built my 1st PC. I broke it then I diagnosed and fixed it. I knew what a "computer" is way before I enter university.
* Then programming became my vocation. C++ is my root. Havok took me a month to start cause I had 0 3d programming knowledge at the time. I can start fumbling JS, html5, python, Java, C#, Unity, Unreal, jQuery (I even wrote my own framework before jQuery's time) and other computer languages around on day 1.

Hirox

btw, not long ago, someone was arguing on the internet, saying business applications are better on the iOS because it has APP and Windows only have WEB version. And because of that, his company, are adapting iPads (I bet it was a lie).

Ticket System for example.
Say, you are on the ListView. You are about to open a ticket. What if, you don't want to navigate away from the ListView?

Most "APP", shows 1 view at a time. So when you open ticket#123, ListView is gone. What if you want to go to ticket#432? Hit the back button at the top left corner (instead of alt+ left). Wait for the ListView to reload then navigate to ticket#432.

On Windows (browser), you have tabs and windows.
Colleague came "hey, can you double check ticket#432 with me?"
"Sure" you then alt+d, ctrl+c, ctrl+n, alt+d, ctrl+v, end, ctrl+backSpace, 432, enter.
What if you want to compare 2 tickets side by side? win+right, alt+tab, win+left.
What if you want to jump between tabs? ctrl(+shift)+tab or, ctrl+[numKeys].

How do you correct typo?
left, menu, enter?

Some tools lack pro-features/shortcuts, you cannot hack it or inject scripts to enhance your work flow on phone OSes. You cannot hack stuff, create automations to speed up your work flow on phone OSes too.

How efficient can you work on a phone OS?

oso2k

Chris,

Something I've started using with my daughter is "Mario Teaches Typing for DOS" as hosted on the Internet Archive. Simply google for "Internet Archive Mario Teaches Typing".

Christopher Wellons

That's a good idea. I've got an old DOS copy of Mavis Beacon for similar reasons. Works great. I've looked into the open source options since I generally prefer it, but they're just not good enough to recommend to anyone.

oso2k

Same here. :/ Educational software is somewhere FLOSS is far behind Closed Source.

4thGenBendite77

skeeto but


A JIT Compiler Skirmish with SELinux

ttsiodras

Excellent story - I learned a lot!

Thanks for sharing.


Why Aren't There C Conferences?

Jon Kalb

Chris,

How would you feel about making this into a play list for the CppCon YouTube channel?

Let's make that happen.

Jon

Christopher Wellons

Good idea, Jon. I just put these together:
https://www.youtube.com/pla... https://www.youtube.com/pla...
I almost didn't do it since the YouTube UI is so ridiculously awful, and I was terrified at the idea of of manually adding 42 videos through that interface. Fortunately I found this nice little hack and it only took a few minutes:

https://webapps.stackexchan...

(I did get your email about this. I've just been neglecting my email lately.)

Jon Kalb

Do reply by email and we'll discuss hosting a version of your blog post on https://cppcon.org and putting the playlist on our channel.


A Survey of $RANDOM

John Doe

You can find the code of ksh86a, with the RANDOM variable already present, here:
https://github.com/weiss/or...

The version number is here: https://github.com/weiss/or... : ksh-06/03/86a

Also, the RELEASE file for this version (https://github.com/weiss/or... reference the list of changes introduced since the previous version (ksh-02/21/85). It *does not mention RANDOM*, bit mention fo rexample the addition of the special variable SECONDS, or change of behavior for variables PWD and _ (underscore), or bugfix for the variable PPID. So, I guess it is safe to assume that RANDOM var already present in ksh-02/21/85.


A JavaScript Typed Array Gotcha

Klaus

Gotta love MATLAB XD

(See link at end of article)

Now I wonder if the behaviour is consistent in Octave.

WebReflection

Alternatively: `let r = (++array[0], array[0]);` 🎉

Christopher Wellons

Clever!


The Day I Fell in Love with Fuzzing

César Rodríguez

Did you consider using a dictionary for your input format? I'm sure you can improve the coverage of your tests by using one. More in section 9 of http://lcamtuf.coredump.cx/....

Another point is that updating the code doesn't render useless all AFL-generated tests. You can always use them as seeds in a new fuzzing session on the updated source code. The coverage you already achieved with them in the previous session will save you time for the next one.

AFL is great if you want to fuzz a C program, but for managed languages (e.g, Java) there is practically nothing available. We are trying to put an end to that: https://www.diffblue.com/bl...

Christopher Wellons

Thanks for the tips. I'm aware of afl's support for dictionaries, but, since it's just INI, the input has no keywords. Some strings may be special to the game, but my parser only knows about INI. All the special parts of the syntax (brackets, commas, quotes, etc.) are single bytes, which, unlike long keywords, are not difficult for afl to discover on its own. It even very quickly figured out floating point syntax — though without instrumentation on strtod() there wasn't much more for it to explore.

That's also a good point about about seeding the next corpus from the current test set.

Philippe Bourgau

Thanks for the article. My team thought of similar tools when doing exploratory testing, and I'm really interested to see that there is some theory behind this technique.

We were doing exploratory testing, through the UI, which has some pros and cons compared to fuzzing. We would use gamification to spice the thing up a bit and make developers want to break their program! (https://philippe.bourgau.ne...

I guess fuzzing would come has a handy tool when doing exploratory testing.

Qqwy

What is your opinion about Property-based testing? When would you use fuzzing vs propchecking?


The CPython Bytecode Compiler is Dumb

Kurt Jung
...in this case for my wife’s photo blog

Stunning bird photographs, really exceptional. And nice work on that responsive static album generator. Now time to continue reading about CPython byte code...

Christopher Wellons

Thanks, Kurt! I showed her your comment.

D ZJ

Optimization passes obvious takes time. So Julia does the compilation and stores the compiled function. I guess some implementation of Python can try a similar route where the first time a function runs it tries to compile it with optimization passes. But that changes the whole language. Julia on the other hand can suffer from long compilation times if the programs are complex.

maybe give Julia a try and see if you like it.

Christopher Wellons

I did look into Julia about five years ago:

https://nullprogram.com/blo...

I thought the FFI API was super-slick and it inspired me to experiment with an Emacs Lisp FFI. My complaint about modules was too harsh. I don't really care much about inferior-process interactive development anymore, so that part doesn't really matter to me. I still think its string indexing is weird and ill-suited for its intended target audience.

I don't personally have a need for Julia, but I do have a bunch of Matlab-addicted co-workers. My exploration of Julia was more about finding an alternative that I could sell to them so that they could kick their expensive and unhealthy habit. I would happily use Julia instead of Matlab when collaborating with them. After all these years still no luck with either Julia or NumPy + Matplotlib.

dave

Obviously there's not much value in optimizing specific cases like 'return [1024][0]' because that would rarely if ever occur in real code. But the problem is that nearly anything more general than that could likely not be optimizable because dynamic typing, operator overloading, runtime eval support, etc. all combine to make it all but impossible to know ahead of time what the actual behavior will be.

So yes you could optimize the specific case of a builtin array object being created explicitly and initialized with an explicit single member that is an int followed by an array access of an explicit numerical parameter. But even small differences from that exact set of conditions and you can't really optimize it anymore - if you instead use a class constructor instead of array syntactic sugar, or instead of an explicit int you use a variable that may not point to an int for the initialized value, or instead of an explicit int for the array index you use a variable that may not point to an int, then it may no longer be safe to take a shortcut.

Christopher Wellons

Yeah, you'll never see "[1024][0]" in real code. I wrote it that way to avoid introducing a variable, which, from the previous section, I already knew would inhibit modification. Consider a function that references a dict as if it were static data local to that function. The data never changes, so it doesn't really need to be reconstructed every time the function is called.

table = {'a': 1, 'b': 2, 'c': 2}
return table[name]

This isn't uncommon. The bytecode compiler does the "dumb" thing and rebuilds the dict each time. (This is exactly the sort of thing escape analysis is for.) If I want to avoid this penalty, I have to arrange for it myself, like you were saying (global variable, closure, class member, etc.).

possiblywrong

Very interesting post, particularly the comparison with other language compilers. Python strives to be a language that "you can keep in your head," not having nooks and crannies and unexpected corner cases either in its language specification or in its implementation. Although I think it does a decent job of this (or at least it used to, but just like C++, the spec is getting more complex), your puzzle involving the tuple modification (and the constant optimization discussed here) was an interesting unexpected gotcha for me.

I'm reminded of another gotcha presented by a student, who while learning to program had gotten in the habit of always testing boolean conditions by explicit equality comparison with the literal True... with unexpected results that made me stop and think. For example:

x=-1
if x < 0 == True: print('Negative')

Christopher Wellons

Your example was really puzzling. I was confused for awhile, but I think I figured it out thanks to something you pointed out a couple years ago.
I looked up Python operator precedence to see if there was anything funny going on. I did notice that Python gives the relational and equality operators equal precedence, which is unusual (but has a good reason). Most languages, including C, give relational operators higher precedence in order to avoid this sort of confusion.

However, in this case, having equal precedence wouldn't hurt since its left-to-right and the relational operator came first. So that's not it. Next I looked at the assembly for that bytecode and walked through it, which is when it hit me: Python does the conventional mathematics notation thing with "a < b < c". This is just a case of it in action.

Something else I learned from the bytecode is that "a < b < c" is short-circuiting. You can count on it stopping evaluation as soon as it knows the answer.

shevek

One more interesting example of the same gotcha is:
False == False in [False]

kallikanzarid

At least when you're dealing with NumPy, you can use Numba. It JIT-compiles functions that you mark with a special decorator allowing you to create arrays and loop over them without paying for the interpreter overhead. It's useful for writing kernels on Kaggle or just any prototyping any numerical code where you have a bottleneck that you don't want to rewrite in C.

https://numba.pydata.org

Christopher Wellons

Didn't know about Numba, thanks!

shevek

@skeeto:disqus
Very interesting post, thank you :)


Python Decorators: Syntactic Artificial Sweetener

Clément Pit-Claudel

> but I still wonder why it was defined that way.

The PEP explains it, actually: 'The decorator statement is limited in what it can accept -- arbitrary expressions will not work. Guido preferred this because of a gut feeling [17].' The link is to https://mail.python.org/pip...

Christopher Wellons

Ah, thanks, I didn't notice that. Looks like I'm 15 years behind on this.

Henry Chen

Illuminating article! Not suggesting this is a good idea, but a possible way to circumvent the restricted decorator grammar could be a function

def do_nothing(x):
  return x

which wraps some expression:

@do_nothing(lambda f: f)
def f():
  pass

Christopher Wellons

Oh, clever idea! It seems so obvious now.


An Async / Await Library for Emacs Lisp

Nance

You might be interested in trio.rtfd.org, which I am optimistic will become the future of async programming in python. For a more theoretical perspective on the concept of "structured concurrency": https://vorpus.org/blog/not...

Christopher Wellons

Very interesting article, Nance, thanks. In playing around with my little library, I was noticing this pattern myself. If I created a promise but never awaited on it, errors were silently swallowed up in the promise and lost. It's confusing and risky, and it's an indication that I probably shouldn't do that. Having an explicit nursery context awaiting on everything solves this problem.

The diagrams in the article are really useful, too. It's obvious that a "goto" is half of a function call: It's a primitive on which to build functions as a proper abstraction. In the same way a "go" is just a primitive used in constructing a nursery, and you need the second part, the join, in order to have a complete abstraction.

JDS

This looks very interesting. Currently I'm struggling with implementing a completion framework for a mode where a trip to an external process is required to calculate completions. Sadly, completion-at-point-functions (CAPF) provides no callback interface. What would be fantastic is to 'await' the external process' work in the CAPF function, then continue on, returning a set of completions to caller. The caller would itself never know an asynchronous excursion occurred. Possible with aio?

Christopher Wellons

In order for aio to do its job, the event you're waiting on must relinquish the thread — it must truly be asynchronous. Otherwise it's simply not possible. If Emacs threads were more useful, you could put work around this by putting it on a thread, but they're not quite there yet.

JD

It is an asynchronous process, but the completion framework itself is fully synchronous. So I suppose there is no way to pause it without blocking. I guess you need aio function "all the way up" the stack to make this possible. I.e. inserting an aio-defun somewhere down the call stack cannot simulate a "paused" function.


Some Galois Linear Feedback Shift Generators

(no comments)

Endlessh: an SSH Tarpit

Charlie

OpenBSD spamd (for SMTP) actually has a 2-second delay between sending each character. Perhaps their code might be useful?

In the first years that I ran it, I had a few threatening messages from spammers who thought that I had personally maligned them with it.

Christopher Wellons

I'd heard about that OpenBSD spamd feature before but had forgotten all about it. Glad to be reminded, thanks! I'm surprised spammers took the time to threaten you over it.

Ishwor Gurung

I see where you are coming from on developing `endlessh` but it begs the question why?

Even on a 1Gbit link that gets DoSed at close to line rate, I can safely deploy internet facing servers that *ban* IPv4/IPv6 noise for a prolonged period using fail2ban (which uses iptables and log parsing to drop the packets). It has served me well thus far.

On a 10Gbit+ links, those logic need to be moved closer to the NICs (eBPF) and/or ASICs to bypass kernel processing of packets if so you desire.

Is there something I am missing?

Christopher Wellons

On the server where I'm currently running it, port 22 is constantly bombarded with SSH brute force attempts. I hope my tarpit slows down these attackers by tying up some of their resources for a time, at practically no cost to myself. Presumably they're committing to N attempts in parallel (threads, processes, etc.) spread across different targets. N is likely based less on network bandwidth and more on some other resource (available memory, CPU cores, etc.). My tarpit will trap some of those attempts and effectively reduce N for a time. Overall I'm making SSH brute force attempts a little more expensive.

Stanislas

I didn't know about tarpits! Thank you for sharing.

FRex

I wonder if a compression bomb or other kind of malicious reply could be used to waste some CPU and RAM.

Christopher Wellons

For SSH you'd need to do this later in the protocol, assuming the server can force compression to be enabled (can it?). But I know for sure you can pull it off quite simply with with HTTP. That's exactly what Neal Krawetz did here to defend his hidden TOR service against malicious bots:
https://www.hackerfactor.co...

FRex

It was just a wild idea since we are already into "frustrate attacker/waste his resources" territory. I wasn't sure if there even was compression in SSH and quick googling just got me to server side stuff, like 'yes' (enable) vs. 'delayed' (enable only after auth to not expose server to attacks via compression algo).

mmorgan27

I was running your program and discovered, to no one's surprise, most of the probes are coming from China. I was considering putting some rules in my firewall (for fear of attracting additional attention due to the tarpit), but the number of subnets that the probes are coming from is just too many, despite almost all being from China. Then I remembered some post on Reddit the other day about how these jokers would send messages in in-game chat about Tiananmen square in a chat and the Chinese player would be booted. Probably just a joke, but still funny.
So I modified the program to send back messages that are likely on the blocked keyword list for the Great Chinese Firewall. Perhaps the problem will be taken care of by the Chinese government by putting my server on their firewall rules. If nothing else, I think it's pretty funny that these guys are now downloading political dissident messages.

SRG

This is brilliant !

I've just installed your dockerfile (c version) image, it's working fine.

Of course, i fear that in the end it's going to be taken in account by attackers (that are going to add extra timeouts in their scan procedures and auto-avoidance of servers running that kind of tarpits), but it's still a nice first layer defend for now.

Jim P

I have also moved my sshd to a different port. For some time, the logs were very quiet. Last month, however, something must have discovered the proper port and I've been fielding failed logins about once a minute, which led me here. I've noticed, with verbose settings, that the attacking clients will hang up after the second line gets written out, so I've had to extend the timeout to 1000 seconds in order to keep clients trapped.


Fibers: the Most Elegant Windows API

Jon Forrest

Extremely minor typo:

"It’s not quite and apples-to-apples comparison" -> "It’s not quite an apples-to-apples comparison"

Christopher Wellons

Oops! Fixed now, thanks! I gave the whole article another pass and fixed some other issues, too.

possiblywrong

This was really interesting, I wasn't familiar with this API at all. How does the "equivalent to the process main thread returning from main()" work? That is, if a fiber exiting (with void return) can shut things down like this, then what integer gets returned from int main()?

I wonder if you could get around the WaitForMultipleObjects() count limitation by using WaitForSingleObject() instead, with some extra mutex and condition variable wrapping? In other words, effectively implement WaitForMultipleObjects() "in terms of" WaitForSingleObject()?

Christopher Wellons

Based on experiments like my "coup" example, the process will automatically exit with status 0. However, that's not the whole story, and your question prompted me to dig a bit further. Normally the C runtime also does cleanup when the main function returns, such as flushing output buffers and running atexit() routines. What actually happens depends on the toolchain and the host.

When compiled with Mingw-w64 and run on Windows, it seems the C runtime exits cleanly when any fiber returns on the main thread. However, when run under Wine it either crashes (older Wine), or it flushes but fails to run atexit() routines (newer Wine).

When compiled with MSVC, the C runtime doesn't exit cleanly when a fiber returns on the main thread, and neither flushing nor atexit() occurs. I can't really say I'm surprised by this.

There isn't any clearly documented behavior, but Mingw-w64 is the most conscientious. There's an uncertain interaction between the host and the C runtime here. The host implements CreateFiber() and the CRT supplies the C cleanup routine, but there's no documented way for them to coordinate. It would be interesting to explore this further.

Christopher Wellons

I replied then realized I only addressed the first paragraph! Here's my reply for your second paragraph.

You're on the right track. We'd need some way to group multiple handles under a single, new handle, creating a kind of tree of handles where no level has more than 64. This can be done with a thread pool, where each thread waits on multiple handles, and the non-leaves wait on other thread in the pool. In fact, this is what RegisterWaitForSingleObject() is for.

There's an efficiency benefit to this: Only one path down the wait-tree to the signaled handle was woken up, and so only waits on handles along those paths need to be restarted. If I'm thinking about this properly, for about a million handles, the wait-tree would be log64(N) = 4 layers deep (not bad), and you'd need around N/64 = 16,000 threads (not so great).

Zafer Balkan

It's the first time I have ever seen the Fiber API. I was into porting a library that utilizes libco (a coroutine library) to Windows and trying to find an equivalence. Alternatives are using a cross-platform library like libfiber or Windows native Fibers, after I have seen your post. Thank you for your enlightening demo also. By the way, the library I have mentioned, libfiber might be helpful since it works on windows too.


Looking for Entropy in All the Wrong Places

FRex

Noticed small typo 1/3 way through in sentence "The more pressing issue is that time(3) as a resolution of one second." the "as" seems like it should have been a "has".

It surprised me all this work of looking for entropy in unusual places (using tmpnam like that was very neat) went to seeding rand (which at the start was said to be low quality, unknown algorithm and even slow) instead of bringing along some random algorithm written in portable C and seeding that.

Christopher Wellons

Fixed, thanks!

Anonymous

What about getpid and getppid?
There's also the raw value
of a FILE*, although if you can't open /dev/urandom that doesn't help.
And as a last-ditch effort, there is of course always uninitialized
memory/registers. Although if you're that starved for entropy, maybe
it's better to give up than to attempt to spit out something that might
actually be horribly insecure.
On newer-model x86 processors, there's also RDSEED, and this is coming in ARM v8.5-A too.

Christopher Wellons

The purpose of the exercise is to gather entropy using only strictly conforming C. Since getpid() is part of POSIX, not C, that's off the table, as is RDRAND since it's only accessible via compiler extensions.
I didn't use uninitialized values for two reasons. First, it is, at best, a secondary source, residual from a source that is already being sampled directly (ASLR, etc.), so it has no practical value. Second, reading initialized values is a bit dubious in a strictly conforming program anyway. Except for char, the value could be a trap representations.

Anonymous

Well, RDRAND is accessible through assembly too.

If you're okay with using the JIT technique you used for some other stuff, that could work. But it's probably not a very good idea for production code.

malloc() is pretty slow, so you could put that in the clock() loop to extract some entropy from the time it takes the OS to hand you a block, and also spend less time busy waiting.


UTF-8 String Indexing Strategies

Klaus

I’d say it really depends on the domain. A main argument of the „utf8everywhere“ website is that indexing strings is rarely relevant to performance in day-to-day programming, if done at all, so the cost of having to fix forgotten handling of international characters when it comes up from user input outweighs the performance benefits of not abstracting it away. Especially since characters may be represented as sequences of code points and separating them may result in something barely more useful than byte-wise indexing.

That said, for my use cases (data analysis with python) implicit unicode handling is usually an annoyance, as it requires me to write b““ instead of just ““ everywhere, and originally many features were string-only. Thankfully, as of 3.7 essentially the same string and regexp features are available for both strings and byte arrays.

Abel Stern

As you mention in passing, Python >=3.3 takes a different approach for str objects: like Emacs, it takes the highest code point in a string to determine the encoding it will use. Unlike Emacs, it then always picks a fixed-width encoding (latin-1,UCS-2 or UCS-4). That is, "fish" would be encoded in latin-1, but "fiŝo" in UCS-2. This is, of course, not the most storage-efficient way to handle long strings containing a few non-ASCII characters. If the programmer wants to pick an encoding him/herself, the bytes type presents a more basic alternative.

Some references for those interested in the Python specifics: How Python does Unicode, How Python saves memory when storing strings.

okapia

Indexing by Unicode codepoint really isn't that useful. When you factor in various Unicode features like combining characters and zero-width joiners you might as well consider UCS-4 to be a variable width encoding too. With that in mind, UTF-8 everywhere and byte indexing makes for a pragmatic approach. You can generally avoid the worst inefficiencies of O(n) indexing by keeping track of relative indexes and sometimes rethinking aspects of your overall algorithm. It also helps if the API actually provides a way to advance an index back and forward over graphemes.

Christopher Wellons

Agreed. I was only thinking about the situation in Emacs Lisp recently because I was playing around with string hash functions, which is one of those few legitimate cases for iterating over code points. I've hardly ever iterated over code points manually.

Tracker1

Wouldn't UTF8 + NFKC be sufficient?


Go Slices are Fat Pointers

FRex

In "say, that it’s 16-bit aligned" was it meant to say byte? And alignment freeing low bits is also trick behind JVM's CompressedOops, another nifty ptr trick.

Christopher Wellons

You're right, thanks! Fixed.
https://github.com/skeeto/s...

Leandro Moreira

Isn't the correct macro to be used in the definition of the macro ARRAYPTR the FATPTR instead of ADDROF?

```
#define ARRAYPTR(array) \
ADDROF(array, COUNTOF(array))
```

Christopher Wellons

You're right, thanks! Fixed now. I renamed the macro during editing and missed that instance. That's what I get for not having a compiler check my work.

Valentin Deleplace

When you'd want to slice the variable foo, you would usually be in control of the declaration of foo, thus declaring an array of size one ([1]int) would do the job: https://play.golang.org/p/z...

I have reservations about the safety of unsafe.Pointer, as its name implies. The spec says: "A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr's value if the object moves, nor will that uintptr keep the object from being reclaimed."

Christopher Wellons

Yes, converting unsafe.Pointer to uintptr is risky. It's not a proper reference, so the variable could be garbage collected, or the variable could be moved by the garbage collector (though gc currently doesn't do this). However, in my example I never convert to uintptr. I go from pointer, to unsafe.Pointer, to pointer. Each of these has proper pointer semantics and would interact correctly with the garbage collector in both situations.

As far as I can tell, my example doesn't violate any aliasing rules (the Go spec doesn't even discuss aliasing), and the underlying type is the same. That's why I'm pretty confident my example will always be safe.

Valentin Deleplace

Indeed!
It's not the case (3) that I mentioned, it looks more like the case (1) Conversion of a *T1 to Pointer to *T2

Patrick Schlüter

For your C example it is also possible to use array pointers. It's quite unusual because of the uncommon syntax but works quite well for fix sized arrays. Your foo function becomes in that case
void foo( int(*)[4] ); or
void foo( int(*arr)[4]) with a parameter name
in the function for exampe sizeof arr == sizeof(int*) but sizeof *arr == 4*sizeof(int)

at tha call site a simple foo(array) will do.

One can get used to the syntax and it works without problem. There are some weird consequences though. If you want a function returning an array pointer it uses this syntax.

int (*foo(int a, int b))[4]; for example


Predictable, Passphrase-Derived PGP Keys

Yu0

So far my approach to the whole topic is: If it is import data and stored offline, don’t use a key file. If it is in Dropbox (e.g. a KeePass database) use a key and transfer it by hand.

I’d really be interested in finding a solution for handling keys that would be accessible even to average people :/

Though generally, I find that the real issues are rather related to trust than to encryption. I trust KeePass, but can I trust it’s third-party mobile ports? Especially since they can be updated from a store: If the creator gets hacked or goes rogue, suddenly my passwords may be compromised by a compromised app update.

I wonder what such scenarios might affect your backup?

Though your article points out a method, how I might be able to include a key in a backup / sync :) I always thought that to be impossible.

Nathaniel Harari

Hey there, I know this isn't the appropriate article to post this comment, but as I was using Elfeed in Emacs to read this, and the idea came to me whilst doing so, I had to write it in a comment before forgetting - I really hope that you don't mind.

In the search view, I often just go down my list of entries by hitting "r" for Mark as Read, but I'm wondering if it is possible that the entry with the cursor on it could be highlighted with a background colour and a different foreground colour as well? That would be super. So I'd have the marked colour above, the unread colour below, and the highlighted colour where the cursour/selected one is currently. It would just make it easier to stand out when I'm reading the news at 6:30am while I'm having my coffee but before I've finished it. :)

Just a thought. Still loving Elfeed. I use org mode to organize my feeds and wow is it nice and tidy. :D

Christopher Wellons

The search buffer uses hl-line-mode to highlight the current entry, so perhaps you could customize that mode, at least for the search buffer, in order to achieve what you want. (If I'm understanding you properly.)


The Long Key ID Collider

possiblywrong

Very interesting! Do I understand correctly the impact of the choice of N (the length of the zero-suffix indicating when to stop a hash chain)? As you increase/decrease N, you roughly halve/double the *memory* requirement (i.e., the expected number of chains that you need to store), but the *execution time* should be roughly unchanged, though, right? That is, you're still evaluating about the same number of keys, with the exception of the possible "edge effect" of the extra time spent on the *last* chain that you evaluate that uncovers the collision (where by "edge effect" I mean that, for large N, you may have found the collision early in the chain, but won't know it until you reach its distinguishing point)?

Christopher Wellons

Yup, that's exactly right. I largely chose my particular N so that I could get timely progress updates (printing a line as each hash chain completes). For a job I expect to run for several hours, I could certainly have chosen a larger N without significantly affecting the total run time. With the N I chose, the entire lookup table will always fit inside L3, if not L2, so I really have no need to increase N and move the trade-off towards less memory usage.


Keyringless GnuPG

Nathaniel Harari

Thanks for the great article yet again! I just switched to Fastmail from Gmail and I was going to implement GPG signing in my mu4e setup, but then I realized: literally none of my friends use encryption PGP/GPG in their emails. I mean zero. Zip. They don't even know what it is. The only people I've seen use it are coders in these kinds of circles, and I don't interact with them via email. So when it comes to emailing people, I could set it up but it would be a complete waste of time for me, which is a shame. :/ I really want to use it, but nobody I email with would ever actually make use of it back.

Christopher Wellons

Yeah, I gave up on the idea of GPG for email years ago. Nobody uses it, and it's a technological dead end anyway. The upside is it makes my email setup simpler, and MUAs that don't support it are viable options.


No, PHP Doesn't Have Closures

Clément Pit-Claudel

Is your beef that PHP captures bindings by value instead of by reference? I thought that the general agreement was that Python and Java actually got that one wrong (esp. wrt to loop variables). When the F# people had to tackle this problem, they forbid capturing mutable locals in closures precisely to avoid the resulting confusion (you have to explicitly use a ref cell and mutate the contents of that).

Besides that, I'm a bit puzzled: did you get to the section on capturing variables by reference? It's example #3 on the page that you linked to, and it shows how to fix your example (just write &$n instead of $n and you'll get $r == 2).

A nitpick: you're missing a sigil in front of the f that you return in bar()

Christopher Wellons

Loop variables can be a gotcha, and ones that aren't explicitly mutated should probably be distinct variables on each iteration (i.e. for ... in loops, or Go's range loops). But loop variables that are explicitly mutated, like C-style for loops (i++) should *not* be distinct variables on each iteration. It was a mistake for JavaScript to make this a special case just to avoid surprising newbies (instead choosing to surprise everyone else with special case semantics).

I mentioned references towards the end to get something "kinda, sorta like a closure" but *semantically* it's still something different. It's like binding a pointer to a hidden parameter than actually closing over a variable. This is close enough that it really doesn't matter, so I admit I'm being nitpicky. Still, the whole "use" thing is really silly, and I'm baffled they settled on that design.

Thanks for the note about the missing sigil. I just fixed it.

Clément Pit-Claudel

I think the loop case is up for debate. Rust works around it by making it impossible to use a captured, mutated variable, unless you go through a pointer, which I think is sensible.

I think what you describe (re references to pointers to hidden parameters) is exactly how I think of closures ^^ That is, in most implementations that I know, a closure is a record of pointers that is passed into the function at every call.

Clément Pit-Claudel

Also, what alternative do you have in mind for 'use'? If they just got rid of it there'd be no way to chose capturing by value or by reference, so I imagine you'd default to capturing by reference, and you'd capture everything based on which variables appear in the function's body — but then you'd run the risk of capturing things that you didn't intend to close over and overwriting variables you didn't intend to modify (all of this wouldn't be a problem in a language where you have to declare your variables, of course…)

Christopher Wellons

Capture by value or by reference is irrelevant to an actual closure: it simply closes over its environment, whatever that means for the language. Since PHP doesn't have closures, it must rely on "use" for a kind of partial function evaluation instead.

The "risk" of capturing things you don't intend isn't a real problem with closures in practice, just as we don't have a "use" for each inner scopes because we're worried about clobbering values in the outer scope. Closures are little different than being an inner scope.

Clément Pit-Claudel

Wait, but clobbering in inner scopes *is* a problem in PHP :) Isn't it a problem in essentially any language that doesn't force you to declare variables before using them?

Christopher Wellons

Ha, you've got me there. :-) That's one of my biggest annoyance with Python.

kallikanzarid

This doesn't seem very different from how C++ does it. In PHP, like in Java, all variables except primitive types are pointers, and you capture the value of the pointer. If you capture "by reference", you capture by the double pointer. Not ideologically pure, but not incomprehensible either.

Christopher Wellons

Capture by reference or capture by value is mostly irrelevant. The question is: Does the "closure" actually *close over* its lexical environment such that it can continue accessing its variables. The answer for PHP is no, even when using reference captures.

I'm not convinced C++ lambdas technically evaluate to closures either, but the semantics there still make more sense than PHP. For example, in C++ you can't capture global variables, or any variable without automatic storage duration, since that makes no sense. It's also explicit about captures, but you can use [&] or [=] to avoid listing each captured variable. Unlike PHP, C++ has a good excuse for this: it has manual memory management, and closures require thought about allocation. Specifying reference vs. value captures is part of this. (It's also why closures are honestly a poor fit for C++ and probably shouldn't have been added.)

SomeGuy

I guess this is technically correct - PHP doesn't have "real closures". But, having worked with both PHP and JS, I actually prefer the way PHP does things in this case. Using Closures in JS can be quite confusing, especially when you aren't familiar with all the details of how scope is captured, how bind works etc. PHP on the other hand makes this very obvious and hard to mess up.

Markus Heiler

On reddit this link was given: https://www.php.net/manual/... I think it may be useful to extend the article here and look whether this meets your criteria for a closure as-is.

Olivier Laviale

I have good news for you! You can pass variables by reference by using the "&" sign, like everywhere else in the language. You can reference the closure itself for recursion with this method. Also, short closures are coming with the next release of the language, here's the RFC: https://wiki.php.net/rfc/ar...

Try again in a couple of months :)

郭云鹤(Guo Yunhe)

I think the comparison between php and js only shows how php is doing better. You make less mistakes by whitelisting variables you want to pass to a closure. Also, designing a programming language doesn't mean that you have to keep the same practice as others. Language is practical. Good article share facts. Not opinions.

sapphirepaw

I'm probably missing something here, but why isn't it a closure, exactly? Consider the following:


function makeSequence() {
$i = 0;
return function () use (&$i) {
return ++$i;
};
}

Call makeSequence() twice, and the result is two distinct callables that generate their own sequence of integers. Nothing one callable does affects the other, nor can code in the global scope affect either one.

How is this JavaScript version different? Is it a closure?


"use strict";
function makeSequence() {
let i = 0;
return function () {
return ++i;
};
}

If a "true closure" requires nested lexical environments, then sure, PHP doesn't have closures. But maybe that's a distinction without a difference. An anonymous function can still have access to a variable outside its own scope, but not in the global scope, which is pretty much what distinguishes "a closure" for me. A closure has variables that persist between invocations, but aren't in the invoker's scope.

Lasse Hillerøe Petersen

Your comment puzzles me almost as much as the blog itself. I would think that your example is an excellent proof that using references to pass the local variable to an inner function is a very decent way to implement closures. What do you mean by "nested lexical environments", and how is PHP lacking them? (I will happily admit that I've never used PHP much.) Also, why wouldn't an anonymous function be able to access a variable in the global scope?

sapphirepaw

You can define a named function inside another named function in PHP, but that inner function has no way to access the outer function's local variables. (Also, it's really a global definition that runs once the outer function is called, so in general, never do that.) This is part of PHP's legacy design, where there were only global and local scope, and later, "superglobals" (variables that are specially treated by the language, so that access is never local.)

These scopes are still clearly lexical, I think, because they're not dynamic. A function that uses a local $n can't be affected by setting a global $n before calling it. But they obviously don't "nest", because of what I said above.

Python, in contrast, has nested lexical scope because a function inside a function can read (and, using nonlocal, write) to variables in any outer function's scope.

As for the global scope, that was confusingly worded. Anonymous functions can choose to access global variables, like any other function; but, they also have the ability to access variables in the enclosing scope, even if that scope is non-global.

To tl;dr my original comment, it's that the OP never proved the thesis. I read the post at least twice, and never came away feeling like I understood the reasoning. I would happily accept "it's because function x() { $a = function () { $b = function () {…}; }; } has no way for $b to use variables in x() that $a did not use", but I didn't get that from the actual post.

Lasse Hillerøe Petersen

I didn't quite get it either, nor do I feel sure I get your reply, I'm afraid.
Further down, Christopher Wllons writes: "in C++ you can't capture global variables", and that seems absolutely meaningless to me, because as global variables by definition live in the outermost scope, they never go out of scope and are available to any function value without further ado. It is only variables in containing functions that need to have their scope (and by this I mean allocation, not lexical name visibility) outlast the return of their defining function, the socalled upward funarg problem.
The reason I am interested is because I am toying with the design of a language, for which the compiler will generate C code, and I intend to implement closures precisely by hoisting inner functions to the global level, and pass any imported variables as references in an extra argument struct, while also ensuring that the containing function allocates these (but only these) variables on the heap instead of on the stack.
That an inner function should declare which variables it uses from an outer scope (and how) does not IMO interfere with the "closureness", it's mostly a matter of convenience/lazyness or even a sincere belief that this is a beneficial design for other reasons.
Meanwhile, I tried installing Php and playing with it a bit, and so far I seem to have run into trouble definining nested functions inside nested functions and for example assigning them to globals, so the first experiments are not too promising regarding at least the consistency of Php's closure implementation, but maybe I shall try a little harder. (Or even ask my niece, who is an MS in CS, and wrote a thesis on Php memory management - I guess she might know. :-) )


Legitimate Use of Variable Length Arrays

(no comments)

Legitimate-ish Use of alloca()

Vladimír Kotal

Solaris door mechanism spawns a thread (as a result of kernel upcall perhaps) inside of a user process that never returns. To avoid leaks there, alloca() is legitimate.


Infectious Executable Stacks

ContraPants

That's really interesting. I've done ld -r -b binary a lot for small resources in the past. I had no idea about any of this. Thank you!

Unrelated to ld, this piqued my curiosity:

No only does one contaminated object file infects the binary, everything dynamically linked with it also gets an executable stack.

If your process does not have an executable stack, will libraries that require one fail if you try to use it through dlopen at runtime rather than linking at compile time? Can a process dynamically change from a non-executable stack to an executable one? I would think no because it would be a major security vulnerability, so I don't know how the OS would handle this.

ContraPants

My curiosity got the best of me, and I tried it on a Linux box with gcc version 9.2.0.

As expected, the .so file had an executable stack. The executable, even though it is linked with -ldl, did not. However, the code still worked.

I don't know if it's a recent change, but compiling even without supplying -Wall or -Wextra still warned me about the generated trampoline.

Thanks again for the article!

Christopher Wellons

Yes, the process' stacks will dynamically become executable when a shared object requiring executable stacks is loaded. This even includes the use of dlopen(), so it can happen late in program execution. On Linux you can observe this by examining /proc/PID/maps as the program runs. I didn't know about this myself until I wrote this article, so it was a surprise for me, too.

I've never seen a warning about the closure trampoline. What Linux distribution are you using? I don't see it with Debian's GCC, nor with my own build of GCC.

ContraPants


$ gcc --version
gcc (Gentoo 9.2.0-r2 p3) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -std=c99 -o test test.c
test.c: In function ‘intsort2’:
test.c:5:9: warning: trampoline generated for nested function ‘cmp’ [-Wtrampolines]
5 | int cmp(const void *a, const void *b)
| ^~~

Christopher Wellons

Ah, interesting. Looks like Gentoo uses a hardened GCC build that enables -Wtrampolines by default. Thanks!

Walter Misar

Looking around a bit I found that dynamic lib excutable stack infection happens on debian (deb10u2) with kmail via libQt5WebEngineCore (/usr/lib/x86_64-linux-gnu/libQt5WebEngineCore.so.5.11.3):

7ffc11287000-7ffc112a7000 rwxp 00000000 00:00 0 [stack]

Another 15 binaries installed on my box link this library too.

Christopher Wellons

Wow, nice catch! I just did a build of qtwebengine-opensource-src and the cause is a bunch of BoringSSL assembly files and a "SaveRegisters" assembly file. Each of these assembly sources lack a .note.GNU-stack section, so QtWebEngine and everything that links against it gets an executable stack. Crazy!

Were you going to report this as a bug? If not, I can do it.

Walter Misar

Would be nice if you can let the appropipiate people know about it . You already seem to have dug deeper into it and I wont find the time to do so soonish.

Christopher Wellons
possiblywrong

Very interesting read. This seems... really bad, particularly the silent transition even when dlopen()ing as you discuss in the earlier comments. Am I missing some safety net, or is this really as dangerous/broken as it seems?

On a side note, re your mention of "Baking Data with Serialization." I just did something similar to this recently to get past a stupid email filter, turning a "binary" file (that I couldn't attach to an email) into a flat-text script file (that I could attach) that fwrote() itself from a Base64 encoding of its bytes. I thought I was clever... but I'm pretty sure the nugget of the idea came from reading your post years ago!

Christopher Wellons

Nope, you're not missing anything. I'm still honestly surprised how much this is tolerated, and I'm irritated that it's caught me offguard despite being aware of the issue.

There's still some hope. I said that my manual trampoline demo works on "nearly" any unix-like system. This includes both Linux and FreeBSD, each of which will happily make executable stacks. The main exception is OpenBSD where they've explicitly disabled executable stacks, so it's impossible to screw this up. Actual GCC closures don't work at all on OpenBSD for this reason, and it's noisy about them at compile time. On NetBSD, both my manual trampoline and GCC closures *sometimes* work. Whether or not I get an executable stack is kind of random, which I believe is a bug rather than a deliberate security feature.


On-the-fly Linear Congruential Generator Using Emacs Calc

(no comments)

Chunking Optimizations: Let the Knife Do the Work

Pedro Henrique
“Letting the knife do the work” means writing a correct program and lifting unnecessary constraints so that the compiler can use whatever chunk size is appropriate for the target.


But how do I learn that art?

sapphirepaw

I noticed the change to optimization flags. When I compile xor512a with -O3, gcc-9.2.1 produces assembly with semantics if (not overlapping) { use xmm code; } else { use byte-by-byte code; }… and then doesn't call it. It inlines the xmm code in main().

When I compile xor512d with -Os, then the function is nothing more than the byte-by-byte code, and main() calls it normally.

It seems that one must choose the proper knife, even before letting it do the work.


Efficient Alias of a Built-In Emacs Lisp Function

Klaus

Another efficiency thing that surprised me: Lexical closures seem to add quite some overhead to the bytecode.

I came across this when trying to use lazy evaluation. When you have something like

(lambda ()
(let*(a (f1 (lambda () a)))
(funcall f1)))


as a human, you recognize, that the function f1 cannot be called outside that let block, and thus could in principle reference the stack frame of the encompassing function.

The way it is currently implemented however, the lexical closure causes a to be accessed with varref a... In my practical case, this causes quite some overhead.

Christopher Wellons

Are you sure you compiled that lambda with lexical scope? When I tried it, the byte-compiler handled it really well. It even recognizes that the closure doesn't escape, and so it doesn't allocate a fresh closure object each time it's called, and instead uses a "static" one from the constants vector. I suspect this is because the compiler does lambda lifting and is careful with funcall.

Further, varref is strictly for dynamic variables, so either you compiled with dynamic scope, or you've previously told Emacs that "a" is a "special" variable (e.g. with defvar) and so should be bound dynamically.

However, if the closure escapes, then it has to create fresh closures since it's nowhere near sophisticated enough to do otherwise. I discussed this issue in my Emacs Lisp performance article: https://nullprogram.com/blo...

Klaus

"Have you tried rebooting Emacs?" :) The hind about varref solved my issue 😅

Seems like I had polluted my environment by unintentinally executing (setq a ...) without let-binding at some point. In the actual use-case (a macro), the root cause was incorrect use of make-symbol leading to something like

(let (#:--prefix-var--)
...
(setq #:--prefix-var--)
...)


that looked fine when printed, but actually the two symbols were separate; Weirdly enough, it DID work as intended when compiled.


Unintuitive JSON Parsing

Gregg Irwin

Nice article. On track with how things are not intuitive. Coincidentally, we're working on a syntax diagramming (live coded railroad diagrams) and dev tool, so I was able to use your `[01]` test case on a feature it has. My test session went like this:

- Load JSON grammar project
- Clear existing test input and put in `[01]`
- Click `parse` with the main `json-value` rule selected
- See that the parse failed
- Repeat with the `json-array` rule selected
- With `json-array` selected, click Find. No match.
- Select `number` rule, click Find. `0` is highlighted.
- Click Find again, `1` is highlighted.

Now I want to tinker on more bad inputs, to see how it can help find more unintuitive cases.


Purgeable Memory Allocations for Linux

(no comments)