Lush: my favorite small programming language

Nov 19, 2024

I meant to write about this when I started my blog in 2009. Eventually Lush kind of faded out of my consciousness, as it was a lot easier to get work doing stuff in R or Matlab or whatever. The guy who was maintaining the code moved on to other things. The guys who wrote most of the code were getting famous because of the German Traffic Sign results. I moved on to other things. I had a thought bubble the other day that I'd try to compile it and see what happened. Binutils guys have been busy for the last decade and changed all manner of things: I couldn't even find documentation of the old binutils the last version of Lush2 was linked against. Then I noticed poking around the old sourceforge site that Leon Bottou had done some recent check ins fixing (more effectively than me) the same problems in the Lush1 branch. I stuck the subversion repo with history so you can marvel at it on github. I may try to revive a few of the demos I remember as being cool.

I call it a small language; compared to contemporary Python or R it is quite small, and had a small number of developers. The developers were basically Yann LeCun and Leon Bottou and some of their students (there are other names in the source like Yoshua Bengio). This tool is where they developed what became Deep Learning -lenet5; the first version of Torch was in here (as I recall it was more oriented to HMMs at the time). Since it's a lisp, it's easy to add macros and such to make it do your bidding and fit your needs. Unlike anything else I've ever used, Lush is a real ergonomic fit for the programmer. It has a self-documenting feature which is incredibly useful: sort of like what R does, it takes comments in code and makes them into documentation. Unlike R documentation there is a way of viewing it in a nice gui and linking it to other documentation. So you have a nice manual for the system and whatever you built in it, almost automagically. Remember "literate coding?" It was always a sort of aspiration: this is a real implementation of it, and it's so easy to use, you'd have to be actively malicious or in a pretty big hurry not to do it. Here's a screen I made for myself so I could remember how to use some code I built 15 years ago (it still works BTW). You can update it at the CLI, just like everything else in a Lisp.

As a Lisp, you have access to macros which allows you to do magic things that make Paul Graham happy. I am smooth brain: only wrote a couple of them: I've written considerably more C macros than Lisp macros and plan on keeping it that way. The Lush authors also don't use them very often; mostly in the compiler, which is how it should be. "A word to the wise: don't get carried away with macros." as Peter Norvig told us in PAIP. There is a nice object system, and a very useful set of GUI tooling. Not just the help gizmo; there's a full fledged GUI (ogre). Imagine that; something to develop old fashioned graphical user interfaces without importing two gigabytes of Electron and javascript baloney. The helptool uses this; it is not an HTML browser. The documentation format looks a bit like markdown with a few quirks; I never had to look at a manual to write the stuff. Essentially it looks like the standard two sentence comments you put to remind yourself what a complicated function does. It has a nice object system the GUI thing is written in; I assume it's something like CLOS: whatever it is, there are no surprises and anyone who knows about namespaces and objects can use it. I found it particularly useful for its encapsulation of raw FFI pointers and other tooling which is best trapped in a namespace where it can't hurt anything.

Since it is oriented around developing 80s-90s era cutting edge machine learning, one of the core types is the array. The arrays are real APL style arrays: rank 0 to rank 4, which is probably one rank higher than most sane people use (most people use rank 2, aka matrices). It looks like it had up to rank-7 at one point: I have no idea what you'd do with that. APLs such as J often have rank-whatever, so someone somewhere has probably done something with such structures. Lush2 had an interesting APL like sublanguage for operating on the arrays, which looked pretty handy, but which I never quite got into (most of my work was in Lush1).

All this is cool, but I suppose other small programming languages promise things like this. The really cool thing about it is the layers. You get a high level interpreted Lisp. You also have a compilable subset of Lisp; mostly oriented around numerics things, just as one would expect in a domain specific language one might develop early convolutional net/deep learning algorithms in. Even better than this, if you want to call some C, including calling libraries, you can enclose your C in a Lisp macro and compile it right into the interpreter. Most of the interesting and useful code in the world still sits behind a C API. With a tool like this: suddenly you have a useful interpreter where you can vacuum in all the DLLs you want, and they'll be available at the command line.

Most interpreters have some FFI facility for doing this; none to my knowledge are this easy to use or powerfully agglomerative. The memory management happens for free, more or less. In, say, R's repl, you can do something called dyn.load on libraries with R compatible types. If it's more complex than that you might have to write significant wrapper code, and this is a hack: it might just leak memory all over the place. You have to work pretty hard to encapsulate C libraries in a proper R package, compiling against the R sources. J, same story; you can use the 15!:0 foreign to load a dll and wrap up J structures to send, with some tooling to deallocate or copy memory locations (very carefully). In Lush, you call the C functions directly, in C, on C's terms (or C++ ). You can write a couple of lines C wrapper, a couple of pages; whatever: it's all a part of the Lush source. If you look at examples of well-wrapped dlls in R on CRAN, you'll see they're festooned with all manner of ugly R structure casts, mysterious R #defines and all kinds of badness and quasi-memory management you'd have to read a 300 page manual to make sense of what's going on. Having done this a few times, I'm exaggerating a tiny bit, but it is tedious and fiddly and takes a fair amount of work; a couple days if you've never done it before, versus a couple minutes. In Lush you just stick a dollar sign in front of variables you allocated in Lush in your C function calls, and after it's been compiled into the interpreter (which happens if you "libload" the file), you call them, variables appear where they're supposed to. No memory leaks. Usually doesn't take down the interpreter when something goes wrong, though of course if you send something weird to a raw pointer it will probably segfault and die. Here's an image grab of a simple method for instantiating a KD-tree using LibANN (a bleeding edge nearest neighbor library of circa 2009):

First lines are the documentation; inside the defmethod we try to make a new kdtree; the stuff between #{ and }# is normal C++. You can see the $ in front of $out, this tells the Lush compiler to pull the result back into the interpreter. This method gets compiled and loaded and accessed like any other method in Lush. idx2 is a matrix type, the other stuff does what you think it does.

Lush dates from 1987: I don't even remember what kind of computers people used back then. I assume something like a 68020 Sun Workstation or a VAX. Even when I was using it in 2009, a "multicore" system might have two cores, so it wasn't really designed with that sort of thing in mind either (though you could link to blas which do this in most numerics cases and it has tooling to use it on a cluster). Some of the intestines of the thing probably reflect this. I'm pretty sure Lush1 is not completely 64 bit clean: when I was using it in 2009 it was 32 bit binaries only, which was fine as nobody had 256g of ram back then. Other stuff which will seem unfamiliar to contemporary people: it's for talking to local libraries. There is no provision for a package manager over the interbutts, or much other network stuff I noticed beyond sockets. No JSON (didn't exist; s-expr are better anyway), sql interfaces (was exotic pay-for technology) and none of the stuff modern code sloppers are used to having. It was mostly a tool for developing more Lush code which links to locally installed libraries: this is what R&D on machine learning algorithms had to be back then. As a tool for building your own little universe of new numerics oriented algorithms it is almost incomparably cozy and nice. You get the high level stuff to move bits around in style. You get the typedefed sublanguage to compile hot chunks to the metal, and you get C/C++ APIs or adding new functions written in C/C++ as a natural part of the system. Extremely cozy system to use. While it's not the Lisp machine enthusiasts like Stas are always telling us about, it's probably about as close as you're going to get to that experience using a contemporary operating system and hardware. Yes you have to deal with the C API: I'm sorry about that, but it's just current year reality. Nobody is going to rewrite BLAS in Haskell or CMU-CL to make you happy. Purity is folly.

As a tool, if I had to fault it for anything, it's a few small things which I could probably fix. For example, in Kubuntu anyway, you can't copy/paste examples from the helptool. This is probably something that could be repaired if I dig down into whatever X library the ogre package calls to do this. It's no big deal; not a very wordy language anyway, and I should be reading the docs and typing code I'm about to run in emacs rather than copypasta. Another slightly annoying thing is a lack of built in pretty-print for results. Many languages have this problem: in Lush it's easy to write one and I have one around somewhere. Some of the packages aren't well documented and some don't work because of various forms of bitrot: this is to be expected in something this old. Other than that, no faults. Very cozy programming language. The coziest.

The C insides are fairly understandable, modulo the glowing crystal dldbfd.c gizmo at the center that does the binutils incantations that make the dynamic linking magic happen. Even that looks like it could be understood if you were familiar with binutils. Lush1 there are a number of odd pieces that were planned to be sawed off which you can sort of infer by their absence in Lush2, which had a redesigned VM. However, Lush1 compiles and runs the old code, and Lush2 doesn't.

While this programming language could (and really should) be revived, even in its present state it can be marveled at. Both for its historical importance in developing machine learning algorithms, and for its wonderful "programmer first" utility. I don't know what exigencies caused them to move the Torch neural net library to Lua; probably whiny wimps who were intimidated by parenthesis. I can guess why it ended up in Python (the Visual Basic of current year). It's one of those things where, had things worked out a little differently, machine learning people would be typing lots of parenthesis in vastly more futuristic Lush instead of drearily plodding along with spaghetti in Jupyter. It represents a very clear vision of how software development should work. No bureaucracies or committees were involved in its design: just people who needed a good tool to invent the future. I suspect the committees and social pressures involved in larger programming languages is why they're often so awful. Lush is all designed and built by makers, not bureaucrats and "product managers." It feels purposeful. It also feels incomplete, which is as it should be, as these guys were too talented to maintain programming languages. Like an unfinished DaVinci painting; you can see the grandeur of the artist's vision.

I've always been fans of these guys; as I pointed out in my article on DjVu, there is much to admire beyond their good taste in algorithms and dogged determination to continue working on them at a time when only eccentrics were interested in neural nets. All the cool kids of the era were doing SVMs .... because .... researchers are mostly trend following rather than thinking. Hopefully I don't cheese them off too much by bringing it up, though as an American it is arguably a sovereign duty to piss off the French. For myself, I have a shitload of work to do in coming months. I sort of hope I can find an excuse to fiddle around with it some more, or maybe even use it in production in some small way. If I do, I'll write about it. I encourage others to give it a try and ponder how cool 2024 would have been if we used this tool instead of trashfire Python slop you're all doomed to use in your day job.

Locklin mostly on science

Discussion about this post

Ready for more?