Created Thursday 12 November 2020
Gherkin is an interpreter I wrote in bash in 2013 for a Clojure-inspired dialect of Lisp. Gherkin was the most sophisticated Lisp implementation I had attempted up to that point. I announced Gherkin during a lightning talk at Clojure/conj 2013. Working on and sharing Gherkin brought me great joy, and inspired others in ways that continue to inspire me. Gherkin is one of the most gratifying projects I've ever worked on, and the experience continues to pay dividends.
So, you want to be a Lisp hacker...
Before starting on Gherkin, I had long nursed an interest in Lisp implementation wizardry. After learning Clojure around 2009, I began work at Relevance (now Cognitect) where I was presented with opportunities to make small contributions to Clojure itself. At Relevance, I had the great fortune of being a fly on the wall during discussions between experts including Rich Hickey, the creator of Clojure, about exciting and mysterious aspects of language design and implementation.
While my position at Relevance afforded me a front row seat to the business of language development, I was still a relatively junior programmer, and I'd never gotten my own language to a state of anything close to completion. I had been programming long enough to develop an intuition about how things like lexical scope should work, but my understanding of how such things were actually implemented was fuzzy at best.
In a quest to become a Real Lisp Hacker, I bounced around Google search results and a small friend group of kindred spirits for a couple of years and found myself reading stuff like the following:
- Closure conversion: How to compile lambda, a blog post by Matt Might
- Paradigms of Artificial Intelligence (PAIP), a book by Peter Norvig. This book is now also freely available online.
- (How to Write a (Lisp) Interpreter (in Python)), a page by Peter Norvig
- Henry Baker's Archive of Research Papers
Of the stuff I read, PAIP probably propelled me the furthest along, but I struggled to develop a really solid comprehension of the basic implementation mechanics of things like closures. In retrospect, this is almost surely because I was experimenting in Clojure, but the examples were in Common Lisp, a language I didn't know well.
In the fall of 2013, in the month before Clojure/conj, I ran across awklisp by Darius Bacon. I became obsessed with it.
awklisp had a few properties unique among available implementations at the time. I believe this set of properties made it especially compelling to me as a learning aid:
- It was not written in C. This kept the code small and focused.
- It consisted of only 500 lines of code, all on one page. It was possible for me to understand the whole thing at one time.
- It included a mark-and-sweep garbage collector, and so illuminates the same problems of memory management confronted by Lisp's inventors. This is something most Lisp interpreters written in high-level languages do not tackle.
- It was written in a "lower level" language than Lisp and implemented a call stack. The emergence of Lisp is thus striking, and the mechanics of function calls are exposed.
After messing with awklisp, I had the idea to write something like it, but in a different language. Bash is what I decided on, and Gherkin was born. Most of Gherkin was written in the week preceding Clojure/conj 2013.
A pickle is plucked
I picked bash because I knew it would be a real challenge, and boy, was I not disappointed! The reader, the first piece I wrote, was especially challenging because Lisp syntax involves various characters that have special meaning in a shell context, like *. I got hung up many times by things like the differences between " and ', the consequences of various options to set, and bash eval.
Once I had the reader basically working I started to gain serious momentum. I felt I was over the bash syntax/craziness hump. I extended awklisp's memory model to account for more data types. awklisp only had conses, symbols, and numbers; Gherkin had these, and also arrays, strings, and closures. Objects were represented as bash strings that start with a special marker character, followed by a type tag, followed by an index into the heap for objects of that type, followed by a payload. Interestingly, because Gherkin had its own memory model and heap, and because Gherkin was larger than any C program I'd written, I experienced real pointer debugging for the first time - in bash.
Implementing closures presented a special opportunity for programming skill growth. Before this work, calling conventions, memory models, and closure semantics were topics I could hand-wave about but did not understand deeply. After this work, I reached a new, palpable level of understanding. The moment I reached this new level of understanding is unforgettable. I think (I hope!) I've grown a lot in various ways over my years of programming, but never so much, and in so little time.
My resulting tighter grasp on closure implementation and memory management primed me for work with and on R. The R language is implemented in C and features first-class lexical environments. I worked extensively with R and R extensions in C and C++ during my time at RStudio. I'm positive I would not have been as successful with R were it not for my prior experience writing Gherkin.
By the time I arrived at Clojure/conj, Gherkin was working well enough that I signed up to give a lightning talk about it. I was at the conference with my coworkers from LonoCloud, all Lisp and Clojure enthusiasts and afficionados themselves. They got a kick out of it and provided encouragement. One, Joel Martin, was especially excited, and contributed ideas and code for a much cleaner reader.
From my perspective, the presentation (up on YouTube) was surreal. When I looked to the audience after moments I thought would elicit laughter, there was none. When I made what I thought was a serious observation, there was laughter! I feared I'd embarrassed myself. I was relieved to learn later from members of the audience that they thoroughly enjoyed themselves.
After the talk, Paula Gearon, Jeremy Heiler, and Devin Walters kindly contributed core functions. Craig Andera even made an Emacs mode, gherkin-mode.el. I'm very grateful to these and my other friends for their interest, encouragement, and involvement.
After a flurry of conference activity, progress slowed down. My stated ambition for the project, that it would replace bash, was never completely serious; ironically, I became so inured to bash that I lost what little motivation I did originally have to replace it. The evaluator and garbage collector also had serious flaws that would have required a lot of work to rectify. I think I was satisfied enough with myself for perceiving these flaws that I saw little additional value in addressing them. In 2015, after two years of quiet, I "archived" the project on GitHub.
mal or "make-a-lisp" by Joel Martin is a Clojure-inspired Lisp interpreter and much, much more. Joel started by writing an interpreter in GNU Make shortly after I showed him Gherkin at Clojure/conj. His choice of Make was audacious compared even to my choice of bash. Then he took a huge step further, and codified the interpreter development experience into a structured, gamified series of language-agnostic steps. Thanks to Joel, anyone who wants to make a Lisp for any language has resources to start from that exceed even awklisp in educational value.
I highly recommend Joel's talk on YouTube, Achievement Unlocked: A Better Path to Language Learning if you want to learn more about his fantastic project.
Other related projects
I'm aware of these projects that were inspired by or otherwise related to Gherkin. If you know of others, please let me know at firstname.lastname@example.org and I will happily list them below.
- timl by Tim Pope is an impressive Clojure-like language that compiles to VimL. It is a much more sophisticated and comprehensive effort than Gherkin. Tim was moved to create timl after seeing my lightning talk.
- Fleck by Chris McCormick is "a Clojure-like LISP that runs wherever Bash is".
- Andy Chu reported on Twitter that he used Gherkin at one point as part of the tests for the parser of his oil shell.
Proper BDFL attire
Clinton Dreisbach made me an awesome Gherkin shirt in December of 2013, at the height of Gherkin-mania. Here I am modeling it. Thanks Clinton!