← Articles
Published , 5 minute read
LOOP
is a famously divisive iteration DSL. Personally, I love it. Iteration is central to programming, I don’t have any scruples about imperative code, and I already know a million weird languages, so what’s one more. Plus, it’s a zero cost abstraction and an awesome example of what a Lisp macro is capable of.
But I digress. In this post, I want to talk specifically about the collect
keyword of LOOP
: how it works, and how we can make our own version of it that’s useful in situations LOOP
isn’t desirable or otherwise applicable.
Consider the following example that evaluates to the list (2 4 6 8 10)
:
It counts from 1 to 10, and when an even number is encountered, that value is appended to a list and that list is returned. The LOOP
syntax makes it all look pretty simple, but just beneath the syntax some fanciness is happening.
This fanciness is necessary because it’s not efficient to append to a linked list made of cons cells. You have to find the last cell in order to mutate it, and finding the last cell means traversing all the cells that precede it. The longer the list is, the longer it will take.
The way around this problem — the fanciness — is to maintain both the head and the tail of the list you’re appending to. When you want to append to the list, you CONS
up a new tail. When you’re done appending, you return the head.
The following example also returns the list (2 4 6 8 10)
but uses a simplified form of LOOP
and maintains the result list manually. It’s kind of hairy, but don’t worry, we’ll clean it up shortly:
(let ((head nil)
(tail nil)
(i 1))
(loop
(when (> i 10) (return head))
(when (evenp i)
(if (null tail)
(setq tail (cons i nil)
head tail)
(let ((new-tail (cons i nil)))
(setf (cdr tail) new-tail
tail new-tail))))
(incf i)))
Here’s a line-by-line breakdown of what’s going on:
HEAD
and TAIL
both start as NIL
.I
to 1I
is greater than 10, RETURN
from the nearest enclosing BLOCK
named NIL
, which in this case is the one established by LOOP
.I
is even…HEAD
and TAIL
are both set to it.TAIL
is set to point to it. Then, TAIL
is set to NEW-TAIL
.I
is incremented, and the LOOP
continues.It wouldn’t be fun to have to write that head/tail management code all the time, which is one reason that LOOP
and its collect
keyword are so useful. But collecting doesn’t have to be so wordy, even without LOOP
.
One way to make collecting more accessible is to wrap it in a function-based interface. Consider the following function, COLLECTOR
:
(defun collector ()
"Maintains the head of a list and returns a function that appends an
item to the list. When the returned function is called with no
arguments, returns the head."
(let ((tail nil)
(head nil))
(lambda (&optional (item nil argument?))
(cond
((not argument?) head)
((null tail) (setq tail (cons item nil)
head tail))
(t (let ((new-tail (cons item nil)))
(setf (cdr tail) new-tail
tail new-tail)))))))
COLLECTOR
encapsulates the collecting logic in the form of a function that returns a function. The returned function can be called two ways:
Here’s some example usage:
(setq collect (collector))
(funcall collect 1)
(funcall collect 2)
(funcall collect 3)
(funcall collect) ;=> (1 2 3)
Note that
FUNCALL
is necessary here because we have stored the function returned byCOLLECTOR
into a variable and Common Lisp is a Lisp-2.
Our code for collecting even numbers gets a lot clearer with COLLECTOR
:
(let ((collect (collector))
(i 1))
(loop
(cond
((> i 10) (return (funcall collect)))
((evenp i) (funcall collect i)))
(incf i)))
We can arguably simplify even further by using the other iteration constructs DO
or DOTIMES
:
;; with DO
(do ((collect (collector))
(i 1 (1+ i)))
((> i 10) (funcall collect))
(when (evenp i)
(funcall collect i)))
;; with DOTIMES
(let ((collect (collector)))
(dotimes (i 10 (funcall collect))
(let ((i (1+ i)))
(when (evenp i)
(funcall collect i)))))
COLLECTOR
. I could also have reversed the stack before returning it, but I wanted to avoid the performance penalty. My topological sort has a lot of room for improvement; in the future, I might use a proper queue to store the result.funcall
but that’s a huge topic. If you’re coming from another Lisp or otherwise up for a deep dive, check out Technical Issues of Separation in Function Cells and Value Cells.RPLACA
and RPLACD
could have been used to mutate conses.COLLECTOR
. I’ve found it worth learning both styles, if only to be able to be conscious of the associated tradeoffs.