Published , 5 minute read
LOOP is a famously divisive iteration DSL. Personally, I love it. Iteration is central to programming, I don’t have any scruples about imperative code, and I already know a million weird languages, so what’s one more. Plus, it’s a zero cost abstraction and an awesome example of what a Lisp macro is capable of.
But I digress. In this post, I want to talk specifically about the
collect keyword of
LOOP: how it works, and how we can make our own version of it that’s useful in situations
LOOP isn’t desirable or otherwise applicable.
Consider the following example that evaluates to the list
(2 4 6 8 10):
It counts from 1 to 10, and when an even number is encountered, that value is appended to a list and that list is returned. The
LOOP syntax makes it all look pretty simple, but just beneath the syntax some fanciness is happening.
This fanciness is necessary because it’s not efficient to append to a linked list made of cons cells. You have to find the last cell in order to mutate it, and finding the last cell means traversing all the cells that precede it. The longer the list is, the longer it will take.
The way around this problem — the fanciness — is to maintain both the head and the tail of the list you’re appending to. When you want to append to the list, you
CONS up a new tail. When you’re done appending, you return the head.
The following example also returns the list
(2 4 6 8 10) but uses a simplified form of
LOOP and maintains the result list manually. It’s kind of hairy, but don’t worry, we’ll clean it up shortly:
(let ((head nil) (tail nil) (i 1)) (loop (when (> i 10) (return head)) (when (evenp i) (if (null tail) (setq tail (cons i nil) head tail) (let ((new-tail (cons i nil))) (setf (cdr tail) new-tail tail new-tail)))) (incf i)))
Here’s a line-by-line breakdown of what’s going on:
TAILboth start as
Iis greater than 10,
RETURNfrom the nearest enclosing
NIL, which in this case is the one established by
TAILare both set to it.
TAILis set to point to it. Then,
TAILis set to
Iis incremented, and the
It wouldn’t be fun to have to write that head/tail management code all the time, which is one reason that
LOOP and its
collect keyword are so useful. But collecting doesn’t have to be so wordy, even without
One way to make collecting more accessible is to wrap it in a function-based interface. Consider the following function,
(defun collector () "Maintains the head of a list and returns a function that appends an item to the list. When the returned function is called with no arguments, returns the head." (let ((tail nil) (head nil)) (lambda (&optional (item nil argument?)) (cond ((not argument?) head) ((null tail) (setq tail (cons item nil) head tail)) (t (let ((new-tail (cons item nil))) (setf (cdr tail) new-tail tail new-tail)))))))
COLLECTOR encapsulates the collecting logic in the form of a function that returns a function. The returned function can be called two ways:
Here’s some example usage:
(setq collect (collector)) (funcall collect 1) (funcall collect 2) (funcall collect 3) (funcall collect) ;=> (1 2 3)
FUNCALLis necessary here because we have stored the function returned by
COLLECTORinto a variable and Common Lisp is a Lisp-2.
Our code for collecting even numbers gets a lot clearer with
(let ((collect (collector)) (i 1)) (loop (cond ((> i 10) (return (funcall collect))) ((evenp i) (funcall collect i))) (incf i)))
We can arguably simplify even further by using the other iteration constructs
;; with DO (do ((collect (collector)) (i 1 (1+ i))) ((> i 10) (funcall collect)) (when (evenp i) (funcall collect i))) ;; with DOTIMES (let ((collect (collector))) (dotimes (i 10 (funcall collect)) (let ((i (1+ i))) (when (evenp i) (funcall collect i)))))
COLLECTOR. I could also have reversed the stack before returning it, but I wanted to avoid the performance penalty. My topological sort has a lot of room for improvement; in the future, I might use a proper queue to store the result.
funcallbut that’s a huge topic. If you’re coming from another Lisp or otherwise up for a deep dive, check out Technical Issues of Separation in Function Cells and Value Cells.
RPLACDcould have been used to mutate conses.
COLLECTOR. I’ve found it worth learning both styles, if only to be able to be conscious of the associated tradeoffs.