author | Alan Dipert
<alan@dipert.org> 2020-03-03 04:43:16 UTC |
committer | Alan Dipert
<alan@dipert.org> 2020-03-03 04:43:16 UTC |
parent | 42f58d456806640cb8a545d12046f07a91502e0e |
paper/jacl-els-2020.tex | +107 | -106 |
diff --git a/paper/jacl-els-2020.tex b/paper/jacl-els-2020.tex index 08a6842..1245598 100644 --- a/paper/jacl-els-2020.tex +++ b/paper/jacl-els-2020.tex @@ -71,8 +71,8 @@ creating a widening variety of special-purpose programming languages that compile to JavaScript \cite{Somasegar12,Czaplicki12,wiki:ReasonML}. Each new language promotes one or more paradigms, application architectures, or -development workflows, and claim some advantage relative to the status -quo. +development workflows, and claims some advantage relative to the +status quo. This paper presents one new such language, JavaScript-Assisted Common Lisp (JACL), an implementation of an extended subset of Common @@ -123,8 +123,8 @@ interpreter presented in Chapter 23 of \emph{Paradigms of Artificial Lisp files may be batch-compiled to FASLs. FASLs represent code as JavaScript code instead of as Lisp data. The browser is able to load -FASLs faster than SLip code because the browser's native JavaScript -parser is much faster than SLip's reader. Despite the ability to +FASLs faster than SLip code because the JavaScript parser in the +browser is much faster than the SLip reader. Despite the ability to produce FASLs, the interpreted nature of SLip precludes the system from producing competitively fast or small application deliverables. Consequently, SLip does not satisfy the JACL project @@ -138,7 +138,7 @@ JSCL is a genuine Lisp system. And unlike SLip, JSCL compiles directly to JavaScript instead of to an interpreted bytecode. It is self-hosting, includes the major control operators, and integrates tightly with JavaScript. JSCL includes a reader, compiler, and -printer, and evaluation is performed by JavaScript's \texttt{eval()} +printer, and evaluation is performed by the JavaScript \texttt{eval()} function. Between these, a Read Eval Print Loop (REPL) is possible, and the JSCL distribution includes an implementation of one. @@ -177,22 +177,21 @@ JSCL compilation is performed in two stages: \end{itemize} The first stage, the conversion from Lisp to JavaScript Abstract -Syntax Trees (AST), is where the implementation of Lisp's special +Syntax Trees (AST), is where the implementation of the Lisp special forms in terms of JavaScript language constructs and runtime support -is performed. This is done in a single pass in which macro expansion, -lexical analysis, and JavaScript AST generation all occur. The lexical -environment is maintained in a dynamically-scoped variable as the -compiler descends into Lisp code and produces JavaScript AST. +is performed. This conversion is done in a single pass in which macro +expansion, lexical analysis, and JavaScript AST generation all +occur. The lexical environment is maintained in a dynamically-scoped +variable as the compiler descends into Lisp code and produces +JavaScript AST. Code for \texttt{TAGBODY} is generated in the first stage, and the -generated code is much slower than comparable JavaScript code for -looping. Only the general, dynamic case of \texttt{TAGBODY} is -implemented. Every control transfer initiated by \texttt{GO} results -in a JavaScript exception being thrown, which is an expensive -operation. Since many Common Lisp operators have \emph{implicit - tagbodies}, and since most other iteration operators are expressed -in terms of \texttt{TAGBODY}, this performance problem pervades the -JSCL system. +generated code is much slower than comparable JavaScript code. Every +control transfer initiated by \texttt{GO} results in a JavaScript +exception being thrown, which is an expensive operation. Since many +Common Lisp operators have \emph{implicit tagbodies}, and since most +other iteration operators are expressed in terms of \texttt{TAGBODY}, +this performance problem pervades the JSCL system. More efficient ways of implementing \texttt{TAGBODY} are not hard to imagine, but the JSCL compiler does not amene itself to the @@ -212,7 +211,7 @@ number of commercial users \cite{CljsUsers}. ClojureScript targets JavaScript, and is a dialect of an earlier language, Clojure\cite{Clojure}, which targets Java Virtual Machine -(JVM) bytecode. ClojureScript's reader and macro systems were both +(JVM) bytecode. The ClojureScript reader and macro systems were both originally hosted in Clojure, in a manner similar to Parenscript. Since its introduction\cite{CljsRelease}, ClojureScript has heavily @@ -226,7 +225,7 @@ ability to produce high-performance deliverables is considered a crucial capability of JACL. Other than the fact that JACL is a Common Lisp and ClojureScript is -not, the biggest difference between the two is JACL's promotion of a +not, the biggest difference between the two is that JACL promotes a browser-based development environment with minimal host-side tooling. ClojureScript, in contrast, promotes\cite{CljsQuickStart} a development experience oriented around compilation performed on the @@ -243,19 +242,19 @@ work in pursuit of this balance. \subsection{Asynchronous reader} The basis for interactive development in Lisp is undeniably the REPL, -but as JSCL's ``pre-reader'' demonstrates, even the direct approach to -this simple mechanism is hampered by JavaScript's asynchronous model -of input\cite{EventLoop}. Traditionally, Lisp readers are implemented -in environments with a blocking function for obtaining input, like -\texttt{getc(1)} on Unix. The blocking nature of input consumption -allows the reader to consume nested input recursively, using the call -stack to accumulate structures. In JavaScript, input arrives -asynchronously, and only when the call stack is empty. To mitigate -this difficulty, JACL's reader facility is completely -asynchronous. Conceptually, it is the JSCL REPL ``pre-reader'' taken -to its inevitable conclusion. - -JACL's reader is implemented as a JavaScript class, +but as the JSCL ``pre-reader'' demonstrates, even the direct approach +to this simple mechanism is hampered by the asynchronous model of +input imposed by JavaScript.\cite{EventLoop}. Traditionally, Lisp +readers are implemented in environments with a blocking function for +obtaining input, like \texttt{getc(1)} on Unix. The blocking nature of +input consumption allows the reader to consume nested input +recursively, using the call stack to accumulate structures. In +JavaScript, input arrives asynchronously, and only when the call stack +is empty. To mitigate this difficulty, the JACL reader facility is +completely asynchronous. Conceptually, it is the JSCL REPL +``pre-reader'' taken to its inevitable conclusion. + +The JACL reader is implemented as a JavaScript class, \texttt{Reader}. \texttt{Reader} instances are parameterized by an input source. One such input source is the \texttt{BufferedStream} class. The input source asynchronously notifies the reader instance @@ -265,7 +264,7 @@ its subscribers of the availability of the datum. The JACL reader implementation makes extensive use of modern JavaScript features to support asynchronous programming including -Promises, iterators, async functions, async iterators, and the +promises, iterators, async functions, async iterators, and the \texttt{await} keyword. These features simplify the JACL implementation and aid its performance \cite{V8async}. It is hoped that JACL will eventually be written in itself, and that these @@ -293,41 +292,44 @@ printed to the JavaScript console. })(); \end{verbatim} -In the preceding example, \texttt{window.setTimeout()} is used to -enqueue several JavaScript functions for execution after 1000, 2000, -3000, and 4000 milliseconds. Each enqueued function writes a character -of input to the \texttt{BufferedStream} \texttt{bs} when invoked. +\noindent In the preceding example, \texttt{window.setTimeout()} is +used to enqueue several JavaScript functions for execution after 1000, +2000, 3000, and 4000 milliseconds. Each enqueued function writes a +character of input to the \texttt{BufferedStream} \texttt{bs} when +invoked. Before any enqueued function is invoked, execution proceeds to the \texttt{console.log} call, but is suspended by the \texttt{await} keyword. -The \texttt{await} keyword expects a Promise object on its right side, -and JavaScript execution remains suspended until the Promise has -``resolved'', or notified its subscribers that the pending computation -it represents has completed. \texttt{rdr.read()} is an \texttt{async} -function that returns such a Promise. +The \texttt{await} keyword expects a JavaScript \texttt{Promise} +object on its right side, and JavaScript execution remains suspended +until the \texttt{Promise} has ``resolved'', or notified its +subscribers that the pending computation it represents has +completed. \texttt{rdr.read()} is an \texttt{async} function that +returns such a \texttt{Promise}. Once \texttt{rdr} has completed a form --- in this case, the number 123, after about 4000 milliseconds have elapsed --- execution continues, and \texttt{123} is printed to the JavaScript console. -The ``read'' portion of JACL's REPL is satisfied by first establishing -\texttt{BufferedStream} and \texttt{Reader} objects. Then, in an -asynchronous loop, objects are consumed from the \texttt{Reader}, -analyzed, compiled, and evaluated. +The ``read'' portion of the JACL REPL is implemented by first +instantiating \texttt{BufferedStream} and \texttt{Reader} +objects. Then, in an asynchronous loop, objects are consumed from the +\texttt{Reader}, analyzed, compiled, and evaluated. Concurrently, characters may be sent to the \texttt{BufferedStream} -instantiated by the REPL by calling its \texttt{write()} or -\texttt{writeEach()} methods. Neither character input nor read object -consumption impede other JavaScript operations, so the JACL REPL is -suitable for embedding in applications. - -Because of the platform and implementation-dependent nature of JACL's -reader, JACL does not support Common Lisp input streams, nor its +instantiated by the REPL by calling the \texttt{write()} or +\texttt{writeEach()} methods of the \texttt{BufferedaStream} +object. Neither character input nor read object consumption impede +other JavaScript operations, so the JACL REPL is suitable for +embedding in applications. + +Because of the platform and implementation-dependent nature of the +JACL reader, JACL does not support Common Lisp input streams, nor its standard \texttt{READ} and \texttt{READ-FROM-STRING} functions. Standard interfaces for extending the reader, such as the -\texttt{SET-MACRO-CHARACTER} function, are not directly +\linebreak\texttt{SET-MACRO-CHARACTER} function, are not directly supported. However, the JACL reader does provide an implementation-specific way to define reader macros. @@ -350,7 +352,7 @@ the DevTools Protocol \cite{GDevTools}. DevTools Protocol clients may then connect to the server and interact with open tabs, such as by evaluating arbitrary JavaScript within the context of the tab. JACL leverages the DevTools Protocol to deliver a command-line REPL client -that may be run on the developer's host machine. The workflow is the +that may be run on development machines. The workflow is the following: \begin{enumerate} @@ -400,20 +402,22 @@ pass produces JavaScript code. AST nodes are represented by generic JavaScript objects with at least the following keys: \begin{itemize} - \item \texttt{op}: The node's name, as a JavaScript string. + \item \texttt{op}: The name of the node, as a JavaScript string. \item \texttt{env}: An object of class \texttt{Env} that represents - the node's lexical environment. - \item \texttt{parent}: The node's parent; this is \texttt{null} for the root. - \item \texttt{form}: The node's original source data, a Lisp datum. + the lexical environment of the node. + \item \texttt{parent}: The parent of the node; this is \texttt{null} + for the root. + \item \texttt{form}: The original source data of the node, a Lisp + datum. \end{itemize} -Nodes and \texttt{Env} objects are immutable by convention. Functions -are provided for modifying and merging these objects so as only to -produce new objects. This convention reduces the possibility of -optimization passes interfering with one another. It also eases -understanding the AST, since every AST node contains a copy of all -relevant context. As JavaScript objects, AST nodes are easily -introspected using the Web browser's object inspector. +\noindent Nodes and \texttt{Env} objects are immutable by +convention. Functions are provided for modifying and merging these +objects so as only to produce new objects. This convention reduces the +possibility of optimization passes interfering with one another. It +also eases understanding the AST, since every AST node contains a copy +of all relevant context. As JavaScript objects, AST nodes are easily +introspected using the object inspector of the Web browser. Currently, the \texttt{Env} object tracks evaluation context --- one of \emph{statement}, \emph{expression}, or \emph{return} --- lexical @@ -423,7 +427,7 @@ functions and macros. \subsubsection{Embedding JavaScript with \texttt{JACL:\%JS}} -Unlike JSCL or SLip, JACL's compiler supports a special operator for +Unlike JSCL or SLip, the JACL compiler supports a special operator for constructing fragments of JavaScript code, verbatim, from Lisp. The semantics of this operator, \texttt{JACL:\%JS}, are inspired by a similar feature of ClojureScript, \texttt{js*}. For example, the @@ -443,17 +447,18 @@ syntax. There must be as many placeholders as there are arguments to In addition to \texttt{JACL:\%JS}, the JACL compiler currently supports three more special operators for interacting with the host platform: \texttt{JACL:\%NEW}, \texttt{JACL:\%DOT} and -\texttt{JACL:\%CALL}. These perform JavaScript object instantiation, -field access, and function calls, respectively. Since JACL functions -compile into JavaScript functions, \texttt{JACL:\%CALL} is the basis -for JACL's \texttt{FUNCALL}, and for function calls generally. +\texttt{JACL:\%CALL}. These operators perform JavaScript object +instantiation, field access, and function calls, respectively. Since +JACL functions compile into JavaScript functions, \texttt{JACL:\%CALL} +is the basis for \texttt{FUNCALL} in JACL, and for function calls +generally. JACL also supplies a convenience macro, \texttt{JACL:\textbackslash.} or ``the dot macro'' for performing a series of field accesses and method calls\footnote{Strictly speaking, JavaScript ``method calls'' are normal function calls but with a particular value of \texttt{this}.} concisely. The dot macro takes direct inspiration -from Clojure's \texttt{..} macro. \texttt{JACL:\textbackslash.} +from the \texttt{..} macro of Clojure. \texttt{JACL:\textbackslash.} expands to zero or more nested \texttt{JACL:\%DOT} or \texttt{JACL:\%CALL} forms. Here is an example of a \texttt{JACL:\textbackslash.} form --- equivalent to the JavaScript @@ -465,10 +470,10 @@ expansion: (%DOT (%CALL 123 |toString|) |length|) \end{verbatim} -Note that JavaScript identifiers are case sensitive, and so +\noindent Note that JavaScript identifiers are case sensitive, and so case-preserving, pipe-delimited Lisp symbols must be used to refer to -JavaScript object field and method names. The JACL reader's -\emph{readtable case} cannot currently be modified. The dot macro also +JavaScript object field and method names. The \emph{readtable case} of +the JACL reader cannot currently be modified. The dot macro also recognizes Lisp or JavaScript strings as JavaScript identifiers. \subsubsection{\texttt{TAGBODY} compilation strategy} @@ -486,14 +491,12 @@ variable \texttt{X} 10 times: end)) \end{verbatim} -JSCL, the existing Lisp closest to JACL, would compile the preceding -code into approximately\footnote{Actual JSCL output is not used - because it includes type checks, generated variable names, and other - code that would obscure the relevant machinery.} the following +\noindent JSCL, the existing Lisp closest to JACL, would compile the +preceding code into approximately\footnote{Actual JSCL output is not + used because it includes type checks, generated variable names, and + other code that would obscure the relevant machinery.} the following JavaScript: -\newpage - \begin{verbatim} function Jump(id, label) { this.id = id; @@ -524,15 +527,13 @@ LOOP: while (true) { } \end{verbatim} -The mechanism is ingenious and general, as it results in correct -behavior of both \emph{local} (to a lexically-enclosing -\texttt{TAGBODY} tag) and \emph{non-local} (to a dynamically-enclosing -\texttt{TAGBODY} tag) jumps. \texttt{GO} tags became \texttt{switch} -labels, and jumps are performed by throwing an instance of -\texttt{Jump} containing the destination label. +\noindent The mechanism is ingenious. \texttt{GO} tags became +\texttt{switch} labels, and jumps became \texttt{throw} +statements. The thrown objects are instances of \texttt{Jump}. Each +instance of \texttt{Jump} contains a destination label. Unfortunately, in this scheme, every jump requires a JavaScript -exception be thrown, severely penalizing \texttt{TAGBODY} as +exception to be thrown, severely penalizing \texttt{TAGBODY} as previously discussed. Fortunately, a straightforward \emph{local jump optimization} can be applied that yields a tremendous performance benefit. Local jump optimization is a known @@ -550,7 +551,7 @@ respective, lexically-enclosing \texttt{TAGBODY}s. Then, \texttt{GO}s. JavaScript generated for local \texttt{GO}s does not throw an -exception, but instead leverages the labeled form of JavaScript's +exception, but instead leverages the labeled form of the JavaScript \texttt{continue}\cite{MozLabel} statement to transfer control appropriately. JavaScript generated for \texttt{TAGBODY}s that have been determined to consist only of local jumps omits the @@ -559,9 +560,9 @@ been determined to consist only of local jumps omits the The following code is similar\footnote{Once more, actual compiler output has been significantly modified and reformatted for brevity.} to that generated by the JACL compiler. Cursory benchmarks -\ref{appendix:benchmarks} show JACL's code runs several orders of -magnitude faster than JSCL's, and that JACL's code is almost as fast -as the JavaScript statement \texttt{while(X--)}: +\ref{appendix:benchmarks} show JACL code runs several orders of +magnitude faster than JSCL, and that JACL code is almost as fast as +the JavaScript statement \texttt{while(X--)}: \begin{verbatim} var X = 10; @@ -585,10 +586,10 @@ LOOP: while (true) { \section{Conclusion} -JACL, a new Common Lisp created to ease SPA development, was -introduced. JACL is designed as an efficient, practical tool, with the -needs of industrial SPA developers in mind. JACL integrates tightly -with the Web browser platform and interoperates easily with +We introduced JACL, a new Common Lisp created to ease SPA development. +JACL is designed as an efficient, practical tool, with the needs of +industrial SPA developers in mind. JACL integrates tightly with the +Web browser platform and interoperates easily with JavaScript. Compared to other browser-based Lisps, JACL places a higher emphasis on the value of the REPL, and introduces new techniques for integrating the REPL into the development workflow. @@ -596,15 +597,15 @@ techniques for integrating the REPL into the development workflow. \section{Future Work} In order to be practical for application development, JACL must -support the creation of standalone executables. In JACL's case, these -would be single JavaScript files that may be included in an HTML page -and are executed on page load. Fortunately, since JACL development is -image-based, JACL should support the traditional approach of -specifying a Lisp function entrypoint and dumping the Lisp image to -native (JavaScript) code. SBCL's -\texttt{SAVE-LISP-AND-DIE}\cite{SBCLManual} and LispWork's -\texttt{DELIVER}\cite{LispWorksDeliver} functions are two examples of -this in other implementations. +support the creation of standalone executables. In the case of JACL, +these would be single JavaScript files that may be included in an HTML +page and are executed on page load. Fortunately, since JACL +development is image-based, JACL should support the traditional +approach of specifying a Lisp function entrypoint and dumping the Lisp +image to native (JavaScript) code. The +\texttt{SAVE-LISP-AND-DIE}\cite{SBCLManual} function in SBCL and the +\texttt{DELIVER}\cite{LispWorksDeliver} function in LispWorks are two +examples of this functionality in other implementations. JACL should be able to perform rudimentary optimizations such as global function and variable tree shaking\cite{wiki:TreeShaking} in