Write a Lisp interpreter in Java



Some time ago I wanted to write a small interpreted scripting language, just for fun, not posing any serious problems. I then actively read the well-known magic book SICP (Structure and interpretation of computer programs), and I wanted to implement it described the abstraction — functions as first class objects, closures, macros, etc. in Parallel, I was interested in Haskell, and decided to combine the pleasant with the pleasant, and to write an interpreter for it. In my language had to be strict with the semantics of call by value and mutable variables. The resulting language of the Lisp family I have in my local environment associated with the name Liscript, the full implementation of which could fit in about 250 lines, including the parser, the core and console/file input and output. This is more than enough in order for the interest to solve any of the puzzles, which usually torment the students who have managed to learn Lisp in the curriculum.

After some time I wanted to make to the interpreter cross-platform GUI with the prospect of porting to Android, so I implemented the second option of the interpreter in Java, the appearance of which you can see in the picture above. Yes, it supports graphical output and is generally interoperability with Java, and this Tetris written in Liscript see the part in his code. Anyone interested in the details — ask a cat.

In one article it is difficult to fit all the features of the language and its implementation, but writing a series of articles beginning with a lengthy description of the implementation of the parser using the standard library to me modesty would not allow it. (By the way, in both implementations — Haskell and Java, I did not use any libraries for parsing, wrote it manually in Haskell it took as much as 10 lines of code). So some things I will write in the style of "I came, I saw, I conquered". In General, I have written parsers for text in the AST types. The only thing I want to mention a couple of things — I'm not a fan of brackets, but the ease of parsing, gomogennosti code (code as data) and the lack of infix binary operators with the necessity of taking into account precedence and associativity and parse through Poland or marshalling yards are more than outweighs all claims to such syntax. As a result of parsing I get a list that is passed to the function interpretation. In the Java implementation I did not use the library list types and developer your — simple single loop, thread safe, etc.:

spoiler Title
 /** the type of language Liscript - linked list */
public static class ConsList {
/** object value of the head of the current list */
public Object car;
/** list - the value of the tail of the current list */
public ConsList cdr;

/** The constructor with the values head and tail.
* @param h the object - the head of the list
* @param t a list - tail list
*/
ConsList(Object h, ConsList t) { car = h; cdr = t; }

/** checks whether the list is empty
* @return true/false
*/
public boolean isEmpty() { return this.car == null && this.cdr == null; }

/** returns the size of the list
* @return size
*/
public int size() {
int r = 0;
ConsList p = this;
while (!p.isEmpty()) {r += 1; p = p.cdr;}
return r;
}

/** @return string representation of the current list */
@Override
public String toString() { return showVal(this); }
}


Similarly, the implemented types are the classes for the function macro. Code example class-like features:

spoiler Title
 /** the type of language Liscript function */
public static class Func {
/** linked list of parameter names of the function */
public ConsList pars;
/** function body */
public Object body;
/** the environment in which the created function */
public Env clojure;

/** Constructor
* @param p a unidirectional list of the names of the function parameters
* @param b the body of the function

*/
Func(ConsList p, Object b, Env c) { pars = p; body = b; clojure = c; }

/** @return the string representation of the function */
@Override
public String toString() { return showVal(this); }
}


the

environment


Implemented in full accordance with the SICP is the hierarchical structure of dictionaries, where each dictionary is a HashMap<String, Object> the class contains methods to add, retrieve and change the value by string name. And here we can be creative: for example, what to do when trying to get the value of the missing key (variable name)? To privivat execution with an error? Or return the string representation of a key? The same if trying to add a variable, whose name is already contained in the dictionary of the current frame environment to give an error or override variable? Such trifles as the result determine the peculiarities of the language, and for example I like that I can identify them. Get autocamioane and deep linking, with all the pluses and minuses of this approach.

Also I wanted to implement variable arity special forms directly in the kernel language, and then not in the standard library. Many of them allow the transfer of a number of parameters and/or work not quite as their same-named counterparts in other dialects of Lisp is not complicated the implementation of the interpreter. Example of working with the environment (after = > is the answer of interpreter):

the
++ "a1 = a1, b1 = b1, c1 =" c1
=> a1 = a1, b1 = b1, c1 = c1
def 1 a1 b1 (+ a1 1) (++ "c" a1) (+ a1 b1)
=> OK
++ "a1 = a1, b1 = b1, c1 =" c1
=> a1 = 1, b1 = 2, c1 = 3
set! (++ "a" 1) 5 c1 10
=> OK
++ "a1 =" a1 ", b1 = " (get (++ "b" 1)) ", c1 = " c1
=> a1 = 5, b1 = 2, c1 = 10

the

Functions


Are first-class objects. When you create a function captures the current context. When invoked the arguments are evaluated sequentially. Implemented automatic tail-call optimization — TSO:

the
defn is-even (n) (cond (= n 0) true (is-odd (- n 1)) )
=> OK
defn is-odd (n) (cond (= n 0) false (is-even (- n 1)) )
=> OK
is-even 10000
=> true
is-even 10001
=> false

A special form tray allows you to print the call stack when the function is applied. So, for example, is the factorial of 3:

spoiler Title
defn f (i) (cond (< i 2) 1 (* i (f (- i 1))))
=> OK
tray f 3
=> 
1 ← (f 3)
2 ← f
2 → (lambda (i) (cond (< i 2) 1 (* i (f (- i 1)))))
2 ← (cond (< i 2) 1 (* i (f (- i 1))))
3 ← (< i 2)
4 ← i
4 → 3
3 → false
3 ← (* i (f (- i 1)))
4 ← i
4 → 3
4 ← (f (- i 1))
5 ← f
5 → (lambda (i) (cond (< i 2) 1 (* i (f (- i 1)))))
5 ← (- i 1)
6 ← i
6 → 3
5 → 2
5 ← (cond (< i 2) 1 (* i (f (- i 1))))
6 ← (< i 2)
7 ← i
7 → 2
6 → false
6 ← (* i (f (- i 1)))
7 ← i
7 → 2
7 ← (f (- i 1))
8 ← f
8 → (lambda (i) (cond (< i 2) 1 (* i (f (- i 1)))))
8 ← (- i 1)
9 ← i
9 → 2
8 → 1
8 ← (cond (< i 2) 1 (* i (f (- i 1))))
9 ← (< i 2)
10 ← i
10 → 1
9 → true
8 → 1
7 → 1
6 → 2
5 → 2
4 → 2
3 → 6
2 → 6
1 → 6
6


If the function call arguments passed is greater than the number of formal parameters, then the last formal parameter will be contacted with a list of the calculated remaining passed arguments. If it is less, then these formal parameters will be bound in the environment of the function evaluation and when evaluating her body will be searched in the environments of the upper level of the hierarchy. You could choose the option of currying to return the function partially applied, or to issue a mismatch error the number of arguments.

the

Macros


Liscript supports so-called runtime macros, which are first-class objects of the language. You can create them, assign names to return as a result of the functions and perform the process (code interpretation). Derived from the source code text expression immediately begins to be interpreted without a preliminary stage of macro expansion, and therefore macros are full types of language and disclosed and evaluated in the process of interpretation according to the rules of calculating macros is first substituted into the macro body uncalculated passed arguments and then the body of the macro is evaluated in the current environment (in contrast to the function body that is always calculated in a separate private environment, which already contains pre-computed values of the actual parameters)
the
 def m (macro (i r) (cond (<= i 0) 'r (m (- i 1) (cons i r))))
=> OK
m 5 nil
=> (cons (- (- (- (- 5 1) 1) 1) 1) (cons (- (- (- 5 1) 1) 1) (cons (- (- 5 1) 1) (cons (- 5 1) (cons 5 nil)))))
eval (m 5 nil)
=> (1 2 3 4 5)

the

Interoperability with Java


Implemented through the Java Reflection mechanism. Only 2 special forms: class — specifies class name and java class calls the method with the passed parameters. The search for the right method of among the same name is overloaded based on the number and types of the parameters passed. To speed once found and called in the current session of the interpreter class method is stored in a table called methods and for any method I call first, is searched in a table — memoization.

the
def m (java (class "java.util.HashMap") "new")
=> OK
java m "put" 1 "a"
=> OK
java m "put" "b" 2
=> OK
java m get 1
=> a
m
=> {1=a, 2=b}

Thus, we can access many resources of the language implementation, including a chart. This code opens a separate window with a painted red line on a white background:

the
(def image (java (class "java.awt.image.BufferedImage") "new" 100 100 1))
(def imageGraphics (java image "createGraphics"))
(java imageGraphics "setBackground" (java (class "java.awt.Color") new 255 255 255))
(java imageGraphics "clearRect" 0 0 100 100)
(java imageGraphics "setColor" (java (class "java.awt.Color") new 255 0 0))

(java imageGraphics "drawLine" 10 10 90 90)

(def icon (java (class "javax.swing.ImageIcon") "new"))
(java icon "setImage", image)
(def label (java (class "javax.swing.JLabel") "new"))
(java label "setIcon" icon)

(def window (java (class "javax.swing.JFrame") new))
(java window "setLayout" (java (class "java.awt.FlowLayout") "new"))
(java window "add" label)
(java, window, "setVisible", true)
(java window "pack")

Of course, you can select the typical blocks in separate functions or macros and assign them once in the standard library, which is loaded at startup of the interpreter. And because the interpreter implements a multi-threaded execution of tasks in separate tabs with shared mutable environment (Yes, I know that you have selected as the repository of dictionary, HashMap is not thread safe and it can create problems with the simultaneous change of the environment of parallel threads, and it was better to use a HashTable), so, this allows one tab to run the procedure that calls itself after a timeout timer, and in another window (thread) — the procedure for requesting user input and perform certain actions. And was implemented Tetris (with the feature of blocking user input — every command must be confirmed using Ctrl+Enter).

This project is available on Github at github.com/Ivana-/Liscript-GUI-Java-Swing. It contains the source files of the interpreter and his Swing GUI (I know it's the 21st century, but until JavaFX rewrite not in a hurry), and script files of the standard library description language interpreter and a sufficient number of demo examples. If you wish to criticize please note that this is my first project in Java and OOP language, so things can be implemented optimally. But I would be interested in opinions and feedback on this project.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Fresh hay from the cow, or 3000 icons submitted!

Knowledge base. Part 2. Freebase: make requests to the Google Knowledge Graph

Group edit the resources (documents) using MIGXDB