Is a statically-typed full Lisp variant possible?

programming-languages lisp static-typing

29,866

Solution 1

Yes, it's very possible, although a standard HM-style type system is usually the wrong choice for most idiomatic Lisp/Scheme code. See Typed Racket for a recent language that is a "Full Lisp" (more like Scheme, actually) with static typing.

Solution 2

If all you wanted was a statically typed language that looked like Lisp, you could do that rather easily, by defining an abstract syntax tree that represents your language and then mapping that AST to S-expressions. However, I don't think I would call the result Lisp.

If you want something that actually has Lisp-y characteristics besides the syntax, it's possible to do this with a statically typed language. However, there are many characteristics to Lisp that are difficult to get much useful static typing out of. To illustrate, let's take a look at the list structure itself, called the cons, which forms the primary building block of Lisp.

Calling the cons a list, though (1 2 3) looks like one, is a bit of a misnomer. For example, it's not at all comparable to a statically typed list, like C++'s std::list or Haskell's list. Those are single-dimensional linked lists where all the cells are of the same type. Lisp happily allows (1 "abc" #\d 'foo). Plus, even if you extend your static-typed lists to cover lists-of-lists, the type of these objects requires that every element of the list is a sublist. How would you represent ((1 2) 3 4) in them?

Lisp conses form a binary tree, with leaves (atoms) and branches (conses). Further, the leaves of such a tree may contain any atomic (non-cons) Lisp type at all! The flexibility of this structure is what makes Lisp so good at handling symbolic computation, ASTs, and transforming Lisp code itself!

So how would you model such a structure in a statically typed language? Let's try it in Haskell, which has an extremely powerful and precise static type system:

type Symbol = String
data Atom = ASymbol Symbol | AInt Int | AString String | Nil
data Cons = CCons Cons Cons 
            | CAtom Atom

Your first problem is going to be the scope of the Atom type. Clearly, we haven't picked an Atom type of sufficient flexibility to cover all the types of objects we want to sling around in conses. Instead of trying to extend the Atom data structure as listed above (which you can clearly see is brittle), let's say we had a magical type class Atomic that distinguished all the types we wanted to make atomic. Then we might try:

class Atomic a where ?????
data Atomic a => Cons a = CCons Cons Cons 
                          | CAtom a

But this won't work because it requires all atoms in the tree to be of the same type. We want them to be able to differ from leaf to leaf. A better approach requires the use of Haskell's existential quantifiers:

class Atomic a where ?????
data Cons = CCons Cons Cons 
            | forall a. Atomic a => CAtom a

But now you come to the crux of the matter. What can you do with atoms in this kind of structure? What structure do they have in common that could be modeled with Atomic a? What level of type safety are you guaranteed with such a type? Note we haven't added any functions to our type class, and there's a good reason: the atoms share nothing in common in Lisp. Their supertype in Lisp is simply called t (i.e. top).

In order to use them, you'd have to come up with mechanisms to dynamically coerce the value of an atom to something you can actually use. And at that point, you've basically implemented a dynamically typed subsystem within your statically typed language! (One cannot help but note a possible corollary to Greenspun's Tenth Rule of Programming.)

Note that Haskell provides support for just such a dynamic subsystem with an Obj type, used in conjunction with a Dynamic type and a Typeable class to replace our Atomic class, that allow arbitrary values to be stored with their types, and explicit coercion back from those types. That's the kind of system you'd need to use to work with Lisp cons structures in their full generality.

What you can also do is go the other way, and embed a statically typed subsystem within an essentially dynamically typed language. This allows you the benefit of static type checking for the parts of your program that can take advantage of more stringent type requirements. This seems to be the approach taken in CMUCL's limited form of precise type checking, for example.

Finally, there's the possibility of having two separate subsystems, dynamically and statically typed, that use contract-style programming to help navigate the transition between the two. That way the language can accommodate usages of Lisp where static type checking would be more of a hindrance than a help, as well as uses where static type checking would be advantageous. This is the approach taken by Typed Racket, as you will see from the comments that follow.

Solution 3

My answer, without a high degree of confidence is probably. If you look at a language like SML, for example, and compare it with Lisp, the functional core of each is almost identical. As a result, it doesn't seem like you would have much trouble applying some kind of static typing to the core of Lisp (function application and primitive values).

Your question does say full though, and where I see some of the problem coming in is the code-as-data approach. Types exist at a more abstract level than expressions. Lisp doesn't have this distinction - everything is "flat" in structure. If we consider some expression E : T (where T is some representation of its type), and then we consider this expression as being plain ol' data, then what exactly is the type of T here? Well, it's a kind! A kind is a higher, order type, so let's just go ahead and say something about that in our code:

E : T :: K

You might see where I'm going with this. I'm sure by separating out the type information from the code it would be possible to avoid this kind of self-referentiality of types, however that would make the types not very "lisp" in their flavour. There are probably many ways around this, though it's not obvious to me which would be the best one.

EDIT: Oh, so with a bit of googling, I found Qi, which appears to be very similar to Lisp except that it's statically typed. Perhaps it's a good place to start to see where they made changes to get the static typing in there.

Solution 4

Dylan: Extending Dylan's type system for better type inference and error detection

View more solutions

29,866

Lambda the Penultimate

Updated on July 08, 2022

Comments

Lambda the Penultimate almost 2 years

Is a statically-typed full Lisp variant possible? Does it even make sense for something like this to exist? I believe one of a Lisp language's virtues is the simplicity of its definition. Would static typing compromise this core principle?
- mcandre almost 13 years
  
  I like Lisp's freeform macros, but I like the robustness of Haskell's type system. I'd love to see what a statically-typed Lisp looks like.
- Hamish Grubijan about 11 years
  
  Good question! I believe shenlanguage.org does that. I wish it became more mainstream.
- Anentropic over 5 years
  
  github.com/carp-lang/Carp
- aoeu256 almost 5 years
  
  How do you do symbolic computing with Haskell? (solve 'x '(= (+ x y) (* x y))). If you put it in a string there is no checking (unlike Lisp which can use macros to add checking). If you use algebraic data types or lists... It'll be very verbose: solve (Sym "x") (Eq (Plus (Sym "x") (Sym "y")) (Mult (Sym "x") (Sym "y")))
Eli Barzilay almost 14 years

This answer suffers from a fundamental problem: you're assuming that static type systems must be HM-style. The basic concept that cannot be expressed there, and is an important feature of Lisp code, is subtyping. If you'll take a look at typed racket, you'll see that it can easily express any kind of list -- including things like (Listof Integer) and (Listof Any). Obviously, you'd suspect the latter to be useless because you know nothing about the type, but in TR, you can later use (if (integer? x) ...) and the system will know that x is an Integer in the 1st branch.
Eli Barzilay almost 14 years

Oh, and it's a bad characterization of typed racket (which is different from unsound type systems that you'd find in some places). Typed Racket is a statically typed language, with no runtime overhead for typed code. Racket still allows writing some code in TR and some in the usual untyped language -- and for these cases contracts (dynamic checks) are used to guard typed code from the potentially misbehaving untyped code.
Owen S. almost 14 years

@Eli Barzilay: Thanks, and I've got a 3-part response: 1. I think the argument applies to other statically typed systems as well, I merely chose an HM type system for illustration. Like you note, you'd have to do (Listof Any) to support an arbitrary Lisp cons tree, right? My general point is the more you try to create a structure to do Lisp-y stuff, the more generic and unhelpful the type you'll be forced to feed the type system.
Owen S. almost 14 years

@Eli Barzilay: 2. integer? is possible to code if you're working with Typeable objects: case (cast a :: Int) of Just _ -> True; Nothing -> False. So yes, Haskell supports this limited kind of subtyping. Usually I'd use type classes in Haskell to do the sorts of things that I might use subtyping for in C++ or Lisp, and if I were to attempt a Lisp using an HM type system I'd want to keep that approach. I think it would be workable. But I haven't tried it. :-)
Owen S. almost 14 years

@Eli Barzilay: 3. You're right, I didn't realize you can get from Typed Racket out to Racket. I think of Typed Racket and untyped Racket as parts of a whole, which I'm sure was your intent. :-) I'll try to clarify.
Owen S. almost 14 years

@Eli Barzilay: I lied, there are four parts: 4. It's interesting to me how industry-accepted C++ coding style has been moving gradually away from subtyping toward generics. The weakness is that the language doesn't provide help for declaring the interface a generic function is going to use, something type classes could certainly help with. Plus, C++0x may be adding type inference. Not HM, I suppose, but creeping in that direction?
Eli Barzilay almost 14 years

Owen: (1) the main point is that you need subtypes to express the kind of code lispers write -- and you just can't have that with HM systems, so you're forced to custom types and constructors for each use, which makes the whole thing much more awkward to use. In typed racket using a system with subtypes was a corollary of an intentional design decision: that the result should be able to express the types of such code without changing the code or creating custom types.
Eli Barzilay almost 14 years

(2) Yes, dynamic types are getting to be popular in static languages as a kind of a workaround to get some of the benefits of dynamically typed languages, with the usual tradeoff of these values being wrapped in a way that makes the types identifiable. But here too typed racket is doing a very good job at making it convenient within the language -- the type checker uses occurrences of predicates to know more about the types. For example, see the typed example on the racket page and see how string? "reduces" a list of strings and numbers to a list of strings.
Eli Barzilay almost 14 years

(3) If you view them as part of a whole then obviously you'd see it as a way to type parts of the code. But in practice, it is very possible (and often done in practice) to use only the typed language so for all practical purpose you are working in a statically typed environment. (As for (4), I don't know how C++ moves, but if it's more features that are hard to use then I'm sure they'll get there...)
Zorf almost 14 years

The problem here is, what is the type of the list that makes up the entire source code of a typed racket program?
Eli Barzilay almost 14 years

That would usually be Sexpr.
ssice almost 8 years

But then, I can write coerce :: a->b in terms of eval. Where's the type safety?
Eli Barzilay almost 8 years

@ssice: when you're using an untyped function like eval you need to test the result to see what comes out, which is nothing new in Typed Racked (same deal as a function that takes a union type of String and Number). An implicit way to see that this can be done is the fact that you can write and use a dynamically typed language in an HM-statically-typed language.
Björn Lindqvist about 5 years

The link is dead. But in any case, Dylan is not statically typed.
Rainer Joswig about 5 years

@BjörnLindqvist: that link was to a thesis about adding gradual typing to Dylan.
Rainer Joswig about 5 years

@BjörnLindqvist: I linked to an overview paper.
Björn Lindqvist about 5 years

But gradual typing doesn't count as static typing. If it did, then Pypy would be statically typed Python since it also uses gradual typing.
Rainer Joswig about 5 years

@BjörnLindqvist: if we add static types via gradual typing and these are checked during compilation, then this is static typing. It's just not that the whole program is statically typed, but parts/regions. homes.sice.indiana.edu/jsiek/what-is-gradual-typing 'Gradual typing is a type system I developed with Walid Taha in 2006 that allows parts of a program to be dynamically typed and other parts to be statically typed.'
Diagon over 4 years

It seems the next iteration after Qi is Shen, developed by the same person.
Danielhu over 3 years

Why a standard HM-style is usually the wrong choice for lisp?
Eli Barzilay over 3 years

(if (...) 2 'three)