Is there a language specification for clojure? Something that precisely defines the lexical syntax and grammar in EBNF or something similar?
The closest thing that I co
There is no language specification. If there are any plans for one in the future, I haven't heard of them.
The grammar linked by fogus, which was within the Eclipse plugin Counterclockwise, is no longer used by that project and has been removed from it.
Clojure.g4
, an ANTLR grammarFor a more up-to-date ANTLR grammar for Clojure, see Clojure.g4 (permalink) from grammars-v4, a collection of “grammars written for ANTLR v4”. Clojure.g4
is small and easy to read, and it has successfully parsed Compojure and clojure.core in the past, but that does not guarantee that it can parse all Clojure code correctly.
LispReader.java
The most authorative specification of Clojure’s syntax is Clojure’s source code itself. Clojure does not use an abstract grammar, only a custom parser, but you can understand the grammar after careful study of the parser’s implementation. Here is Clojure’s parser: LispReader.java (permalink).
LispReader.java
uses a few classes from other files in the same directory, such as LineNumberingPushbackReader, but most of the code is in that file. In LispReader
, the main function is read. read
uses isWhitespace to ignore whitespace and commas. It also detects numbers and hands off the parsing to readNumber. For most other symbols, such as (
and #
, read
hands off interpretation to the objects in the macros and dispatchMacros arrays. You can follow the code from there.
There is also a Clojure reimplementation of LispReader.java
called clojure.tools.reader. Its source code might be easier to read than LispReader
, since it is in Clojure, not Java. clojure.tools.reader has some differences from LispReader.java, which are mostly being able to read some minor extra syntaxes proposed for Clojure and handling errors better.
Let's look at a syntax error or two:
user=> (defn)
Syntax error macroexpanding clojure.core/defn at (REPL:1:1).
() - failed: Insufficient input at: [:fn-name] spec: :clojure.core.specs.alpha/defn-args
And
user=> (fn [3]) Syntax error macroexpanding clojure.core/fn at (REPL:1:1).
(3) - failed: Extra input at: [:fn-tail :arity-1 :params] spec: :clojure.core.specs.alpha/param-list
3 - failed: vector? at: [:fn-tail :arity-n :params] spec: :clojure.core.specs.alpha/param-list
It is clear that the syntax of the core macros is now (version 1.10) checked using clojure.spec
. If and when the Clojure in Clojure project advances, we can expect spec
to extend its reach into the compiler proper.
The point is that spec
has full EBNF power, so the source code will then contain a full EBNF of the language. The notation is explained in Clojure - clojure.spec: Rationale and Overview:
Sequences
Specs for sequences/vectors use a set of standard regular expression operators, with the standard semantics of regular expressions:
cat - a concatenation of predicates/patterns alt - a choice of one among a set of predicates/patterns * - zero or more occurrences of a predicate/pattern + - one or more ? - one or none & - takes a regex op and further constrains it with one or more predicates
When will this happen? My (utterly uninformed) impression is that the core team are drowning in alligators and have almost forgotten their original intention to drain this swamp.
A previous answer referring to spec
in general terms for Clojure 1.9 was deleted. I think its use to define and check macro syntax is new with 1.10.
This is the closest thing to an official Clojure EBNF that you are likely to find.
https://github.com/laurentpetit/ccw/blob/3738a4fd768bcb0399630b7f6a6427a3066bdaa9/clojure-antlr-grammar/src/Clojure.g