I am having issues testing out the Scala Parser Combinator functionality for a simple Book DSL.
Firstly there is a book class:
case class Book (name:Stri
When you use ~>
or <~
, you are discarding the element from which the arrow comes. For example:
"book" ~> stringLit // discards "book"
"book" ~> stringLit ~> "has" // discards "book" and then stringLit
"book" ~> stringLit ~> "has" ~> "isbn" // discards everything except "isbn"
"book" ~> stringLit ~> "has" ~> "isbn" ~> stringLit // discards everything but the last stringLit
You could write it like this:
def bookSpec: Parser[Book] = ("book" ~> stringLit <~ "has" <~ "isbn") ~ stringLit ^^ {
case name ~ isbn => new Book(name,isbn)
}
You're on the right track. There are a few issues in your parser. I'll post the corrected code, then explain the changes.
import scala.util.parsing.combinator._
import scala.util.parsing.combinator.syntactical._
case class Book (name: String, isbn: String) {
def niceName = name + " : " + isbn
}
object BookParser extends StandardTokenParsers {
lexical.reserved += ("book","has","isbn")
def bookSpec: Parser[Book] = "book" ~ ident ~ "has" ~ "isbn" ~ ident ^^ {
case "book" ~ name ~ "has" ~ "isbn" ~ isbn => new Book(name, isbn) }
def parse (s: String) = {
val tokens = new lexical.Scanner(s)
phrase(bookSpec)(tokens)
}
def test (exprString : String) = {
parse (exprString) match {
case Success(book, _) => println("Book: " + book.niceName)
case Failure(msg, _) => println("Failure: " + msg)
case Error(msg, _) => println("Error: " + msg)
}
}
def main (args: Array[String]) = {
test ("book ABC has isbn DEF")
}
}
1. Parser return value
In order to return a book from a parser, you need to give the type inferencer some help. I changed the definition of the bookSpec function to be explicit: it returns a Parser[Book]. That is, it returns an object which is a parser for books.
2. stringLit
The stringLit function you used comes from the StdTokenParsers trait. stringLit is a function that returns Parser[String], but the pattern it matches includes the double-quotes that most languages use to delimit a string literal. If you are happy with double-quoting words in your DSL, then stringLit is what you want. In the interest of simplicity, I replaced stringLit with ident. ident looks for a Java-language identifier. This isn't really the right format for ISBNs, but it did pass your test case. :-)
To match ISBNs correctly, I think you'll need to use a regex expression instead of idents.
3. Ignore-left sequence
Your matcher used a string of ~> combiners. This is a function that takes two Parser[_] objects and returns a Parser that recognizes both in sequence, then returns the result of the right hand side. By using a whole chain of them to lead up to your final stringLit, your parser would ignore everything except the final word in the sentence. That means it would throw away the book name, too.
Also, when you use ~> or <~, the ignored tokens should not appear in your pattern matching.
For simplicity, I changed these all to simple sequence functions and left the extra tokens in the pattern match.
4. Matching results
The test method needs to match all the possible results from the parse() function. So, I added the Failure() and Error() cases. Also, even Success includes both your return value and the Reader object. We don't care about the reader, so I just used "_" to ignore it in the pattern match.
Hope this helps!