parser-combinators | 易学教程

Scala parser combinators and newline-delimited text

阅读更多关于 Scala parser combinators and newline-delimited text

问题 I am writing a Scala parser combinator grammar that reads newline-delimited word lists, where lists are separated by one or more blank lines. Given the following string: cat mouse horse apple orange pear I would like to have it return List(List(cat, mouse, horse), List(apple, orange, pear)) . I wrote this basic grammar which treats word lists as newline-delimited words. Note that I had to override the default definition of whitespace . import util.parsing.combinator.RegexParsers object

How to skip whitespace but use it as a token delimeter in a parser combinator

阅读更多关于 How to skip whitespace but use it as a token delimeter in a parser combinator

I am trying to build a small parser where the tokens (luckily) never contain whitespace. Whitespace (spaces, tabs and newlines) are essentially token delimeters (apart from cases where there are brackets etc.). I am extending the RegexParsers class. If I turn on skipWhitespace the parser is greedily joining tokens together when the next token matches the regular expression of the previous one. If I turn off skipWhitespace , on the other hand, it complains because of the spaces not being part of the definition. I am trying to match the BNF as much as possible, and given that whitespace is

Scalac hanging on phase typer of RegexParser

阅读更多关于 Scalac hanging on phase typer of RegexParser

I have a scala program which among other things has a parser-combinator. This is done by extending scala.util.parsing.combinator.RegexParsers . I had developed it using Scala 2.10 and all was working fine. Yesterday I upgraded my system to Scala 2.11.4, together with IntelliJ 14.02 (not that it matters). However, whenever I try to compile this program now, scalac hangs during this phase: scalac: phase typer on MyParser.scala I changed absolutely nothing to this code, I can't understand why it is hanging or from where I should start. IntelliJ had a warning about postfix operators for parser

Scala Parser Combinators: Parsing in a stream

阅读更多关于 Scala Parser Combinators: Parsing in a stream

I'm using the native parser combinator library in scala, and I'd like to use it to parse a number of large files. I have my combinators set up, but the file that I'm trying to parse is too large to be read into memory all at once. I'd like to be able to stream from an input file through my parser and read it back to disk so that I don't need to store it all in memory at once.My current system looks something like this: val f = Source.fromFile("myfile") parser.parse(parser.document.+, f.reader).get.map{_.writeToFile} f.close This reads the whole file in as it parses, which I'd like to avoid.

Cannot compute minimal length of a parser - uu-parsinglib in Haskell

阅读更多关于 Cannot compute minimal length of a parser - uu-parsinglib in Haskell

Lets see the code snippet: pSegmentBegin p i = pIndentExact i *> ((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure [])) if I change this code in my parser to: pSegmentBegin p i = do pIndentExact i ((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure [])) I've got an error: canot compute minmal length of a parser due to occurrence of a moadic bind, use addLength to override I thought the above parser should behave the same way. Why this error can occur? EDIT The above example is very simple (to simplify the question) and as noted below it is not necessary to use do notation here, but the real case

Errors and failures in Scala Parser Combinators

阅读更多关于 Errors and failures in Scala Parser Combinators

I would like to implement a parser for some defined language using Scala Parser Combinators. However, the software that will compile the language does not implements all the language's feature, so I would like to fail if these features are used. I tried to forge a small example below : object TestFail extends JavaTokenParsers { def test: Parser[String] = "hello" ~ "world" ^^ { case _ => ??? } | "hello" ~ ident ^^ { case "hello" ~ id => s"hi, $id" } } I.e., the parser succeeds on "hello" + some identifier, but fails if the identifier is "world". I see that there exist fail() and err() parsers

Parsing an indentation based language using scala parser combinators

阅读更多关于 Parsing an indentation based language using scala parser combinators

Is there a convenient way to use Scala's parser combinators to parse languages where indentation is significant? (e.g. Python) Let's assume we have a very simple language where this is a valid program block inside the block and we want to parse this into a List[String] with each line inside the block as one String . We first define a method that takes a minimum indentation level and returns a parser for a line with that indentation level. def line(minIndent:Int):Parser[String] = repN(minIndent + 1,"\\s".r) ~ ".*".r ^^ {case s ~ r => s.mkString + r} Then we define a block with a minimum

Turning a list/sequence of combinator parsers into a single one

阅读更多关于 Turning a list/sequence of combinator parsers into a single one

I have a list of values from which I can construct a list of parsers, that depend on these values by mapping (see example). Then what I want to do is turn the list of parsers into a single parser by concatenation. One possibility is using foldLeft and ~ : parsers.foldLeft(success(Nil)){case (ps,p) => rs ~ p ^^ {case xs ~ x => x ::xs}} ^^ (_.reverse) Is this efficient? I don't know how combinator parsers work; will there be a call stack with depth of length of the list? Thus may I run into SO errors for very long concatenations? Better way Is there a different way that is more readable? Example

Scala parser combinators and newline-delimited text

阅读更多关于 Scala parser combinators and newline-delimited text

I am writing a Scala parser combinator grammar that reads newline-delimited word lists, where lists are separated by one or more blank lines. Given the following string: cat mouse horse apple orange pear I would like to have it return List(List(cat, mouse, horse), List(apple, orange, pear)) . I wrote this basic grammar which treats word lists as newline-delimited words. Note that I had to override the default definition of whitespace . import util.parsing.combinator.RegexParsers object WordList extends RegexParsers { private val eol = sys.props("line.separator") override val whiteSpace = """[

Arithmetic Expression Grammar and Parser

阅读更多关于 Arithmetic Expression Grammar and Parser

问题 Recently I was looking for a decent grammar for arithmetic expressions but found only trivial ones, ignoring pow(..., ...) for example. Then I tried it on my own, but sometimes it didn´t worked as one expects. For example, I missed to allow a unary - in front of expressions and fixed it. Perhaps someone can take a look at my current approach and improve it. Furthermore I think others can take advantage because it´s a common task to be able to parse arithmetic expressions. import scala.math._