scala combinator parser not backtracking as I would have thought…

◇◆丶佛笑我妖孽 提交于 2019-12-07 07:46:00

问题


I've been staring myself blind on this problem I have and I guess this will probably be a real stupid question. But I have to swallow my pride.

I have this combinator parser that doesn't backtrack like I thought it would. I've been reducing it down to a small example without entirely removing context. Feels like "foobar"-examples are just harder to read. Here I go:

@RunWith(classOf[JUnitRunner])
class ParserBacktrackTest extends RegexParsers with Spec with ShouldMatchers {
  override def skipWhitespace = false

  lazy val optSpace = opt(whiteSpace)
  lazy val number = """\d+([\.]\d+)?""".r
  lazy val numWithOptSpace = number <~ optSpace

  private def litre = numWithOptSpace <~ ("litre" | "l")
  def volume = litre ^^ { case _ => "volume" }

  private def namedPieces = numWithOptSpace <~ ("pcs") ^^ { case _ => "explPcs" }
  private def implicitPieces = number ^^ { case _ => "implPcs" }
  protected def unitAmount = namedPieces | implicitPieces

  def nameOfIngredient = ".*".r

  def amount = volume | unitAmount
//  def amount = unitAmount
  protected def ingredient = (amount <~ whiteSpace) ~ nameOfIngredient

  describe("IngredientParser") {
    it("should parse volume") {
      shouldParse("1 litre lime")
    }
    it("should parse explicit pieces") {
      shouldParse("1 pcs lime")
    }
    it("should parse implicit pieces") {
      shouldParse("1 lime")
    }
  }

  def shouldParse(row: String) = {
    val result = parseAll(ingredient, row)
    result match {
      case Success(value, _) => println(value)
      case x => println(x)
    }
    result.successful should be(true)
  }
}

So what happens is that the third test fails:

(volume~lime)
(explPcs~lime)
[1.4] failure: string matching regex `\s+' expected but `i' found

1 lime
   ^

So it seems the litre-parser consumed the l and then it failed when it couldn't find any space. But I would have thought that it would backtrack then and try the next production rule. Obviously the implicitPieces parser parses this line because if I remove the preceding volume parser (remove the comment), it succeeds

(implPcs~litre lime)
(explPcs~lime)
(implPcs~lime)

Why isn't amount backtracking? What am I misunderstanding?


回答1:


It doesn't backtrack because for 1 lime

  • ingredient starts out with amount
  • amount starts with volume
  • volume starts with litre, and
  • litre successfully consumes 1 l of 1 lime

So litre, volume and amount were all successful! This is why then the whole thing continues with the second part of ingredient, namely whiteSpace.

HTH!




回答2:


I just want to post a minimal example illustrating my misunderstanding. I thought this would succeed:

  def foo = "foo" | "fo"
  def obar = "obar"

  def foobar = foo ~ obar

  describe("foobar-parser") {
    it("should parse it") {
      shouldParseWith(foobar, "foobar")
    }
  }

but backtracking over | doesn't work that way. The disjunctive parser will consume "foo" and never give it back.

It has to be normalized so the disjunction is moved to the top level:

def workingFooBar = ("foo" ~ obar) | ("fo" ~ obar)


来源:https://stackoverflow.com/questions/9192060/scala-combinator-parser-not-backtracking-as-i-would-have-thought

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!