Scala parser combinators and newline-delimited text

寵の児 提交于 2019-12-04 23:44:41

You should try setting skipWhitespace to false instead of redefining the definition of whitespace. The issue you're having with the empty list is caused by the fact that repsep doesn't consume the line break at the end of the list. Instead, you should parse the line break (or possibly end of input) after each item:

import util.parsing.combinator.RegexParsers

object WordList extends RegexParsers {

  private val eoi = """\z""".r // end of input
  private val eol = sys.props("line.separator")
  private val separator = eoi | eol
  private val word = """\w+""".r

  override val skipWhitespace = false

  val list: Parser[List[String]] = rep(word <~ separator)

  val lists: Parser[List[List[String]]] = repsep(list, rep1(eol))

  def main(args: Array[String]) {
    val s =
      """cat
        |mouse
        |horse
        |
        |apple
        |orange
        |pear""".stripMargin

    println(parseAll(lists, s))
  }

}

Then again, parser combinators are a bit overkill here. You could get practically the same thing (but with Arrays instead of Lists) with something much simpler:

s.split("\n{2,}").map(_.split("\n"))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!