问题
I want to use Scala's collect function with a regular expression. Ideally I'd like to collect only those terms that match the regular expression. I've so far implemented the following which works fine
val regex = "(^([^:]+):([^:]+):([^:]+):([+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?)$".r
<other_code>.collect{case x: String if regex.pattern.matcher(x).matches =>
x match {
case regex(feature, hash, value, weight) => (feature.split("\\^"), weight.toDouble)
}
}
This seems to have an extra step though. I'm first checking if the regex matches in the case statement for collect and then I'm checking if it matches again to extract the groups of the match. Is there a way that I can do this with only checking the regex match once?
回答1:
You don't need the first match:
<other_code>.collect {
case regex(feature, hash, value, weight) => (feature.split("\\^"), weight.toDouble)
}
回答2:
Checking to see if the regex matches is unnecessary, because the pattern matching will do that for you. Let me illustrate with a slightly simpler example.
val regex = "(\\d+),([A-z]+)".r
val input = List("1,a", "23,zZ", "1", "1ab", "")
scala> input collect { case regex(a, b) => (a, b) }
res2: List[(String, String)] = List((1,a), (23,zZ))
Using x match { ... }
could result in a match error.
来源:https://stackoverflow.com/questions/28051084/scala-regex-and-partial-functions