Most efficient way to create a Scala Map from a file of strings?

前端未结

关注

 1  1493

Now, I am trying to create a Map[String, String] from the csv file where the word is the Key*, and the pronunciation is the Value

相关标签:

1条回答

星月不相逢

2021-01-16 14:01

I believe your main problem is that you are reading all your file into a String to reprocess it after. Which means, you don't only allocate twice of required memory, but that you process your file twice too.

The first improvement you may made to your code is to do everything in just one iteration.

import scala.io.Source def mapFile(filename: String): Map[String, String] = (for { line <- Source.fromFile(filename).getLines if (line.nonEmpty && !line.startsWith(";;;")) Array(word, pronunciation) = line.split(" ") } yield word -> pronunciation).toMap

The above code is equivalent (and will be desugared to something very similar) to this:

import scala.io.Source def mapFile(filename: String): Map[String, String] = Source .fromFile(filename) .getLines .filter(line => line.nonEmpty && !line.startsWith(";;;")) .map(line => line.split(" ")) .map { case Array(word, pronunciation) => word -> pronunciation } .toMap

Additionally, if the input file is too big, you may give a look to FS2, or Akka-Streams, or any other kind of streaming to process the file by chunks.

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复