Is there a good \"scala-esque\" (I guess I mean functional) way of recursively listing files in a directory? What about matching a particular pattern?
For example re
The simplest Scala-only solution (if you don't mind requiring the Scala compiler library):
val path = scala.reflect.io.Path(dir)
scala.tools.nsc.io.Path.onlyFiles(path.walk).foreach(println)
Otherwise, @Renaud's solution is short and sweet (if you don't mind pulling in Apache Commons FileUtils):
import scala.collection.JavaConversions._ // enables foreach
import org.apache.commons.io.FileUtils
FileUtils.listFiles(dir, null, true).foreach(println)
Where dir
is a java.io.File:
new File("path/to/dir")
Scala code typically uses Java classes for dealing with I/O, including reading directories. So you have to do something like:
import java.io.File
def recursiveListFiles(f: File): Array[File] = {
val these = f.listFiles
these ++ these.filter(_.isDirectory).flatMap(recursiveListFiles)
}
You could collect all the files and then filter using a regex:
myBigFileArray.filter(f => """.*\.html$""".r.findFirstIn(f.getName).isDefined)
Or you could incorporate the regex into the recursive search:
import scala.util.matching.Regex
def recursiveListFiles(f: File, r: Regex): Array[File] = {
val these = f.listFiles
val good = these.filter(f => r.findFirstIn(f.getName).isDefined)
good ++ these.filter(_.isDirectory).flatMap(recursiveListFiles(_,r))
}
I like yura's stream solution, but it (and the others) recurses into hidden directories. We can also simplify by making use of the fact that listFiles
returns null for a non-directory.
def tree(root: File, skipHidden: Boolean = false): Stream[File] =
if (!root.exists || (skipHidden && root.isHidden)) Stream.empty
else root #:: (
root.listFiles match {
case null => Stream.empty
case files => files.toStream.flatMap(tree(_, skipHidden))
})
Now we can list files
tree(new File(".")).filter(f => f.isFile && f.getName.endsWith(".html")).foreach(println)
or realise the whole stream for later processing
tree(new File("dir"), true).toArray
Take a look at scala.tools.nsc.io
There are some very useful utilities there including deep listing functionality on the Directory class.
If I remember correctly this was highlighted (possibly contributed) by retronym and were seen as a stopgap before io gets a fresh and more complete implementation in the standard library.
Scala has library 'scala.reflect.io' which considered experimental but does the work
import scala.reflect.io.Path
Path(path) walkFilter { p =>
p.isDirectory || """a*.foo""".r.findFirstIn(p.name).isDefined
}
Apache Commons Io's FileUtils fits on one line, and is quite readable:
import scala.collection.JavaConversions._ // important for 'foreach'
import org.apache.commons.io.FileUtils
FileUtils.listFiles(new File("c:\temp"), Array("foo"), true).foreach{ f =>
}