问题
I have a stream of directories from the readdirp
module.
I want to:-
- search for a file using a regex (e.g.
README.*
) in each directory - read the first line of that file that does not start with a
#
- print out each directory and this first non-heading line of the README in the directory.
I am trying to do this using streams and highland.js.
I am stuck trying to process a stream of all files inside each directory.
h = require 'highland'
dirStream = readdirp root: root, depth: 0, entryType: 'directories'
dirStream = h(dirStream)
.filter (entry) -> entry.stat.isDirectory()
.map (entry) ->
# Search all files in the directory for README.
fileStream = readdirp root: entry.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store'
fileStream = h(fileStream).filter (entry) -> /README\..*/.test entry.name
fileStream.each (file) ->
readmeStream = fs.createReadStream file
_(readmeStream)
.split()
.takeUntil (line) -> not line.startsWith '#' and line isnt ''
.last(1)
.toArray (comment) ->
# TODO: How do I access `comment` asynchronously to include in the return value of the map?
return {name: entry.name, comment: comment}
回答1:
It's best to consider Highland streams as immutable, and operations like filter
and map
returning new streams that depend on the old stream, rather than modifications of the old stream.
Also, Highland methods are lazy: you should only call each
or toArray
when you absolutely need the data right now.
The standard way of asynchronously mapping a stream is flatMap
. It's like map
, but the function you give it should return a stream. The stream you get from flatMap
is the concatenation of all the returned streams. Because the new stream depends on all the old streams in order, it can be used to sequence asynchronous process.
I'd modify your example to the following (clarified some variable names):
h = require 'highland'
readmeStream = h(readdirp root: root, depth: 0, entryType: 'directories')
.filter (dir) -> dir.stat.isDirectory()
.flatMap (dir) ->
# Search all files in the directory for README.
h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')
.filter (file) -> /README\..*/.test file.name
.flatMap (file) ->
h(fs.createReadStream file.name)
.split()
.takeUntil (line) -> not line.startsWith '#' and line isnt ''
.last(1)
.map (comment) -> {name: file.name, comment}
Let's take a walk though the types in this code. First, note that flatMap
has type (in Haskellish notation) Stream a → (a → Stream b) → Stream b
, i.e. it takes a stream containing some things of type a
, and a function expecting things of type a
and returning streams containing b
s, and returns a stream containing b
s. It's standard for collection types (such as stream and array) to implement flatMap
as concatenating the returned collections.
h(readdirp root: root, depth: 0, entryType: 'directories')
Let's say this has type Stream Directory
. The filter
doesn't change the type, so the flatMap
will be Stream Directory → (Directory → Stream b) → Stream b
. We'll see what the function returns:
h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')
Call this a Stream File
, so the second flatMap
is Stream File → (File → Stream b) → Stream b
.
h(fs.createReadStream file.name)
This is a Stream String
. split
, takeUntil
and last
don't change that, so what does the map
do? map
is very similar to flatMap
: its type is Stream a → (a → b) → Stream b
. In this case a
is String
and b
is an object type {name : String, comment : String}
. Then map
returns a stream of that object, which is what the overall flatMap
function returns. Step up, and b
in the second flatMap
is the object, so the first flatMap
's function also returns a stream of the object, so the entire stream is a Stream {name : String, comment : String}
.
Note that because of Highland's laziness, this doesn't actually start any streaming or processing. You need to use each
or toArray
to cause a thunk
and start the pipeline. In each
, the callback will be called with your object. Depending on what you want to do with the comments, it might be best to flatMap
some more (if you're writing them to a file for example).
Well, I didn't mean to write an essay. Hope this helps.
来源:https://stackoverflow.com/questions/27721268/nested-stream-operations-in-highland-js