Read a file one line at a time in node.js?

前端 未结 29 1109
深忆病人
深忆病人 2020-11-22 04:33

I am trying to read a large file one line at a time. I found a question on Quora that dealt with the subject but I\'m missing some connections to make the whole thing fit to

29条回答
  •  长发绾君心
    2020-11-22 04:58

    This is my favorite way of going through a file, a simple native solution for a progressive (as in not a "slurp" or all-in-memory way) file read with modern async/await. It's a solution that I find "natural" when processing large text files without having to resort to the readline package or any non-core dependency.

    let buf = '';
    for await ( const chunk of fs.createReadStream('myfile') ) {
        const lines = buf.concat(chunk).split(/\r?\n/);
        buf = lines.pop();
        for( const line of lines ) {
            console.log(line);
        }
    }
    if(buf.length) console.log(buf);  // last line, if file does not end with newline
    

    You can adjust encoding in the fs.createReadStream or use chunk.toString(). Also this let's you better fine-tune the line splitting to your taste, ie. use .split(/\n+/) to skip empty lines and control the chunk size with { highWaterMark: }.

    Don't forget to create a function like processLine(line) to avoid repeating the line processing code twice due to the ending buf leftover. Unfortunately, the ReadStream instance does not update its end-of-file flags in this setup, so there's no way, afaik, to detect within the loop that we're in the last iteration without some more verbose tricks like comparing the file size from a fs.Stats() with .bytesRead. Hence the final buf processing solution, unless you're absolutely sure your file ends with a newline \n, in which case the for await loop should suffice.

    ★ If you prefer the evented asynchronous version, this would be it:

    let buf = '';
    fs.createReadStream('myfile')
    .on('data', chunk => {
        const lines = buf.concat(chunk).split(/\r?\n/);
        buf = lines.pop();
        for( const line of lines ) {
            console.log(line);
        }
    })
    .on('end', () => buf.length && console.log(buf) );
    

    ★ Now if you don't mind importing the stream core package, then this is the equivalent piped stream version, which allows for chaining transforms like gzip decompression:

    const { Writable } = require('stream');
    let buf = '';
    fs.createReadStream('myfile').pipe(
        new Writable({
            write: (chunk, enc, next) => {
                const lines = buf.concat(chunk).split(/\r?\n/);
                buf = lines.pop();
                for (const line of lines) {
                    console.log(line);
                }
                next();
            }
        })
    ).on('finish', () => buf.length && console.log(buf) );
    

提交回复
热议问题