问题
I am copying file with Node on an SSD under VMWare, but the performance is very low. The benchmark I have run to measure actual speed is as follows:
$ hdparm -tT /dev/sda
/dev/sda:
Timing cached reads: 12004 MB in 1.99 seconds = 6025.64 MB/sec
Timing buffered disk reads: 1370 MB in 3.00 seconds = 456.29 MB/sec
However, the following Node code that copies file is very slow, evne teh consequent runs do not make it faster:
var fs = require("fs");
fs.createReadStream("bigfile").pipe(fs.createWriteStream("tempbigfile"));
And the runs as:
$ seq 1 10000000 > bigfile
$ ll bigfile -h
-rw-rw-r-- 1 mustafa mustafa 848M Jun 3 03:30 bigfile
$ time node test.js
real 0m4.973s
user 0m2.621s
sys 0m7.236s
$ time node test.js
real 0m5.370s
user 0m2.496s
sys 0m7.190s
What is the issue here and how can I speed it up? I believe I can write it faster in C by just adjusting the buffer size. The thing that confuses me is that when I wrote simple almost pv equivalent program, that pipes stdin to stdout as the below, it is very fast.
process.stdin.pipe(process.stdout);
And the runs as:
$ dd if=/dev/zero bs=8M count=128 | pv | dd of=/dev/null
128+0 records in 174MB/s] [ <=> ]
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.78077 s, 186 MB/s
1GB 0:00:05 [ 177MB/s] [ <=> ]
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.78131 s, 186 MB/s
$ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.57005 s, 193 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.5704 s, 193 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.61734 s, 233 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.62766 s, 232 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.22107 s, 254 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.23231 s, 254 MB/s
$ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.70124 s, 188 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.70144 s, 188 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.51055 s, 238 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.52087 s, 238 MB/s
回答1:
I don't know the answer to your question, but perhaps this helps in your investigation of the problem.
In the Node.js documentation about stream buffering, it says:
Both Writable and Readable streams will store data in an internal buffer that can be retrieved using
writable.writableBuffer
orreadable.readableBuffer
, respectively.The amount of data potentially buffered depends on the
highWaterMark
option passed into the stream's constructor. For normal streams, thehighWaterMark
option specifies a total number of bytes. For streams operating in object mode, thehighWaterMark
specifies a total number of objects....A key goal of the
stream
API, particularly the stream.pipe() method, is to limit the buffering of data to acceptable levels such that sources and destinations of differing speeds will not overwhelm the available memory.
Source: http://www.nodejs.org/api/stream.html#stream_buffering
So, you can play with the buffer sizes to improve speed:
var fs = require('fs');
var path = require('path');
var from = path.normalize(process.argv[2]);
var to = path.normalize(process.argv[3]);
var readOpts = {highWaterMark: Math.pow(2,16)}; // 65536
var writeOpts = {highWaterMark: Math.pow(2,16)}; // 65536
var source = fs.createReadStream(from, readOpts);
var destiny = fs.createWriteStream(to, writeOpts)
source.pipe(destiny);
https://nodejs.org/api/stream.html#stream_writable_writablehighwatermark
https://nodejs.org/api/stream.html#stream_readable_readablehighwatermark
https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options
来源:https://stackoverflow.com/questions/24005496/nodejs-copying-file-over-a-stream-is-very-slow