Hei.
I\'m using Node.JS
with child_process
to spawn bash processes. I\'m trying to understand if i\'m doing I/O bound, CPU bound or both.
If your jobs are CPU hungry, then the optimal number of jobs to run is typically the number of cores (or double that if the CPUs have hyperthreading). So if you have a 4 core machine you will typically see the optimal speed by running 4 jobs in parallel.
However, modern CPUs are heavily dependent on caches. This makes it hard to predict the optimal number of jobs to run in parallel. Throw in the latency from disks and it will make it even harder.
I have even seen jobs on systems in which the cores shared the CPU cache, and where it was faster to run a single job at a time - simply because it could then use the full CPU cache.
Due to that experience my advice has always been: Measure.
So if you have 10k jobs to run, then try running 100 random jobs with different number of jobs in parallel to see what the optimal number is for you. It is important to choose at random, so you also get to measure the disk I/O. If the files differ greatly in size, run the test a few times.
find pdfdir -type f > files
mytest() {
shuf files | head -n 100 |
parallel -j $1 pdftotext -layout -enc UTF-8 {} - > out;
}
export -f mytest
# Test with 1..10 parallel jobs. Sort by JobRuntime.
seq 10 | parallel -j1 --joblog - mytest | sort -nk 4
Do not worry about your CPUs running at 100%. That just means you get getting a return for all the money you spent at the computer store.
Your RAM is only a problem if the disk cache gets low (In your screenshot 754M is not low. When it gets < 100M it is low), because that may cause your computer to start swapping - which can slow it to a crawl.
Your Node.js code is I/O bound. It is doing almost none of the CPU work. You can see in your code that you are only creating external tasks and moving around the output from those tasks. You are not using long running loops or heavy math calculations. You are seeing high CPU numbers for the Node.js process because the pdftotext processes are its child processes, and therefore you are seeing its CPU values aggregated.