R system() process always uses same CPU, not multi-threaded/multi-core

后端 未结 2 533
清酒与你
清酒与你 2021-01-12 02:14

In R 3.0.2 on Linux 3.12.0, I am using the system() function to execute a number of tasks. The desired effect is for each of these tasks to run as they would i

相关标签:
2条回答
  • 2021-01-12 02:35

    I tested running:

    system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
    system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
    system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
    system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
    

    on Linux 2.6.32 with R 3.0.2 and on Linux 3.8.0 with R 2.15.2. In both cases it takes up 4 CPU cores (as you would expect).

    -- Edit --

    I installed Linux 3.12 on a Virtual Box machine, and here R 3.0.2 also does what I expect: Takes up 4 CPUs. It even slowly wanders between the CPUs - so each process does not stick to the same CPU but changes every second or so.

    This leads me to believe your system as some local modifications that forces R to use only one CPU.

    From your description I would guess the local modifications are in R and not system wide (since your Python has no problems spawning more processes).

    The modifications could be on your user alone, so create a new user and try with that. If it works for the new user, we need to figure out what your userid has installed.

    If it does not work for the new user, it could be globally installed R libraries that causes the problem. Install an older R version and try that out. If the older version works, your R 3.0.2 installation is probably broken. Remove it and re-install it.

    0 讨论(0)
  • 2021-01-12 02:38

    Following on @agstudy's comment, you should get parallel to work first. On my system, this uses multiple cores:

    f<-function(x)system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
    library(parallel)
    mclapply(1:4,f,mc.cores=4)
    

    I would have wrote this in a comment myself, but it is too long. I know you have said that you have tried the parallel package, but I wanted to confirm that you are using it correctly. If it doesn't work, can you confirm that a non-system call uses mclapply correctly, like this one?

    a<-mclapply(rep(1e8,4),rnorm,mc.cores=4)
    

    Reading your comments, I suspect that your pthreads Linux package is out of date and broken. On my system, I am using libpthread-2.15.so (not 2.13). If you're on Ubuntu, you can grab the latest with apt-get install libpthread-stubs0.

    Also, note that you should be using parallel, not multicore. If you look at the docs for parallel, you'll note that they have incorporated the work on multicore.


    Reading your next set of comments, I must insist that it is parallel and not multicore that has been included in R since 2.14. You can read about this on the CRAN Task View.

    Getting parallel to work is crucial. I previously told you that you could compile it directly from source, but this is not correct. I guess the only way to recompile it would be to compile R from source.

    Can you also verify that your CPU affinity is set correctly? Also can you check if R can detect the number of cores? Just run:

    library(parallel)
    mcaffinity()
    # Should be c(1,2,3,4) for you.
    detectCores()
    # Should be 4 for you.
    
    0 讨论(0)
提交回复
热议问题