I came across some problem when I was trying to use parallel
package in R on my Mac.
Here is how the parallel package works normally.
After weeks of trying I finally solved this problem. I am putting my answer here.
The problem is actually caused by some unknown firewall issue within the macOS. The solution for this is to reinstall the entire operating system ... I know this sounds stupid and troublesome, but the problem is solved after that.
The motivation for doing so is that I happen to notice I do not have access to some of my folders under the home directory (I tried to use sudo
to modify some files but was not granted access). This is my personal laptop and there is not supposed to have such issue. I then realize that this Mac was synced from my old Mac. The syncing process might cause some firewall issues.
Several potential reasons includes not enough memory, installation error, etc. They do not seems to be the problem here, as I restarted sessions, reinstalled R, but the problem remained.
Correct, those type of problems should not be involved here. The calls you've shown use basic built-in functionalities of R (mostly from the 'parallel' package) and there's very little memory usage involved.
I guess the problem is about the permission when R tried to connect to cores. [...]
Both parallel:makeCluster(2)
and future::makeClusterPSOCK(2)
launches workers (using the parallel:::.slaveRSOCK()
) that are independent R sessions that run in the background. The master session and these workers communicate via sockets. So, yes, it could be that you have firewall issues preventing R from opening those ports. (I don't know enough macOS to troubleshoot that)
By setting outfile = NULL
, you will also get information on what happens on the workers' end. Here is what it should look like when it works:
> cl <- future::makeClusterPSOCK(1, outfile = NULL, verbose = TRUE)
Workers: [n = 1] ‘localhost’
Base port: 11306
Creating node 1 of 1 ...
- setting up node
Starting worker #1 on ‘localhost’: '/usr/lib/R/bin/Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11306 OUT= TIMEOUT=2592000 XDR=TRUE
Waiting for worker #1 on ‘localhost’ to connect back
starting worker pid=7608 on localhost:11306 at 14:46:57.827
Connection with worker #1 on ‘localhost’ established
- assigning connection UUID
- collecting session information
Creating node 1 of 1 ... done
PS. You only need one worker to troubleshoot this.