I have installed the SparkR package from Spark distribution into the R library. I can call the following command and it seems to work properly: library(SparkR)
<
After installing Hadoop followed by Spark.
spark_path <- strsplit(system("brew info apache-spark",intern=T)[4],' ')[[1]][1] # Get your spark path
.libPaths(c(file.path(spark_path,"libexec", "R", "lib"), .libPaths()))
library(SparkR
I had the same issue and my spark-submit.cmd
file was also not executing from the command line. Following steps worked for me
Go to your environment variables and in the system variables select variable name PATH. Along with other values add c:/Windows/System32/
separated by a semicolon. This made my spark-submit.cmd run from command line and eventually from the Rstudio.
I have realized that we get the above issue only if all the required path values are not specified. Ensure all your path values(R, Rtools) are specified in the environment variables. For instance my Rtools path was c:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin
I hope this helps.
I got the exact same error message. My case is a little bit different, as in I have run SparkR in RStudio successfully before. But after few days, it doesn't work anymore.
By looking at the conversion between Shivaram Venkataraman and Prakash Ponshankaarchinnusamy, I realized this may have something to do with running permission.
https://issues.apache.org/jira/browse/SPARK-8603
So what I did, which eventually works, is that, I unzip the spark tar.gz to my C:/ folder again (previously it is kept in D:/ drive) AND SparkR WORKS!!!
screenshot of working RStudio
Try to give execute permissions C:/sparkpath/bin/spark-submit.cmd. That worked for me.