问题
I am having trouble installing mxnet GPU for R on Amazon deep learning linux AMI. The environment variables are such a mess that it’s a nightmare for any non-expert sys-admin to figure out.
Step 1: install the ridiculous amount of missing/broken programs and R packages
sudo yum install R
sudo yum install libxml2-devel
sudo yum install cairo-devel
sudo yum install giflib-devel
sudo yum install libXt-devel
sudo R
install.packages("devtools")
library(devtools)
install_github("igraph/rigraph")
install.packages(‘DiagrammeR’)
install.packages(‘roxygen2’)
install.packages(‘rgexf’)
install.packages(‘influenceR’)
install.packages(‘Cairo’)
install.packages(“imager”)
Step 2: edit the config.mk file
cd /src/mxnet
cp make/config.mk .
echo "USE_BLAS=openblas" >>config.mk
echo "ADD_CFLAGS += -I/usr/include/openblas" >>config.mk
echo "ADD_LDFLAGS += -lopencv_core -lopencv_imgproc -lopencv_imgcodecs" >>config.mk
echo "USE_CUDA=1" >>config.mk
echo "USE_CUDA_PATH=/usr/local/cuda" >>config.mk
echo "USE_CUDNN=1" >>config.mk
*note even though the USE_CUDA_PATH is set, it STILL cannot find libcudart.so and needs to be linked in the make command (shown later)
Step 3: make new config file so make command can find libcudart.so
/etc/ld.so.conf.d/cuda.conf
add /usr/local/cuda-8.0/lib64
sudo ldconfig
- note this was posted by nvidia but does absolutely nothing to help the make rpkg
Step 4: set up R directories
Rscript -e "install.packages('devtools', repo = 'https://cran.rstudio.com')"
cd R-package
Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cran.rstudio.com'));
install_deps(dependencies = TRUE)" cd ..
step 5: make
cd /src/mxnet
sudo make -j8
Result:
make CXX=g++ DEPS_PATH=/home/ec2-user/src/mxnet/deps -C /home/ec2-user/src/mxnet/ps-lite ps
cd /home/ec2-user/src/mxnet/dmlc-core; make libdmlc.a USE_SSE=1 config=/home/ec2-user/src/mxnet/config.mk; cd /home/ec2-user/src/mxnet
make[1]: Entering directory /home/ec2-user/src/mxnet/dmlc-core'
make[1]:
libdmlc.a' is up to date.
make[1]: Leaving directory /home/ec2-user/src/mxnet/dmlc-core'
make[1]: Entering directory
/home/ec2-user/src/mxnet/ps-lite'
make[1]: Nothing to be done for ps'.
make[1]: Leaving directory
/home/ec2-user/src/mxnet/ps-lite'
ar crv lib/libmxnet.a
*note, even when changing the config.mk file, the make command always returns ‘nothing to update’
Step 6: attempt to make rpkg
Cd /src/mxnet
Sudo make rpkg
Error: Error: package or namespace load failed for ‘mxnet’: .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/usr/lib64/R/library/mxnet/libs/libmxnet.so': libcudart.so.8.0: cannot open shared object file: No such file or directory Error: loading failed Execution halted ERROR: loading failed
So it’s looking in a location that doesn’t exist: /usr/lib64/R/library/mxnet/libs/ When the file actually lives: /home/ec2-user/src/mxnet/R-package/inst/libs/libmxnet.so or /home/ec2-user/src/mxnet/lib/libmxnet.so
What I’ve tried so far:
sudo LD_LIBRARY_PATH=/usr/local/cuda/lib64 make rpkg
This will fix the missing libcudart.so.8.0 issue but it is simply replace with: libmklml_intel.so: cannot open shared object file: No such file or directory as well as the original ‘cannot find libmxnet.so
Also tried: 1. actually creating directories (/usr/lib64/R/library/mxnet/libs/) and then copying libmxnet.so there Result: same error
adding /home/ec2-user/src/mxnet/R-package/inst/libs/ to the make command sudo LD_LIBRARY_PATH=/home/ec2-user/src/mxnet/R-package/inst/libs make rpkg Result: same error
a ridiculous amount of environment labels all of which failed:
export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/ export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/libmxnet.so
sudo ldconfig /usr/local/cuda/lib64 sudo ln -s /usr/lib64/R/library/mxnet/libs /usr/lib sudo ln -s /usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib sudo ln -s /usr/local/lib/libmklml_intel.so /usr/lib sudo ln -s /usr/local/lib/libiomp5.so /usr/lib sudo ln -s /usr/local /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0 export LD_LIBRARY_PATH=/usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/targets/x86_64-linux/lib/:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0
In all ONE of these worked, because I briefly got mxnet R package working before it fell apart again. I’ve dropped 50+ hours into this installation, which, frankly is ridiculous. Tougher to install the software then it is to program an actual net....
I don’t have 5+ years of linux sys admin knowledge so if you’d like please be a bit more helpful then ‘fix environment variables.’ I can tell that’s obviously what’s wrong yet have no idea what ‘fix environment variables’ entails.
To top it off, even after successful install of the R package, it STILL won’t work until setting Rstudio server’s config file to: rsession-ld-library-path=/opt/local/lib:/usr/local/cuda/lib64
回答1:
Did you try the following when running any sudo commands.
sudo -E make -j8
This means that it will preserve the env variables when running as superuser. You shouldn't have to add a new config file for the make to find the libraries. Just preserving the env variables using the above command should be enough.
来源:https://stackoverflow.com/questions/46942883/issues-installing-mxnet-gpu-r-package-for-amazon-deep-learning-ami