Converting Octave to Use CuBLAS

强颜欢笑 提交于 2020-01-12 06:18:27

问题


I'd like to convert Octave to use CuBLAS for matrix multiplication. This video seems to indicate this is as simple as typing 28 characters:

Using CUDA Library to Accelerate Applications

In practice it's a bit more complex than this. Does anyone know what additional work must be done to make the modifications made in this video compile?

UPDATE

Here's the method I'm trying

in dMatrix.cc add

#include <cublas.h>

in dMatrix.cc change all occurences of (preserving case)

dgemm

to

cublas_dgemm

in my build terminal set

export CC=nvcc
export CFLAGS="-lcublas -lcudart"
export CPPFLAGS="-I/usr/local/cuda/include"
export LDFLAGS="-L/usr/local/cuda/lib64"

the error I receive is:

libtool: link: g++ -I/usr/include/freetype2 -Wall -W -Wshadow -Wold-style-cast 
-Wformat -Wpointer-arith -Wwrite-strings -Wcast-align -Wcast-qual -g -O2
-o .libs/octave octave-main.o  -L/usr/local/cuda/lib64 
../libgui/.libs/liboctgui.so ../libinterp/.libs/liboctinterp.so 
../liboctave/.libs/liboctave.so -lutil -lm -lpthread -Wl,-rpath
-Wl,/usr/local/lib/octave/3.7.5

../liboctave/.libs/liboctave.so: undefined reference to `cublas_dgemm_'

回答1:


EDIT2: The method described in this video requires the use of the fortran "thunking library" bindings for cublas. These steps worked for me:

  1. Download octave 3.6.3 from here:

    wget ftp://ftp.gnu.org/gnu/octave/octave-3.6.3.tar.gz
    
  2. extract all files from the archive:

    tar -xzvf octave-3.6.3.tar.gz
    
  3. change into the octave directory just created:

    cd octave-3.6.3
    
  4. make a directory for your "thunking cublas library"

    mkdir mycublas
    
  5. change into that directory

    cd mycublas
    
  6. build the "thunking cublas library"

    g++ -c -fPIC -I/usr/local/cuda/include -I/usr/local/cuda/src -DCUBLAS_GFORTRAN -o fortran_thunking.o /usr/local/cuda/src/fortran_thunking.c
    ar rvs libmycublas.a fortran_thunking.o
    
  7. switch back to the main build directory

    cd ..
    
  8. run octave's configure with additional options:

    ./configure --disable-docs LDFLAGS="-L/usr/local/cuda/lib64 -lcublas -lcudart -L/home/user2/octave/octave-3.6.3/mycublas -lmycublas"
    

    Note that in the above command line, you will need to change the directory for the second -L switch to that which matches the path to your mycublas directory that you created in step 4

  9. Now edit octave-3.6.3/liboctave/dMatrix.cc according to the instructions given in the video. It should be sufficient to replace every instance of dgemm with cublas_dgemm and every instance of DGEMM with CUBLAS_DGEMM. In the octave 3.6.3 version I used, there were 3 such instances of each (lower case and upper case).

  10. Now you can build octave:

    make
    

    (make sure you are in the octave-3.6.3 directory)

At this point, for me, Octave built successfully. I did not pursue make install although I assume that would work. I simply ran octave using the ./run-octave script in the octave-3.6.3 directory.

The above steps assume a proper and standard CUDA 5.0 install. I will try to respond to CUDA-specific questions or issues, but there are any number of problems that may arise with a general Octave install on your platform. I'm not an octave expert and I won't be able to respond to those. I used CentOS 6.2 for this test.

This method, as indicated, involves modification of the C source files of octave.

Another method was covered in some detail in the S3527 session at the GTC 2013 GPU Tech Conference. This session was actually a hands-on laboratory exercise. Unfortunately the materials on that are not conveniently available. However the method there did not involve any modification of GNU Octave source, but instead uses the LD_PRELOAD capability of Linux to intercept the BLAS library calls and re-direct (the appropriate ones) to the cublas library.

A newer, better method (using the NVBLAS intercept library) is discussed in this blog article




回答2:


I was able to produce a compiled executable using the information supplied. It's a horrible hack, but it works.

The process looks like this:

First produce an object file for fortran_thunking.c

sudo /usr/local/cuda-5.0/bin/nvcc -O3 -c -DCUBLAS_GFORTRAN fortran_thunking.c

Then move that object file to the src subdirectory in octave

cp /usr/local/cuda-5.0/src/fortran_thunking.o ./octave/src

run make. The compile will fail on the last step. Change to the src directory.

cd src

Then execute the failing final line with the addition of ./fortran_thunking.o -lcudart -lcublas just after octave-main.o. This produces the following command

g++ -I/usr/include/freetype2 -Wall -W -Wshadow -Wold-style-cast -Wformat
 -Wpointer-arith -Wwrite-strings -Wcast-align -Wcast-qual
 -I/usr/local/cuda/include -o .libs/octave octave-main.o 
./fortran_thunking.o -lcudart -lcublas  -L/usr/local/cuda/lib64 
../libgui/.libs/liboctgui.so ../libinterp/.libs/liboctinterp.so 
../liboctave/.libs/liboctave.so -lutil -lm -lpthread -Wl,-rpath 
-Wl,/usr/local/lib/octave/3.7.5

An octave binary will be created in the src/.libs directory. This is your octave executable.




回答3:


In a most recent version of CUDA you don't have to recompile anything. At least as I found in Debian. First, create a config file for NVBLAS (a cuBLAS wrapper). It won't work without it, at all.

tee nvblas.conf <<EOF
NVBLAS_CPU_BLAS_LIB $(dpkg -L libopenblas-base | grep libblas)
NVBLAS_GPU_LIST ALL
EOF

Then use Octave as you would usually do running it with:

LD_PRELOAD=libnvblas.so octave

NVBLAS will do what it can on a GPU while relaying everything else to OpenBLAS.

Further reading:

  • Benchmark for Octave.

  • Relevant slides for NVBLAS presentation.

  • Manual for nvblas.conf

Worth noting that you may not enjoy all the benefits of GPU computing depending on used CPU/GPU: OpenBLAS is quite fast with current multi-core processors. So fast that time spend copying data to GPU, working on it, and copying back could come close to time needed to do the job right on CPU. Check for yourself. Though GPUs are usually more energy efficient.



来源:https://stackoverflow.com/questions/17493270/converting-octave-to-use-cublas

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!