intel-mkl

Why does numpy.sin return a different result if the argument size is greater than 8192?

99封情书 提交于 2019-11-30 08:31:29
I discovered that numpy.sin behaves differently when the argument size is <= 8192 and when it is > 8192. The difference is in both performance and values returned. Can someone explain this effect? For example, let's calculate sin(pi/4): x = np.pi*0.25 for n in range(8191, 8195): xx = np.repeat(x, n) %timeit np.sin(xx) print(n, np.sin(xx)[0]) 64.7 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 8191 0.7071067811865476 64.6 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 8192 0.7071067811865476 20.1 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100000

Install Scipy with MKL through PIP

纵饮孤独 提交于 2019-11-29 16:26:55
问题 I am using PIP to install Scipy with MKL to accelerate the performance. My OS is Ubuntu 64 bit. Using the solution from this question, I create a file .numpy-site.cfg [mkl] library_dirs=/opt/intel/composer_xe_2013_sp1/mkl/lib/intel64/ include_dirs=/opt/intel/mkl/include/ mkl_libs=mkl_intel_lp64,mkl_intel_thread,mkl_core,mkl_rt lapack_libs= This file helps me to install Numpy with MKL successfully. However, using the same above file, installing Scipy prompts the error ImportError: libmkl_rt.so

Confused with pdpotrf arguments

南笙酒味 提交于 2019-11-29 16:01:53
I want to do a Cholesky factorization in a distributed environment. For that purpose, I use pdpotrf() . However, I am struggling understanding the parameters needed by the function and they provide no C example on how to use it (and an example would be really great to have). Assume I have a NxX matrix I want to factorize. Then, what values should the parameters have? uplo , a and info are well defined in my mind. How about the rest? n should be equal to N I would say. However, desca , ia and ja are the ones that confuse me. Moreover, desca is global and local, something that I can't understand

Why does numpy.sin return a different result if the argument size is greater than 8192?

筅森魡賤 提交于 2019-11-29 11:59:26
问题 I discovered that numpy.sin behaves differently when the argument size is <= 8192 and when it is > 8192. The difference is in both performance and values returned. Can someone explain this effect? For example, let's calculate sin(pi/4): x = np.pi*0.25 for n in range(8191, 8195): xx = np.repeat(x, n) %timeit np.sin(xx) print(n, np.sin(xx)[0]) 64.7 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 8191 0.7071067811865476 64.6 µs ± 166 ns per loop (mean ± std. dev. of 7 runs,

Difference between Numpy and Numpy-MKL?

别说谁变了你拦得住时间么 提交于 2019-11-29 10:00:17
I wanted to test some signal processing and statistics using SciPy. So I had to use scipy.signal and scipy.stats , but I always used to get an error: ImportError: DLL load failed: The specified module could not be found. I was using Numpy 1.7.1, scipy 0.12 and Python 2.7.3. I checked on the internet and asked about it on other forums too! This problem got solved when switched my Numpy distribution with the Numpy-MKL distribution. I want to know the difference between the two libraries ? Numpy and scipy rely on lower level fortran libraries such as BLAS and lapack to perform many of their

Calling multithreaded MKL in from openmp parallel region

佐手、 提交于 2019-11-29 08:21:57
I have a code with following structure #pragma omp parallel { #omp for nowait { // first for loop } #omp for nowait { // first for loop } #pragma barrier <-- #pragma omp single/critical/atomic --> not sure dgemm_(....) #pragma omp for { // yet another for loop } } For dgemm_, I link with multithreaded mkl. I want mkl to use all available 8 threads. What is the best way to do so? This is a case of nested parallelism. It is supported by MKL, but it only works if your executable is built using the Intel C/C++ compiler. The reason for that restriction is that MKL uses Intel's OpenMP runtime and

Does installing BLAS/ATLAS/MKL/OPENBLAS will speed up R package that is written in C/C++?

二次信任 提交于 2019-11-29 08:13:10
I found that using one of BLAS/ATLAS/MKL/OPENBLAS will give improvement on speed in R. However, will it still improve the R Package that is written in C or C++? for example, R package Glmnet is implemented in FORTRAN and R package rpart is implemented in C++. Will it just installing BLAS/...etc will improve the execution time? or do we have to rebuild (building new C code) the package based on BLAS/...etc? It is frequently stated, including in a comment here, that "you have to recompile R" to use different BLAS or LAPACK library. That is wrong. You do not have to recompile R provided it is

Using mkl_set_num_threads with numpy

≯℡__Kan透↙ 提交于 2019-11-29 03:38:48
I'm trying to set the number of threads for numpy calculations with mkl_set_num_threads like this import numpy import ctypes mkl_rt = ctypes.CDLL('libmkl_rt.so') mkl_rt.mkl_set_num_threads(4) but I keep getting an segmentation fault: Program received signal SIGSEGV, Segmentation fault. 0x00002aaab34d7561 in mkl_set_num_threads__ () from /../libmkl_intel_lp64.so Getting the number of threads is no problem: print mkl_rt.mkl_get_max_threads() How can I get my code working? Or is there another way to set the number of threads at runtime? Ophion led me the right way. Despite the documentation, one

MATLAB twice as fast as Numpy

 ̄綄美尐妖づ 提交于 2019-11-29 00:46:54
问题 I am an engineering grad student currently making the transition from MATLAB to Python for the purposes of numerical simulation. I was under the impression that for basic array manipulation, Numpy would be as fast as MATLAB. However, it appears for two different programs I write that MATLAB is a little under twice as fast as Numpy. The test code I am using for Numpy (Python 3.3) is: import numpy as np import time a = np.random.rand(5000,5000,3) tic = time.time() a[:,:,0] = a[:,:,1] a[:,:,2] =

Pyinstaller numpy “Intel MKL FATAL ERROR: Cannot load mkl_intel_thread.dll”

∥☆過路亽.° 提交于 2019-11-28 20:51:23
I'm new with python apps. I'm trying to build my python GUI app with pyinstaller. My app depends on the following packages: PyQt4, numpy, pyqtgraph, h5py. I'm working with WinPython-32bit-3.4.4.1. I build the app with this command: pyinstaller --hidden-import=h5py.defs --hidden-import=h5py.utils --hidden-import=h5py.h5ac --hidden-import=h5py._proxy VOGE.py I launch my app with the exe file in the dist directory created by pyinstaller and it seems work fine until the program call numpy and crash with this error: Intel MKL FATAL ERROR: Cannot load mkl_intel_thread.dll The mkl_intel_thread.dll is