numba

nearest neighbour search 4D space python fast - vectorization

六眼飞鱼酱① 提交于 2020-08-26 07:10:06
问题 For each observation in X (there are 20) I want to get the k(3) nearest neighbors. How to make this fast to support up to 3 to 4 million rows? Is it possible to speed up the loop iterating over the elements? Maybe via numpy, numba or some kind of vectorization? A naive loop in python: import numpy as np from sklearn.neighbors import KDTree n_points = 20 d_dimensions = 4 k_neighbours = 3 rng = np.random.RandomState(0) X = rng.random_sample((n_points, d_dimensions)) print(X) tree = KDTree(X,

Numba jit and deferred types

℡╲_俬逩灬. 提交于 2020-08-25 03:53:22
问题 I'm passing numba as signature of my function @numba.jit(numba.types.UniTuple(numba.float64[:, :], 2)( numba.float64[:, :], numba.float64[:, :], numba.float64[:, :], earth_model_type)) where earth_model_type is defined as earth_model_type = numba.deferred_type() earth_model_type.define(em.EarthModel.class_type.instance_type) and it compiles fine, but when I try to call the function I get *** TypeError: No matching definition for argument type(s) array(float64, 2d, F), array(float64, 2d, C),

speed up function that takes a function as argument with numba

允我心安 提交于 2020-08-21 19:48:28
问题 I am trying to use numba to speed up a function that takes another function as argument. A minimal example would be the following: import numba as nb def f(x): return x*x @nb.jit(nopython=True) def call_func(func,x): return func(x) if __name__ == '__main__': print(call_func(f,5)) This, however, doesn't work, as apparently numba doesn't know what to do with that function argument. The traceback is quite long: Traceback (most recent call last): File "numba_function.py", line 15, in <module>

【技术博客】GPU 编程之从零开始实现 MNIST-CNN

吃可爱长大的小学妹 提交于 2020-08-18 03:36:22
【技术博客】GPU 编程之从零开始实现 MNIST-CNN 很多人最开始接触“ GPU ”想必都是通过游戏,一块高性能的 GPU 能带来非凡的游戏体验。而真正使GPU被越来越多人熟知是因为机器学习、深度学习的大热(也有人用于比特币挖矿),因为庞大的数据与计算量需要更快的处理速度,GPU 编程也因此越来越普遍。 从事深度学习的工作者常常自嘲自己为“炼丹师”,因为日常工作是:搭网络,调参,调参,调参......作为刚入门深度学习的小白更是如此,虽然不停的复现着一个又一个的网络,但总有些迷茫。我想这个迷茫来源于深度学习的“黑盒”性质,而我们所做的工作绝大部分时间又仅是调参,对网络内部的计算实际是模糊的。因此,本文试着结合 GPU 编程从零开始写一个简单的 CNN 网络,从内部深入理解网络的运行,希望能够一定程度上消除这种迷茫,也通过这一个简单的应用来了解 GPU 编程。 因此本文主要分为两个部分: GPU 编程的介绍与入门。 使用 GPU 编程从零开始搭建 MNIST-CNN。 1 GPU 编程的介绍与入门 1.1 介绍 图 1 为 CPU 与 GPU 的物理结构对比(图片源自网络)。图中的有色方块代表了处理核心,虽然 CPU 的核心比较少(如图中只有8块),但每一个算能力都非常强;而 GPU 的核心非常多,但每一个的能力有限。核心数量决定着处理计算过程的线程数量,因此对于计算问题来说