问题
How do I extend theanos downsample.max_pool_2d_same_size in order to pool not only within a feature map, but also between those - in a efficient manner?
Lets say i got 3 feature maps, each of size 10x10, that would be a 4D Tensor (1,3,10,10). First lets max pool ((2,2), no overlapping) each of the (10,10) feature map. The results are 3 sparse feature maps, still (10,10) but most values equal to zero: within a (2,2) window is at most one value greater than zero. This is what downsample.max_pool_2d_same_size does.
Next, i want to compare every maximum of a certain (2,2) window to all other maxima of all other feature maps of the window at the same position. I want to keep only the maxima across all of the feature maps. The results are again 3 feature maps (10,10), with nearly all of the values being zero.
Is there a fast way of doing so? I wouldn't mind other max_pooling functions, but i need the exact locations of the maxima for pooling/unpooling purposes (but that's another topic).
回答1:
I solved it using lasagne with cudnn. Here are some minimal examples of how to get the indices of a max pooling operation (2d and 3d). See https://groups.google.com/forum/#!topic/lasagne-users/BhtKsRmFei4
import numpy as np
import theano
import theano.tensor as T
from theano.tensor.type import TensorType
from theano.configparser import config
import lasagne
def tensor5(name=None, dtype=None):
if dtype is None:
dtype = config.floatX
type = TensorType(dtype, (False, False, False, False, False))
return type(name)
def max_pooling_2d():
input_var = T.tensor4('input')
input_layer = lasagne.layers.InputLayer(shape=(None, 2, 4, 4), input_var=input_var)
max_pool_layer = lasagne.layers.MaxPool2DLayer(input_layer, pool_size=(2, 2))
pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)
data = np.random.randint(low=0, high=9, size=32).reshape((1,2,4,4))
indices = get_indices_fn(data)
print data, "\n\n", indices
def max_pooling_3d():
input_var = tensor5('input')
input_layer = lasagne.layers.InputLayer(shape=(1, 1, 2, 4, 4), input_var=input_var)
# 5 input dimensions: (batchsize, channels, 3 spatial dimensions)
max_pool_layer = lasagne.layers.dnn.MaxPool3DDNNLayer(input_layer, pool_size=(2, 2, 2))
pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)
data = np.random.randint(low=0, high=9, size=32).reshape((1,1,2,4,4))
indices = get_indices_fn(data)
print data, "\n\n", indices
来源:https://stackoverflow.com/questions/32989064/theano-max-pool-3d