I need to resize some 3D data, like in the tf.image.resize_images
method for 2d data.
I was thinking I could try and run tf.image.resize_images
A tensor is already 4D, with 1D allocated to 'batch_size' and the other 3D allocated for width, height, depth. If you are looking to process a 3D image and have batches of them in this configuration
[batch_size, width, height, depth, 1]
then use the squeeze function to remove to unnecessary final dimension like so:
tf.squeeze(yourData, [4])
This will output a tensor or shape
[batch_size, width, height, depth]
Which is what tensorflow will uses gracefully.
addition
If you have the dimensions handy and want to use the reshape capability of tensorflow instead you could like so :
reshapedData = tf.reshape(yourData, [batch_size, width, height, depth])
Personally, I'd use squeeze to declare to the next programmer that your code only intends to get rid of dimensions of size 1 whereas reshape could me so much more and would leave the next dev having to try to figure out why you are reshaping.
update to include the changing 4th dimension
You would like to sometimes use the dimension
[batch_size, width, height, depth, 1]
and sometimes use
[batch_size, width, height, depth, n]
No problem. It is the same solution, but now you can't use squeeze and instead are just left with reshape like so:
reshapedData = tf.reshape(yourData, [batch_size, width, height, depth*n])
How could this work? Let's imagine that depth is the number of image frames and n is the color depth (possibly 3 for RGB). The reshape will stack the color frames one after the other. Your tensorflow no doubt has a convolution layer immediately after the input. The convolution layer will process your stack of color frames as easily as your monochrome frames (albeit with more computing power and parameters).
and addition of scaling
Okay, here is how to scale the image, use tf.image.resize_images after resizing like so:
reshapedData = tf.image.resize_images( tf.reshape(yourData, [batch_size, width, height, depth*n]) , new_size )
where size is a 2D tensor if [ new_height, new_width ], or in your case [ width * scale , height * scale ]
new_size = tf.constant( [ width * scale , height * scale ] )
and then back to the original
If after all this resizing of the image you want it to again be in the shape : [batch_size, width, height, depth, n]
then simple use this code
tf.reshape(yourData, [batch_size, width*scale, height*scale, depth,n])
last addition to add address scaling the depth also
Here is my solution :
we'll want to reshape this matrix, and expand it similar how a 3d matrix is expanded in numpy like this
a = np.array([[1, 2, 3, 4, 5, 6, 7, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27],[1, 2,3, 4, 5, 6, 7, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]])
print a.reshape([2,3,3,3])
a.reshape([54,1]).dot(np.ones([1,8])).reshape([2,3,3,3,2,2,2]).transpose([0,1,6,2,5,3,4]).reshape([2,6,6,6])
print a
Here is the tensorflow code
isolate = tf.transpose(yourdata,[0,4,1,2,3]) # [batch_size,n,width,height,depth]
flatten_it_all = tf.reshape([batch_size * n * width * height * depth , 1]) # flatten it
expanded_it = flatten_it_all * tf.ones( [1,8] )
prepare_for_transpose = tf.reshape( expanded_it , [batch_size*n,width,height,depth,2,2,2] )
transpose_to_align_neighbors = tf.transpose( prepare_for_transpose, [0,1,6,2,5,3,4])
expand_it_all = tf.reshape( transpose_to_align_neighbors , [batch_size,n,width*2,height*2,depth*2] )
#### - removing this section because the requirements changed
# do a conv layer here to 'blend' neighbor values like:
# averager = tf.ones([2,2,2,1,1]) * 1. / 8.
# tf.nn.conf3d( expand_it_all , averager , padding="SAME")
# for n = 1. for n = 3, I'll leave it to you.
# then finally reorder and you are done
reorder_dimensions = tf.transpose(expand_it_all,[0,2,3,4,1]) # [batch_size,width*2,height*2,depth*2,n]
My approach to this would be to resize the image along two axis, in the code I paste below, I resample along depth and then width
def resize_by_axis(image, dim_1, dim_2, ax, is_grayscale):
resized_list = []
if is_grayscale:
unstack_img_depth_list = [tf.expand_dims(x,2) for x in tf.unstack(image, axis = ax)]
for i in unstack_img_depth_list:
resized_list.append(tf.image.resize_images(i, [dim_1, dim_2],method=0))
stack_img = tf.squeeze(tf.stack(resized_list, axis=ax))
print(stack_img.get_shape())
else:
unstack_img_depth_list = tf.unstack(image, axis = ax)
for i in unstack_img_depth_list:
resized_list.append(tf.image.resize_images(i, [dim_1, dim_2],method=0))
stack_img = tf.stack(resized_list, axis=ax)
return stack_img
resized_along_depth = resize_by_axis(x,50,60,2, True)
resized_along_width = resize_by_axis(resized_along_depth,50,70,1,True)
Where x will be the 3-d tensor either grayscale or RGB; resized_along_width is the final resized tensor. Here we want to resize the 3-d image to dimensions of (50,60,70)