how tf.space_to_depth() works in tensorflow?
Solution 1
This tf.space_to_depth divides your input into blocs and concatenates them.
In your example the input is 38x38x64 (and I guess the block_size is 2). So the function divides your input into 4 (block_size x block_size) and concatenates them which gives your 19x19x256 output.
You just need to divide each of your channel (input) into block_size*block_size patches (each patch has a size of width/block_size x height/block_size) and concatenate all of these patches. Should be pretty straightforward with numpy.
Hope it helps.
Solution 2
Conclusion: tf.space_to_depth()
only outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.
If you modify your code a little bit, like this
norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)
with tf.Session() as s:
norm = s.run(norm)
trans = tf.space_to_depth(norm,2)
with tf.Session() as s:
trans = s.run(trans)
Then you will have the following results:
Norm
(1, 2, 2, 1)
-0.130227
2.04587
-0.077691
-0.112031
Trans
(1, 1, 1, 4)
-0.130227
2.04587
-0.077691
-0.112031
Hope this can help you.
Solution 3
Using split and stack functions along with permute in Pytorch gives us the same result as space_to_depth in tensorflow does. Here is the code in Pytorch. Assume that input is in BHWC format.
Based on block_size and input shape, we can caculate the output shape. First, it splits the input on the "width" dimension or dimension #2 by block_size. The result of this operation is an array of length d_width. It's just like you cut a cake (by block_size) into d_width pieces. Then for each piece, you reshape it so it has correct output height and output depth (channel). Finally, we stack those pieces together and perform a permutation.
Hope it helps.
def space_to_depth(input, block_size)
block_size_sq = block_size*block_size
(batch_size, s_height, s_width, s_depth) = input.size()
d_depth = s_depth * self.block_size_sq
d_width = int(s_width / self.block_size)
d_height = int(s_height / self.block_size)
t_1 = input.split(self.block_size, 2)
stack = [t_t.contiguous().view(batch_size, d_height, d_depth) for t_t in t_1]
output = torch.stack(stack, 1)
output = output.permute(0, 2, 1, 3)
return output
Solution 4
A good reference for PyTorch is the implementation of the PixelShuffle module here. This shows the implementation of something equivalent to Tensorflow's depth_to_space. Based on that we can implement pixel_shuffle with a scaling factor less than 1 which would be like space_to_depth. E.g., downscale_factor=0.5 is like space_to_depth with block_size=2.
def pixel_shuffle_down(input, downscale_factor):
batch_size, channels, in_height, in_width = input.size()
out_channels = channels / (downscale_factor ** 2)
block_size = 1 / downscale_factor
out_height = in_height * downscale_factor
out_width = in_width * downscale_factor
input_view = input.contiguous().view(
batch_size, channels, out_height, block_size, out_width, block_size)
shuffle_out = input_view.permute(0, 1, 3, 5, 2, 4).contiguous()
return shuffle_out.view(batch_size, out_channels, out_height, out_width)
Note: I haven't verified this implementation yet and I'm not sure if it's exactly the inverse of pixel_shuffle but this is the basic idea. I've also opened an issue on the PyTorch Github about this here. In NumPy the equivalent code would use reshape
and transpose
instead of view
and permute
respectively.
Moahammad mehdi
Updated on June 09, 2022Comments
-
Moahammad mehdi almost 2 years
I am a pytorch user. I have got a pretrained model in tensorflow and I would like to transfer it into pytorch. In one part of model architecture, I mean in tensorflow-defined model, there is a function tf.space_to_depth which transfers an input size of (None, 38,38,64) to (None, 19,19, 256). (https://www.tensorflow.org/api_docs/python/tf/space_to_depth) is the doc of this function. But I could not understand what this function actually do. Could you please provide some numpy codes to illustrate it for me?
Actually I would like to make an exact similar layer in pytorch.
Some codes in tensorflow reveals another secret: Here is some codes:
import numpy as np import tensorflow as tf norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1) trans = tf.space_to_depth(norm,2) with tf.Session() as s: norm = s.run(norm) trans = s.run(trans) print("Norm") print(norm.shape) for index,value in np.ndenumerate(norm): print(value) print("Trans") print(trans.shape) for index,value in np.ndenumerate(trans): print(value)
And here is the output:
Norm (1, 2, 2, 1) 0.695261 0.455764 1.04699 -0.237587 Trans (1, 1, 1, 4) 1.01139 0.898777 0.210135 2.36742
As you can see above, In Addition to data reshaping, the tensor values has changed!
-
Moahammad mehdi almost 7 yearsThanks for your answer. But I think some data shuffling has done! Did you see it in its docs! I think it is more than division!
-
Moahammad mehdi almost 7 yearsThanks for your response!
-
A. Piro almost 7 yearsI didn't see any shuffling in tensorflow doc (tensorflow.org/api_docs/python/tf/space_to_depth) .. but maybe there is :)
-
Moahammad mehdi almost 7 yearsgreat conclusion. Maybe adding some information about data shuffling in this operation is nice!
-
Moahammad mehdi almost 7 yearsHi again. thanks for your response. Could you please provide some numpy code to do this division. I am really novice in tensorflow and numpy. I would like to have the exact result as tf.space_to_depth. please!
-
Abhay about 4 yearsCan you explain why would someone need such an operation?