how tf.space_to_depth() works in tensorflow?

10,402

Solution 1

This tf.space_to_depth divides your input into blocs and concatenates them.

In your example the input is 38x38x64 (and I guess the block_size is 2). So the function divides your input into 4 (block_size x block_size) and concatenates them which gives your 19x19x256 output.

You just need to divide each of your channel (input) into block_size*block_size patches (each patch has a size of width/block_size x height/block_size) and concatenate all of these patches. Should be pretty straightforward with numpy.

Hope it helps.

Solution 2

Conclusion: tf.space_to_depth() only outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.

If you modify your code a little bit, like this

norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)

with tf.Session() as s:
    norm = s.run(norm)

trans = tf.space_to_depth(norm,2)

with tf.Session() as s:
    trans = s.run(trans)

Then you will have the following results:

Norm
(1, 2, 2, 1)
-0.130227
2.04587
-0.077691
-0.112031
Trans
(1, 1, 1, 4)
-0.130227
2.04587
-0.077691
-0.112031

Hope this can help you.

Solution 3

Using split and stack functions along with permute in Pytorch gives us the same result as space_to_depth in tensorflow does. Here is the code in Pytorch. Assume that input is in BHWC format.

Based on block_size and input shape, we can caculate the output shape. First, it splits the input on the "width" dimension or dimension #2 by block_size. The result of this operation is an array of length d_width. It's just like you cut a cake (by block_size) into d_width pieces. Then for each piece, you reshape it so it has correct output height and output depth (channel). Finally, we stack those pieces together and perform a permutation.

Hope it helps.

def space_to_depth(input, block_size)
    block_size_sq = block_size*block_size
    (batch_size, s_height, s_width, s_depth) = input.size()
    d_depth = s_depth * self.block_size_sq
    d_width = int(s_width / self.block_size)
    d_height = int(s_height / self.block_size)
    t_1 = input.split(self.block_size, 2)
    stack = [t_t.contiguous().view(batch_size, d_height, d_depth) for t_t in t_1]
    output = torch.stack(stack, 1)
    output = output.permute(0, 2, 1, 3)
    return output

Solution 4

A good reference for PyTorch is the implementation of the PixelShuffle module here. This shows the implementation of something equivalent to Tensorflow's depth_to_space. Based on that we can implement pixel_shuffle with a scaling factor less than 1 which would be like space_to_depth. E.g., downscale_factor=0.5 is like space_to_depth with block_size=2.

def pixel_shuffle_down(input, downscale_factor):
    batch_size, channels, in_height, in_width = input.size()
    out_channels = channels / (downscale_factor ** 2)
    block_size = 1 / downscale_factor

    out_height = in_height * downscale_factor
    out_width = in_width * downscale_factor

    input_view = input.contiguous().view(
        batch_size, channels, out_height, block_size, out_width, block_size)

    shuffle_out = input_view.permute(0, 1, 3, 5, 2, 4).contiguous()
    return shuffle_out.view(batch_size, out_channels, out_height, out_width)

Note: I haven't verified this implementation yet and I'm not sure if it's exactly the inverse of pixel_shuffle but this is the basic idea. I've also opened an issue on the PyTorch Github about this here. In NumPy the equivalent code would use reshapeand transpose instead of view and permute respectively.

Share:
10,402
Moahammad mehdi
Author by

Moahammad mehdi

Updated on June 09, 2022

Comments

  • Moahammad mehdi
    Moahammad mehdi almost 2 years

    I am a pytorch user. I have got a pretrained model in tensorflow and I would like to transfer it into pytorch. In one part of model architecture, I mean in tensorflow-defined model, there is a function tf.space_to_depth which transfers an input size of (None, 38,38,64) to (None, 19,19, 256). (https://www.tensorflow.org/api_docs/python/tf/space_to_depth) is the doc of this function. But I could not understand what this function actually do. Could you please provide some numpy codes to illustrate it for me?

    Actually I would like to make an exact similar layer in pytorch.

    Some codes in tensorflow reveals another secret: Here is some codes:

    import numpy as np
    import tensorflow as tf
    
    norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)
    trans = tf.space_to_depth(norm,2)
    
    with tf.Session() as s:
        norm = s.run(norm)
        trans = s.run(trans)
    
    
    
    print("Norm")
    print(norm.shape)
    for index,value in np.ndenumerate(norm):
        print(value)
    
    print("Trans")
    print(trans.shape)
    for index,value in np.ndenumerate(trans):
        print(value)
    

    And here is the output:

    Norm
    (1, 2, 2, 1)
    0.695261
    0.455764
    1.04699
    -0.237587
    Trans
    (1, 1, 1, 4)
    1.01139
    0.898777
    0.210135
    2.36742
    

    As you can see above, In Addition to data reshaping, the tensor values has changed!

  • Moahammad mehdi
    Moahammad mehdi almost 7 years
    Thanks for your answer. But I think some data shuffling has done! Did you see it in its docs! I think it is more than division!
  • Moahammad mehdi
    Moahammad mehdi almost 7 years
    Thanks for your response!
  • A. Piro
    A. Piro almost 7 years
    I didn't see any shuffling in tensorflow doc (tensorflow.org/api_docs/python/tf/space_to_depth) .. but maybe there is :)
  • Moahammad mehdi
    Moahammad mehdi almost 7 years
    great conclusion. Maybe adding some information about data shuffling in this operation is nice!
  • Moahammad mehdi
    Moahammad mehdi almost 7 years
    Hi again. thanks for your response. Could you please provide some numpy code to do this division. I am really novice in tensorflow and numpy. I would like to have the exact result as tf.space_to_depth. please!
  • Abhay
    Abhay about 4 years
    Can you explain why would someone need such an operation?