TensorFlow 2.0 How to get trainable variables from tf.keras.layers layers, like Conv2D or Dense

13,233

Solution 1

Ok, so I think I found the problem.

The trainable variables were not available until I used the given layer object. After I run my forward pass I could retrieve attributes of the tf.keras.layers.Layer object like trainable_variables and weights.

However, before my forward pass I received an empty list. To make things a little bit more clear:

with tf.GradientTape() as tape:
    print(dense_layers[0].trainable_variables)
    self.forward_pass(X)
    self.compute_loss()
    print(dense_layers[0].trainable_variables)

On the code above, the attribute trainable_variables is an empty list before executing self.forward_pass. However, right after it, I could retrieve the kernel and bias numpy arrays.

Solution 2

Let me start by having a simple model as an example to make it easier to explain and understand.

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(1, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(1, (3, 3), activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(3, activation='softmax'))

When using gradient tape you pass model.trainable_weights which returns the weights and biases of the entire model and use the optimizer to apply the gradients.

If you print the output of model.trainable_weights, you will get this output. I removed the actual weights and biases for readability.

[<tf.Variable 'conv2d/kernel:0' shape=(3, 3, 3, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d_1/kernel:0' shape=(3, 3, 1, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d_1/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense/kernel:0' shape=(169, 10) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense/bias:0' shape=(10,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_1/kernel:0' shape=(10, 10) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_2/kernel:0' shape=(10, 3) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_2/bias:0' shape=(3,) dtype=float32, numpy=array([...], dtype=float32)>]

As you can see each layer's kernel and bias was outputted as a list. This is the same output you pass to the gradient tape. If you want to pass just a specific layer, you can slice the list and get the desired weights you want to train.

model.trainable_weights[0:2] # Get the first conv layer weights at index 0 and bias at index 1.

Which will output only the first conv layer weights and biases.

[<tf.Variable 'conv2d/kernel:0' shape=(3, 3, 3, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>]
Share:
13,233
MattSt
Author by

MattSt

Updated on June 14, 2022

Comments

  • MattSt
    MattSt almost 2 years

    I have been trying to get the trainable variables from my layers and can't figure out a way to make it work. So here is what I have tried:

    I have tried accessing the kernel and bias attribute of the Dense or Conv2D object directly, but to no avail. The type of result that I get is "Dense object has no attribute 'kernel'".

    trainable_variables.append(conv_layer.kernel)
    trainable_variables.append(conv_layer.bias)
    

    Similarly, I have tried using the attribute "trainable_variables" in the following way:

    trainable_variables.extend(conv_layer.trainable_variables)
    

    From what I know this is supposed to return a list of two variables, the weight and the bias variables. However, what I get is an empty list.

    Any idea of how to get the variables from labels in TensorFlow 2.0? I want to be able to later feed those variables to an optimizer, in a way similar to the following:

    gradients = tape.gradient(loss, trainable_variables)
    optimizer.apply_gradients(zip(gradients, trainable_variables))
    

    Edit: Here is part of my current code to serve as an example and help answering the question (Hope it is readable)

    from tensorflow.keras.layers import Dense, Conv2D, Conv2DTranspose, Reshape, Flatten
    
    ... 
    
    class Network:
        def __init__(self, params):
            weights_initializer = tf.initializers.GlorotUniform(seed=params["seed"])
            bias_initializer = tf.initializers.Constant(0.0)
    
            self.trainable_variables = []
    
            self.conv_layers = []
            self.conv_activations = []
            self.create_conv_layers(params, weights_initializer, bias_initializer)
    
            self.flatten_layer = Flatten()
    
    
            self.dense_layers = []
            self.dense_activations = []
            self.create_dense_layers(params, weights_initializer, bias_initializer)
    
            self.output_layer = Dense(1, kernel_initializer=weights_initializer, bias_initializer=bias_initializer)
            self.trainable_variables.append(self.output_layer.kernel)
            self.trainable_variables.append(self.output_layer.bias)
    
        def create_conv_layers(self, params, weight_init, bias_init):
            nconv = len(params['stride'])
            for i in range(nconv):
                conv_layer = Conv2D(filters=params["nfilter"][i],
                                    kernel_size=params["shape"][i], kernel_initializer=weight_init,
                                    kernel_regularizer=spectral_norm,
                                    use_bias=True, bias_initializer=bias_init,
                                    strides=params["stride"][i],
                                    padding="same", )
                self.conv_layers.append(conv_layer)
                self.trainable_variables.append(conv_layer.kernel)
                self.trainable_variables.append(conv_layer.bias)
                self.conv_activations.append(params["activation"])
    
        def create_conv_layers(self, params, weight_init, bias_init):
            nconv = len(params['stride'])
            for i in range(nconv):
                conv_layer = Conv2D(filters=params["nfilter"][i],
                                    kernel_size=params["shape"][i], kernel_initializer=weight_init,
                                    kernel_regularizer=spectral_norm,
                                    use_bias=True, bias_initializer=bias_init,
                                    strides=params["stride"][i],
                                    padding="same", )
                self.conv_layers.append(conv_layer)
                self.trainable_variables.append(conv_layer.kernel)
                self.trainable_variables.append(conv_layer.bias)
                self.conv_activations.append(params["activation"])
    

    As you can see I am trying to gather all my trainable variables into a list attribute called trainable_variables. However as I mentioned this code does not work because I get an error for trying to acquire the kernel and bias attributes of those layer objects.