@tf.function ValueError: Creating variables on a non-first call to a function decorated with tf.function, unable to understand behaviour

18,856

Using tf.function you're converting the content of the decorated function: this means that TensorFlow will try to compile your eager code into its graph representation.

The variables, however, are special objects. In fact, when you were using TensorFlow 1.x (graph mode), you were defining the variables only once and then using/updating them.

In tensorflow 2.0, if you use pure eager execution, you can declare and re-use the same variable more than once since a tf.Variable - in eager mode - is just a plain Python object that gets destroyed as soon as the function ends and the variable, thus, goes out of scope.

In order to make TensorFlow able to correctly convert a function that creates a state (thus, that uses Variables) you have to break the function scope, declaring the variables outside of the function.

In short, if you have a function that works correctly in eager mode, like:

def f():
    a = tf.constant([[10,10],[11.,1.]])
    x = tf.constant([[1.,0.],[0.,1.]])
    b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    return y

You have to change it's structure to something like:

b = None

@tf.function
def f():
    a = tf.constant([[10, 10], [11., 1.]])
    x = tf.constant([[1., 0.], [0., 1.]])
    global b
    if b is None:
        b = tf.Variable(12.)
    y = tf.matmul(a, x) + b
    print("PRINT: ", y)
    tf.print("TF-PRINT: ", y)
    return y

f()

in order to make it work correctly with the tf.function decorator.

I covered this (and others) scenario in several blog posts: the first part analyzes this behavior in the section Handling states breaking the function scope (however I suggest to read it from the beginning and to read also part 2 and 3).

Share:
18,856
drongo
Author by

drongo

Updated on June 16, 2022

Comments

  • drongo
    drongo almost 2 years

    I would like to know why this function:

    @tf.function
    def train(self,TargetNet,epsilon):
        if len(self.experience['s']) < self.min_experiences:
            return 0
        ids=np.random.randint(low=0,high=len(self.replay_buffer['s']),size=self.batch_size)
        states=np.asarray([self.experience['s'][i] for i in ids])
        actions=np.asarray([self.experience['a'][i] for i in ids])
        rewards=np.asarray([self.experience['r'][i] for i in ids])
        next_states=np.asarray([self.experience['s1'][i] for i in ids])
        dones = np.asarray([self.experience['done'][i] for i in ids])
        q_next_actions=self.get_action(next_states,epsilon)
        q_value_next=TargetNet.predict(next_states)
        q_value_next=tf.gather_nd(q_value_next,tf.stack((tf.range(self.batch_size),q_next_actions),axis=1))
        targets=tf.where(dones, rewards, rewards+self.gamma*q_value_next)
    
        with tf.GradientTape() as tape:
            estimates=tf.math.reduce_sum(self.predict(states)*tf.one_hot(actions,self.num_actions),axis=1)
            loss=tf.math.reduce_sum(tf.square(estimates - targets))
        variables=self.model.trainable_variables
        gradients=tape.gradient(loss,variables)
        self.optimizer.apply_gradients(zip(gradients,variables))
    

    gives ValueError: Creating variables on a non-first call to a function decorated with tf.function. Whereas this code which is very similiar:

    @tf.function
    def train(self, TargetNet):
        if len(self.experience['s']) < self.min_experiences:
            return 0
        ids = np.random.randint(low=0, high=len(self.experience['s']), size=self.batch_size)
        states = np.asarray([self.experience['s'][i] for i in ids])
        actions = np.asarray([self.experience['a'][i] for i in ids])
        rewards = np.asarray([self.experience['r'][i] for i in ids])
        states_next = np.asarray([self.experience['s2'][i] for i in ids])
        dones = np.asarray([self.experience['done'][i] for i in ids])
        value_next = np.max(TargetNet.predict(states_next), axis=1)
        actual_values = np.where(dones, rewards, rewards+self.gamma*value_next)
    
        with tf.GradientTape() as tape:
            selected_action_values = tf.math.reduce_sum(
                self.predict(states) * tf.one_hot(actions, self.num_actions), axis=1)
            loss = tf.math.reduce_sum(tf.square(actual_values - selected_action_values))
        variables = self.model.trainable_variables
        gradients = tape.gradient(loss, variables)
        self.optimizer.apply_gradients(zip(gradients, variables))
    

    Does not throw an error.Please help me understand why.

    EDIT:I removed the parameter epsilon from the function and it works.Is it because the @tf.function decorator is valid only for single argument functions?