InvalidArgumentError : ConcatOp : Dimensions of inputs should match
The error is coming from LSTMCell.call
call method. There we are trying to tf.concat([inputs, h], 1)
meaning that we want to concatenate the next input with the current hidden state before matmul
'ing with the kernel variables matrix. The error is saying that you can't do it because the batch
(0
th) dimensions don't match up - your input is shaped [20,25]
and your hidden state is shaped [30,100]
.
For some reason on your 32nd iteration, or whenever you see the error, the input is not batched to 30
, but only to 20
. This usually happens at the end of your training data when the total number of training examples does not evenly divide your batch size. This hypothesis is also consistent with "When i used smaller batch , it seems the code can run longer" statement.
DiIli
Updated on June 25, 2022Comments
-
DiIli almost 2 years
Tensorflow 1.7 when using dynamic_rnn.It runs fine at first , but at the 32th(it changes when i run the code) step , the error appears. When i used smaller batch , it seems the code can run longer , however the error still poped up .Just cannt figure out what's wrong.
from mapping import * def my_input_fn(features, targets, batch_size=20, shuffle=True, num_epochs=None, sequece_lenth=None): ds = tf.data.Dataset.from_tensor_slices( (features, targets, sequece_lenth)) # warning: 2GB limit ds = ds.batch(batch_size).repeat(num_epochs) if shuffle: ds = ds.shuffle(10000) features, labels, sequence = ds.make_one_shot_iterator().get_next() return features, labels, sequence def lstm_cell(lstm_size=50): return tf.contrib.rnn.BasicLSTMCell(lstm_size) class RnnModel: def __init__(self, batch_size, hidden_units, time_steps, num_features ): self.batch_size = batch_size self.hidden_units = hidden_units stacked_lstm = tf.contrib.rnn.MultiRNNCell( [lstm_cell(i) for i in self.hidden_units]) self.initial_state = stacked_lstm.zero_state(batch_size, tf.float32) self.model = stacked_lstm self.state = self.initial_state self.time_steps = time_steps self.num_features = num_features def loss_mean_squre(self, outputs, targets): pos = tf.add(outputs, tf.ones(self.batch_size)) eve = tf.div(pos, 2) error = tf.subtract(eve, targets) return tf.reduce_mean(tf.square(error)) def train(self, num_steps, learningRate, input_fn, inputs, targets, sequenceLenth): periods = 10 step_per_periods = int(num_steps / periods) input, target, sequence = input_fn(inputs, targets, self.batch_size, shuffle=True, sequece_lenth=sequenceLenth) initial_state = self.model.zero_state(self.batch_size, tf.float32) outputs, state = tf.nn.dynamic_rnn(self.model, input, initial_state=initial_state) loss = self.loss_mean_squre(tf.reshape(outputs, [self.time_steps, self.batch_size])[-1], target) optimizer = tf.train.AdamOptimizer(learning_rate=learningRate) grads_and_vars = optimizer.compute_gradients(loss, self.model.variables) optimizer.apply_gradients(grads_and_vars) init_op = tf.global_variables_initializer() with tf.Session() as sess: for i in range(num_steps): sess.run(init_op) state2, current_loss= sess.run([state, loss]) if i % step_per_periods == 0: print("period " + str(int(i / step_per_periods)) + ":" + str(current_loss)) return self.model, self.state def processFeature(df): df = df.drop('class', 1) features = [] for i in range(len(df["vecs"])): features.append(df["vecs"][i]) aa = pd.Series(features).tolist() # tramsform into list featuresList = [] for i in features: p1 = [] for k in i: p1.append(list(k)) featuresList.append(p1) return featuresList def processTargets(df): selected_features = df[ "class"] processed_features = selected_features.copy() return tf.convert_to_tensor(processed_features.astype(float).tolist()) if __name__ == '__main__': dividNumber = 30 """ some code here to modify my data to input it looks like this: inputs before use input function : [fullLenth, charactorLenth, embeddinglenth] """ model = RnnModel(15, [100, 80, 80, 1], time_steps=dividNumber, num_features=25) model.train(5000, 0.0001, my_input_fn, training_examples, training_targets, sequenceLenth=trainSequenceL)
And error is under here
Traceback (most recent call last): File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1330, in _do_call return fn(*args) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1315, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1423, in _call_tf_sessionrun status, run_metadata) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in __exit__ c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [20,25] vs. shape[1] = [30,100] [[Node: rnn/while/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnn/while/TensorArrayReadV3, rnn/while/Switch_4:1, rnn/while/rnn/multi_rnn_cell/cell_3/basic_lstm_cell/Const)]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:/programming/mlwords/dnn_gragh.py", line 198, in <module> model.train(5000, 0.0001, my_input_fn, training_examples, training_targets, sequenceLenth=trainSequenceL) File "D:/programming/mlwords/dnn_gragh.py", line 124, in train state2, current_loss, nowAccuracy = sess.run([state, loss, accuracy]) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 908, in run run_metadata_ptr) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1143, in _run feed_dict_tensor, options, run_metadata) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1324, in _do_run run_metadata) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\client\session.py", line 1343, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [20,25] vs. shape[1] = [30,100] [[Node: rnn/while/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnn/while/TensorArrayReadV3, rnn/while/Switch_4:1, rnn/while/rnn/multi_rnn_cell/cell_3/basic_lstm_cell/Const)]] Caused by op 'rnn/while/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/concat', defined at: File "D:/programming/mlwords/dnn_gragh.py", line 198, in <module> model.train(5000, 0.0001, my_input_fn, training_examples, training_targets, sequenceLenth=trainSequenceL) File "D:/programming/mlwords/dnn_gragh.py", line 95, in train outputs, state = tf.nn.dynamic_rnn(self.model, input, initial_state=initial_state)#,sequence_length=sequence File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn.py", line 627, in dynamic_rnn dtype=dtype) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn.py", line 824, in _dynamic_rnn_loop swap_memory=swap_memory) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3205, in while_loop result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2943, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2880, in _BuildLoop body_result = body(*packed_vars_for_body) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3181, in <lambda> body = lambda i, lv: (i + 1, orig_body(*lv)) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn.py", line 795, in _time_step (output, new_state) = call_cell() File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn.py", line 781, in <lambda> call_cell = lambda: cell(input_t, state) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 232, in __call__ return super(RNNCell, self).__call__(inputs, state) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\layers\base.py", line 714, in __call__ outputs = self.call(inputs, *args, **kwargs) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 1283, in call cur_inp, new_state = cell(cur_inp, cur_state) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 339, in __call__ *args, **kwargs) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\layers\base.py", line 714, in __call__ outputs = self.call(inputs, *args, **kwargs) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 620, in call array_ops.concat([inputs, h], 1), self._kernel) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1181, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1101, in concat_v2 "ConcatV2", values=values, axis=axis, name=name) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\framework\ops.py", line 3309, in create_op op_def=op_def) File "D:\Anaconda3\envs\tensorflow-cpu\lib\site-packages\tensorflow\python\framework\ops.py", line 1669, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [20,25] vs. shape[1] = [30,100] [[Node: rnn/while/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnn/while/TensorArrayReadV3, rnn/while/Switch_4:1, rnn/while/rnn/multi_rnn_cell/cell_3/basic_lstm_cell/Const)]]
this is my code used to check my input
def checkData(inputs, targets, sequencelence): batch_size = 20 features, target, sequece = my_input_fn(inputs, targets, batch_size=batch_size, shuffle=True, num_epochs=None, sequece_lenth=sequencelence) with tf.Session() as sess: for i in range(1000): features1, target1, sequece1 = sess.run([features, target, sequece]) assert len(features1) == batch_size for sentence in features1 : assert len(sentence) == 30 for word in sentence: assert len(word) == 25 assert len(target1) == batch_size assert len(sequece1) == batch_size print(target1) print("OK")
-
DiIli about 6 yearsAt first , it can run without error popped up if the batch size is under a threshold like 50 .But after i add a function to calculate the accuracy, it cannot run even with small batch size. now i delete the code again.The code up there is the original type.
-
-
DiIli about 6 yearsThanks to reply. But when I check the input , it is exactly [batch_size, time_steps, embedding_lenth].The check code is attached up there.
-
iga almost 6 yearsI don't see how you call this function, but here are some possible explanations. You call it once where there is still a full batch of data, so everything matches. It seems like in the code you use batch of 15, but in this test function it is 20. It might be that total number of examples is divisible by 20 but not by 15. Here is an issue discussing this and some ways of handling it - github.com/tensorflow/tensorflow/issues/13161
-
Admin over 2 yearsYour answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.