How can I multiply a vector and a matrix in tensorflow without reshaping?

20,456

Solution 1

Matmul was coded for rank two or greater tensors. Not sure why to be honest as numpy has it such that it allows for matrix vector multiplication as well.

import numpy as np
a = np.array([1, 2, 1])
w = np.array([[.5, .6], [.7, .8], [.7, .8]])

print(np.dot(a, w))
# [ 2.6  3. ] # plain nice old matix multiplication n x (n, m) -> m
print(np.sum(np.expand_dims(a, -1) * w , axis=0))
# equivalent result [2.6, 3]

import tensorflow as tf

a = tf.constant(a, dtype=tf.float64)
w = tf.constant(w)

with tf.Session() as sess:
  # they all produce the same result as numpy above
  print(tf.matmul(tf.expand_dims(a,0), w).eval())
  print((tf.reduce_sum(tf.multiply(tf.expand_dims(a,-1), w), axis=0)).eval())
  print((tf.reduce_sum(tf.multiply(a, tf.transpose(w)), axis=1)).eval())

  # Note tf.multiply is equivalent to "*"
  print((tf.reduce_sum(tf.expand_dims(a,-1) * w, axis=0)).eval())
  print((tf.reduce_sum(a * tf.transpose(w), axis=1)).eval())

Solution 2

tf.einsum gives you the ability to do exactly what you need in concise and intuitive form:

with tf.Session() as sess:
    print(tf.einsum('n,nm->m', a, w).eval())
    # [ 2.6  3. ] 

You even get to write your comment explicitly n x (n, m) -> m. It is more readable and intuitive in my opinion.

My favorite use case is when you want to multiply a batch of matrices with a weight vector:

n_in = 10
n_step = 6
input = tf.placeholder(dtype=tf.float32, shape=(None, n_step, n_in))
weights = tf.Variable(tf.truncated_normal((n_in, 1), stddev=1.0/np.sqrt(n_in)))
Y_predict = tf.einsum('ijk,kl->ijl', input, weights)
print(Y_predict.get_shape())
# (?, 6, 1)

So you can easily multiply weights over all batches with no transformations or duplication. This you can not do by expanding dimensions like in other answer. So you avoid the tf.matmul requirement to have matching dimensions for batch and other outer dimensions:

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication arguments, and any further outer dimensions match.

Solution 3

You can use tf.tensordot and set axes=1. For the simple operation of a vector times a matrix, this is a bit cleaner than tf.einsum

tf.tensordot(a, w, 1)
Share:
20,456
Mr_and_Mrs_D
Author by

Mr_and_Mrs_D

Be warned - the Monster isAlife Git, Java, Android and finally Python I was flirting with JEE since a couple years but since 1/2014 we are having an affair I spent the best part of the last year refactoring a widely used mod manager application. Here is the commit message of the release I have been working on, where I detail what I have been doing: https://github.com/wrye-bash/wrye-bash/commit/1cd839fadbf4b7338b1c12457f601066b39d1929 I am interested in code quality and performance (aka in the code as opposed to what the code does) If you find my posts useful you can buy me a coffee TCP walks into a bar & says: “I’d like a beer.” “You’d like a beer?” “Yes, a beer.”

Updated on July 05, 2022

Comments

  • Mr_and_Mrs_D
    Mr_and_Mrs_D almost 2 years

    This:

    import numpy as np
    a = np.array([1, 2, 1])
    w = np.array([[.5, .6], [.7, .8], [.7, .8]])
    
    print(np.dot(a, w))
    # [ 2.6  3. ] # plain nice old matrix multiplication n x (n, m) -> m
    
    import tensorflow as tf
    
    a = tf.constant(a, dtype=tf.float64)
    w = tf.constant(w)
    
    with tf.Session() as sess:
        print(tf.matmul(a, w).eval())
    

    results in:

    C:\_\Python35\python.exe C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py
    [ 2.6  3. ]
    # bunch of errors in windows...
    Traceback (most recent call last):
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 671, in _call_cpp_shape_fn_impl
        input_tensors_as_shapes, status)
      File "C:\_\Python35\lib\contextlib.py", line 66, in __exit__
        next(self.gen)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
        pywrap_tensorflow.TF_GetCode(status))
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py", line 14, in <module>
        print(tf.matmul(a, w).eval())
      File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1765, in matmul
        a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1454, in _mat_mul
        transpose_b=transpose_b, name=name)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op
        op_def=op_def)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2329, in create_op
        set_shapes_for_outputs(ret)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in set_shapes_for_outputs
        shapes = shape_func(op)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1667, in call_with_requiring
        return call_cpp_shape_fn(op, require_shape_fn=True)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn
        debug_python_shape_fn, require_shape_fn)
      File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl
        raise ValueError(err.message)
    ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].
    
    Process finished with exit code 1
    

    (not sure why the same exception is raised inside its handling)

    The solution suggested in Tensorflow exception with matmul is reshaping the vector to a matrix but this leads to needlessly complicated code - is there still no other way to multiply a vector with a matrix?

    Incidentally using expand_dims (as suggested in the link above) with default arguments raises a ValueError - that's not mentioned in the docs and defeats the purpose of having a default argument.

  • Mr_and_Mrs_D
    Mr_and_Mrs_D about 7 years
    Oh thanks - well, it's not matrix multiplication then ;) Are those 2 equivalent ? Could you explain a bit what reduce sum does ? Sorry too much fighting with tf today, I 'm dizzy
  • Steven
    Steven about 7 years
    So the "*" multiplication operation supports regular numpy broadcasting sematics(it might be missing some fancy indexing stuff). In the above it will multiply a the vector across each vector in w. Then reduce_sum will collapse a dimension by summing along that dimension. so we go from a * w -> reduce_sum(product) -> ans; ([n * nxm]) -> [nxm] -> [m]. Axis determines which axis to add over in this case we want 0 to get our final result of dimension m.
  • Mr_and_Mrs_D
    Mr_and_Mrs_D about 7 years
    Nope I'm sorry -> print(tf.reduce_sum(a * w, axis=0).eval()) results in ValueError: Dimensions must be equal, but are 3 and 2 for 'mul' (op: 'Mul') with input shapes: [3], [3,2]. in code in question
  • Steven
    Steven about 7 years
    Sorry about the mixup in broadcasting. I've fixed the code and provided both examples that produce the same results in numpy and tf.
  • Mr_and_Mrs_D
    Mr_and_Mrs_D about 7 years
    Thanks - I reported it here: github.com/tensorflow/tensorflow/issues/9055
  • Mr_and_Mrs_D
    Mr_and_Mrs_D over 6 years
    Thanks didn't know about einsum