Get dot-product of dataframe with vector, and return dataframe, in Pandas

18,637

Solution 1

Here is an example of how to multiply a DataFrame by a vector:

In [60]: df = pd.DataFrame({'A': [1., 1., 1., 2., 2., 2.], 'B': np.arange(1., 7.)})

In [61]: vector = np.array([2,2,2,3,3,3])

In [62]: df.mul(vector, axis=0)
Out[62]: 
   A   B
0  2   2
1  2   4
2  2   6
3  6  12
4  6  15
5  6  18

Solution 2

mul is doing essentially an outer-product, while dot is an inner product. Let me expand on the accepted answer:

In [13]: df = pd.DataFrame({'A': [1., 1., 1., 2., 2., 2.], 'B': np.arange(1., 7.)})

In [14]: v1 = np.array([2,2,2,3,3,3])

In [15]: v2 = np.array([2,3])

In [16]: df.shape
Out[16]: (6, 2)

In [17]: v1.shape
Out[17]: (6,)

In [18]: v2.shape
Out[18]: (2,)

In [24]: df.mul(v2)
Out[24]: 
   A   B
0  2   3
1  2   6
2  2   9
3  4  12
4  4  15
5  4  18

In [26]: df.dot(v2)
Out[26]: 
0     5
1     8
2    11
3    16
4    19
5    22
dtype: float64

So:

df.mul takes matrix of shape (6,2) and vector (6, 1) and returns matrix shape (6,2)

While:

df.dot takes matrix of shape (6,2) and vector (2,1) and returns (6,1).

These are not the same operation, they are outer and inner products, respectively.

Solution 3

It's quite hard to say with a degree of accuracy.

Often, a method exists and is undocumented because it's considered internal by the vendor, and may be subject to change.

It could, of course, be a simple oversight by the folks who put together the documentation.

Regarding your second question; I don't really know about that - but it might be better to make a new S/O question for it. Just scanning the the API, could you do something with the DataFrame's .applymap(function) feature ?

Share:
18,637
Amelio Vazquez-Reina
Author by

Amelio Vazquez-Reina

I'm passionate about people, technology and research. Some of my favorite quotes: "Far better an approximate answer to the right question than an exact answer to the wrong question" -- J. Tukey, 1962. "Your title makes you a manager, your people make you a leader" -- Donna Dubinsky, quoted in "Trillion Dollar Coach", 2019.

Updated on June 23, 2022

Comments

  • Amelio Vazquez-Reina
    Amelio Vazquez-Reina almost 2 years

    I am unable to find the entry on the method dot() in the official documentation. However the method is there and I can use it. Why is this?

    On this topic, is there a way compute an element-wise multiplication of every row in a data frame with another vector? (and obtain a dataframe back?), i.e. similar to dot() but rather than computing the dot product, one computes the element-wise product.

  • Amelio Vazquez-Reina
    Amelio Vazquez-Reina about 11 years
    Thanks!. Do you know why dot() is not part of the official doc?
  • unutbu
    unutbu about 11 years
    There are two dot methods in Pandas. Series.dot is inherited from ndarray.dot, since Series is a subclass of NumPy's ndarray. You can find the documentation for that here. As for DataFrame.dot, my guess is that they simply haven't gotten around to documenting it. (It's behavior is pretty understandable by peeking at its definition in pandas/core/frame.py however.)
  • unutbu
    unutbu about 8 years
    Update: pandas.Series is no longer a subclass of ndarray, but it still has a dot method.
  • smci
    smci over 7 years
    Thanks, I found this the most helpful answer here. Better than the pandas doc also.