Difference between np.dot and np.multiply with np.sum in binary cross-entropy loss calculation

python numpy neural-network sum difference

51,240

Solution 1

What you're doing is calculating the binary cross-entropy loss which measures how bad the predictions (here: A2) of the model are when compared to the true outputs (here: Y).

Here is a reproducible example for your case, which should explain why you get a scalar in the second case using np.sum

In [88]: Y = np.array([[1, 0, 1, 1, 0, 1, 0, 0]])

In [89]: A2 = np.array([[0.8, 0.2, 0.95, 0.92, 0.01, 0.93, 0.1, 0.02]])

In [90]: logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)

# `np.dot` returns 2D array since its arguments are 2D arrays
In [91]: logprobs
Out[91]: array([[-0.78914626]])

In [92]: cost = (-1/m) * logprobs

In [93]: cost
Out[93]: array([[ 0.09864328]])

In [94]: logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))

# np.sum returns scalar since it sums everything in the 2D array
In [95]: logprobs
Out[95]: -0.78914625761870361

Note that the np.dot sums along only the inner dimensions which match here (1x8) and (8x1). So, the 8s will be gone during the dot product or matrix multiplication yielding the result as (1x1) which is just a scalar but returned as 2D array of shape (1,1).

Also, most importantly note that here np.dot is exactly same as doing np.matmul since the inputs are 2D arrays (i.e. matrices)

In [107]: logprobs = np.matmul(Y, (np.log(A2)).T) + np.matmul((1.0-Y),(np.log(1 - A2)).T)

In [108]: logprobs
Out[108]: array([[-0.78914626]])

In [109]: logprobs.shape
Out[109]: (1, 1)

Return result as a scalar value

np.dot or np.matmul returns whatever the resulting array shape would be, based on input arrays. Even with out= argument it's not possible to return a scalar, if the inputs are 2D arrays. However, we can use np.asscalar() on the result to convert it to a scalar if the result array is of shape (1,1) (or more generally a scalar value wrapped in an nD array)

In [123]: np.asscalar(logprobs)
Out[123]: -0.7891462576187036

In [124]: type(np.asscalar(logprobs))
Out[124]: float

ndarray of size 1 to scalar value

In [127]: np.asscalar(np.array([[[23.2]]]))
Out[127]: 23.2

In [128]: np.asscalar(np.array([[[[23.2]]]]))
Out[128]: 23.2

Solution 2

np.dot is the dot product of two matrices.

|A B| . |E F| = |A*E+B*G A*F+B*H|
|C D|   |G H|   |C*E+D*G C*F+D*H|

Whereas np.multiply does an element-wise multiplication of two matrices.

|A B| ⊙ |E F| = |A*E B*F|
|C D|   |G H|   |C*G D*H|

When used with np.sum, the result being equal is merely a coincidence.

>>> np.dot([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 5,  8],
       [11, 18]])
>>> np.multiply([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 1,  4],
       [ 6, 12]])

>>> np.sum(np.dot([[1,2], [3,4]], [[1,2], [2,3]]))
42
>>> np.sum(np.multiply([[1,2], [3,4]], [[1,2], [2,3]]))
23

Solution 3

If Y and A2 are (1,N) arrays, then np.dot(Y,A.T) will produce a (1,1) result. It is doing a matrix multiplication of a (1,N) with a (N,1). The N's are summed, leaving the (1,1).

With multiply the result is (1,N). Sum all values, and the result is a scalar.

If Y and A2 were (N,) shaped (same number of elements, but 1d), the np.dot(Y,A2) (no .T) would also produce a scalar. From np.dot documentation:

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors

Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned.

squeeze reduces all size 1 dimensions, but still returns an array. In numpy an array can have any number of dimensions (from 0 to 32). So a 0d array is possible. Compare the shape of np.array(3), np.array([3]) and np.array([[3]]).

51,240

Author by

Asad Shakeel

I've joined Systems Limited as a Consultant - App Development in April 2021. Previously, I was working as a Software Engineer at Technosoft Solutions Inc. I had completed my studies at PUCIT (Punjab University College of Information Technology) Lahore, Pakistan. I had started that degree in 2014 and finished in 2018.

Updated on February 22, 2021

Comments

Asad Shakeel about 3 years

I have tried the following code but didn't find the difference between np.dot and np.multiply with np.sum

Here is np.dot code

logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)
print(logprobs.shape)
print(logprobs)
cost = (-1/m) * logprobs
print(cost.shape)
print(type(cost))
print(cost)

Its output is

(1, 1)
[[-2.07917628]]
(1, 1)
<class 'numpy.ndarray'>
[[ 0.693058761039 ]]

Here is the code for np.multiply with np.sum

logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))
print(logprobs.shape)         
print(logprobs)
cost = - logprobs / m
print(cost.shape)
print(type(cost))
print(cost)

Its output is

()
-2.07917628312
()
<class 'numpy.float64'>
0.693058761039

I'm unable to understand the type and shape difference whereas the result value is same in both cases

Even in the case of squeezing former code cost value become same as later but type remains same

cost = np.squeeze(cost)
print(type(cost))
print(cost)

output is

<class 'numpy.ndarray'>
0.6930587610394646

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Sigma Notation in Python

Keras - .flow_from_directory(directory)

Determine sum of numpy array while excluding certain values

numpy ValueError shapes not aligned

Python numpy array sum over certain indices

Find phase difference between two (inharmonic) waves

how to implement tensorflow's next_batch for own data

What's the best way to sum all values in a Pandas dataframe?

How to implement the ReLU function in Numpy

Differences between numpy.random.rand vs numpy.random.randn in Python