How to get rid of "RuntimeWarning: invalid value encountered in greater"

python-3.x numpy

12,466

Solution 1

You would have that warning whenever an array containing at least one NaN is compared. The solution would be to use masking to compare only the non-NaN elements and we would try to have a generic solution to cover all types of comparisons with the help of comparison based NumPy ufuncs, as shown below -

def compare_nan_array(func, a, thresh):
    out = ~np.isnan(a)
    out[out] = func(a[out] , thresh)
    return out

The idea being :

Get the mask of non-NaNs.
Use that to get the non-NaN values from input array. Then perform the required comparison (greater than, greater than equal to, etc.) to get another mask, which represents the compared mask output for the masked places.
Use this to refine the mask of non-NaNs and this is the final output.

Sample run -

In [41]: np.random.seed(0)

In [42]: a = np.random.randint(0,9,(4,5)).astype(float)

In [43]: a.ravel()[np.random.choice(a.size, 16, replace=0)] = np.nan

In [44]: a
Out[44]: 
array([[ nan,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,   4.,   7.],
       [ nan,  nan,  nan,   1.,  nan],
       [ nan,   7.,  nan,  nan,  nan]])

In [45]: a > 5  # Shows warning with the usual comparison
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[45]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

# With suggested masking based method
In [46]: compare_nan_array(np.greater, a, 5)
Out[46]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

Let's test out the generic behavior by testing for lesser than 5 -

In [47]: a < 5
__main__:1: RuntimeWarning: invalid value encountered in less
Out[47]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)

In [48]: compare_nan_array(np.less, a, 5)
Out[48]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)

Solution 2

There is a better way - you don't want to suppress the warning forever, because it could help you find other mistakes later on.

Following the suggestions found in this question: RuntimeWarning: invalid value encountered in divide

The Right Way:

If the result is the one you want, you can just write:

with np.errstate(invalid='ignore'):
    result = (array > 0.5)

# ... use result, and your warnings are not suppressed.

A different Wrong Way:

Otherwise, you could meet your restrictions by copying the array:

to_compare = array.copy()
to_compare[np.isnan(to_compare)] = 0.5  # you don't need -np.inf, anything <= 0.5 is OK
result = (to_compare > 0.5)

And you don't need to "recover" the NaNs in your array.

12,466

tjiagoM

Nothing interesting, honestly. One day I might have more to say about me, who knows.

Updated on June 04, 2022

Comments

tjiagoM almost 2 years
This question is very similar to a lot of questions related with the warning RuntimeWarning: invalid value encountered in greater/less/etc

However, I couldn't find a solution for my particular problem, and I think there should be one.

So, I have a numpy.ndarray similar to this one:
```
array([[ nan,   1.,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ..., 
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]])
```
I want to calculate array > 0.5, which gives exactly the result I want, but with the warning for being comparing with nan:
```
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[68]: 
array([[False,  True, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ..., 
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]], dtype=bool)
```
I basically want to calculate array > 0.5, but without the warning showing up.

My restrictions:
- I do NOT want to just suppress the warning with with np.errstate(invalid='ignore'):
- I need to maintain the original array, thus I cannot change it.
I have come up with a simple solution:
- Change the nan in the original matrix (array[np.isnan(array)] = -np.inf), recovering it back after I do my comparison (array[array == -np.inf] = np.nan)
But I think it is just a waste of time all these calculations when (I think) it should exist a direct way to do this at once. I have been exploring the numpy.ma module and the numpy.where function, but I couldn't find this "direct" solution which I want.

Any thoughts on this?
- user2357112 over 6 years
  
  "In a future release of numpy this result could change" - that is extremely unlikely to happen, and if it did, you'd have to rethink all your NaN handling anyway.
tjiagoM over 6 years

Damn! The solution is actually SO simple. I have been following all your edits eheh Thank you so much, that's exactly what I wanted!
Lukas over 5 years

But the readability is reduced. What about setting this to be the default way?
Tomasz Gandor over 4 years

This is not what I call "making the warning go away". It's carefully avoiding operations which raise this warning. And it has overhead both in the code and in the runtime. However, short of hacking NumPy's source code, you can actually do np.seterr(invalid='ignore').
Divakar over 4 years

@TomaszGandor OP has already tried that and doesn't want to do so. It's under My restrictions: section in the question.
Alfred Wallace over 3 years

Solution can also be applied to np.sign() when you're are trying to get the sign from an array with nan values. Error would be RuntimeWarning: invalid value encountered in sign. Instead of df['b'] = np.sign(df['a']), you can use dfS['b'] = evaluate_nan_array(np.sign, df['a']) where def evaluate_nan_array(func, a): out = ~np.isnan(a) out[out] = func(a[out]) return out