How to invert numpy.where (np.where) function

10,625

Solution 1

Something like this maybe?

mask = np.zeros(X.shape, dtype='bool')
mask[ix] = True

but if it's something simple like X > 0, you're probably better off doing mask = X > 0 unless mask is very sparse or you no longer have a reference to X.

Solution 2

The bottom of the np.where docstring suggests to use np.in1d for this.

>>> x = np.array([1, 3, 4, 1, 2, 7, 6])
>>> indices = np.where(x % 3 == 1)[0]
>>> indices
array([0, 2, 3, 5])
>>> np.in1d(np.arange(len(x)), indices)
array([ True, False,  True,  True, False,  True, False], dtype=bool)

(While this is a nice one-liner, it is a lot slower than @Bi Rico's solution.)

Solution 3

mask = X > 0
imask = np.logical_not(mask)

For example

Edit: Sorry for being so concise before. Shouldn't be answering things on the phone :P

As I noted in the example, it's better to just invert the boolean mask. Much more efficient/easier than going back from the result of where.

Share:
10,625
Setjmp
Author by

Setjmp

I have been working in the digital marketing space since 2010 building prediction and optimization products. Before that, did quant finance for a bunch of years including such storied places as Amaranth Advisors and WorldQuant where I was one of the original employees. This was the mid 2000s and a very exciting period in the field. Along the way I have spent many years in academia and produced a number of publications including some early work in Recommender Systems. A theme in all my industrial work has been prediction and optimization as work product.

Updated on June 11, 2022

Comments

  • Setjmp
    Setjmp over 1 year

    I frequently use the numpy.where function to gather a tuple of indices of a matrix having some property. For example

    import numpy as np
    X = np.random.rand(3,3)
    >>> X
    array([[ 0.51035326,  0.41536004,  0.37821622],
       [ 0.32285063,  0.29847402,  0.82969935],
       [ 0.74340225,  0.51553363,  0.22528989]])
    >>> ix = np.where(X > 0.5)
    >>> ix
    (array([0, 1, 2, 2]), array([0, 2, 0, 1]))
    

    ix is now a tuple of ndarray objects that contain the row and column indices, whereas the sub-expression X>0.5 contains a single boolean matrix indicating which cells had the >0.5 property. Each representation has its own advantages.

    What is the best way to take ix object and convert it back to the boolean form later when it is desired? For example

    G = np.zeros(X.shape,dtype=np.bool)
    >>> G[ix] = True
    

    Is there a one-liner that accomplishes the same thing?