get prime numbers from numpy array

20,340

Solution 1

The way that I would do it is with gmpy or a 3rd party library who has developed a good primality test algorithm. The Miller-Rabin primality test is generally a very safe (and fast!) bet. If you just want the slow way, you can do:

import numpy as np
import math

def is_prime(n):
    if n % 2 == 0 and n > 2: 
        return False
    return all(n % i for i in range(3, int(math.sqrt(n)) + 1, 2))

a = np.arange(1, 10**3)
foo = np.vectorize(is_prime)
pbools = foo(a)
primes = np.extract(pbools, a)
primes  # => Output below
array([  1,   2,   3,   5,   7,  11,  13,  17,  19,  23,  29,  31,  37,
        41,  43,  47,  53,  59,  61,  67,  71,  73,  79,  83,  89,  97,
       101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163,
       167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233,
       239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,
       313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389,
       397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463,
       467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563,
       569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641,
       643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727,
       733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821,
       823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907,
       911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997])

If you want to filter OUT the primes, just call np.invert on the pbools variables. The same would go for any predicate. You can also pass a lambda into vectorize. For example, say we wanted only prime numbers that are also 1 away from being divisible by 5 (for whatever reason).

import numpy as np
import math

def is_prime(n):
    if n % 2 == 0 and n > 2: 
        return False
    return all(n % i for i in range(3, int(math.sqrt(n)) + 1, 2))

a = np.arange(1, 10**3)
foo = np.vectorize(lambda x: (not (x + 1) % 5 or not (x - 1) % 5) and is_prime(x))
primes = a[foo(a)]  # => Shorthand.... Output below
array([  1,  11,  19,  29,  31,  41,  59,  61,  71,  79,  89, 101, 109,
   131, 139, 149, 151, 179, 181, 191, 199, 211, 229, 239, 241, 251,
   269, 271, 281, 311, 331, 349, 359, 379, 389, 401, 409, 419, 421,
   431, 439, 449, 461, 479, 491, 499, 509, 521, 541, 569, 571, 599,
   601, 619, 631, 641, 659, 661, 691, 701, 709, 719, 739, 751, 761,
   769, 809, 811, 821, 829, 839, 859, 881, 911, 919, 929, 941, 971, 991])

Solution 2

If you care of speed and efficiency I would recommend you to use one of the fastest prime sieves and numpy.intersect1d() function:

import numpy as np

def primesfrom2to(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = np.ones(n//3 + (n%6==2), dtype=np.bool)
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[      ((k*k)//3)      ::2*k] = False
            sieve[(k*k+4*k-2*k*(i&1))//3::2*k] = False
    return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]

# generate 100.000 random integers from 1 to 1.000.000.000
a1 = np.random.randint(1, 10**9, 100000)
# generate all primes that are equal or less than a1.max() 
primes = primesfrom2to(a1.max())

# print result    
print(np.intersect1d(primes, a1))

Solution 3

Having a setup like this:

import numpy as np
import math
nums = np.array([17, 18, 19, 20, 21, 22, 23])

So now we create an array that contains all possible integer candidates:

divisors = np.arange(2,int(math.sqrt(np.max(nums)))+1) # Numbers from 2 to sqrt(max(nums))
print(divisors)
# [2 3 4]

Now apply the modulo operation on the array but with a different dimension so we check each number with each divisor:

print(nums[:,None] % divisors[None,:]) # Modulo operation on each element (0 means divisible)
[[1 2 1]
 [0 0 2]
 [1 1 3]
 [0 2 0]
 [1 0 1]
 [0 1 2]
 [1 2 3]]

Now how do we get the primes ... we check if there is no result in the line that is zero:

print(np.min(nums[:,None] % divisors[None,:], axis=1)) # Minimum of the modulo for that element
# [1 0 1 0 0 0 1]

and then create a mask to index them:

print(nums[np.min(nums[:,None] % divisors[None,:], axis=1) > 0]) # So index them
# [17 19 23]

So all you need in the end is:

nums = np.array([17, 18, 19, 20, 21, 22, 23])
divisors = np.arange(2,int(math.sqrt(np.max(nums)))+1)
nums[np.min(nums[:,None] % divisors[None,:], axis=1) > 0]

all the other stuff is just for illustrating what each step is doing.

This is not trivial since it uses broadcasting of 1D arrays into a 2D array but the method should be clear. Let me know if you have any questions.


If you want to optimize this, there is another possibility: The divisors are currently every number between 2 and the sqrt(max(array)) but you don't need to test for all these numbers. If you had a function that returned all primes in that range that would be enough. For example using primesfrom2to of @MaxU's answer a faster possibility is:

nums = np.array([17, 18, 19, 20, 21, 22, 23])
# All prime numbers in the range from 2 to sqrt(max(nums))
divisors = primesfrom2to(int(math.sqrt(np.max(nums)))+1)
nums[np.min(nums[:,None] % divisors[None,:], axis=1) > 0]

but it uses the same mechanism as before but is a bit faster. :-)

Solution 4

It appears that you question is not about primes at all, but about how to apply a function to numpy array. I used simple is_odd example. Maybe np.vectorize is what you are looking for:

In [34]: nums = np.array([17, 18, 19, 20, 21, 22, 23])

In [35]: def is_odd(n):
    if n % 2 == 1:
        return True
    return False
   ....: 

In [36]: is_odd_v = np.vectorize(is_odd)

In [37]: nums[is_odd_v(nums)]
Out[37]: array([17, 19, 21, 23]

If I recall correctly np.vectorize is used primarily for convenience, and does not have great performance.

Share:
20,340
Admin
Author by

Admin

Updated on January 04, 2022

Comments

  • Admin
    Admin over 2 years

    I have a numpy array like,

    nums = np.array([17, 18, 19, 20, 21, 22, 23])
    

    How do I filter out the prime numbers from this array in a pythonic manner? I know to do a simple filtering like,

    nums[nums > 20]   #array([21, 22, 23])
    

    Is there a way to pass a lambda function for filtering?

    Expected output: array([17, 19, 23])

  • Admin
    Admin about 8 years
    But this will not always return prime? It'll return 39 but it's not a prime though
  • Akavall
    Akavall about 8 years
    @kmario23 This will not return primes, this will return odd numbers. But if you function wrote is_prime function yourself. It would return you a filtered np.array of primes. Of course my solution is naive and not optimized.
  • JeD
    JeD about 8 years
    Why the downvote? I know it is ugly and unoptimized, but it works and satisfies the question asked.