Fastest way to find if int array contains a number

21,305

Solution 1

No, there is no faster way unless the array of integers is already sorted, which I doubt given it's an array of colours.

To scan through an unsorted array takes linear time "O(n)". That's what you do, and you exit the method as soon as a match is found which is good too.

Solution 2

Without switching to some other data structure, no, there is no better way to find whether the array contains that value. You have to look at all the array elements to see if it's there, since if you don't check some particular location you might miss the one copy of that pixel color.

That said, there are alternative ways that you could solve this problem. Here are a few thoughts on how to speed this up:

  • If every value is guaranteed to be either white or black, you could store two extra boolean values alongside the array representing whether there are white or black pixels. That way, once you've run the scan once, you could just read the booleans back. You could also store a count of the number of white and black pixels along with the array, and then whenever you write a pixel update the count by decrementing the number of pixels of the original color and incrementing the number of pixels of the new color. This would then give you the ability to check if a pixel of a given color exists in O(1) by just seeing if the correct counter is nonzero.

  • Alternatively, if you happen to know something about the image (perhaps where the white and black pixels ought to be), you could consider doing the iteration in a different order. For example, if the pixels you're looking for tend to be clustered in the center of the image, rewriting the loop to check there first might be a good idea since if there are any pixels of that type you'll find them more rapidly. This still has the same worst-case behavior, but for "realistic" images might be much faster.

  • If you have multiple threads available and the array is really huge (millions of elements), you could consider having multiple threads each search a part of the array for the value. This would only be feasible if you had a reason to suspect that most of the image was not white.

  • Since in most realistic images you might assume that the image is a mixture of colors and you're just looking for something of one color, then you might want to consider storing the image as a sparse array, where you store a list of the pixels that happen to be of one color (say, white) and then assume everything else is black. If you expect most images to be a solid color with a few outliers, this might be a very good representation. Additionally, it would give you constant-time lookup of whether any black or white pixels exist - just check if the list of set pixels is empty or consists of the entire image.

  • If the order doesn't matter, you could also store the elements in some container like a hash table, which could give you O(1) lookup of whether or not the element is there. You could also sort the array and then just check the endpoints.

  • As a microoptimization, you could consider always appending to the real image two values - one white pixel and one black pixel - so that you could always iterate until you find the value. This eliminates one of the comparisons from the loop (the check to see if you're in-bounds) and is recommended by some authors for very large arrays.

  • If you assume that most images are a nice mixture of white and black and are okay with getting the wrong answer a small fraction of the time, you could consider probing a few random locations and checking if any of them are the right color. If so, then clearly a pixel of the correct color exists and you're done. Otherwise, run the full linear scan. For images that are a nice blend of colors, this could save you an enormous amount of time, since you could probe some small number of locations (say, O(log n) of them) and end up avoiding a huge linear scan in many cases. This is exponentially faster than before.

  • If every value is either white or black, you could also consider storing the image in a bitvector. This would compress the size of the array by a factor of the machine word size (probably between 32-128x compression) You could then iterate across the compressed array and see if any value is not identically equal to 0 to see if any of the pixels are white. This also saves a huge amount of space, and I'd actually suggest doing this since it makes a lot of other operations easy as well.

Hope this helps!

Solution 3

It doesn't matter at the bytecode level, but at the native-code level,

if (pixels[i] != 0)

is likely to be a bit faster, given that you're sure only these two values can appear.

Solution 4

If your array is really big, it might be worth it to divide and conquer. That is, assign segments of the data to multiple threads (probably t threads where t is the number of available processor cores). With a sufficiently large data set, the parallelism may amortize the thread startup cost.

Solution 5

Here is the simple optimization that helps on large arrays: put the requested value at the end of the array and thus eliminate array bounds check. (templatetypedef has already mentioned this optimization.) This solution saves 25% of loop running time and it is good for large arrays:

tmp = a[n - 1]
a[n - 1] = 0xFFFFFFFF

pos = 0
while a[pos] != 0xFFFFFFFF
    pos = pos + 1

a[n - 1] = tmp

if a[pos] = 0xFFFFFFFF then
    return pos
return -1

There is the C# implementation with running time analysis on this address.

Share:
21,305

Related videos on Youtube

Kyle Emmerich
Author by

Kyle Emmerich

Updated on June 21, 2020

Comments

  • Kyle Emmerich
    Kyle Emmerich almost 4 years

    This is an odd question. I have an integer array in Java, where each int represents a color. They will either be 0xFFFFFFFF or 0x0. What would be the FASTEST way to find if this array contains ANY values equal to 0xFFFFFFFF?

    This is my current code:

    int length = w * h;
    for (int i = 0; i < length; i++) {
        if (pixels[i] == 0xFFFFFFFF) {
            return true;
        }
    }
    

    I have no clue if there is a faster way to do this or not. I imagine you vets could have a trick or two though.

    EDIT: Seeing as it is just a dumb array of pixels from Bitmap.getPixels(), there's no way it would be sorted or transformed to another storage structure. Thanks for the input, everyone, it seems like looping through is the best way in this case.

  • Kyle Emmerich
    Kyle Emmerich over 12 years
    That's what I was thinking, never hurts to find some new knowledge if possible, though. Thanks!
  • Connor Doyle
    Connor Doyle over 12 years
    While this approach would be reasonable in C, I don't think this code will compile in Java. javac doesn't allow an implicit conversion from int to boolean. In either case, the result of the expression in the test clause of the if statement is compared to the static constant zero. The only difference is that in C (which lacks a boolean type) this comparison does not need to be made explicit.
  • sbridges
    sbridges over 12 years
    this is actually a lot slower, since it auto boxes each pix value. the for each also does not eliminate the bounds check.