How do I search for a String in an array of Strings using binarySearch or another method?

26,278

Solution 1

From Wikipedia:

"In computer science, a binary search is an algorithm for locating the position of an element in a sorted list by checking the middle, eliminating half of the list from consideration, and then performing the search on the remaining half.[1][2] If the middle element is equal to the sought value, then the position has been found; otherwise, the upper half or lower half is chosen for search based on whether the element is greater than or less than the middle element."

So the prerequisite for binary search is that the data is sorted. It has to be sorted because it cuts the array in half and looks at the middle element. If the middle element is what it is looking for it is done. If the middle element is larger it takes the lower half of the array. If the middle element is smaller it the upper half of the array. Then the process is repeated (look in the middle etc...) until the element is found (or not).

If the data isn't sorted the algorithm cannot work.

So you would do something like:

final String[] data;
final int      index;

data = new String[] { /* init the elements here or however you want to do it */ };
Collections.sort(data);
index = Arrays.binarySearch(data, value);

or, if you do not want to sort it do a linear search:

int index = -1; // not found

for(int i = 0; i < data.length; i++)
{
    if(data[i].equals(value))
    {
        index = i;
        break; // stop looking
    }
}

And for completeness here are some variations with the full method:

// strict one - disallow nulls for everything
public <T> static int linearSearch(final T[] data, final T value)
{
    int index;

    if(data == null)
    {
        throw new IllegalArgumentException("data cannot be null");
    }

    if(value == null)
    {
        throw new IllegalArgumentException("value cannot be null");
    }

    index = -1;

    for(int i = 0; i < data.length; i++)
    {
        if(data[i] == null)
        {
            throw new IllegalArgumentException("data[" + i + "] cannot be null");
        }

        if(data[i].equals(value))
        {
            index = i;
            break; // stop looking
        }
    }    

    return (index);
}

// allow null for everything

public static <T> int linearSearch(final T[] data, final T value)
{
    int index;

    index = -1;

    if(data != null)
    {
        for(int i = 0; i < data.length; i++)
        {
            if(value == null)
            {
                if(data[i] == null)
                {
                    index = i;
                    break;
                }
            } 
            else
            {            
                if(value.equals(data[i]))
                {
                    index = i;
                    break; // stop looking
                }
            }
        }    
    }

    return (index);
}

You can fill in the other variations, like not allowing a null data array, or not allowing null in the value, or not allowing null in the array. :-)

Based on the comments this is also the same as the permissive one, and since you are not writing most of the code it would be better than the version above. If you want it to be paranoid and not allow null for anything you are stuck with the paranoid version above (and this version is basically as fast as the other version since the overhead of the method call (asList) probably goes away at runtime).

public static <T> int linearSearch(final T[] data, final T value)
{
    final int index;

    if(data == null)
    {
        index = -1;
    }
    else
    {
        final List<T> list;

        list  = Arrays.asList(data);
        index = list.indexOf(value);
    }

    return (index);
}

Solution 2

java.util.Arrays.sort(myArray);

That's how binarySearch is designed to work - it assumes sorting so that it can find faster.

If you just want to find something in a list in O(n) time, don't use BinarySearch, use indexOf. All other implementations of this algorithm posted on this page are wrong because they fail when the array contains nulls, or when the item is not present.

public static int indexOf(final Object[] array, final Object objectToFind, int startIndex) {
    if (array == null) {
        return -1;
    }
    if (startIndex < 0) {
        startIndex = 0;
    }
    if (objectToFind == null) {
        for (int i = startIndex; i < array.length; i++) {
            if (array[i] == null) {
                return i;
            }
        }
    } else {
        for (int i = startIndex; i < array.length; i++) {
            if (objectToFind.equals(array[i])) {
                return i;
            }
        }
    }
    return -1;
}
Share:
26,278
qodeninja
Author by

qodeninja

I write qode mostly for myself... out of curiosity for solving problems, understanding how things work or making (sometimes unnecessarily) complex systems to only simplify them later (once I discover alternative strategies). For whatever reason, I like torturing myself with Regular Expressions, SED, Bash and JavaScript (Node), but have found a growing (painful) love with Python. Having said that, I enjoy scripting languages a lot more than compiled languages, and I've coded in almost all of the major modern ones except Ruby. I'm a secret Turing Machine/Computer Grammars/Regular Expressions nerd, and have written my own mini compilers and toy languages. I'm constantly writing command dispatchers that I later write scripting languages for; it's an addiction. There's plenty room for me to grow and learn still; and I appreciate the wisdom of grey beards and lady wizards even if I don't always follow their sage advice. FOSS is hella cool; cool projects are cool. Find me online if you have ideas. I'm a really bad programmer but I'll write a line or two for the betterization of the peoples. Edit: I recently discoved that VI is really just SED with wings. Still not using VI. Nano or bust.

Updated on August 27, 2020

Comments

  • qodeninja
    qodeninja over 3 years

    Using binarySearch never returns the right index

    int j = Arrays.binarySearch(keys,key);
    

    where keys is type String[] and key is type String

    I read something about needing to sort the Array, but how do I even do that if that is the case?

    Given all this I really just need to know:

    How do you search for a String in an array of Strings (less than 1000) then?