Java- how to parse for words in a string for a specific word

12,993

Solution 1

To just find the substring, you can use contains or indexOf or any other variant:

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html

if( s.contains( word ) ) {
   // ...
}

if( s.indexOf( word2 ) >=0 ) {
   // ...
}

If you care about word boundaries, then StringTokenizer is probably a good approach.

https://docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html

You can then perform a case-insensitive check (equalsIgnoreCase) on each word.

Solution 2

Looks like a job for Regular Expressions. Contains would give a false positive on, say, "hire-purchase".

if (Pattern.match("\\bhi\\b", stringToMatch)) { //...

Solution 3

I'd go for the java.util.StringTokenizer: https://docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html

StringTokenizer st = new StringTokenizer(
    "Hi, how are you?", 
    ",.:?! \t\n\r"       //whitespace and puntuation as delimiters
);
 while (st.hasMoreTokens()) {
     if(st.nextToken().equals("Hi")){
         //matches "Hi"
     }
 }

Alternatively, take a look at java.util.regex and use regular expressions.

Share:
12,993
Jackson Curtis
Author by

Jackson Curtis

Updated on June 08, 2022

Comments

  • Jackson Curtis
    Jackson Curtis almost 2 years

    How would I parse for the word "hi" in the sentence "hi, how are you?" or in parse for the word "how" in "how are you?"?

    example of what I want in code:

    String word = "hi";
    String word2 = "how";
    Scanner scan = new Scanner(System.in).useDelimiter("\n");
    String s = scan.nextLine();
    if(s.equals(word)) {
    System.out.println("Hey");
    }
    if(s.equals(word2)) {
    System.out.println("Hey");
    }
    
  • Anon.
    Anon. about 14 years
    A hit-and-run downvote with no explanation? Are you really trying to improve SO, or just throwing away your own rep to try and hurt others'?
  • fmunshi
    fmunshi about 14 years
    The javadoc for StringTokenizer contains the sentence: "StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead."
  • Roland Bouman
    Roland Bouman about 14 years
    Simon Nickerson: thanks for pointing that out, I didn't realize. Pity they favour split since that seems to do all the work up front
  • Jackson Curtis
    Jackson Curtis about 14 years
    Wow! Thats exactly what I was looking for! btw: in my actual version i had converted it to lowercase already, I just simplified it for the question! Thanks again!
  • Jackson Curtis
    Jackson Curtis about 14 years
    Hey sorry, didnt see that there were other answers down here :p I tried it, but it doesnt seem to work at all... any thing that I might have done possibly wrong? btw: it gives me an error when i use "match" so i use "matches"
  • Jackson Curtis
    Jackson Curtis about 14 years
    what would happen if the user just typed in "hi"? there is no " " anymore after it.
  • Roland Bouman
    Roland Bouman about 14 years
    @Custard: have you tried? For me, the string tokenizer correctly passes "hi" on nextToken()
  • Jackson Curtis
    Jackson Curtis about 14 years
    I havent, (sorry), but I am interested! Ill get to it tomorrow!
  • Amir Raminfar
    Amir Raminfar over 10 years
    +1 Except you need to double escape \\b for it work correctly. Updating answer.