Calculating frequency of each word in a sentence in java

89,819

Solution 1

Use a map with word as a key and count as value, somthing like this

    Map<String, Integer> map = new HashMap<>();
    for (String w : words) {
        Integer n = map.get(w);
        n = (n == null) ? 1 : ++n;
        map.put(w, n);
    }

if you are not allowed to use java.util then you can sort arr using some sorting algoritm and do this

    String[] words = new String[arr.length];
    int[] counts = new int[arr.length];
    words[0] = words[0];
    counts[0] = 1;
    for (int i = 1, j = 0; i < arr.length; i++) {
        if (words[j].equals(arr[i])) {
            counts[j]++;
        } else {
            j++;
            words[j] = arr[i];
            counts[j] = 1;
        }
    }

An interesting solution with ConcurrentHashMap since Java 8

    ConcurrentMap<String, Integer> m = new ConcurrentHashMap<>();
    m.compute("x", (k, v) -> v == null ? 1 : v + 1);

Solution 2

In Java 8, you can write this in two simple lines! In addition you can take advantage of parallel computing.

Here's the most beautiful way to do this:

Stream<String> stream = Stream.of(text.toLowerCase().split("\\W+")).parallel();

Map<String, Long> wordFreq = stream
     .collect(Collectors.groupingBy(String::toString,Collectors.counting()));

Solution 3

Try this

public class Main
{

    public static void main(String[] args)
    {       
        String text = "the quick brown fox jumps fox fox over the lazy dog brown";
        String[] keys = text.split(" ");
        String[] uniqueKeys;
        int count = 0;
        System.out.println(text);
        uniqueKeys = getUniqueKeys(keys);

        for(String key: uniqueKeys)
        {
            if(null == key)
            {
                break;
            }           
            for(String s : keys)
            {
                if(key.equals(s))
                {
                    count++;
                }               
            }
            System.out.println("Count of ["+key+"] is : "+count);
            count=0;
        }
    }

    private static String[] getUniqueKeys(String[] keys)
    {
        String[] uniqueKeys = new String[keys.length];

        uniqueKeys[0] = keys[0];
        int uniqueKeyIndex = 1;
        boolean keyAlreadyExists = false;

        for(int i=1; i<keys.length ; i++)
        {
            for(int j=0; j<=uniqueKeyIndex; j++)
            {
                if(keys[i].equals(uniqueKeys[j]))
                {
                    keyAlreadyExists = true;
                }
            }           

            if(!keyAlreadyExists)
            {
                uniqueKeys[uniqueKeyIndex] = keys[i];
                uniqueKeyIndex++;               
            }
            keyAlreadyExists = false;
        }       
        return uniqueKeys;
    }
}

Output:

the quick brown fox jumps fox fox over the lazy dog brown
Count of [the] is : 2
Count of [quick] is : 1
Count of [brown] is : 2
Count of [fox] is : 3
Count of [jumps] is : 1
Count of [over] is : 1
Count of [lazy] is : 1
Count of [dog] is : 1

Solution 4

import java.util.*;

public class WordCounter {

    public static void main(String[] args) {

        String s = "this is a this is this a this yes this is a this what it may be i do not care about this";
        String a[] = s.split(" ");
        Map<String, Integer> words = new HashMap<>();
        for (String str : a) {
            if (words.containsKey(str)) {
                words.put(str, 1 + words.get(str));
            } else {
                words.put(str, 1);
            }
        }
        System.out.println(words);
    }
}

Output: {a=3, be=1, may=1, yes=1, this=7, about=1, i=1, is=3, it=1, do=1, not=1, what=1, care=1}

Solution 5

From Java 10 you can use the following:

import java.util.Arrays;
import java.util.stream.Collectors;

public class StringFrequencyMap {
    public static void main(String... args){
        String[] wordArray = {"One", "One", "Two","Three", "Two", "two"};
        var freq = Arrays.stream(wordArray)
                         .collect(Collectors.groupingBy(x -> x, Collectors.counting()));
        System.out.println(freq);
    }
}

Output:

{One=2, two=1, Two=2, Three=1}
Share:
89,819
Sigma
Author by

Sigma

Updated on July 09, 2022

Comments

  • Sigma
    Sigma almost 2 years

    I am writing a very basic java program that calculates frequency of each word in a sentence so far i managed to do this much

    import java.io.*;
    
    class Linked {
    
        public static void main(String args[]) throws IOException {
    
            BufferedReader br = new BufferedReader(
                new InputStreamReader(System.in));
            System.out.println("Enter the sentence");
            String st = br.readLine();
            st = st + " ";
            int a = lengthx(st);
            String arr[] = new String[a];
            int p = 0;
            int c = 0;
    
            for (int j = 0; j < st.length(); j++) {
                if (st.charAt(j) == ' ') {
                    arr[p++] = st.substring(c,j);
                    c = j + 1;
                }
            }
        }
    
        static int lengthx(String a) {
            int p = 0;
            for (int j = 0; j < a.length(); j++) {
                if (a.charAt(j) == ' ') {
                    p++;
                }
            }
            return p;
        }
    }
    

    I have extracted each string and stored it in a array , now problem is actually how to count the no of instances where each 'word' is repeated and how to display so that repeated words not get displayed multiple times , can you help me in this one ?

  • Sigma
    Sigma about 10 years
    Is there any other way to accomplish this , Because For my examination i am not allowed to use java.utill
  • Evgeniy Dorofeev
    Evgeniy Dorofeev about 10 years
    OK, see my no map version
  • Sigma
    Sigma about 10 years
    does this counts alphabets ?
  • Zeeshan
    Zeeshan about 10 years
    97 is the ascii for a, and so on 122 is ascii for z.
  • Sigma
    Sigma about 10 years
    i dont want to do this i want to count words not alphabets for that matter
  • grepit
    grepit almost 10 years
    @EvgeniyDorofeev , thanks the logic looks really good but where is "arr" defined ?
  • Petter Friberg
    Petter Friberg about 8 years
    Welcome to Stack Overflow! While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.
  • user25976
    user25976 about 8 years
    @EvgeniyDorofeev This word really well for me, but I'm confused about the syntax here : n = (n == null) ? 1 : ++n; I'm new to Java, can you explain how it works or tell me where I can find out?
  • Kumar Ayush
    Kumar Ayush about 7 years
    I just need the output now in group formatting, Any idea how to do it. Thanks in advance.
  • Ebony Maw
    Ebony Maw about 7 years
    @user25976 That is called a ternary operator. It is like an if-then-else statement. It is expressed as follows: (boolean condition)?(then):(else) In this case, we are checking to see if n equals null, if this condition is true, then we set the value of n to 1. If it is false, then we pre-increment the value of n by one.
  • Ebony Maw
    Ebony Maw about 7 years
    @EvgeniyDorofeev Is it possible to utilize java.util.Collections.frequency() method to achieve what Sigma is asking?
  • K. Symbol
    K. Symbol about 4 years
    Though valid, the new "var" looks strange in Java :-)