Java8: Create HashMap with character count of a String
Solution 1
Simplest way to count occurrence of each character in a string, with full Unicode support (Java 11+)1:
String word = "AAABBB";
Map<String, Long> charCount = word.codePoints().mapToObj(Character::toString)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(charCount);
1) Java 8 version with full Unicode support is at the end of the answer.
Output
{A=3, B=3}
UPDATE: For Java 8+ (doesn't support characters from supplemental planes, e.g. emoji):
Map<String, Long> charCount = IntStream.range(0, word.length())
.mapToObj(i -> word.substring(i, i + 1))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
UPDATE 2: Also for Java 8+.
I was mistaken, thinking that codePoints()
wasn't added until Java 9. It was added in Java 8 to the CharSequence
interface, so it doesn't show in javadoc for String
in Java 8, and shows as added in Java 9 for later versions of the javadoc.
However, the Character.toString(int codePoint)
method wasn't added until Java 11, so to use the Character.toString(char c)
method, we can use chars()
in Java 8:
Map<String, Long> charCount = word.chars().mapToObj(c -> Character.toString((char) c))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Or for full Unicode support, incl. supplemental planes, we can use codePoints()
and the String(int[] codePoints, int offset, int count)
constructor, in Java 8:
Map<String, Long> charCount = word.codePoints()
.mapToObj(cp -> new String(new int[] { cp }, 0, 1))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Solution 2
String str = "Hello Manash";
Map<Character,Long> hm = str.chars().mapToObj(c->
(char)c).collect(Collectors.groupingBy(c->c,Collectors.counting()));
System.out.println(hm);
Solution 3
Try the below approaches:
Approach 1:
String str = "abcaadcbcb";
Map<Character, Integer> charCount = str.chars()
.boxed()
.collect(toMap(
k -> (char) k.intValue(),
v -> 1, // 1 occurence
Integer::sum));
System.out.println("Char Counts:\n" + charCount);
Approach 2:
String str = "abcaadcbcb";
Map<Character, Integer> charCount = new HashMap<>();
for (char c : str.toCharArray()) {
charCount.merge(c, // key = char
1, // value to merge
Integer::sum); // counting
}
System.out.println("Char Counts:\n" + charCount);
Output:
Char Counts:
{a=3, b=3, c=3, d=1}
Solution 4
String str = "abcaadcbcb";
Map<String, Long> charCount =
Arrays.asList(str.split("")).stream().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
OTUser
Updated on July 19, 2022Comments
-
OTUser almost 2 years
Wondering is there more simple way than computing the character count of a given string as below?
String word = "AAABBB"; Map<String, Integer> charCount = new HashMap(); for(String charr: word.split("")){ Integer added = charCount.putIfAbsent(charr, 1); if(added != null) charCount.computeIfPresent(charr,(k,v) -> v+1); } System.out.println(charCount);
-
nice_dev about 5 yearsFor ANSI characters, you can just have an array of size 256 and compute it.
-
Andreas about 5 years@vivek_23 Which ANSI character set would that be? Or did you mean ASCII and 128?
-
Holger almost 4 years@vivek_23 that is the windows code page 1252, not ANSI. The Unicode standard matches the iso-latin-1 character set for the first 256 codepoints. Referring to the windows code page 1252 is an unnecessary complication, as that code page does not match in the 128-159 range.
-
nice_dev almost 4 years@Holger Ahh! Thanks for the correction. Deleted my previous comment to avoid confusion.
-
-
OTUser about 5 yearsAm sorry, is there a simple way for Java 8?
-
Andreas about 5 years
chars()
requires Java 9, and better solution usingcodePoints()
instead ofchars()
already posted 13 minutes earlier. -
mm6 about 5 years
-
Andreas about 5 yearsThat would be
CharSequence.chars()
, notString.chars()
, but I accept your correction. Javadoc for Java 11 show method as added toString
in Java 9, which is what lead me astray. -
Holger almost 4 years
charCount.put(charr,charCount.getOrDefault(charr,0)+1);
can be simplified tocharCount.merge(charr, 1, Integer::sum);
By the way, you should usenew HashMap<>()
… -
Holger almost 4 yearsSpeaking of “full Unicode support” and Emojis, it’s worth pointing out that even using codepoints is not necessarily providing the intended semantics. E.g.
"ā̧👩🇮🇩"
has 10 chars, 7 codepoints, but only three characters; the first one demonstrates that this is not only an Emoji issue. The only solution, I currently know of, is to process grapheme clusters, e.g. with Java 9+:Pattern.compile("\\X").matcher(example).results() .collect(Collectors.groupingBy(MatchResult::group, Collectors.counting()))
. -
user2901351 over 2 yearsMIght you format your code snippet as coded, to allow for greater readability? Thanks.
-
Admin over 2 yearsAs it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
-
Admin over 2 yearsYour answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.