Using char[] array in HashMap in java
Solution 1
I have a huge GBs of data to be read and if I use String, I am running out of the heap size.
You will have to think of a different way then.
You could use consider using a
TreeMap<char[], V>
with a customComparator
... but this is a stop-gap measure.You could implement a custom hash table-based
Map
class that usedchar[]
as the key, but did the array hashing and equality tests without using the key object'sequals(Object)
andhashcode()
methods. This also is a stop-gap measure.You could get a bigger machine ... another stop-gap measure.
You could redesign the algorithm so that it doesn't have to put all of the data into a big in-memory hash table in the address space of one Java program.
What you are currently doing doesn't scale. Even if you manage to implement a map using char[]
keys instead of String
keys, you probably only reduce the space used to hold the keys by half. The best this is going to give you is the ability to handle roughly twice the number of keys you can currently handle. Then you hit the wall again.
In the long term, the last option is the best if you want to keep scaling up.
Incidentally @Sean Patrick Floyd's suggestion of interning the key strings will probably only make matters worse. First, you won't ever get Strings that are equal but not == as keys in the same map. So interning doesn't save anything. Second, interning is performed using a private hash table, and the JVM needs to allocate space to represent this table.
The only scenario where interning is potentially worthwhile is if the strings that you are using to to do map lookups are likely to survive multiple GC cycles. Only then could interning save space.
Finally, there's one scenario which may cause your key strings to use far more memory than you realize. Consider this:
BufferedReader br = ...
Map<String, Value> map = new HashMap<String, Value>();
String line;
while ((line = br.readLine()) != null) {
...
String key = line.substring(...);
map.put(key, ...);
}
The problem is that the substring
method shares the same backing char[]
as the original String. If key
is long-lived (which it probably will be), this means that the original large backing array will also be long lived, even though we're only ever going to refer to a slice of that array it via the key
object.
The solution is to write this:
String key = new String(line.substring(...));
which forces the copying of the characters into a new (smaller) character array.
UPDATE - Changes to the implementation of java.lang.String
in Java 7 addressed this problem. The substring
methods now make a copy of the relevant slice of the backing array.
Solution 2
Printing is the least of your poblems, really. The real problem is that char[] does not have meaningful implementation of the equals()
and hashCode()
methods that HashMap
relies on. Therefore the map will not recognize different arrays with the same content as the same keys, which in all probablity defeats your purpose of using a hash map in the first place.
So you need to wrap your arrays in something that does have equals()
and hashCode()
. May I suggest wrapping them in java.lang.String
?
Solution 3
That's because arrays don't have a toString() method. What you are seeing is the Output of Object.toString():
return getClass().getName() + "@" + Integer.toHexString(hashCode());
In your case:
[C = internal name of char[].class
@ = "@"
31ff23 = your array's identity hash code as hey String
=1 = the value
BTW, they don't have a hashCode() implementation either, so they make bad keys for HashMaps (see this similar question about using byte arrays as HashMap keys).
Solution 4
char[]
's .toString()
method (which is used when printing map contents) is not implemented to display the contents of the array - it uses the default toString()
implementation which includes the class name ([C
in this case) followed by the hashcode, which is typically the (hex) memory address in the JVM
You shouldn't use arrays as keys in HashMap
, because they don't implement hashCode()
and equals()
based on the elements. Prefer String
.
![vamsi360](https://i.stack.imgur.com/v43Ha.jpg?s=256&g=1)
Comments
-
vamsi360 about 2 years
I am using a HashMap with a char array as the key. But when I put the key, value into the HashMap and print it the value printed as the key is some strange value. If I use a string instead, it is working fine. Please let me know how to make this work for char[] values.
while((ch = br.read()) != -1) { s = new char[10]; s[i++] = (char)ch; while( (i < 10) && ((ch = br.read()) != -1)) { s[i++] = (char)ch; } //System.out.println(s); hmap.put(s, 1); //System.out.println(hmap); i = 0; }
Key: BZOUA1578L Value: 1
The contents in the hashmap are {[C@31ff23=1} instead of {BZOUA1578L, 1}
-
jmj almost 13 years
char[]
is an Object which has.toString()
but not implemented to show the content as OP is expecting -
vamsi360 almost 13 yearsI have a huge GBs of data to be read and if I use String, I am running out of the heap size.
-
vamsi360 almost 13 yearsBut if I have a string of length 10 and if I use a string, it is not that efficient.
-
Sean Patrick Floyd almost 13 years@Subhash ok, but why do you need to use them as HashMap keys in the first place? Perhaps you could say what you are doing, so we can provide an alternative strategy.
-
vamsi360 almost 13 yearsI am having a fixed length string of 10 chars and I have lots of records of them. If I use a string, I am unable to process all of those as I am running out of heap space. I used a String Buffer to manipulate the data on each iteration but it prints the same values as the char array.
-
Bozho almost 13 yearswhy do you think it is not efficient? I can assure you it is. The string takes up almost as much memory as the array. If you need more, increase it with -Xmx
-
Sean Patrick Floyd almost 13 years@vamsi360 try interning your Strings, that way the non-interned Strings can be garbage collected:
map.put(string.intern(),1)
-
vamsi360 almost 13 yearsException in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Unknown Source) at java.lang.String.<init>(Unknown Source) at java.lang.StringBuilder.toString(Unknown Source)
-
Stephen C almost 13 yearsInterning won't help; see my answer.