How to store 100 million integers in a Java array?
Solution 1
Use an associative array. The key is an integer, and the value is the count (the number of times the integer has been added to the list).
This should get you some decent space savings if the distribution is relatively random, much more so if it's not.
Solution 2
If you need to store 1 billion completely random integers then I am afraid that you really do need to corresponding space, i.e. about 4GB of memory for 32-bit int
numbers. You can try increasing the JVM heap space but you need to have a 64-bit OS and at least as much physical memory - and there is only so far that you can go.
On the other hand, you might be able to store those number more efficiently if you can make use of specific constraints within your application.
E.g. if you only need to know if a specific int
is contained in a set, you could get away with a bit set - i.e. a single bit for each value in the int
range. That is about 4 billion bits, i.e. 512 MB - a far more reasonable space requirement. For example, a handful of BitSet objects could cover the whole 32-bit integer range without you having to write any bit-handling code...
Solution 3
May be using memory mapped files will help? They are not allocated from the heap. Here is an article how to create a matrix. An array should be easier.
Using a memory mapped file for a huge matrix - Peter Lawrey
user2035996
Updated on June 06, 2022Comments
-
user2035996 about 2 years
I am working on a small task where I am required to store around 1 billion integers in an Array. However, I am running into a heap space problem. Could you please help me with this?
Machine Details : Core 2 Duo Processor with 4 GB RAM. I have even tried -Xmx 3072m . Is there any work around for this? The same thing works in C++ , so there should definitely be a way to store this many numbers in memory.
Below is the code and the exception I am getting :
public class test { private static int C[] = new int[10000*10000]; public static void main(String[] args) { System.out.println(java.lang.Runtime.getRuntime().maxMemory()); } }
Exception : Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at test.(test.java:3)
-
Audrius Meškauskas over 11 yearsRAM is not the only memory we have. There is also swap.
-
thkala over 11 years@AudriusMeškauskas: yes, but as far as I know, the JVM will absolutely refuse to allocate more heap that what is available as physical RAM...
-
Audrius Meškauskas over 11 yearsGood idea. Really, it should be many repetitive values in the array of this size.
-
thkala over 11 years@Dave: a Hashtable, at least as implemented in the Java libraries is one of the worst methods w.r.t. space efficiency. Your approach would only make sense when you have a only few integers with high appearance frequency...
-
Audrius Meškauskas over 11 yearsI am not convinced. Where have you seen this documented? Difference between RAM and swap kind of should be transparent for the program. But probably will be slow.
-
thkala over 11 years@Dave: that's the point: most associative arrays have a significant overhead per-entry. Unless a lot of the integers are repeated, a primitive
int[]
may still be better... -
Dave over 11 years@thkala: it's only better until you run out of space :)
-
thkala over 11 years@AudriusMeškauskas: Sorry, this indeed does seem to be the case in modern JVMs. My JVM will happily start with
-Xmx80g
which is actually more than both RAM+swap on my system... -
thkala over 11 yearsOf course, depending on what one is doing with all those
int
s, using swap space might be little better than doing the computations by hand :-) -
phuclv about 10 yearsthe address space of a 32-bit process is only 2GB (3GB with large address aware enabled)