One of the fascinating things about our data structures is that even though we’re so familiar with them, it’s still hard for us to tell exactly how many items something as basic as a HashMap can hold in a 1GB of memory. We may learn this at school, from senior developers, or the hard way when our servers collapse due to poor choice of data structures.


So I decided to do just that. I took ~20 of the most popular Java, Scala, Guava and Trove collections and tested to see how many random ints each of them can hold in a JVM with 1GB of memory (via -Xmx). For each data structure, we appended ints until we received an OutOfMemoryError, which ended the test. We ran each test 5 times on JDK 7 to ensure consistency. For such a fundamental test I found some of  the results to be quite surprising.

* I assume we’re all on the same page in that this isn’t a competition. Different collections have different semantics. I also didn’t include times, as the focus isn’t on micro-benchmarking performance, but on giving us a sense of how much the collections we use everyday can actually hold.

The Results

  • Scala’s collections have greater capacity than Java’s. Scala collections seem to be slightly more efficient than their Java counterparts. While some collections such as TreeSets performed roughly the same, others such as Scala’s HashMaps were able to hold close to 15% more items. HashSets were able to hold 8% more items than their Java counterparts. I’d love to hear from the community why they think that is. Scala’s ArrayBuffer has a slight advantage over its ArrayList counterpart.
  • The exception to that is Scala’s linked list which was able to hold 18% less data than Java’s LinkedList. A constraint here was that in order to append to the list it needed to receive another linked list instead of the direct value. Even so, that in itself should not affect the list’s capacity for holding within the 1GB JVM assuming the temporary list is GC’d.

how much data_3 (1)

* We’re counting the number of ints held, which means 2 ints per map insertion.

  • If you’re not using Trove you’re missing out. We’ve been using Trove’s collections on OverOps backend since day one. Well, sadly not from day one, as that would have saved us quite a lot of time optimizing server code. With TIntArrayList able to hold 250% of its Scala and Java boxed comparable, the numbers kinda speak for themselves.
  • A similar ratio is observed for maps. It was quite surprising to see that Trove maps outperform Java and Scala lists by over 50% capacity. Remember that you also have Trove collections mapping between an int to an Object and vice versa, so you don’t need a fully primitive map to enjoy the capacity benefits.
  • Even so, we still saw that Trove’s TIntLinkedList can hold less data than a Java ArrayList or a Scala ArrayBuffer who box their primitives. This really makes you look at a linked list which is used heavily in your code and reconsider – do I absolutely need it?
  • Since there’s really no overhead in using Trove collections vs. standard library ones, I wouldn’t categorize using them as an “optimize on-demand” scenario. That’s because memory consumption bugs usually manifest under scale, where they’re the hardest to find and you have to start production debugging (of course Trove won’t save you from an inherently inefficient algorithm). It can sometime be the difference between analyzing a core dump and watching a Giants game.
  • Guava – big maps, small sets. Guava’s multi sets are more costly than their Java and Scala counterparts. Compared to Scala’s set they were able to hold 19% less data. The flip side to this of course is that they can do things that aren’t possible using standard set semantics. Just make sure you need those if multisets are going to play a big part in your memory structure.
  • With multimaps we see quite the reverse. Guava’s MultiHashMap was able to hold more than 20% more values than Java’s hashmap and 10% more than Scala’s. It was pretty counterintuitive to see that while Guava’s multisets underperform their Java and Scala equivalents in terms of capacity, MultiMaps actually outperform both Java and Scala.

There are a ton more collections types out there (queues, stacks, etc..), but I wanted to start with the basics and add more per popular demand. So if you want to see another collection on the list or want to shed some more light on some of the differences in capacity – the comments section is below and you know what to do. The code is available here.


OverOps detects all your exceptions and errors and tells you why they happen. Even across multiple threads and machines. Installs in 1min. Less than 2% overhead – Try it free

Tal is the CTO of OverOps. Tal has been designing scalable, real-time Java and C++ applications for the past 15 years. He still enjoys analyzing a good bug though, and instrumenting code. In his free time Tal plays Jazz drums.
  • Luc Bourlier

    Nice post. Could you precise if you are using the mutable or immutable versions of the Scala collections? It looks like they are the mutable ones, as LinkedList is used.

    I’d love to see the results for the immutable ones, especially List and Vector.

    • Tal Weiss

      Thanks for the feedback.

      We’ve edited the post to clarify that we appended ints to the data structures. This approach can’t be used when measuring immutable data structures.

      A different test can be made by generating predefined int lists in different sizes and then initializing immutable structures with them to get to the memory limit. If you’re interested in running the test, I’d be happy to link and share your results.

  • Petr Janeček

    Great post, thanks. I would also love to see how Goldman-Sachs Collections would do. They are a complete replacement of the JDK collections and are designed with memory consuption in mind.

    Also, could you please share your testing code? You say you tested using simple appending. With that, a lot of the differences could be explained by the (re)sizing strategy – the JDK collection tend to grow 2 times in their capacity when max capacity is reached. Could you try to pre-size them to try to achieve maximum possible capacity?

    • Tal Weiss

      Thanks for the tip Petr, we will run the benchmark with GS collections and update the post with our findings.

      The testing code was added to the post and is available on GitHub.

      Re capacity, our aim was to test the data structures as most users use them. That’s the reason we didn’t optimize them.

      • Petr Janeček

        This, however, means, that your results will probably vary depending on the memory cap used. If you used a slightly higher memory cap, some of the collections might be able to resize their internal array and score much more. Could you try with range from 1 to 2 GB with intervals, say, 100 MB? I would like to see if it makes a difference sometimes.

        • Tal Weiss

          That’s a valid point. I agree that the results could vary, and we might do a followup post using ranges of intervals. In the meantime, feel free to fork the code on GitHub and test it out.

  • Karel Čížex

    This test seems a bit fishy. I don’t want to believe that you can fit only 10M ints in 1GB heap in case of Trove array list.

    Trove has initial capacity of 10 and on every resize shifts this number to the left to determine new capacity. And 10 << 20 is 10485760. That means in your test you hit OOM on 21st resize. In this moment you have only 10M and 20M element arrays in memory and it gives only 120MB of data.

    I can run test up to 83886080 elements (335MB) before it OOMs.

    It is most likely caused by multi generational garbage collector. It splits heap to regions (eden/survivor/old) and no region is big enough for this huge array, hence small reported capacity. This also explin why "fragmented" datastructures that don't need continuous arrays have bigger reported capacity – they just fill all cracks in these regions.

    So perhaps this don't test memory needs of datastructures but how badly multigenerationl GC interacts with huge continuous arrays.

    • Tal Weiss

      Hi Karl,
      Your analysis is correct, and as I stated in the comments, the point of the post was not to benchmark, tweak and/or optimize a Java process, but rather use the most basic memory management argument for Java, Xmx, which we believe most users use. Your deep analysis is great and should be taken into account when trying to get more out of your data structures.

    • vitavonni

      First of all, I believe that other array-backed datastructures (in particular: ArrayList) will fail similarly. Second, it won’t change a lot. It’s pretty clear that for primitive types, you should be using Trove primitive collections.

  • Parallel Java Collections

    I’ve developed a Java parallel collections library (PJC) that improves the performance of the Java Collections. At this moment, thery are only implemented ArrayList and HashMap, and HashSet is being implemented.
    You can download the source code and the benchmarks to probe Java, PJC, Trove and GoldmanSachs Collections here:
    The documentation (at this moment, only in Spanish, but the graphics are easily understood) is available here: . There, you can see the results of the execution of the benchmarks.
    The project has also been uploaded to GitHub:
    If someone has any question, doubt or suggestion, can contact with me by email: