HashSet vs List vs Dictionary

Yan Cui

I help clients go faster for less using serverless technologies.

Table of Content

Out of curiosity after reading some articles on how the HashSet<T> (introduced in .Net 3.5) class is more performant than the List<T> class for set operations, I set about doing some experiments of my own to get a feel of just how much faster a HashSet is, and under what circumstances.

Also, whilst there’s been much comparison between HashSet<T> and List<T>, I have found nothing on how HashSet<T> fares against Dictionary<TKey, TValue> in performance terms, so I’ll factor that into consideration too!

Using a HashSet, a List and a Dictionary of integer and a simple reference type, I ran the following tests:

Test 1: add 1000000 value type objects without checking for duplicates

Test 2: add 1000000 reference type objects without checking for duplicates

Test 3: run Contains() method against half the objects in a list of 10000 value type objects

Test 4: run Contains() method against half the objects in a list of 10000 reference type objects

Test 5: remove half the objects in a list of 10000 value types

Test 6: remove half the objects in a list of 10000 reference types

The objective is to find out:

how the three constructs performs for each of these basic operations
how the performance differs for value and reference types

Test Results

Test 1 and Test 2

The List type is the clear winner here, and no surprise really given that both HashSet and Dictionary ensures that that are no duplicated, what’s surprising though is how much more overhead you incur when dealing with reference types!

Test 3 and Test 4

They say hash lookups are fast and it’s no lie! Interestingly though, looking for a matching reference type in the values of a Dictionary proved to be much slower than doing the same thing in a List.

Test 5 and Test 6

The power of hash lookup strikes again!

Source Code

You can download the source code for the tests on here.

Parting Thoughts…

The results I posted here suggest that HashSet and Dictionary types are in general better performing than List whose faster speed at adding new items is greatly offset by deficits in other common operations. However, it’s important to remember that based on your use case the type of collection you should use normally picks itself – use a list if you just need a List to keep track of items; use a Dictionary if you require hash lookup against some value (an ID for your objects perhaps?); use a hash set if you need to perform set operations (e.g. set comparison, determine subset/superset relationship) frequently, and so on.

In practice, the difference in your application’s overall performance resulting from using a different collection type is trivial and should not dictate which collection type you use UNLESS proven otherwise via profiling!

Also, you should be mindful of other differences between the three types, both in terms of behaviour as well as functionalities, for instance:

HashSet.Add will skip a new item if it’s deemed equal to one of the existing items and return false.
Dictionary.Add will throw an exception if the new key being added is deemed equal to one of the existing keys. However, if you use the Dictionary‘s indexer instead, it will replace the existing item if the new item is deemed equal to it.
List.Add will simply add the same item twice.
HashSet provides some very useful methods such as IsSubsetOf and Overlaps, both can be achieved on the other collection types using LINQ but HashSet provides an optimized, ready-made solution

Whenever you’re ready, here are 3 ways I can help you:

Production-Ready Serverless: Join 20+ AWS Heroes & Community Builders and 1000+ other students in levelling up your serverless game. This is your one-stop shop for quickly levelling up your serverless skills.
I help clients launch product ideas, improve their development processes and upskill their teams. If you’d like to work together, then let’s get in touch.
Join my community on Discord, ask questions, and join the discussion on all things AWS and Serverless.

17 thoughts on “HashSet vs List vs Dictionary”

Pingback: SimpleSpeedTester | theburningmonk.com
Grahame Scott-Douglas
April 10, 2012 at 7:50 pm

Thanks for doing this. I was wondering how HashSet stacks up against List. Now you’ve answered it.
Fabio
July 8, 2012 at 2:31 pm

Cool! Thanks for the insight!
Ruperto Leon
October 11, 2012 at 1:26 am

Excellent, Thanks!
Sagar
May 24, 2013 at 11:48 am

Excellent article Sir, Thank you
Andrew
August 27, 2013 at 10:12 am

Nice and clear.
Bhavik
December 2, 2014 at 4:54 am

good clarification with example. like to share your article in my blog http://www.dotnetspan.com
Yan Cui
December 2, 2014 at 11:07 am

Sure, feel free to share :)
Pingback: How to: Merging dictionaries in C# | SevenNet
Pingback: Fixed Merging dictionaries in C# #dev #it #asnwer | Good Answer
Pingback: Fixed: Merging dictionaries in C# #it #computers #development | IT Info
John
May 2, 2015 at 5:48 am

All theory here, but the reason why Add is less expensive on lists might be because lists use an array under-the-hood while dictionaries and hashsets use hashtables. Arrays an be easily resized, while hashtables have to recalculate the hash or something. Have you tried the test with pre-set capacities?
Yan Cui
May 2, 2015 at 9:09 am

Both List and Dictionary have to copy data around when resizing and Dictionary has the additional overhead of recalculating the hash.

Check out this other benchmark I did a while back: https://theburningmonk.com/2011/12/performance-test-sorteddictionary-vs-dictionary-vs-map-vs-array/
in there you can see that adding 1m items to an empty Dictionary is twice as slow as a Dictionary with pre-set capacity.

Here’s all the benchmarks I’ve done so far:
https://theburningmonk.com/benchmarks/
John
May 2, 2015 at 7:47 pm

Wow, you’re very thorough! Thanks for saving all of us a bunch of times and teaching me.
Yan Cui
May 2, 2015 at 8:23 pm

Thank you! :-)
Pingback: HashSet ???< k?v >???????????????? – CodingBlog
Sam
December 4, 2017 at 11:41 pm

”
However, it’s important to remember that based on your use case the type of collection you should use normally picks itself – use a list if you just need a List to keep track of items; use a Dictionary if you require hash lookup against some value (an ID for your objects perhaps?); use a hash set if you need to perform set operations (e.g. set comparison, determine subset/superset relationship) frequently, and so on.

In practice, the difference in your application’s overall performance resulting from using a different collection type is trivial and should not dictate which collection type you use UNLESS proven otherwise via profiling!
”

Bingo.

Test Results

Source Code

Parting Thoughts…

17 thoughts on “HashSet vs List vs Dictionary”

Leave a Comment