You can become a serverless blackbelt. Enrol to my 4-week online workshop Production-Ready Serverless and gain hands-on experience building something from scratch using serverless technologies. At the end of the workshop, you should have a broader view of the challenges you will face as your serverless architecture matures and expands. You should also have a firm grasp on when serverless is a good fit for your system as well as common pitfalls you need to avoid. Sign up now and get 15% discount with the code yanprs15!
Out of curiosity after reading some articles on how the HashSet<T> (introduced in .Net 3.5) class is more performant than the List<T> class for set operations, I set about doing some experiments of my own to get a feel of just how much faster a HashSet is, and under what circumstances.
Also, whilst there’s been much comparison between HashSet<T> and List<T>, I have found nothing on how HashSet<T> fares against Dictionary<TKey, TValue> in performance terms, so I’ll factor that into consideration too!
Using a HashSet, a List and a Dictionary of integer and a simple reference type, I ran the following tests:
Test 1: add 1000000 value type objects without checking for duplicates
Test 2: add 1000000 reference type objects without checking for duplicates
Test 3: run Contains() method against half the objects in a list of 10000 value type objects
Test 4: run Contains() method against half the objects in a list of 10000 reference type objects
Test 5: remove half the objects in a list of 10000 value types
Test 6: remove half the objects in a list of 10000 reference types
The objective is to find out:
- how the three constructs performs for each of these basic operations
- how the performance differs for value and reference types
Test 1 and Test 2
The List type is the clear winner here, and no surprise really given that both HashSet and Dictionary ensures that that are no duplicated, what’s surprising though is how much more overhead you incur when dealing with reference types!
Test 3 and Test 4
They say hash lookups are fast and it’s no lie! Interestingly though, looking for a matching reference type in the values of a Dictionary proved to be much slower than doing the same thing in a List.
Test 5 and Test 6
The power of hash lookup strikes again!
You can download the source code for the tests on here.
The results I posted here suggest that HashSet and Dictionary types are in general better performing than List whose faster speed at adding new items is greatly offset by deficits in other common operations. However, it’s important to remember that based on your use case the type of collection you should use normally picks itself – use a list if you just need a List to keep track of items; use a Dictionary if you require hash lookup against some value (an ID for your objects perhaps?); use a hash set if you need to perform set operations (e.g. set comparison, determine subset/superset relationship) frequently, and so on.
In practice, the difference in your application’s overall performance resulting from using a different collection type is trivial and should not dictate which collection type you use UNLESS proven otherwise via profiling!
Also, you should be mindful of other differences between the three types, both in terms of behaviour as well as functionalities, for instance:
- HashSet.Add will skip a new item if it’s deemed equal to one of the existing items and return false.
- Dictionary.Add will throw an exception if the new key being added is deemed equal to one of the existing keys. However, if you use the Dictionary‘s indexer instead, it will replace the existing item if the new item is deemed equal to it.
- List.Add will simply add the same item twice.
- HashSet provides some very useful methods such as IsSubsetOf and Overlaps, both can be achieved on the other collection types using LINQ but HashSet provides an optimized, ready-made solution
Hi, I’m Yan. I’m an AWS Serverless Hero and I help companies go faster for less by adopting serverless technologies successfully.
Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.
Skill up your serverless game with this hands-on workshop.
My 4-week Production-Ready Serverless online workshop is back!
This course takes you through building a production-ready serverless web application from testing, deployment, security, all the way through to observability. The motivation for this course is to give you hands-on experience building something with serverless technologies while giving you a broader view of the challenges you will face as the architecture matures and expands.
We will start at the basics and give you a firm introduction to Lambda and all the relevant concepts and service features (including the latest announcements in 2020). And then gradually ramping up and cover a wide array of topics such as API security, testing strategies, CI/CD, secret management, and operational best practices for monitoring and troubleshooting.
If you enrol now you can also get 15% OFF with the promo code “yanprs15”.
Check out my new podcast Real-World Serverless where I talk with engineers who are building amazing things with serverless technologies and discuss the real-world use cases and challenges they face. If you’re interested in what people are actually doing with serverless and what it’s really like to be working with serverless day-to-day, then this is the podcast for you.
Check out my new course, Learn you some Lambda best practice for great good! In this course, you will learn best practices for working with AWS Lambda in terms of performance, cost, security, scalability, resilience and observability. We will also cover latest features from re:Invent 2019 such as Provisioned Concurrency and Lambda Destinations. Enrol now and start learning!
Check out my video course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. There is something for everyone from beginners to more advanced users looking for design patterns and best practices. Enrol now and start learning!
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- All you need to know about caching for serverless applications
- Lambda optimization tip – enable HTTP keep-alive
- You are wrong about serverless and vendor lock-in
- You are thinking about serverless costs all wrong
- Just how expensive is the full AWS SDK?
- Check-list for going live with API Gateway and Lambda
- How to choose the right API Gateway auth method
- CloudFormation protip: use !Sub instead of !Join
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- Guys, we’re doing pagination wrong
- Top 10 Serverless framework best practices
- How to break the “senior engineer” career ceiling
- My advice to junior developers