You can become a serverless blackbelt. Enrol to my 4-week online workshop Production-Ready Serverless and gain hands-on experience building something from scratch using serverless technologies. At the end of the workshop, you should have a broader view of the challenges you will face as your serverless architecture matures and expands. You should also have a firm grasp on when serverless is a good fit for your system as well as common pitfalls you need to avoid. Sign up now and get 15% discount with the code yanprs15!
I stumbled upon this interesting question on StackOverflow today, Jon Harrop’s answer mentions a significant overhead in adding and iterating over a SortedDictionary and Map compared to using simple arrays.
Thinking about it, this makes sense, the SortedDictionary class sorts its constituent key-value pairs by key, which will naturally incur some performance overhead.
F#’s Map construct on the other hand, is immutable, and adding an item to a Map returns the resulting Map – a new instance of Map which includes all the items from the original Map instance plus the newly added item. As you can imagine, this means copying over a lot of data when you’re working with a large map which is an obvious performance hit.
This is a similar problem to using List.append ( or equivalently using the @ operator ) on lists as it also involves copying the data in the first list, more on that on another post.
Anyhow, the question piqued my interest and I had to test it out and get some quantitative numbers for myself, and I was also interested in seeing how the standard Dictionary class does compared to the rest. :-)
The test code is very simple, feel free to take a look here and let me know if them are unfair in any way. In short, the test was to add 1,000,000 items and then iterate over them with each type of construct and record the time each step took.
The results are below, the times are recorded in seconds, averaged over 5 runs.
Aside from the fact that the Map construct did particularly poorly in these tests, it was interesting to see that initializing a Dictionary instance with sufficient capacity to begin with allowed it to perform twice as fast!
To understand where that performance boost came from, you need to understand that a Dictionary uses an internal array of entry objects (see below) to keep track of what’s in the dictionary:
When that internal array fills up, it replaces the array with a bigger array and the size of the new array is, roughly speaking, the smallest prime number that’s >= current capacity times 2, even though the implementation only uses a cached array of 72 prime numbers 3, 7, 11, 17, 23, 29, 37, 47, … 7199369.
So when I initialized a Dictionary without specifying its capacity (hence capacity = 0) and proceed to add 1 million items it will have had to resize its internal array 18 times, causing more overhead with each resize.
Again, these results should be taken at face value only, it doesn’t mean that you should never use Map because it’s slower than the other structures for additions and iterations, or that you should start replacing your dictionaries with arrays…
Instead, use the right tool for the right job.
If you’ve got a set of static data (such as configuration data that’s loaded when your application starts up) you need to look up by key frequently, a Map is as good a choice as any, its immutability in this case ensures that the static data cannot be modified by mistake and has little impact to performance as you never need to mutate it once initialized.
Hi, I’m Yan. I’m an AWS Serverless Hero and I help companies go faster for less by adopting serverless technologies successfully.
Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.
Skill up your serverless game with this hands-on workshop.
My 4-week Production-Ready Serverless online workshop is back!
This course takes you through building a production-ready serverless web application from testing, deployment, security, all the way through to observability. The motivation for this course is to give you hands-on experience building something with serverless technologies while giving you a broader view of the challenges you will face as the architecture matures and expands.
We will start at the basics and give you a firm introduction to Lambda and all the relevant concepts and service features (including the latest announcements in 2020). And then gradually ramping up and cover a wide array of topics such as API security, testing strategies, CI/CD, secret management, and operational best practices for monitoring and troubleshooting.
If you enrol now you can also get 15% OFF with the promo code “yanprs15”.
Check out my new podcast Real-World Serverless where I talk with engineers who are building amazing things with serverless technologies and discuss the real-world use cases and challenges they face. If you’re interested in what people are actually doing with serverless and what it’s really like to be working with serverless day-to-day, then this is the podcast for you.
Check out my new course, Learn you some Lambda best practice for great good! In this course, you will learn best practices for working with AWS Lambda in terms of performance, cost, security, scalability, resilience and observability. We will also cover latest features from re:Invent 2019 such as Provisioned Concurrency and Lambda Destinations. Enrol now and start learning!
Check out my video course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. There is something for everyone from beginners to more advanced users looking for design patterns and best practices. Enrol now and start learning!
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- All you need to know about caching for serverless applications
- Lambda optimization tip – enable HTTP keep-alive
- You are wrong about serverless and vendor lock-in
- You are thinking about serverless costs all wrong
- Just how expensive is the full AWS SDK?
- Check-list for going live with API Gateway and Lambda
- How to choose the right API Gateway auth method
- CloudFormation protip: use !Sub instead of !Join
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- Guys, we’re doing pagination wrong
- Top 10 Serverless framework best practices
- How to break the “senior engineer” career ceiling
- My advice to junior developers