You can become a serverless blackbelt. Enrol in my course Learn you some Lambda best practice for great good! and learn best practices for performance, cost, security, resilience, observability and scalability. By the end of this course, you should be able to make informed decisions on which AWS service to use with Lambda and how to build highly scalable, resilient and cost-efficient serverless applications.
Note: don’t forget to check out the Benchmarks page to see the latest round up of binary and JSON serializers.
A little while ago I put together a quick performance test comparing the BCL’s BinaryFormatter with that of Marc Gravell‘s protobuf-net library (.Net implementation of Google’s protocol buffer format). You can read more about my results here.
There’s another fast binary serialization library which I had heard about before, MessagePack, which claims to be 4 times faster than protocol buffers which I find a little hard to believe, so naturally, I have to see for myself!
The conditions of the test is very similar to those outlined in my first post, except MessagePack is now included in the test.
I defined two types to be serialized/deserialized, both contain the same amount of data, one exposes them as public properties whilst the other exposes them as public fields:
The reason for this arrangement is to try and get the best performance out of MessagePack (see Note 2 below).
100,000 identical instances of the above types are serialized and deserialized over 5 runs with the best and worst results excluded, results of the remaining 3 runs are then averaged to give the final results.
A couple of observations from these results:
1. BinaryFormatter performs better with fields – faster serialization and smaller payload!
2. Protobuf-net performs equally well with fields and properties – because it’s driven by the [DataMember] attributes.
3. MessagePack performs significantly better with fields – over 10% faster serialization and deserialization!
4. MessagePack is NOT 4x faster than Protocol Buffer! – at least not with the two implementations (MsgPack and Profotbuf-net) I have tested here, speed-wise there’s not much separating the two and depending on the data being serialized you might find a slightly different result.
5. MessagePack generates bigger payload than Protocol Buffer – again, depending on the data being serialized you might arrive at a slightly different result.
If you’re interested in running the tests yourself, you can find the source code here. It uses my SimpleSpeedTester framework to orchestrate test runs but you should be able to get the gist of it fairly easily.
Although MessagePack couldn’t back up its claim to be 4x faster than Protocol Buffer in my test here, it’s still great to see that it offers comparable speed and payload, which makes it a viable alternative to Protocol Buffer and in my opinion having a number of different options is always a plus!
On a different note, although you’re able to get a good 10% improvement in performance when you use fields instead of properties to expose publically available data in your type, it’s important to remember that the idiomatic way in .Net is to use public properties.
As much as I think performance is of paramount importance, maintainability of your code is a close second, especially in an environment where many developers will be working on the same piece of code. By following the ‘recognized’ way of doing things you can more easily communicate your intentions to other developers working on the same code.
Imagine you’ve intentionally used public fields in your class because you know it will be serialized by the MessagePack serializer but then a new developer joins the team and, not knowing all the intricate details of your code, decides to convert them all to public properties instead..
So if you’re willing to step out of the idiomatic .Net way to get that performance boost, be sure to comment your code well in all the places where you’re diverging from the norm!
The SimpleObject class in this post is slightly different from the one I used in my previous post, the ‘Scores‘ property is now an int array instead of List<int>, doing so drastically improved performance for ALL the serializers involved.
This is because the implementation of List<T> uses an array internally and the array is resized when capacity is reached. So even though you only have x number of items in a List<T> the internal array will always have more than x number of spaces. Which is why SimpleObject in my first post (where ‘Scores‘ is of type List<int>) is serialized into 708 bytes by the BinaryFormatter but the SimpleObject class defined in this post can be serialized into 376 bytes.
Similarly, initializing an instance of Dictionary<TKey, TValue> with the intended capacity allows you to add items to it much more efficiently, for more details, see here.
One caveat I came across in the MessagePack test was that, if you instantiate CompiledPacker without any parameter, the default behaviour is to serialize public fields only and that meant none of the public properties I’ve defined on the SimpleObject type will get serialized. As a result, to serialize/deserialize instances of SimpleObject I needed to use an instance of CompiledPacker that serializes private fields too, which according to the MessagePack’s GettingStarted guide, is slower.
So in order to get the best performance (speed) out of MessagePack, I defined a second type (see SimpleObjectWithFields type above) to see how well it does with a type that defines public fields instead of properties. In the interest of fairness, I repeated the same test for BinaryFormatter and protobuf-net too, just to see if you get better performance out of them too.
I specialise in rapidly transitioning teams to serverless and building production-ready services on AWS.
Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.
Check out my new podcast Real-World Serverless where I talk with engineers who are building amazing things with serverless technologies and discuss the real-world use cases and challenges they face. If you’re interested in what people are actually doing with serverless and what it’s really like to be working with serverless day-to-day, then this is the podcast for you.
Check out my new course, Learn you some Lambda best practice for great good! In this course, you will learn best practices for working with AWS Lambda in terms of performance, cost, security, scalability, resilience and observability. We will also cover latest features from re:Invent 2019 such as Provisioned Concurrency and Lambda Destinations. Enrol now and start learning!
Check out my video course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. There is something for everyone from beginners to more advanced users looking for design patterns and best practices. Enrol now and start learning!
Are you working with Serverless and looking for expert training to level-up your skills? Or are you looking for a solid foundation to start from? Look no further, register for my Production-Ready Serverless workshop to learn how to build production-grade Serverless applications!
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- Lambda optimization tip – enable HTTP keep-alive
- You are wrong about serverless and vendor lock-in
- You are thinking about serverless costs all wrong
- Just how expensive is the full AWS SDK?
- Many faced threats to Serverless security
- We can do better than percentile latencies
- Yubl’s road to Serverless
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- AWS Lambda – compare coldstart time with different languages, memory and code sizes
- Guys, we’re doing pagination wrong
- Top 10 Serverless framework best practices