In the last post, we looked at some inefficiencies with reference types in .Net and perhaps oversold value types a little In any case, now that we’ve made the initial sale and you’re back for more, let’s talk about some pitfalls wrt the use of value types you should be aware of. Specifically let’s focus on cases where the CLR will cause implicit boxing on your value types.
We all know that when we cast a value type to object, we cause boxing. For instance, if we need to shove an int into an object or an ArrayList.
This is not great, but at least we’re doing this consciously and have had the chance to make a decision about it. However, there are a number of situations where the CLR will emit a box IL instruction for us implicitly without us realizing. These are far worse.
When you invoke a virtual method
Value types inherit from the System.ValueType, which itself inherits from System.Object. Amongst other things, System.ValueType provides an override for Equals that gives value types the default compare-by-value behaviour.
However, a value types is stored in memory without the Method Table Pointer (see the last post) so in order to dispatch a virtual method call it’ll first have to be boxed into a reference type first.
There is an exception to this though, as the CLR is able to call Equals directly if the value type overrides the Equals method (which is why it’s a best practice to do so).
Aside from benefiting from CLR’s short-circuit behaviour above, another good reason to override Equals in your custom value type is because the default implementation uses reflection to fetch its fields to compare.
Further more, it then uses UnsafeGetValue to get the value of the field on both the current object and the object being compared to, both of which causes further boxing as FieldInfo.UnsafeGetValue takes an object as argument.
But wait, there’s more…
Even if you override Equals(object other), it’ll still cause boxing to the input argument. To get around this you’ll need to overload Equals to take in another Point2D instance instead, i.e. Equals(Point2D other), which is of course another recommended best practice from the previous post.
Given these three versions of Point2D:
- V1 = plain old struct
- V2 = overrides Equals(object other)
- V3 = overloads Equals(Point2DV3 other)
We can see how they fared when we iterate through 10M instances of each and calling Equals each time.
Couple of things to note from the above:
- V1 causes twice as much boxing as V2 (makes sense given the short-circuiting behaviour)
- V1 also takes nearly four times as long to execute compared to V2
- V3 causes no boxing, and runs in nearly half the time as V2!
sidebar : I chose to run the test in F# because it’s easy to get quick insight (real and CPU time + GC counts) when you enable the #time directive. However, I had to define the types in C# as by default F# compiles struct types with all the optimizations we have talked about – which means:
a. it’s a good reason to use F#; but
b. defeats the purpose of running theses tests!
Unfortunately, there is still one more edge case…
When you call List<T>.Contains
This eventually causes a new EqualityComparer<T> to be created.
In the default case (where Point2D doesn’t implement the IEquatable<T> interface), the ObjectEqualityComparer<T> will be returned. And it is in here, that the overridden/inherited Equals(object other) method will be used and causes boxing to occur for every comparison!
If, on the other hand, Point2D implements the IEquatable<T> interface then the outcome will be very different. This allows some clever logic to kick in and use the overloaded Equals(Point2D other) instead.
So now, let’s introduce V4 of Point2D that implements the IEquatable<T> interface and see how it compares to
- V2 = overridden Equals(Object other) ; and
- V3 = overloaded Equals(Point2DV3 other)
when used in a List<T>.
For V2, List<T>.Contains performs the same as our hand coded version, but the improvements we made with V3 is lost due to the reasons outlined above.
V4 rectified this by allowing the optimization in List<T> to kick in.
Which brings us to the next implicit boxing…
When you invoke an interface method
Like virtual methods, in order to dispatch an interface method you also need the Method Table Pointer, which means boxing is required.
Fortunately, the CLR is able to short-circuit this by calling the method directly if the compile-time type is resolved to the actual value type (e.g. Point2D) rather than the interface type.
For this test, we’ll try:
- invoke Equals(Point2DV4 other) via the IEquatable<Point2DV4> interface; vs
- invoke Equals(Point2DV4 other) directly on Point2DV4
As you can see, invoking the interface method Equals(Point2DV4 other) does indeed incur boxing once for each instance of Point2D.
When Dictionary<T> invokes GetHashCode
In both cases, it’s invoked through the IEqualityComparer<T> type we talked earlier, and in both cases the comparer is also initialized through EqualityComparer<T>.Default and the CreateComparer method.
For this test let’s find out the effects of:
- implementing the IEquatable<T> interface in terms of boxing
- overriding GetHashCode (assuming the default implementation requires the use of reflection)
but first, let’s create V5 of our Point2D struct, this time with a overridden GetHashCode implementation (albeit a bad one, which is OK for this since we only want to see the performance implication of having one).
In this test, we have:
- V3 = no GetHashCode override, not implement IEquatable<T>
- V4 = no GetHashCode override, implements IEquatable<T>
- V5 = has GetHashCode override, implements IEquatable<T>
I used a much smaller sample size here (10K instead of 10M) because the amount of time it took to add even 10K items to a dictionary was sufficient to illustrate the difference here. Even for this small sample size, the difference is very noticeable, couple of things to note:
- since V3 doesn’t implement IEquatable<T> it incurs boxing, and a lot of it because Equals is also called through the IEqualityComparer<T>
- V4 eliminates the need for boxing but is still pretty slow due to the default GetHashCode implementation
- V5 addresses both of these problems!
sidebar : with GetHashCode there are also considerations around what makes a good hash code, e.g. low chance of collision, evenly distributed, etc. but that’s another discussion for another day. Most times I just let tools like Resharper work out the implementation based on the fields my value type has.
As we have seen, there are a number of places where implicit boxing can happen and how much of a difference these might have on your performance. So, to reiterate what was already said in the previous post, here are some best practices for using value types:
- make them immutable
- override Equals (the one that takes an object as argument);
- overload Equals to take another instance of the same value type (e.g. Equals(Point2D other));
- overload operators == and !=;
- override GetHashCode
Enjoy what you’re reading? Subscribe to my newsletter and get more content on AWS and serverless technologies delivered straight to your inbox.
I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.
In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, design patterns and best practices.
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- Lambda optimization tip – enable HTTP keep-alive
- You are thinking about serverless costs all wrong
- Many faced threats to Serverless security
- We can do better than percentile latencies
- I’m afraid you’re thinking about AWS Lambda cold starts all wrong
- Yubl’s road to Serverless
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- AWS Lambda – compare coldstart time with different languages, memory and code sizes
- Guys, we’re doing pagination wrong