MS Bond benchmark updated

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

I recently added MS Bond to my benchmark and found some interesting numbers, which prompted a question on their repo.

Adam Sapek explained that the slow serialization speed I was seeing was down to the default buffer size being 64KB which is not suitable for the payload I was testing with.

Adjusting the buffer size to 256 bytes resulting in some pretty amazing result:

image

image

Fastest serialization & deserialization, and smallest payload.

Wow.

 

Have a look at the performance tuning guide, there’s quite a few tweaks you can do to improve performance further but it’ll depend on your payload.

MS Bond and Chiron benchmarked

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

I updated my binary and JSON serializers benchmark earlier this week, and got some feedbacks on new serializers that I have missed, namely Chiron and Microsoft’s Bond. Here, we’ll have a look how the two fared in the benchmark.

 

MS Bond

Microsoft announced their answer to Google’s Protocol Buffer with Bond this time last year (Jan 2015). Finally I’ve got around to actually test it out (after an ex-Gamesys colleague commented on the last update – thanks Rob!).

First, you define you contract with a .bond file (see tutorial here), for example…

bond_benchmark_03

Now you run the Bond compiler tool, gbc, against this file to generate a C# class that looks like this…

bond_benchmark_04

To serialize and deserialize data, you also need to add the Bond C# nuget package to your project and follow the examples in the aforementioned tutorial.

Here’s how Bond fared against other binary serializers on my list.

NOTE: there’s an updated benchmark test that uses a different initial buffer size which makes a huge difference in performance for Bond. Please read the linked post for more info.

bond_benchmark_01

bond_benchmark_02

The result makes for an interesting reading…

  • Bond produced the smallest payload, and is the fastest at deserializing the payload by some distance.
  • It is also the slowest at serializing the payload!

 

Chiron

I read about Chiron in Marcus Griep‘s F# advent post but then forgot about it (totally my bad… too many hours on Bloodborne over xmas, such an awesome game ).

Anyways, Chiron has a F#-friendly API but because it uses statically resolved type parameters you can’t use it from C#.

In order to serialize/deserialize a type, the type needs to define the static methods ToJson and FromJson. The inlined serialize and deserialize functions can then constraint your type to have those static members and invoke them in the corresponding function. I used the same technique in MBrace.AWS and honestly, I’m not happy with the amount of work this pushes onto the user, especially when they end up having to write uninteresting plumbing code…

On the API front, I’m not thrilled with the custom operators either, even though there are only 3 of them so I’m probably just over-reacting. In general I find custom operators get in the way of discovery.

Reading through the post, this paragraph suggests a lot of intermediate JsonResult<‘a> and Json objects are created during the serialization process. Whilst this might be an idiomatic functional approach, it’s also likely to hurt our performance..

The *> operator that we used in ToJson discards the JsonResult<'a> (which is only used when writing), but continues to build upon the Json object from the previous operation. By chaining these operations together, we build up the members of a Json.Object.

Unsurprisingly, the cost of immutability proved really costly under the benchmark.

chiron_benchmark_01

chiron_benchmark_02

 

So that’s it folks, another 2 serializers added to our stable. If there any other serializers that you think I should include here, please give me a shout and I’ll do my best to accommodate.

Binary and Json benchmarks updated

It’s been a while since I last updated my binary and JSON serializer benchmarks, so here I round up the latest versions of the serializers on here.

 

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

Binary

Only FsPickler was updated for this benchmark so there are no significant changes in numbers here (with the exception of the BinaryWriter!).

image

image

 

JSON

Quite a few of the JSON serializers have been updated:

  • FsPickler
  • Jil
  • MongoDB Driver
  • NetJson
  • Newtonsoft.Json (aka Json.Net)
  • ServiceStack.Text

Jil seems to have made the biggest gains since the last time.

image

image

*protobuf-net is in this list purely as a benchmark to show how the test JSON serializers compare to one of the fastest binary serializers (both in terms of speed and payload size)

Fasterflect vs HyperDescriptor vs FastMember vs Reflection

The other day I had a small task to inspect return values of methods and if the following property exists then set it to empty array.

        public long[] Achievements { get; set; }

This needed to happen once on every web request, and I decided to implement it as a PostSharp attribute. WHY I needed to do this is another interesting discussion, but broadly speaking boils down to assumptions baked into base class’s control flows  no longer holds true but is inflexible to change.

A major refactoring is in order, but given we have a deadline to meet, let’s take a technical debt and make sure it’s in the backlog so we know to come back to it.

 

So I went about finding a more efficient way of doing this than hand-writing some reflection code.

There’s Jon Skeet’s famous reflection post but to make it work across all types and incorporating IL emit is more complex than this task warrants.

sidebar : if you’re still interested in seeing how Jon’s technique can be applied, check out FSharpx.Reflection to see how it’s used to make reflection over F# types fast.

Then there’s Marc Gravell’s HyperDescriptor although the Nuget  package itself was authored by someone else. Plus Marc’s follow-up to HyperDescriptorFastMember.

I also came across a package called Fasterflect which I wasn’t aware of before, from its project page the API looks pretty clean.

 

Test 1 – 1M instances of MyTypeA

Suppose I have a type, with the Achievements property that I want to set to empty whenever I see it returned by a method.

image

To do this with reflection is pretty straight forward:

image

With Fasterflect, this becomes:

image

I do like this API, it’s very intuitive.

And here’s how it looks with HyperDescriptor and FastMember:

image

Now let’s run this across 1M instance of MyTypeA and see how they do. Both Fasterflect and FastMember did really well although HyperDescriptor was 3x slower than basic reflection!

image

 

Test 2 – 1M instances of MyTypeA and MyTypeB

Ok, but since this code has to work across many types, so we have to expect both

  • types that has this property, and
  • types that don’t

To simulate that, let’s introduce another type into the test.

image

Instead of working with an array of MyTypeA, now we have a mixed array of both MyTypeA and MyTypeB for our test.

image

and the result over 1M objects makes for interesting readings:

image

some observations from this set of results:

  • we need to invoke the setter on only half the objects, so reflection is faster (almost halved) than before, makes sense;
  • both FastMember and HyperDescriptor are faster due to the same reason as above;
  • having less work to do had much smaller impact on FastMember suggests some caching around the call site (and indeed it does);
  • WTF is going on with Fasterflect!

 

Conclusions

The morale of this story is that – always verify claims against your particular use case.

Another perfect example of this is Kyle Kingsbury’s work with Jepsen. Where he uses a generative testing approach to verify whether NoSQL databases actually provide the consistency model that they claims to offer. His findings are very interesting to read, and in many cases pretty worrying

Oh, and stick with FastMember or reflection 

 

Links

Beware of implicit boxing of value types

In the last post, we looked at some inefficiencies with reference types in .Net and perhaps oversold value types a little  In any case, now that we’ve made the initial sale and you’re back for more, let’s talk about some pitfalls wrt the use of value types you should be aware of. Specifically let’s focus on cases where the CLR will cause implicit boxing on your value types.

 

We all know that when we cast a value type to object, we cause boxing. For instance, if we need to shove an int into an object[] or an ArrayList.

This is not great, but at least we’re doing this consciously and have had the chance to make a decision about it. However, there are a number of situations where the CLR will emit a box IL instruction for us implicitly without us realizing. These are far worse.

 

When you invoke a virtual method

Value types inherit from the System.ValueType, which itself inherits from System.Object. Amongst other things, System.ValueType provides an override for Equals that gives value types the default compare-by-value behaviour.

However, a value types is stored in memory without the Method Table Pointer (see the last post) so in order to dispatch a virtual method call it’ll first have to be boxed into a reference type first.

There is an exception to this though, as the CLR is able to call Equals directly if the value type overrides the Equals method (which is why it’s a best practice to do so).

Aside from benefiting from CLR’s short-circuit behaviour above, another good reason to override Equals in your custom value type is because the default implementation uses reflection to fetch its fields to compare.

Further more, it then uses UnsafeGetValue to get the value of the field on both the current object and the object being compared to, both of which causes further boxing as FieldInfo.UnsafeGetValue takes an object as argument.

But wait, there’s more…

Even if you override Equals(object other), it’ll still cause boxing to the input argument. To get around this you’ll need to overload Equals to take in another Point2D instance instead, i.e. Equals(Point2D other), which is of course another recommended best practice from the previous post.

Given these three versions of Point2D:

  • V1 = plain old struct
  • V2 = overrides Equals(object other)
  • V3 = overloads Equals(Point2DV3 other)

We can see how they fared when we iterate through 10M instances of each and calling Equals each time.

Couple of things to note from the above:

  • V1 causes twice as much boxing as V2 (makes sense given the short-circuiting behaviour)
  • V1 also takes nearly four times as long to execute compared to V2
  • V3 causes no boxing, and runs in nearly half the time as V2!

sidebar : I chose to run the test in F# because it’s easy to get quick insight (real and CPU time + GC counts) when you enable the #time directive. However, I had to define the types in C# as by default F# compiles struct types with all the optimizations we have talked about – which means:

a. it’s a good reason to use F#; but

b. defeats the purpose of running theses tests!

 

Unfortunately, there is still one more edge case…

 

When you call List<T>.Contains

When you call List<T>.Contains, an instance of EqualityComparer<T> will be used to compare the argument against every element of the list.

This eventually causes a new EqualityComparer<T> to be created.

In the default case (where Point2D doesn’t implement the IEquatable<T> interface), the ObjectEqualityComparer<T> will be returned. And it is in here, that the overridden/inherited Equals(object other) method will be used and causes boxing to occur for every comparison!

If, on the other hand, Point2D implements the IEquatable<T> interface then the outcome will be very different. This allows some clever logic to kick in and use the overloaded Equals(Point2D other) instead.

 

So now, let’s introduce V4 of Point2D that implements the IEquatable<T> interface and see how it compares to

  • V2 = overridden Equals(Object other) ; and
  • V3 = overloaded Equals(Point2DV3 other)

when used in a List<T>.

For V2, List<T>.Contains performs the same as our hand coded version, but the improvements we made with V3 is lost due to the reasons outlined above.

V4 rectified this by allowing the optimization in List<T> to kick in.

 

Which brings us to the next implicit boxing…

 

When you invoke an interface method

Like virtual methods, in order to dispatch an interface method you also need the Method Table Pointer, which means boxing is required.

Fortunately, the CLR is able to short-circuit this by calling the method directly if the compile-time type is resolved to the actual value type (e.g. Point2D) rather than the interface type.

 

For this test, we’ll try:

  • invoke Equals(Point2DV4 other) via the IEquatable<Point2DV4> interface; vs
  • invoke Equals(Point2DV4 other) directly on Point2DV4

 

As you can see, invoking the interface method Equals(Point2DV4 other) does indeed incur boxing once for each instance of Point2D.

 

When Dictionary<T> invokes GetHashCode

GetHashCode is used by hash-based collection types, the most common being Dictionary<TKey, TValue> and HashSet<T>.

In both cases, it’s invoked through the IEqualityComparer<T> type we talked earlier, and in both cases the comparer is also initialized through EqualityComparer<T>.Default and the CreateComparer method.

GetHashCode is invoked in many places within Dictionary<TKey, TValue> – on Add, ContainsKey, Remove, etc.

For this test let’s find out the effects of:

  • implementing the IEquatable<T> interface in terms of boxing
  • overriding GetHashCode (assuming the default implementation requires the use of reflection)

but first, let’s create V5 of our Point2D struct, this time with a overridden GetHashCode implementation (albeit a bad one, which is OK for this since we only want to see the performance implication of having one).

In this test, we have:

  • V3 = no GetHashCode override, not implement IEquatable<T>
  • V4 = no GetHashCode override, implements IEquatable<T>
  • V5 = has GetHashCode override, implements IEquatable<T>

I used a much smaller sample size here (10K instead of 10M) because the amount of time it took to add even 10K items to a dictionary was sufficient to illustrate the difference here. Even for this small sample size, the difference is very noticeable, couple of things to note:

  • since V3 doesn’t implement IEquatable<T> it incurs boxing, and a lot of it because Equals is also called through the IEqualityComparer<T>
  • V4 eliminates the need for boxing but is still pretty slow due to the default GetHashCode implementation
  • V5 addresses both of these problems!

sidebar : with GetHashCode there are also considerations around what makes a good hash code, e.g. low chance of collision, evenly distributed, etc. but that’s another discussion for another day. Most times I just let tools like Resharper work out the implementation based on the fields my value type has.

 

Conclusions

As we have seen, there are a number of places where implicit boxing can happen and how much of a difference these might have on your performance. So, to reiterate what was already said in the previous post, here are some best practices for using value types:

  • make them immutable
  • override Equals (the one that takes an object as argument);
  • overload Equals to take another instance of the same value type (e.g. Equals(Point2D other));
  • overload operators == and !=;
  • override GetHashCode

 

Links