Binary and JSON serializer benchmarks updated

First of all I’d like to offer my sincere apologies to those who have asked me to update my benchmark numbers following the release of Json.NET 5.0.6, it took me a long time to clear some of my backlogs and only just got around to it, sorry for the waiting!

The good news is that, based on my tests against a simple POCO object (that’s representative of the payload that I work with often) Json.NET has delivered on its promised and offered big improvements on deserialization performance and is now neck-to-neck with ServiceStack.Text in terms of deserialization speed.

 

DISCLAIMER : as always, you should be benchmark against your payload and use case, the benchmark numbers I have produced here is unlikely to be representative of your use cases and neither is anybody else’s benchmark numbers.

You can use the simple test harness I created and see this example code for my JSON serializer tests to benchmark against your particular payload.

 

JSON

Here is the result against the latest versions of JSON serializers at the time of writing:

image

image

Versions of serializers tested:

 

As mentioned previously, the latest version of Json.Net has made big improvements on its deserialization speed and is now on par with ServiceStack.Text.

 

Binary

image12[1]

image16

Versions of serializers tested:

 

As you can see, there is little to choose between the usual suspects of protobuf-net, MessagePack and MessageShark, who are well clear of the rest of the pack. I have also included two additional binary serializers:

  • FlourineFX, an open source library for working with Flash/Flex remoting, and comes with a serializer for AMF encoded data
  • Filbert, a BERT (Binary ERlang Term) serializer and BERT-RPC client I wrote in F#. As you can see, purely from a performance point of view it needs much optimization on its deserialization speed (specifically it needs a buffer pool rather than allocating new array each time) which I have been hoping to find time to do for a while. In general, is interoperability with Erlang something .Net developers are interested in exploring? Would love to hear your thoughts on the matter, and if you know of alternative approaches (other than via something like BERT) you think worth investing.

 

Links

My Benchmarks

Simple Speed Tester

Code for the JSON benchmarks

Code for the binary benchmarks

Simple Speed Tester – moved to Github!

Since I’m liking git more and more by the day, with tools such as SmartGit and GitFlow making the task of managing even a complex branching model a relatively easy task, I’ve decided to move my Simple Speed Tester project over to github!

Simple Speed Tester is a very simple framework I wrote to help me run benchmarks and is used to power my JSON and binary serializers benchmarks. It takes cares of some of the orchestration that you tend to do when running benchmarks, e.g.:

  • repeating a test multiple times
  • time the individual runs
  • ignore the min and max runs and use the rest to calculate a meaningful average

It’s intended to be really easy to use (see examples here) and for one and only one use case – help you speed test a specific piece of code!

If you’re like me and like to run your own benchmarks then check it out, you can also install it via Nuget.

Performance Test – Binary serializers Part III

Note: don’t forget to check out the Benchmarks page to see the latest round up of binary and JSON serializers.

 

Since my last round of benchmarks on binary serializers, there’s a new player in town – MessageShark, which at the time of this writing does not support serialization of fields, but offers comparable speed and payload to protobuf-net.

Using the same test objects, here’s how MessageShark compares to the other binary serializers:

Untitled_1

Untitled_2

Untitled

It’s early days for MessageShark but the signs are good, comparable serialization speed with protobuf-net and a noticeably faster deserialization speed, definitely one to keep an eye out for!

Performance Test – Binary serializers Part II

Note: don’t forget to check out the Benchmarks page to see the latest round up of binary and JSON serializers.

A little while ago I put together a quick performance test comparing the BCL’s BinaryFormatter with that of Marc Gravell‘s protobuf-net library (.Net implementation of Google’s protocol buffer format). You can read more about my results here.

There’s another fast binary serialization library which I had heard about before, MessagePack, which claims to be 4 times faster than protocol buffers which I find a little hard to believe, so naturally, I have to see for myself!

Tests

The conditions of the test is very similar to those outlined in my first post, except MessagePack is now included in the test.

I defined two types to be serialized/deserialized, both contain the same amount of data, one exposes them as public properties whilst the other exposes them as public fields:

image image

The reason for this arrangement is to try and get the best performance out of MessagePack (see Note 2 below).

100,000 identical instances of the above types are serialized and deserialized over 5 runs with the best and worst results excluded, results of the remaining 3 runs are then averaged to give the final results.

Results

image

image

image

A couple of observations from these results:

1. BinaryFormatter performs better with fields – faster serialization and smaller payload!

2. Protobuf-net performs equally well with fields and properties – because it’s driven by the [DataMember] attributes.

3. MessagePack performs significantly better with fields – over 10% faster serialization and deserialization!

4. MessagePack is NOT 4x faster than Protocol Buffer! – at least not with the two implementations (MsgPack and Profotbuf-net) I have tested here, speed-wise there’s not much separating the two and depending on the data being serialized you might find a slightly different result.

5. MessagePack generates bigger payload than Protocol Buffer – again, depending on the data being serialized you might arrive at a slightly different result.

Source code

If you’re interested in running the tests yourself, you can find the source code here. It uses my SimpleSpeedTester framework to orchestrate test runs but you should be able to get the gist of it fairly easily.

Closing thoughts…

Although MessagePack couldn’t back up its claim to be 4x faster than Protocol Buffer in my test here, it’s still great to see that it offers comparable speed and payload, which makes it a viable alternative to Protocol Buffer and in my opinion having a number of different options is always a plus!

In addition to serialization MessagePack also offers a RPC framework, and is also widely available with client implementations available in quite a number of different language and platforms, e.g. C#, JavaScript, Node.Js, Erlang, Python and Ruby, so it’s definitely worth considering when you start a new project.

On a different note, although you’re able to get a good 10% improvement in performance when you use fields instead of properties to expose publically available data in your type, it’s important to remember that the idiomatic way in .Net is to use public properties.

As much as I think performance is of paramount importance, maintainability of your code is a close second, especially in an environment where many developers will be working on the same piece of code. By following the ‘recognized’ way of doing things you can more easily communicate your intentions to other developers working on the same code.

Imagine you’ve intentionally used public fields in your class because you know it will be serialized by the MessagePack serializer but then a new developer joins the team and, not knowing all the intricate details of your code, decides to convert them all to public properties instead..

So if you’re willing to step out of the idiomatic .Net way to get that performance boost, be sure to comment your code well in all the places where you’re diverging from the norm!

NOTE 1:

The SimpleObject class in this post is slightly different from the one I used in my previous post, the ‘Scores‘ property is now an int array instead of List<int>, doing so drastically improved performance for ALL the serializers involved.

This is because the implementation of List<T> uses an array internally and the array is resized when capacity is reached. So even though you only have x number of items in a List<T> the internal array will always have more than x number of spaces. Which is why SimpleObject in my first post (where ‘Scores‘ is of type List<int>) is serialized into 708 bytes by the BinaryFormatter but the SimpleObject class defined in this post can be serialized into 376 bytes.

Similarly, initializing an instance of Dictionary<TKey, TValue> with the intended capacity allows you to add items to it much more efficiently, for more details, see here.

Note 2:

One caveat I came across in the MessagePack test was that, if you instantiate CompiledPacker without any parameter, the default behaviour is to serialize public fields only and that meant none of the public properties I’ve defined on the SimpleObject type will get serialized. As a result, to serialize/deserialize instances of SimpleObject I needed to use an instance of CompiledPacker that serializes private fields too, which according to the MessagePack’s GettingStarted guide, is slower.

So in order to get the best performance (speed) out of MessagePack, I defined a second type (see SimpleObjectWithFields type above) to see how well it does with a type that defines public fields instead of properties. In the interest of fairness, I repeated the same test for BinaryFormatter and protobuf-net too, just to see if you get better performance out of them too.

Performance Test – BinaryFormatter vs Protobuf-Net

Note: don’t forget to check out the Benchmarks page to see the latest round up of binary and JSON serializers.

When working with the BinaryFormatter class frequently, one of the things you notice is that it is really damn inefficient… both in terms of speed as well as the payload (the size of the serialized byte array).

Google’s Protocol Buffers format is designed to be super fast and minimizes the payload by requiring a serializable object to define ahead of time the order in which its properties/fields should be serialized/deserialized in, doing so removes the need for all sorts of metadata that traditionally need to be encoded along with the actual data.

Marc Gravell (of StackOverflow fame!) has a .Net implementation called ‘protobuf-net‘, and is said to totally kick ass! As with most performance related topics, it’s hugely intriguing to me so I decided to put it to test myself :-)

Assumptions/Conditions of tests

  1. code is compiled in release mode, with optimization option turned on
  2. 5 runs of the same test is performed, with the top and bottom results excluded, the remaining three results is then averaged
  3. 100,000 identical instances of type SimpleObject(see below) is serialized and deserialized
  4. serialization/deserializatoin of the objects happen sequentially in a loop (no concurrency)

image

Results

Unsurprisingly, the protobuf-net serializer wins hands-down in all three categories, as you can see from the table below it is a staggering 12x faster than BinaryFormatter when it comes to serialization, with a payload size less than 15th of its counterpart’s.

One curious observation about the payload size is that, when I used a BinaryWriter to simply write every property into the output stream without any metadata, what I got back should be the minimum payload size without compression, and yet the protobuf-net serializer still manages to beat that!

image

image

image

BinaryFormatter with ISerializable

I also tested the BinaryFormatter with a class that implements the ISerializable interface (see below) because others had suggested in the past that you are likely to get a noticeable performance boost if you implement the ISerializable interface yourself. The belief is that it will perform much better as it removes the reliance on reflection which can be detrimental to the performance of your code when used excessively.

However, based on the tests I have done, this does not seem to be the case, the slightly better serialization speed is far from conclusive and is offset by a slightly slower deserialization speed..

image

Source code

If you’re interested in running the tests yourself, you can find the source code here, it uses my SimpleSpeedTester framework to orchestrate the test runs but you should be able to get the gist of it fairly easily.

UPDATE 2011/08/24:

As I mentioned in the post, protobuf-net managed to produce a smaller payload than what is required to hold all the property values of the test object without any meta.

I posted this question on SO, and as Marc said in his answer the smaller payload is achieved through the use of varint and zigzag encoding, read more about them here.