Filbert – a BERT serializer for .Net

I spent the last couple of nights and put together a small BERT serializer for .Net called Filbert.

 

What’s BERT?

BERT (Binary ERlang Term) is a binary format based on Erlang’s binary serialization format (as used by erlang:term_to_binary/1) but supports a couple of complex types such as boolean, dictionary and time, in additional to the primitive types.

BERT and BERT-RPC was specified by GitHub’s cofound Tom Preston-Werner and bas been in production use at GitHub as part of their infrastructure allowing them to integrate Ruby and Erlang through the Ernie BERT-RPC server.

The encoding format for BERT is the same as Erlang’s external term format which you can read about in great detail here. It is not as highly optimized as something like protocol buffer but is very easy to understand and implement and does not require a separate contract definition file (the .proto file for protocol buffer) in order to be able to serialize a type.

 

How can I try it?

I have yet to put together a Nuget package for Filbert but will do so in the near future, in the mean time why not fork the project on GitHub and try it out for yourself?

I have included a simple F# and C# example project as part of the solution, and if it’s not clear then have a read of the tutorial page to help you get started.

 

I have ensured reasonable test coverage for both encoder and decoder but there are no doubt many edge cases which I haven’t considered and would really appreciate any feedback you have on how best I can improve the solution Smile

F# – defining explicit operator in F#

Update 2012/08/23: Thanks for the suggestion from Jizugu in the comments, I’ve updated the post to show you his approach to calling the explicit operator in a clean and elegant way.

 

In C#, you can define an explicit operator for your type using the explicit keyword:

image

You can define an explicit operator like the below and use a custom operator to make invoking the explicit operator in an elegant way rather than having to call the static Person.op_Explicit method:

F# – specifying a discriminated union clause generic unit of measure

You can specify a function which can take in a numeric value with a generic unit of measure easily enough:

image

Similarly, you can also specify a discriminated union whose clauses can be of a numeric value with a generic unit of measure, like this:

F# – defining a type extension for generic array

Peculiarly I couldn’t find any documented way to create a type extension for a generic array, ‘a [ ], turns out you need to use backtick marks ( ` ) around the square brackets in order to do that:

Performance Test – Json Serializers Part III

Note: Don’t forget to check out Benchmarks page to see the latest round up of binary and JSON serializers.

Following on from my previous test, I have now included JsonFx and as well as the Json.Net BSON serializer in the mix to see how they match up.

The results (in milliseconds) as well as the average payload size (for each of the 100K objects serialized) are as follows.

image[4]

Graphically this is how they look:

image

I have included protobuf-net in this test to provide more meaningful comparison for Json.Net BSON serializer since it generates a binary payload and as such has a different use case to the other JSON serializers.

In general, I consider JSON to be appropriate when the serialized data needs to be human readable, a binary payload on the other hand, is more appropriate for communication between applications/services.

Observations

You can see from the results above that the Json.Net BSON serializer actually generates a bigger payload than its JSON counterpart. This is because the simple POCO being serialized contains an array of 10 integers in the range of 1 to 100. When the integer ‘1’ is serialized as JSON, it’ll take 1 byte to represent as one character, but an integer will always take 4 bytes to represent as binary!

In comparison, the protocol buffer format uses varint encoding so that smaller numbers take a smaller number of bytes to represent, and it is not self-describing (the property names are not part of the payload) so it’s able to generate a much much smaller payload compared to JSON and BSON.

Lastly, whilst the Json.Net BSON serializer offers a slightly faster deserialization time compared to the Json.Net JSON serializer, it does however, have a much slower serialization speed.

Disclaimers

Benchmarks do not tell the whole story, and the numbers will naturally vary depending on a number of factors such as the type of data being tested on. In the real world, you will also need to take into account how you’re likely to interact with the data, e.g. if you know you’ll be deserializing data a lot more often than serializing them then deserialization speed will of course become less important than serialization speed!

In the case of BSON and integers, whilst it’s less efficient (than JSON) when serializing small numbers, it’s more efficient when the numbers are bigger than 4 digits.