Using Protocol Buffers with API Gateway and AWS Lambda

AWS announced binary support for API Gateway in late 2016, which opened up the door for you to use more efficient binary formats such as Google’s Protocol Buffers and Apache Thrift.

Why?

Compared to JSON – which is the bread and butter for APIs built with API Gateway and Lambda – these binary formats can produce significantly smaller payloads.

At scale, they can make a big difference to your bandwidth cost.

In restricted environments such as low-end devices or in countries with poor mobile connections, sending smaller payloads can also improve your user experience by improving the end-to-end network latency, and possibly processing time on the device too.

Comparison of serializer performance between Proto Buffers and JSON in .Net

How

Follow these 3 simple steps (assuming you’re using Serverless framework):

  1. install the awesome serverless-apigw-binary plugin
  2. add application/x-protobuf to binary media types (see screenshot below)
  3. add function that returns Protocol Buffers as base64 encoded response

The serverless-apigw-binary plugin has made it really easy to add binary support to API Gateway

To encode & decode Protocol Buffers payload in Nodejs, you can use the protobufjs package from NPM.

It lets you work with your existing .proto files, or you can use JSON descriptors. Give the docs a read to see how you can get started.

In the demo project (link at the bottom of the post) you’ll find a Lambda function that always returns a response in Protocol Buffers.

Couple of things to note from this function:

  • we set the Content-Type header to application/x-protobuf
  • body is base64 encoded representation of the Protocol Buffers payload
  • isBase64Encoded is set to true

you need to do all 3 of these things to make API Gateway return the response as binary data.

Consider them the magic incantation for making API Gateway return binary data, and, the caller also has to set the Accept header to application/x-protobuf.

In the same project, there’s also a JSON endpoint that returns the same payload as comparison.

The response from this JSON endpoint looks like this:

{"players":[{"id":"eb66db14992e06b36282d607cf0134ce4fe45f50","name":"Calvin Ortiz","scores":[57,12,100,56,47,78,20,37,32,48]},{"id":"7b9b38e535453d120e706ff57fef41f6fee991cb","name":"Marcus Cummings","scores":[40,57,24,15,45,54,25,67,59,23]},{"id":"db34a2a5f4d16e77a6d3d6154a8b8bb6760b3b99","name":"Harry James","scores":[61,85,14,70,8,80,14,22,76,87]},{"id":"e21018c4f43eef10771e0fa71bc54156b00a64dd","name":"Gregory Bishop","scores":[51,31,27,47,72,75,61,28,100,41]},{"id":"b3ee29ee49b640ce15be1737d0dca60e48108ee1","name":"Ann Evans","scores":[69,17,48,99,85,8,75,55,78,46]},{"id":"9c1e6d4d46bb0c0d2c92bab11e5dbd5f4ab0c619","name":"Juan Perez","scores":[71,34,60,84,21,98,60,8,91,92]},{"id":"d8de89222633c61393931457c1e72558eba48639","name":"Loretta Harvey","scores":[15,40,73,92,42,65,58,30,26,84]},{"id":"141dad672ec559431f808964391d128d2c3274bf","name":"Ian Powell","scores":[17,21,14,84,64,14,22,22,34,92]},{"id":"8a97e85e2e5385c45fc31f24bfe781c26f78c0b7","name":"Steve Gibson","scores":[33,97,6,1,20,1,78,3,77,19]},{"id":"6b3ca6924e17cd5fd9d91b36d49b36a5d542c9ea","name":"Harold Ferguson","scores":[31,32,4,10,37,85,46,86,39,17]}]}

As you can see, it’s just a bunch of randomly generated names and GUIDs, and integers. The same response in Protocol Buffers is nearly 40% smaller.

Problem with the protobufjs package

Before we move on, there is one important detail about using the protobufjspacakge in a Lambda function – you need to npm install the package on a Linux system.

This is because it has a dependency that is distributed as native binaries, so if you installed the packaged on OSX then the binaries that are packaged and deployed to Lambda will not run on the Lambda execution environment.

I had similar problems with other Google libraries in the past. I find the best way to deal with this is to take a leaf out of aws-serverless-go-shim’s approach and deploy your code inside a Docker container.

This way, you would locally install a compatible version of the native binaries for your OS so you can continue to run and debug your function with sls invoke local (see this post for details).

But, during deployment, a script would run npm install --force in a Docker container running a compatible Linux distribution. This would then install a version of the native binaries that can be executed in the Lambda execution environment. The script would then use sls deploy to deploy the function.

The deployment script can be something simple like this:

In the demo project, I also have a docker-compose.yml file:

The Serverless framework requires my AWS credentials, hence why I’ve attached the $HOME/.aws directory to the container for the AWSSDK to find at runtime.

To deploy, run docker-compose up.

Use HTTP content negotiation

Whilst binary formats are more efficient when it comes to payload size, they do have one major problem: they’re really hard to debug.

Imagine the scenario – you have observed a bug, but you’re not sure if the problem is in the client app or the server. But hey, let’s just observe the HTTP conversation with a HTTP proxy such as Charles or Fiddler.

This workflow works great for JSON but breaks down when it comes to binary formats such as Protocol Buffers as the payloads are not human readable.

As we have discussed in this post, the human readability of JSON comes with the cost of heavier bandwidth usage. For most network communications, be it service-to-service, or service-to-client, unless a human is actively “reading” the payloads it’s not worth paying the cost. But when a human is trying to read it, that human readability is very valuable.

Fortunately, HTTP’s content negotiation mechanism means we can have the best of both worlds.

In the demo project, there is a contentNegotiated function which returns either JSON or Protocol Buffers payloads based on what the Accept header.

By default, you should use Protocol Buffers for all your network communications to minimise bandwidth use.

But, you should build in a mechanism for toggling the communication to JSON when you need to observe the communications. This might mean:

  • for debug builds of your mobile app, allow super users (devs, QA, etc.) the ability to turn on debug mode, which would switch the networking layer to send Accept header as application/json
  • for services, include a configuration option to turn on debug mode (see this post on configuring functions with SSM parameters and cache client for hot-swapping) to make service-to-service calls use JSON too, so you can capture and analyze the request and responses more easily

As usual, you can try out the demo code yourself, the repo is available here.

AWS Lambda – comparing platform performances

As Lambda adds nodejs 6.10 to its supported platforms I wondered if there’s any performance differences between the platforms. Thankfully the templates in the Serverless framework make it a relative breeze to test it out with a simple HelloWorld function.

 

The Test

see the test code here.

I created a simple Lambda function for each platform that will respond to an API Gateway event and return “hello”. This is the nodejs version.

I decided to use API Gateway as the trigger as it allows me to invoke the function and apply a constant load using standard load testing tools for HTTP. I chose Artillery because you can get going with minimal fuzz and I had used it before.

For each platform, I ran a test with 10 virtual users sending 1 request per second (ie. a total of 10 req/s) for an hour.

artillery quick duration 3600 rate 10 n 1 http://my.lambda.backed.api/

Since we’re interested in the performance characteristics of the different Lambda platforms, we’ll only be looking at the function Duration metric, and we’ll ignore the initial cold start times.

 

Observation 1 – C# is slower?

Unsurprisingly the invocation duration is fairly consistent across the functions, although C# is sticking out like a sore thumb.

Take this 10 mins window for instance – where there were no spikes that looked like cold starts – the C# platform is consistently higher than the rest.

 

Observation 2 – Java has very consistent performance

If you look at the max duration for the same 10 mins window – for whatever reason, I didn’t get any percentile metrics from CloudWatch for the entire duration of the test so had to settle for max instead – the Java platform was both lower and had less variance, by some distance.

If you compare the average and max duration for the Java platform over a longer time window, you’ll also see that there’s very little difference between the two (if you ignore the spike at 01:38 which might be down to GC pause as opposed to cold start) which suggests the performance of the Java platform is very consistent.

 

Observation 3 – static languages has more consistent performance?

Following on from the previous observation, it seems that both C# and Java shows less variance when it comes to max duration, so perhaps it’s because both are compiled languages?

 

Observation 4 – Java packages are big…

One of the benefits with using nodejs and Python to write Lambda functions is that they produce much smaller packages, which we know translates to lower code start time. Now, the fault might lie with the Serverless template for aws-java-maven, but my HelloWorld Java example produces a whooping 2MB package, which is orders of magnitude bigger than the nodejs and Python functions. I expected it to be bigger than nodejs, but perhaps closer to the size of the C# package.

 

Conclusions

Take these results with a pinch of salt. Things are evolving at an incredible pace and whatever performance discrepancies we’re seeing today can change quickly as AWS improves all the platforms behind the scenes.

Even as I observe that the C# platform appears to be slower in this test, we’re talking about sub-millisecond difference for a HelloWorld example, hardly representative of a real world application. The DotNetCore platform itself (which C# Lambda functions run on) is also evolving quickly, and any future performance improvements in that underlying platform will be transferred to you at no cost, so don’t let this post dissuade you from writing Lambda functions in C#.

MS Bond benchmark updated

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

I recently added MS Bond to my benchmark and found some interesting numbers, which prompted a question on their repo.

Adam Sapek explained that the slow serialization speed I was seeing was down to the default buffer size being 64KB which is not suitable for the payload I was testing with.

Adjusting the buffer size to 256 bytes resulting in some pretty amazing result:

image

image

Fastest serialization & deserialization, and smallest payload.

Wow.

 

Have a look at the performance tuning guide, there’s quite a few tweaks you can do to improve performance further but it’ll depend on your payload.

MS Bond and Chiron benchmarked

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

I updated my binary and JSON serializers benchmark earlier this week, and got some feedbacks on new serializers that I have missed, namely Chiron and Microsoft’s Bond. Here, we’ll have a look how the two fared in the benchmark.

 

MS Bond

Microsoft announced their answer to Google’s Protocol Buffer with Bond this time last year (Jan 2015). Finally I’ve got around to actually test it out (after an ex-Gamesys colleague commented on the last update – thanks Rob!).

First, you define you contract with a .bond file (see tutorial here), for example…

bond_benchmark_03

Now you run the Bond compiler tool, gbc, against this file to generate a C# class that looks like this…

bond_benchmark_04

To serialize and deserialize data, you also need to add the Bond C# nuget package to your project and follow the examples in the aforementioned tutorial.

Here’s how Bond fared against other binary serializers on my list.

NOTE: there’s an updated benchmark test that uses a different initial buffer size which makes a huge difference in performance for Bond. Please read the linked post for more info.

bond_benchmark_01

bond_benchmark_02

The result makes for an interesting reading…

  • Bond produced the smallest payload, and is the fastest at deserializing the payload by some distance.
  • It is also the slowest at serializing the payload!

 

Chiron

I read about Chiron in Marcus Griep‘s F# advent post but then forgot about it (totally my bad… too many hours on Bloodborne over xmas, such an awesome game ).

Anyways, Chiron has a F#-friendly API but because it uses statically resolved type parameters you can’t use it from C#.

In order to serialize/deserialize a type, the type needs to define the static methods ToJson and FromJson. The inlined serialize and deserialize functions can then constraint your type to have those static members and invoke them in the corresponding function. I used the same technique in MBrace.AWS and honestly, I’m not happy with the amount of work this pushes onto the user, especially when they end up having to write uninteresting plumbing code…

On the API front, I’m not thrilled with the custom operators either, even though there are only 3 of them so I’m probably just over-reacting. In general I find custom operators get in the way of discovery.

Reading through the post, this paragraph suggests a lot of intermediate JsonResult<‘a> and Json objects are created during the serialization process. Whilst this might be an idiomatic functional approach, it’s also likely to hurt our performance..

The *> operator that we used in ToJson discards the JsonResult<'a> (which is only used when writing), but continues to build upon the Json object from the previous operation. By chaining these operations together, we build up the members of a Json.Object.

Unsurprisingly, the cost of immutability proved really costly under the benchmark.

chiron_benchmark_01

chiron_benchmark_02

 

So that’s it folks, another 2 serializers added to our stable. If there any other serializers that you think I should include here, please give me a shout and I’ll do my best to accommodate.

Binary and Json benchmarks updated

It’s been a while since I last updated my binary and JSON serializer benchmarks, so here I round up the latest versions of the serializers on here.

 

DISCLAIMER : as always, you should bench­mark against your pay­load and use case, the bench­mark num­bers I have pro­duced here is unlikely to be rep­re­sen­ta­tive of your use cases and nei­ther is any­body else’s bench­mark numbers.

You can use the sim­ple test har­ness I cre­ated and see these exam­ple code to bench­mark against your par­tic­u­lar payload.

 

Binary

Only FsPickler was updated for this benchmark so there are no significant changes in numbers here (with the exception of the BinaryWriter!).

image

image

 

JSON

Quite a few of the JSON serializers have been updated:

  • FsPickler
  • Jil
  • MongoDB Driver
  • NetJson
  • Newtonsoft.Json (aka Json.Net)
  • ServiceStack.Text

Jil seems to have made the biggest gains since the last time.

image

image

*protobuf-net is in this list purely as a benchmark to show how the test JSON serializers compare to one of the fastest binary serializers (both in terms of speed and payload size)