Update 01/10/2019: a few of you have mentioned layers in the comments or on social media. Scroll to the bottom to see my thoughts on layers as a solution to the problems I mentioned in this post.
A version of the AWS SDK is always bundled into the Lambda runtime for your language. So the conventional wisdom says you don’t need to include it in your deployment artefact.
I’m here to tell you otherwise.
But first, let’s examine the argument for not packing the AWS SDK.
Deployment artefacts are smaller
This means deployment is faster. Although the bulk of the deployment time is with CloudFormation, not uploading the artefact. So the actual time saving depends on your internet speed.
It also means you are less likely to hit the 75GB code storage limit. Which is fair, but I feel this risk can be mitigated with the lambda-janitor.
Lastly, you can actually see and edit your code in the Cloud9 editor! Alright, this is pretty useful when you just wanna quickly check something out. Although when you’re building a production service, you wouldn’t want to rely on the AWS console, ever. You certainly wouldn’t be making code changes directly in the console!
So what’s wrong with relying on the built-in version of the AWS SDK?
The version of the built-in AWS SDK is often much older than the latest. Which means it doesn’t have the latest security patches, bug fixes as well as support for new features. Known vulnerabilities can exist in both the AWS SDK itself as well as its transient dependencies.
This can also manifest itself in subtle differences in the behaviour or performance of your application. Or as bugs that you can’t replicate when you invoke the function locally. Both are problems that are difficult to track down.
If you rely on tests that execute the Lambda code locally then the fact the runtime has a different version of AWS SDK would invalidate your tests. Personally, I think you should always have end-to-end tests as well. Which runs against the deployed system and checks everything works as expected when running inside AWS.
No immutable infrastructure
AWS can update the version of the built-in AWS SDK without notice. This can introduce subtle differences to your application’s behaviour without any action from you.
A while back, AWS updated the version of boto in the Python runtime. Unfortunately, the version they upgraded to had a bug. It caused many people’s functions to suddenly break. This affected Bustle amongst others and many hours were wasted on debugging the issue. As you can imagine, it wasn’t the easiest problem to track down!
In summary, my arguments against using the built-in AWS SDK are:
- The built-in AWS SDK is often out-dated and missing security patches and bug fixes.
- It invalidates integration tests since the runtime uses a different version of the AWS SDK to what was tested.
- AWS can update the built-in AWS SDK without notice
Ultimately, I think the tradeoff is between convenience and safety. And I for one, am happy to sacrifice small conveniences for immutable infrastructure.
What about layers?
Quite a few of you have mentioned Lambda layers as a solution. Yes, you can ship the AWS SDK via layers instead. It would address point 1 and 3 above, but it introduces other challenges. I discussed those challenges in this post, so please give that a read.
Granted, some of those challenges have been tackled – e.g. the Serverless framework now supports layers when you invoke functions locally, and you can introduce semantic versioning to layers by wrapping them in SAR apps. However, it still remains a tooling problem if you want to run unit or integration tests before deploying your functions to AWS. In both cases, you’d still need a copy of the AWS SDK locally. For a dependency that is relatively small (a few MBs) and should be updated frequently, such as the AWS SDK, I don’t feel layers is worth the hassle it brings.
p.s. AWS also recommends including your own copy of the AWS SDK in the official documentation.
I specialise in rapidly transitioning teams to serverless and building production-ready services on AWS.
Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.
Check out my new course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, callbacks, nested workflows, design patterns and best practices.
Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.
You can also get 40% off the face price with the code ytcui.
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- Lambda optimization tip – enable HTTP keep-alive
- You are thinking about serverless costs all wrong
- Many faced threats to Serverless security
- We can do better than percentile latencies
- I’m afraid you’re thinking about AWS Lambda cold starts all wrong
- Yubl’s road to Serverless
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- AWS Lambda – compare coldstart time with different languages, memory and code sizes
- Guys, we’re doing pagination wrong