AWS Lambda – janitor-lambda function to clean up old deployment packages

You can become a serverless blackbelt. Enrol to my 4-week online workshop Production-Ready Serverless and gain hands-on experience building something from scratch using serverless technologies. At the end of the workshop, you should have a broader view of the challenges you will face as your serverless architecture matures and expands. You should also have a firm grasp on when serverless is a good fit for your system as well as common pitfalls you need to avoid. Sign up now and get 15% discount with the code yanprs15!

When working with AWS Lambda, one of the things to keep in mind is that there’s a per region limit of 75GB total size for all deployment packages. Whilst that sounds a lot at first glance, our small team of server engineers managed to rack up nearly 20GB of deployment packages in just over 3 months!

Whilst we have been mindful of deployment package size (because it affects cold start time) and heavily using Serverless‘s built-in mechanism to exclude npm packages that are not used by each of the functions, the simple fact that deployment is simple and fast means we’re doing A LOT OF DEPLOYMENTS.

Individually, most of our functions are sub-2MB, but many functions are deployed so often that in some cases there are more than 300 deployed versions! This is down to how the Serverless framework deploy functions – by publishing a new version each time. On its own, it’s not a problem, but unless you clean up the old deployment packages you’ll eventually run into the 75GB limit.

 

Some readers might have heard of Netflix’s Janitor Monkey, which cleans up unused resources in your environment – instance, ASG,  EBS volumes, EBS snapshots, etc.

Taking a leaf out of Netflix’s book, we wrote a Lambda function which finds and deletes old versions of your functions that are not referenced by an alias – remember, Serverless uses aliases to implement the concept of stages in Lambda, so not being referenced by an alias essentially equates to an orphaned version.

janitor-lambda

At the time of writing, we have just over 100 Lambda functions in our development environment and around 50 running in production. After deploying the janitor-lambda function, we have cut the code storage size down to 1.1GB, which include only the current version of deployments for all our stages (we have 4 non-prod stages in this account).

lambda-console

sidebar: if you’d like to hear more about our experience with Lambda thus far and what we have been doing, then check out the slides from my talk on the matter, I’d be happy to write them up in more details too when I have more free time.

 

Janitor-Lambda

Without further ado, here’s the bulk of our janitor function:

Since AWS Lambda throttles you on the no. of APIs calls per minute, we had to store the list of functions in the functions variable so that it carries over multiple invocations.

When we hit the (almost) inevitable throttle exception, the current invocation will end, and any functions that haven’t been completely cleaned will be cleaned the next time the function is invoked.

Another thing to keep in mind is that, when using a CloudWatch Event as the source of your function, Amazon will retry your function up to 2 more times on failure. In this case, if the function is retried straight away, it’ll just get throttled again. Hence why in the handler we log and swallow any exceptions:

I hope you have found this post useful, let me know in the comments if you have any Lambda/Serverless related questions.

Liked this article? Support me on Patreon and get direct help from me via a private Slack channel or 1-2-1 mentoring.
Subscribe to my newsletter


Hi, I’m Yan. I’m an AWS Serverless Hero and I help companies go faster for less by adopting serverless technologies successfully.

Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.

Hire me.


Skill up your serverless game with this hands-on workshop.

My 4-week Production-Ready Serverless online workshop is back!

This course takes you through building a production-ready serverless web application from testing, deployment, security, all the way through to observability. The motivation for this course is to give you hands-on experience building something with serverless technologies while giving you a broader view of the challenges you will face as the architecture matures and expands.

We will start at the basics and give you a firm introduction to Lambda and all the relevant concepts and service features (including the latest announcements in 2020). And then gradually ramping up and cover a wide array of topics such as API security, testing strategies, CI/CD, secret management, and operational best practices for monitoring and troubleshooting.

If you enrol now you can also get 15% OFF with the promo code “yanprs15”.

Enrol now and SAVE 15%.


Check out my new podcast Real-World Serverless where I talk with engineers who are building amazing things with serverless technologies and discuss the real-world use cases and challenges they face. If you’re interested in what people are actually doing with serverless and what it’s really like to be working with serverless day-to-day, then this is the podcast for you.


Check out my new course, Learn you some Lambda best practice for great good! In this course, you will learn best practices for working with AWS Lambda in terms of performance, cost, security, scalability, resilience and observability. We will also cover latest features from re:Invent 2019 such as Provisioned Concurrency and Lambda Destinations. Enrol now and start learning!


Check out my video course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. There is something for everyone from beginners to more advanced users looking for design patterns and best practices. Enrol now and start learning!