You can become a serverless blackbelt. Enrol to my 4-week online workshop Production-Ready Serverless and let me take you through building a production-ready serverless application from testing, deployment, security right through to observability. This workshop gives you hands-on experience building something with serverless technologies while giving you a broader view of the challenges you will face as the architecture matures and expands. By the end of the workshop, you should have a clear idea when serverless is a good fit for your system as well as common pitfalls you need to avoid.
Imagine the situation. It’s 3 AM, and PagerDuty alerts you that the number of 5XX errors has spiked. Something’s wrong.
Strange, there are no error logs…
Fortunately, we have enabled X-Ray tracing and I’m able to quickly find one of the traces for these 5XX errors.
Sadly, the traces don’t offer any explanation for the error.
After fumbling around aimlessly for a while for clues, a moment of genius struck. I just so happens to notice the function’s timeout value is set to 1 second. The
Duration from the trace was 1.3 seconds.
Aha, the Lambda function probably timed out! Now that I have a suspected cause, I’m able to get the confirmation by looking for
"Task timed out" messages in the logs. And sure enough, I’m able to find them.
And now I can start the real investigation into what happened and find out what caused the function to time out.
This would have been much easier if only I had an error log message that informs me about these time outs!
If all this sounded awful, then you’re in luck. Today, we’re gonna see how we can log an error message when a Lambda function times out.
Pre-emptively log timeout messages
Hold on to your hats, because the solution is mind-bogglingly simple! All you have to do is to log the error message just before the invocation times out ;-)
context object for a Node.js function has a very useful method called
getRemainingTimeInMillis. It returns the number of milliseconds left in the current invocation. So, we can schedule a callback to be actioned JUST before the function times out and preemptively logs the timeout event.
Sure, you might get the odd false positives in the logs. But that is still a lot better than not having any error logs when the function does time out!
This is a cross-cutting concern that should be encapsulated into a middleware. If you’re using Node.js then you should consider using the middy middleware engine. Here, I can write a very simple middleware to log timeout errors:
Couple of things to note about this code:
ln21: we schedule a callback to be executed 10ms before the function times out.
ln38: if the function completes before the scheduled callback runs then we will cancel the callback using bluebird’s Promise.cancel.
ln18: this is important. If you don’t set
callbackWaitsForEmptyEventLoopto false then the invocation won’t complete until the callback has executed. That would make every invocation last longer!
ln24: capture the invocation event when the function times out. This is useful for replaying the invocation locally to debug the problem.
ln2: we are using the logger from dazn-lambda-powertools so it’ll automatically populate any captured correlation IDs in the logs. For more details, check the README in the repo.
If you want to see this middleware in action then take a look at this demo project. Once deployed, I can curl the HTTP endpoint and see an error message for timeouts.
So that’s it! The approach is really simple.
To make it really easy for you to apply this to your projects, we have added this middleware to
dazn-lambda-powertools so you will be able to use it via NPM.
I specialise in rapidly transitioning teams to serverless and building production-ready services on AWS.
Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.
Check out my new podcast Real-World Serverless where I talk with engineers who are building amazing things with serverless technologies and discuss the real-world use cases and challenges they face. If you’re interested in what people are actually doing with serverless and what it’s really like to be working with serverless day-to-day, then this is the podcast for you.
Check out my new course, Learn you some Lambda best practice for great good! In this course, you will learn best practices for working with AWS Lambda in terms of performance, cost, security, scalability, resilience and observability. We will also cover latest features from re:Invent 2019 such as Provisioned Concurrency and Lambda Destinations. Enrol now and start learning!
Check out my video course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. There is something for everyone from beginners to more advanced users looking for design patterns and best practices. Enrol now and start learning!
Are you working with Serverless and looking for expert training to level-up your skills? Or are you looking for a solid foundation to start from? Look no further, register for my Production-Ready Serverless workshop to learn how to build production-grade Serverless applications!
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- Lambda optimization tip – enable HTTP keep-alive
- You are wrong about serverless and vendor lock-in
- You are thinking about serverless costs all wrong
- Just how expensive is the full AWS SDK?
- Many faced threats to Serverless security
- We can do better than percentile latencies
- Yubl’s road to Serverless
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- AWS Lambda – compare coldstart time with different languages, memory and code sizes
- Guys, we’re doing pagination wrong
- Top 10 Serverless framework best practices