You need to use structured logging with AWS Lambda

This is a les­son that I wished I learnt when I first start­ed using AWS Lamb­da in anger, it would have made my life sim­pler right from the start. 

But, we did get there before long, and it allowed us to track and include cor­re­la­tion IDs in our log mes­sages (which are then pushed to an ELK stack)which would also include oth­er use­ful infor­ma­tion such as:

  • age of the func­tion exe­cu­tion
  • whether invo­ca­tion was a cold start
  • log lev­el
  • etc.

Look­ing back, I fell into the com­mon trap of for­get­ting the prac­tices that had served us well in a dif­fer­ent par­a­digm of devel­op­ing soft­ware, at least until I fig­ured out how to adopt these old prac­tices for the new server­less par­a­digm.

In this case, we know struc­tured logs are impor­tant and why we need them, and also how to do it well. But like many oth­ers, we start­ed with console.log because it was sim­ple and it worked (to a lim­it­ed degree).

But, this approach has a real­ly low ceil­ing:

  • you can’t add con­tex­tu­al infor­ma­tion with the log mes­sage, at least not in a way that’s con­sis­tent (you may change the mes­sage you log, but the con­tex­tu­al infor­ma­tion should always be there, e.g. user-id, request-id, etc.) and easy to extract for an auto­mat­ed process
  • as a result, it’s also hard to fil­ter the log mes­sages by spe­cif­ic attrib­ut­es — e.g. “show me all the log mes­sages relat­ed to this request ID”
  • it’s hard to con­trol what lev­el to log at (i.e. debug, info, warn­ing, …) based on con­fig­u­ra­tion — e.g. log at debug lev­el for non-pro­duc­tion envi­ron­ments, but log at info/warning lev­el for pro­duc­tion

Which is why, if you’re just start­ing your Server­less jour­ney, then learn from my mis­takes and write your logs as struc­tured JSON from the start. Also, you should use what­ev­er log client that you were using before — log4j, nlog, log­gly, log4net, what­ev­er it is — and con­fig­ure the client to for­mat log mes­sages as JSON and attach as much con­tex­tu­al infor­ma­tion as you can.

As I men­tioned in the post on how to cap­ture and for­ward col­lec­tion IDs, it’s also a good idea to enable debug log­ging on the entire call chain for a small % of requests in pro­duc­tion. It helps you catch per­va­sive bugs in your log­ic (that are easy to catch, but ONLY if you have the right info from the logs) that would oth­er­wise require you to rede­ploy all the func­tions on the entire call chain to turn on debug log­ging…

So there, a sim­ple and effec­tive thing to do to mas­sive­ly upgrade your server­less archi­tec­ture. Check out my mini-series on log­ging for AWS Lamb­da which cov­ers log aggre­ga­tiontrack­ing cor­re­la­tion IDs, and some tips and tricks such as why and how to send cus­tom met­rics asyn­chro­nous­ly.