You should use SSM Parameter Store over Lambda env variables

AWS Lamb­da announced native sup­port for envi­ron­ment vari­ables at the end of 2016. But even before that, the Server­less frame­work had sup­port­ed envi­ron­ment vari­ables and I was using them hap­pi­ly as me and my team at the time migrat­ed our mono­lith­ic Node.js back­end to server­less.

How­ev­er, as our archi­tec­ture expand­ed we found sev­er­al draw­backs with man­ag­ing con­fig­u­ra­tions with envi­ron­ment vari­ables.

Hard to share configs across projects

The biggest prob­lem for us was the inabil­i­ty to share con­fig­u­ra­tions across projects since envi­ron­ment vari­ables are func­tion spe­cif­ic at run­time.

The Server­less frame­work has the notion of ser­vices, which is just a way of group­ing relat­ed func­tions togeth­er. You can spec­i­fy ser­vice-wide envi­ron­ment vari­ables as well as func­tion-spe­cif­ic ones.

A sam­ple serverless.yml that spec­i­fies both ser­vice-wide as well as func­tion-spe­cif­ic envi­ron­ment vari­ables.

How­ev­er, we often found that con­fig­u­ra­tions need to be shared across mul­ti­ple ser­vices. When these con­fig­u­ra­tions change we had to update and rede­ploy all func­tions that depend on them — which in itself was becom­ing a chal­lenge to track these depen­den­cies across many Github repos that are main­tained by dif­fer­ent mem­bers of the team.

For exam­ple, as we were migrat­ing from a mono­lith­ic sys­tem piece by piece whilst deliv­er­ing new fea­tures, we weren’t able to move away from the mono­lith­ic Mon­goDB data­base in one go. It meant that lots of func­tions shared Mon­goDB con­nec­tion strings. When one of these con­nec­tion strings changed — and it did sev­er­al times — pain and suf­fer­ing fol­lowed.

Anoth­er con­fig­urable val­ue we often share are the root URL of inter­me­di­ate ser­vices. Being a social net­work, many of our user-ini­ti­at­ed oper­a­tions depend on rela­tion­ship data, so many of our microser­vices depend on the Rela­tion­ship API. Instead of hard­cod­ing the URL to the Rela­tion­ship API in every ser­vice (one of the dead­ly microser­vice anti-pat­terns), it should be stored in a cen­tral con­fig­u­ra­tion ser­vice.

Hard to implement fine-grained access control

When you need to con­fig­ure sen­si­tive data such as cre­den­tials, API keys or DB con­nec­tion strings, the rule of thumb are:

  1. data should be encrypt­ed at rest (includes not check­ing them into source con­trol in plain text)
  2. data should be encrypt­ed in-tran­sit
  3. apply the prin­ci­ple of least priv­i­lege to function’s and personnel’s access to data

If you’re oper­at­ing in a heav­i­ly reg­u­lat­ed envi­ron­ment then point 3. might be more than a good prac­tice but a reg­u­la­to­ry require­ment. I know of many fin­tech com­pa­nies and finan­cial jug­ger­nauts where access to pro­duc­tion cre­den­tials are tight­ly con­trolled and avail­able only to a hand­ful of peo­ple in the com­pa­ny.

Whilst efforts such as the server­less-secrets-plu­g­in deliv­ers on point 1. it cou­ples one’s abil­i­ty to deploy Lamb­da func­tions with one’s access to sen­si­tive data — ie. he who deploys the func­tion must have access to the sen­si­tive data too. This might be OK for many star­tups, as every­one has access to every­thing, ide­al­ly your process for man­ag­ing access to data can evolve with the company’s needs as it grows up.

SSM Parameter Store

My team out­grew envi­ron­ment vari­ables, and I start­ed look­ing at oth­er pop­u­lar solu­tions in this space — etcd, con­sul, etc. But I real­ly didn’t fan­cy these solu­tions because:

  • they’re cost­ly to run: you need to run sev­er­al EC2 instances in mul­ti-AZ set­ting for HA
  • you have to man­age these servers
  • they each have a learn­ing curve with regards to both con­fig­ur­ing the ser­vice as well as the CLI tools
  • we need­ed a frac­tion of the fea­tures they offer

This was 5 months before Ama­zon announced SSM Para­me­ter Store at re:invent 2016, so at the time we built our own Con­fig­u­ra­tion API with API Gate­way and Lamb­da.

Nowa­days, you should just use the SSM Para­me­ter Store because:

  • it’s a ful­ly man­aged ser­vice
  • shar­ing con­fig­u­ra­tions is easy, as it’s a cen­tralised ser­vice
  • it inte­grates with KMS out-of-the-box
  • it offers fine-grained con­trol via IAM
  • it records a his­to­ry of changes
  • you can use it via the con­sole, AWS CLI as well as via its HTTPS API

In short, it ticks all our box­es.

You have fine-grained con­trol over what para­me­ters a func­tion is allowed to access.

There are cou­ple of ser­vice lim­its to be aware of:

  • max 10,000 para­me­ters per account
  • max length of para­me­ter val­ue is 4096 char­ac­ters
  • max 100 past val­ues for a para­me­ter

Client library

Hav­ing a cen­tralised place to store para­me­ters is just one side of the coin. You should still invest effort into mak­ing a robust client library that is easy to use, and sup­ports:

  • caching & cache expi­ra­tion
  • hot-swap­ping con­fig­u­ra­tions when source con­fig val­ue has changed

Here is one such client library that I put togeth­er for a demo:

To use it, you can cre­ate con­fig objects with the loadConfigs func­tion. These objects will expose prop­er­ties that return the con­fig val­ues as Promise (hence the yield, which is the mag­ic pow­er we get with co).

You can have dif­fer­ent con­fig val­ues with dif­fer­ent cache expi­ra­tion too.

If you want to play around with using SSM Para­me­ter Store from Lamb­da (or to see this cache client in action), then check out this repo and deploy it to your AWS envi­ron­ment. I haven’t includ­ed any HTTP events, so you’d have to invoke the func­tions from the con­sole.

Update 15/09/2017: the Server­less frame­work release 1.22.0 which intro­duced sup­port for SSM para­me­ters out of the box.

With this lat­est ver­sion of the Server­less frame­work, you can spec­i­fy the val­ue of envi­ron­ment vari­ables to come from SSM para­me­ter store direct­ly.

Com­pared to many of the exist­ing approach­es, it has some ben­e­fits:

  • avoid check­ing in sen­si­tive data in plain text in source con­trol
  • avoid dupli­cat­ing the same con­fig val­ues in mul­ti­ple ser­vices

How­ev­er, it still falls short on many fronts (based on my own require­ments):

  • since it’s fetch­ing the SSM para­me­ter val­ues at deploy­ment time, it still cou­ples your abil­i­ty to deploy your func­tion with access to sen­si­tive con­fig­u­ra­tion data
  • the con­fig­u­ra­tion val­ues are stored in plain text as Lamb­da envi­ron­ment vari­ables, which means you don’t need the KMS per­mis­sions to access them, you can see it the Lamb­da con­sole in plain sight
  • fur­ther to the above, if the func­tion is com­pro­mised by an attack­er (who would then have access to process.env) then they’ll be able to eas­i­ly find the decrypt­ed val­ues dur­ing the ini­tial probe (go to 13:05 mark on this video where I gave a demo of how eas­i­ly this can be done)
  • because the val­ues are baked at deploy­ment time, it doesn’t allow you to eas­i­ly prop­a­gate con­fig val­ue changes. To make a con­fig val­ue change, you will need to a) iden­ti­fy all depen­dent func­tions; and b) re-deploy­ing all these func­tions

Of course, your require­ment might be very dif­fer­ent from mine, and I cer­tain­ly think it’s an improve­ment over many of the approach­es I have seen. But, per­son­al­ly I still think you should:

  1. fetch SSM para­me­ter val­ues at run­time
  2. cache these val­ues, and hot-swap when source val­ues change