Beware of dilution of DynamoDB throughput due to excessive scaling

TL;DR — The no. of par­ti­tions in a DynamoDB table goes up in response to increased load or stor­age size, but it nev­er come back down, ever.

DynamoDB is pret­ty great, but as I have seen this par­tic­u­lar prob­lem at 3 dif­fer­ent com­pa­nies — Gamesys, JUST EAT, and now Space Ape Games — I think it’s a behav­iour that more folks should be aware of.

Cred­it to AWS, they have reg­u­lar­ly talked about the for­mu­la for work­ing out the no. of par­ti­tions at DynamoDB Deep Dive ses­sions.

How­ev­er, they often for­get to men­tion that the DynamoDB will not decrease the no. of par­ti­tions when you reduce your through­put units. It’s a cru­cial detail that is bad­ly under-rep­re­sent­ed in a lengthy Best Prac­tice guide.

Con­sid­er the fol­low­ing sce­nario:

  • you dial up the through­put for a table because there’s a sud­den spike in traf­fic or you need the extra through­put to run an expen­sive scan
  • the extra through­puts cause DynamoDB to increase the no. of par­ti­tions
  • you dial down the through­put to pre­vi­ous lev­els, but now you notice that some requests are throt­tled even when you have not exceed­ed the pro­vi­sioned through­put on the table

This hap­pens because there are less read and write through­put units per par­ti­tion than before due to the increased no. of par­ti­tions. It trans­lates to high­er like­li­hood of exceed­ing read/write through­put on a per-par­ti­tion basis (even if you’re still under the through­put lim­its on the table over­all).

When this dilu­tion of through­put hap­pens you can:

  1. migrate to a new table
  2. spec­i­fy high­er table-lev­el through­put to boost the through units per par­ti­tion to pre­vi­ous lev­els

Giv­en the dif­fi­cul­ty of table migra­tions most folks would opt for option 2, which is how JUST EAT end­ed up with a table with 3000+ write through­put unit despite con­sum­ing clos­er to 200 write units/s.

In con­clu­sion, you should think very care­ful­ly before scal­ing up a DynamoDB table dras­ti­cal­ly in response to tem­po­rary needs, it can have long last­ing cost impli­ca­tions.