CraftConf 15–Takeaways from “Jepsen IV: Hope Springs Eternal”

This talk by Kyle Kings­bury (aka @aphyr on twit­ter) was my favourite at Craft­Conf, and gave us an update on the state of con­sis­ten­cy with Mon­goDB, Elas­tic­search and Aerospike.


Kyle opened the talk by talk­ing about how we so often build appli­ca­tions on top of data­bas­es, queues, streams, etc. and that these sys­tems we depend on are real­ly quite flam­ma­ble (hence the tyre anal­o­gy).

image any­body who’s ever used any data­base knows, every­thing is on fire all the time! But our goal is to pre­tend, and ensure that every­thing still works… we need to iso­late the sys­tem from fail­ures.


- Kyle Kings­bury



Which led nice­ly into the type of fail­ures that the rest of the talk will focus on – split brain, bro­ken for­eign keys, etc. And the pur­pose of his Jepsen project is to analyse a sys­tem against these fail­ures.


A sys­tem has bound­aries, and these bound­aries should be pro­tect­ed by a set of invari­ants – e.g. if you put some­thing into a queue then you should be able to read it out after­wards.

The rest of the talk splits into two halves.

The 1st half builds up a mod­el for talk­ing about con­sis­ten­cy:


and the 2nd half of the talk looked at a num­ber of spe­cif­ic instances of data­bas­es – Elas­tic­search, Mon­goDB and AeroSpike – and see how they stacked up against the con­sis­ten­cy guar­an­tees they claim to have.


Rather than try­ing to explain them here and doing a bad job of it, I sug­gest you read Kyle’s post on the dif­fer­ent con­sis­ten­cy mod­els from his dia­gram.

It’s a 15–20 mins read, after which you might also be inter­est­ed to give these two posts a read too:


Instead I’ll just list a few key points I not­ed dur­ing the ses­sion:

  • CAP the­o­rem tells us that a lin­eariz­able sys­tem can­not be total­ly avail­able
  • for the con­sis­ten­cy mod­els in red, you can’t have total avail­abil­i­ty (the A in CAP) dur­ing a par­ti­tion
  • for total avail­abil­i­ty, look to the area
  • weak­er con­sis­ten­cy mod­els are more avail­able in case of fail­ure
  • weak­er con­sis­ten­cy mod­els are also less intu­itive
  • weak­er con­sis­ten­cy mod­els are faster because they require less coor­di­na­tion
  • weak is not the same as unsafe – safe­ty depends on what you’re try­ing to do, e.g. even­tu­al con­sis­ten­cy is ok for coun­ters, but for claim­ing unique user­names you need lin­eariz­abil­i­ty

Kyle’s Jepsen client uses black-box test­ing approach to test data­base sys­tems (i.e. only look­ing at results from a client’s per­spec­tive) whilst induc­ing net­work par­ti­tions to see how the data­base behaves dur­ing a par­ti­tion.

The clients gen­er­ate ran­dom oper­a­tions and apply them to the sys­tem. Since clients run on the same JVM so you can use lin­eariz­able data struc­tures to record a his­to­ry of results as received by the clients and use that his­to­ry to detect con­sis­ten­cy vio­la­tions.

This is sim­i­lar to the gen­er­a­tive test­ing approach used by QuickCheck. Scott Wlaschin has two excel­lent posts to help you get start­ed with FsCheck, a F# port of QuickCheck.


Mon­goDB is not a bug, it’s a data­base”

- Kyle Kings­bury

and thus began a very enter­tain­ing sec­ond half of the talk as Kyle shared results from his tests against Mon­goDB, Elas­tic­search and AeroSpike.



None of the data­bas­es were able to meet the con­sis­ten­cy lev­el they claim to offer, but at least Elas­tic­search is hon­est about it and doesn’t promise you the moon.


Again, see­ing as Kyle has recent­ly writ­ten about these results in detail, I won’t repeat them here. The talk doesn’t go into quite as much depth, so if you have time I rec­om­mend read­ing his posts:


Whilst it was fun watch­ing Kyle shoot holes through these data­base ven­dors’ con­sis­ten­cy claims, and some of the fun-pok­ing is real­ly quite fun­ny (and well deserved on the vendor’s part).

If there’s one thing you should take­away from Kyle’s talk, and his work with Jepsen in gen­er­al is, don’t drink the kool-aid.

Data­base ven­dors have a his­to­ry of over-sell­ing and at times out-right false mar­ket­ing. As devel­op­ers, we have the means to ver­i­fy their claims, so next time you hear a claim that’s too good to be true, ver­i­fy it, don’t drink the kool-aid.