Why we didn’t choose QLDB for a healthcare app

I have been working with a US client to build a first-of-its-kind application to manage your consent to share your data amongst healthcare providers in your state. The project falls within HIPAA, so we need to meet all the HIPAA compliance requirements, including having a full history of the user data. To meet that audit requirement we used DynamoDB Streams and Lambda to get the change history out of the main transactional table.

However, we also considered switching from DynamoDB to QLDB. But ultimately we decided that it’s not ready for our use case yet. Big thanks to Matt Lewis for sharing their experience of working with QLDB at DVLA.

For anyone who’s considering QLDB for their project, here are the factors that lead to our decision.

Update 07/10/2020: The issue with not being able to create indices on non-empty tables has been addressed. See the announcement here.

The Good

  • You get audit history out of the box with the History function. e.g. Select * From History(tableName, startDate?, endDate?) WHERE metadata.id = ...
  • It supports more flexible queries than DynamoDB, you’re not limited by Hash + Range key. And there is no need for managing lots of composite keys and secondary indices to support querying on different fields.
  • It has built-in support for transactions.
  • It supports indices for improving read performance for attribute lookups.
  • It’s completely schemaless (yup, don’t even need to specify a primary key).
  • It supports nested fields.
  • It’s pay-as-you-go, no need to pay for server uptime, yay!
  • It uses IAM for authentication and authorization, don’t need to deal with VPCs.
  • Its query performance (average ~30ms) is good, if not on par with DynamoDB.
  • It has built-in support for optimistic concurrency control. For example, if a read was interrupted by another write, then the Ledger rejects the read and the app has to retry.
  • There is no “hard” drop table. Dropped tables are simply marked as “inactive” and can be restored with UNDROP TABLE.
  • You can stream data changes out to Kinesis in order to popular other read models – e.g. to ElasticSearch to support search features.

The Bad

  • Lots of Partiql operations are not supported yet, but most importantly..
    • No support for Order By, so we can’t implement some of the required sorting in the app
    • No support for LIMIT, so we can’t implement pagination.
  • History queries require the document id, which is generated by QLDB and you only find out AFTER you run INSERT INTO. This makes querying histories a little awkward… A workaround for this (thanks to Matt Lewis for this tip!) is to capture it after the insert and then run UPDATE to add the id to the document. Fortunately, this can be done atomically in a transaction.
  • No support for conditional checks. Instead, you have to do a read to check if the document already exists, then decide whether to add the document. Fortunately, the read and then write can be done inside a transaction. The optimistic concurrency control ensures there are no race conditions here, and ensures that you don’t end up adding the same document twice.
  • You can only have 5 indices per table.
  • Indices can only be created when the table is empty.
  • Unknown performance at scale without indices.
  • Long cold start (the client library needs to init socket pool) time. During testing, this took around 300ms on average.
  • No CloudFormation support for table and indices.
  • There’s no support for KMS CMKs for server-side encryption, only QLDB-managed KMS keys.
  • IAM permission is not granular, it’s just qldb:sendCommand.
  • While you can export ION documents to S3, there is no way to restore them right now.
  • There is no direct integration with AppSync. But this 2-part blog post shows you how to build a reusable Lambda resolver that can make integrating AppSync with QLDB relatively straight-forward.

Although we decided not to go with QLDB for now, it is a very exciting piece of technology that we would love to use in the future. For now, some of these limitations are showstoppers for us. And while it makes some things easier (e.g. implementing audit history) it makes other things more difficult (e.g. implementing conditional checks, backup and restore, and integrating with our AppSync API). All and all, we don’t feel there would be any net gain in terms of development effort and velocity.