How to secure multi-tenant applications with AppSync and Cognito

Yan Cui

I help clients go faster for less using serverless technologies.

One of the most common questions I get is “How do I build a multi-tenant application with AppSync and Cognito?”.

If you google this topic on the internet you will no doubt come across many different opinions. It’s a topic that we’ll soon explore in the AppSync Masterclass but I want to take this opportunity to explain my thoughts on it.

You see, a common requirement in these multi-tenant applications is to support roles within each tenant. These are usually well-defined roles in your application and a user would fall into one of these roles within his/her tenant.

So you not only have to isolate data access by the tenant but also restrict access to certain operations by role. Users atTenant 1 can only access Tenant 1’s data. Furthermore, a ReadOnly user at Tenant 1 cannot modify any of its data, and only SuperUser and Admin users are allowed to manage Tenant 1’s users.

My preferred way of accomplishing all this is to:

  1. Model the roles as Cognito groups.
  2. Model the tenants as Cognito attributes.
  3. Never accept tenantId as an argument in the GraphQL schema.

Let me explain.

Model tenant as Cognito attributes

To scope every request to a particular tenant, you need to get the tenant ID from somewhere.

Assuming that you’re using AppSync with Cognito, then agood place to do this is to capture the tenant ID as a Cognito custom attribute. This way, the tenant ID would be available in the $context.identity.claims object and is available to both VTL templates as well as Lambda resolvers.

Since the tenant ID is coming from Cognito, you can trust it hasn’t been tempered with. However, you still need to ensure the correct value is set in the first place. For instance, a malicious user from tenant A can’t register a new user with tenant B’s tenant ID.

To protect yourself against this attack vector, you can set AllowAdminCreateUserOnly to true .

UserPool:
  Type: AWS::Cognito::UserPool
  Properties:
    AdminCreateUserConfig:
      AllowAdminCreateUserOnly: true
    ...

This way, when a new tenant is created, your backend (maybe a Lambda function?) would also create an admin user for this tenant and set the user’s tenant ID accordingly. From then on, this admin user can register other users by talking to your AppSync API instead of calling Cognito directly.

For example, you might have an addUser mutation like this:

type Mutation {
  addUser(name: String!, email: AWSEmail!, role: Role!): User
  ...
}

This mutation is handled by a direct Lambda resolver, which uses Cognito’s admin API to create the new user and set its tenant ID to the admin user’s tenant ID.

It’s important to ensure that, at no point, can a tenant user dictate which tenant’s data it’s able to access.

Which is why you should never take tenant ID as a request argument. The tenant ID used in all your data access operations (e.g. DynamoDB read and writes) needs to come from Cognito.

Hey, Yan, how do I make sure that only Admin and SuperUser roles can create new users?”

That’s where roles and Cognito groups come in.

Model roles as Cognito groups

AppSync has an awesome integration with Cognito groups, which lets you specify which users are allowed to perform which GraphQL operations.

If only Admin and SuperUser users can manage a tenant’s users, then you can restrict access to the addUser mutation using the @aws_auth directive.

type Mutation {
  addUser(name: String!, email: AWSEmail!, role: Role!): User
  @aws_auth(cognito_groups: ["Admin", "SuperUser"])
  ...
}

This makes Cognito groups a natural way to model roles within your application and use them to restrict access to certain operations.

But how do I prevent privilege escalation? Like, if a SuperUser decides to give himself admin permissions by creating a new admin user…”

Great question! Sometimes using the @aws_auth directive alone just isn’t enough. You sometimes need to do additional validation in the request VTL template or in the Lambda resolver’s body.

In this particular case, since the operation involves calling Cognito’s admin API, you’re probably using a direct Lambda resolver. In which case, you can see what groups the caller belongs to in event.identity.groups and err if a SuperUser user attempts to create an Admin user.

Tools like Lumigo makes it easy for you to see the invocation event for your Lambda functions without having to litter your code with trace logging. Your functions are auto-instructed and you can quickly debug any issues that come up or help you develop your application faster.

Wrap up

As I mentioned at the start, my preferred way of building multi-tenant applications with AppSync and Cognito is to:

  1. Model the roles as Cognito groups.
  2. Model the tenants as Cognito attributes.
  3. Never accept tenantId as an argument in the GraphQL schema.

This approach is simple and has worked for me time and time again.

But sometimes, you have more complex use cases that this approach cannot accommodate. For example, I recently implemented a custom IAM system to cater for an app where users can have different roles at multiple organizations within a hierarchy. And the access a user has depends on both the roles it holds at those organizations but also the roles it inherits from the hierarchy.

From the above example, this user will have the following permissions:

  • ReadOnly at Parent Org, Child Org 2 and Grandchild Org
  • Admin at Child Org 1

If you’re interested in reading about how we build this system (and why!) then come back next week for my next update.

And if you want to learn more about AppSync and GraphQL, then check out my video course – the AppSync Masterclass – and save 30% while we’re still in early access!


 

Whenever you’re ready, here are 4 ways I can help you:

  1. If you want a one-stop shop to help you quickly level up your serverless skills, you should check out my Production-Ready Serverless workshop. Over 20 AWS Heroes & Community Builders have passed through this workshop, plus 1000+ students from the likes of AWS, LEGO, Booking, HBO and Siemens.
  2. If you want to learn how to test serverless applications without all the pain and hassle, you should check out my latest course, Testing Serverless Architectures.
  3. If you’re a manager or founder and want to help your team move faster and build better software, then check out my consulting services.
  4. If you just want to hang out, talk serverless, or ask for help, then you should join my FREE Community.