Step Functions : apply try-catch to a block of states

In my last post we talked about how we can implement semaphores with Step Functions. Another common scenario that many people have is to handle errors from a block of states like we’re used to with a try-catch block.

try {
  step1()
  step2()
  step3()
} catch (States.Timeout) {
  ...
} catch (States.ALL) {
  ...
}

With Step Functions, you can use Retry and Catch clauses to handle errors from Task states. There are a number of predefined system errors, and you can also handle custom errors that are thrown by your Lambda functions.

You can do this by adding the same Catch clause to each of the Task states.

"Catch": [
  {
    "ErrorEquals": [ "States.ALL" ],
    "Next": "NotifyError"
  }
]

However, this approach requires you to add the same boilerplate to every Task state. As your error handling strategy, or the state machine itself becomes more complex, this becomes a maintenance headache.

Fortunately, both Retry and Catch can be used on Parallel states too!

Even if you’re not looking to perform tasks in parallel, you can still use it to simplify your error handling.

In this case, if I wrap Step1, Step2 and Step3 into a single branch inside a Parallel state, then I can catch unhandled errors from any of the steps with one Catch clause.

{
  "StartAt": "Try",
  "States": {
    "Try": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Step1",
          "States": {
            "Step1": {
              "Type": "Task",
              "Resource": "...",
              "Next": "Step2"              
            },
            "Step2": {
              "Type": "Task",
              "Resource": "...",
              "Next": "Step3"
            },
            "Step3": {
              "Type": "Task",
              "Resource": "...",
              "End": true              
            }
          }
        }
      ],
      "Catch": [
        {
          "ErrorEquals": [ "States.ALL" ],
          "Next": "NotifyError"
        }
      ],
      "Next": "NotifySuccess"
    },
  ...
}

One final caveat with this approach is that, a Parallel state wraps the output from its branches into an array. So if subsequent states?—?such as the NotifySuccess state in the example above?—?wants to use the output from Step3 then it’ll have to take that into consideration.

What you can do instead, is to add a Pass state to unwrap the array, like this:

"UnwrapOutput": {
  "Type": "Pass",
  "InputPath": "$[0]", 
  "Next": "NotifySuccess"
}

This technique is useful when you want to apply the same error handling to block of states without having to resorting to boilerplates.

You can add Retry clause to the Parallel state to retry the entire block (i.e. from Step1, even if Step3 errored). You can also add Retry and Catch for individual states to mix things up too.

So that’s it, a nice and short post to share with you a simple technique that I have found useful with Step Functions.

I have been spending a fair bit of time with Step Functions and enjoying the service. Let me know in the comments if you have use cases that you find difficult to implement with Step Functions, I would love to hear what others are doing with it.

Liked this article? Support me on Patreon and get direct help from me via a private Slack channel or 1-2-1 mentoring.
Subscribe to my newsletter


Hi, I’m Yan. I’m an AWS Serverless Hero and the author of Production-Ready Serverless.

I specialise in rapidly transitioning teams to serverless and building production-ready services on AWS.

Are you struggling with serverless or need guidance on best practices? Do you want someone to review your architecture and help you avoid costly mistakes down the line? Whatever the case, I’m here to help.

Hire me.


Check out my new course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, callbacks, nested workflows, design patterns and best practices.

Get Your Copy


Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.

You can also get 40% off the face price with the code ytcui.

Get Your Copy