Introduction to AWS SimpleWorkflow Extensions Part 2 – Beyond Hello World

The series so far:

1.   Hello World example

3.   Parallelizing activities

 

In this post we’re going to go beyond the previous Hello World example and show you how to use the SWF extensions library to model workflows with multiple steps and allow data to flow naturally from one step to the next.

When using the extension library, the input to a workflow execution is passed onto the first activity of a workflow as input by default (as per the previous Hello World example), and the result of that activity is passed onto the next activity as input and so on. The result of the last activity is then used as the result of the whole workflow :

image

When you model an activity using the Activity type you need to pass in a function with the signature of string –> string. This function is called against the input when the generated activity worker receives a task, and will use the return value from the function as the result of the activity.

What you might not realize is that the Activity type is actually a specialized form of the generic Activity<TInput, TOutput> type which allows you to supply functions with arbitrary input and output types and simply uses the ServiceStack.Text JSON serializer to marshal data to and from string. I had decided to use the ServiceStack.Text serializer because it’s the fastest JSON serializer around based on my benchmarks.

Example : Sum Web Page Lengths

Suppose you want to count and sum the size of the HTML pages given a number of URLs and return the sum as the result:

image

To implement this workflow you need to attach two activities to the workflow, the first requiring a function that turns a string array into an int array and the second aggregates the int array into a single integer value, something along the lines of:

The main things to take away from this example is that:

  1. you can attach multiple activities to a workflow by chaining them up with the ++> operator
  2. handler functions for activities do not have to have string –> string signature

 

Let’s take a closer look at the two activities:

image

image

image

As you can see, given the input JSON string to the workflow:

[ “http://www.google.com”, “http://www.yahoo.com”, “http://www.bing.com” ]

the activities did what they were supposed to and first translated the input into an int array before summing them to give a total for the length of the landing pages for Google, Yahoo and Bing, and little surprise that Yahoo’s landing page is nearly an order of magnitude bigger than the rest!

 

Parting Thoughts

In this post I demonstrated how to model a workflow with multiple steps which accept and return arbitrary types as input and output. In the next post I’ll demonstrate how to schedule multiple activities to be performed in parallel as a single step of a workflow.

Introduction to AWS SimpleWorkflow Extensions Part 1 – Hello World example

Series so far:

2. Beyond Hello World

3. Parallelizing activities

 

In my previous post I mentioned some of the shortcomings with Amazon SimpleWorkflow (SWF) which drove me to create an extension library on top of the standard .Net SDK to make it easier to model workflows and business processes using SWF.

In this series of blog posts I’ll give you more examples of how to use the library to model workflows to be executed against the SWF service to take advantage of the reliable state management and task dispatch it offers, but none of the plumbing and boilerplate code you would have to deal with using the SDK.

Before we start looking at examples, let’s have a quick recap of the SWF terminologies:

  • A workflow is a sequence of steps that are loosely strung together by the decisions the decider makes each time the state of the workflow changes. E.g. step 1 complete then schedule step 2 to commence.
  • A workflow execution is an instance of a particular workflow currently being executed, many executions of the same workflow (identified by name and version) can be in flight at the same time. A workflow execution can be started with string as input and it can return string as output.
  • A decision task is a task that is scheduled each time a workflow’s state changes.
  • A decider is a component in your application which is responsible for polling SWF for decision tasks and respond with decisions. The sequence of steps that need to be performed by the workflow is ultimately determined by the decider.
  • An activity task is a task that is scheduled by a decider, it takes a string as input (along with several other pieces of data which it can be scheduled with) and returns a string as result.
  • An activity worker is a component in your application which is responsible for polling SWF for activity tasks and respond with completion or failure signals, as well as providing regular heartbeat signals. If the decider is responsible for scheduling work to be done, then the activity worker is responsible for doing the actual work.
  • A child workflow is a workflow that is scheduled by the decider as a step in a workflow, similar to an activity.
  • The decider is able to schedule both child workflows and activities for a single step in a workflow, whilst child workflows can be rerun as an independent unit of work, activities cannot be rerun independently outside of the context of a workflow.
  • Both workflows and activities need to be registered with the SWF service before they can be used.

These are the most common concepts/components you’ll see in SWF, but there are also less commonly used (in my opinion at least) features such as:

  • Starting a timer to cause a timer event to be fired after some time.
  • Signalling an external workflow execution to cause an event to be recorded in its execution history and a decision task to be scheduled. This is a useful way to allow inter-workflow communication, e.g. one workflow suspends itself, until another workflow sends it a signal and then it can resume with its execution.
  • Recording a marker as means to provide additional information in the execution history of a workflow.

 

Example : Hello World

Consider a workflow where there is only one activity, which simply prints the input to the screen and echoes it back out.

If we start a workflow execution with the input “Hello World!” then we expect to see the input being printed to the console and then the workflow execution completed with the result “Hello World!”.

image

In its essence, you can think of an activity as nothing more than a function which accepts a string as argument and return a string, i.e. a fun with signature string –> string in F#.

With the standard .Net SDK you will need to write a decider for each workflow in order to provide the orchestration you need for that workflow. The decider logic tends to quickly become difficult to understand and maintain when the decision logic becomes more complicated, e.g. when multiple activities and child workflows are scheduled in parallel ,and you need to retry/fail activities/workflows, etc.

In my view, the decider is largely plumbing that developers should do without, so with the extensions library you should not need to write any custom decider code but instead, simply declare what activities and/or child workflows should be scheduled at each stage of a workflow and let the library do all the heavy lifting for you!

As far as workflow modelling is concerned, the only thing you need to do is use the custom ++> operator (inspired by Dave Thomas’s pipelets project) to attach additional steps to your workflow. So the above workflow can be modelled as:

and that’s it! No need to register the workflow and activity and write bespoke decider & activity worker yourself, the library does all of that for you, all you needed to do was to model the workflow you want.

Notice you haven’t had to provide any reference to SWF at all thus far, in fact, you only need to provide an instance of AmazonSimpleWorkflowClient (from the AWS SDK) when you start the workflow:

image

This way, it’s possible to run the workflow across multiple accounts simultaneously (dev, staging, prod, etc.) by calling the Start method with each of the client instances (one for each account), which fits well with the mobile worker model SWF is designed with – SWF holds the state but you can run your workers from anywhere in and out of the AWS ecosystem.

Once you’ve started the workflow, the library will automatically register the domain, workflow and activity for you if they are not present already. You can verify this by looking in the SWF Management Console:

image

Notice that whilst we didn’t specify the “echo” activity with a version number, it’s registered with “echo.0”? I’ll go into more details on the versioning scheme in a later post, but for now let’s just be glad that we didn’t have to register these by hand!

Next, you can start a workflow directly from the management console, but ticking against the workflow you want to start and clicking the “Start New Execution” button:

image

Let’s follow through with the dialogue box and set the input as Hello World! as below:

image image

Once you start the workflow execution you will see Hello World! being printed in the console:

image

This is a sign that our echo function (which is invoked by the generated activity worker) had been called.

Back in the SWF Management Console, if you look under Workflow Executions, you should see the execution is closed after having completed successfully:

image

Clicking on the workflow execution ID allows you to see the sequence of events which had been recorded for this execution:

image

This is a very granular view of what happened during the workflow execution, giving you plenty of useful information if you ever need to investigate why a workflow execution failed, for instance.

If you switch to the Activities tab, you’ll get a more condensed view with just the activities that were scheduled, along with their inputs, results, etc.

image

For now, ignore the JSON string in the Control field and the format of the Activity ID, these are both automatically generated by the library based on a set of conventions and will be covered by a later post.

So that’s it! I hope you can see that this extension library gives you a powerful way to express and model a workflow and focus your development efforts on the things that count (designing the process and writing the code that does the actual work) rather than wasting precious developer time on getting your code to work with SWF!

 

Parting Thoughts

For Java developers, there is an existing high-level framework (provided by Amazon itself) for working with SWF called the Flow Framework, which adapts a more object-oriented approach and in my opinion requires far more plumbing and most importantly does not

In case you’re wondering, this is how a solution to a similar Hello World example looks using the flow framework (taken straight from the flow framework developer guide) for your comparison:

Making Amazon SimpleWorkflow simpler to work with

Amazon SimpleWorkflow (abbreviated to SWF from here on) is a workflow service provided by Amazon which allows you to model business processes as workflows using a task based programming model. The service provides reliable task dispatch and state management so that you can focus on developing ‘workers’ to perform the tasks that are required to move the execution of a workflow along.

Introduction to SWF

For more information about SWF, have a look at the following introductory webinar.

There are two types of tasks:

  • Activity task – tells an ‘activity worker’ to perform a specific function, e.g. check inventory or charge a credit card.
  • Decision task – tells a ‘decider’ that the state of a workflow execution has changed so that it can determine what the next course of action should be, e.g. continue to the next activity in the workflow, or complete the workflow if all activities have been successfully completed

Both the activity worker and decider needs to poll SWF service for tasks and respond with some result or decisions respectively after receiving a task. Each task is associated with one or more timeout values and if no response is received before the timeout expires then the task will timeout and can be rescheduled (in the case of a decision task, it is rescheduled automatically by the system).

Since tasks can be polled from just about anywhere (from an EC2 instance, or your home computer/laptop) and the tasks received can be part of any number of currently executing workflows, both the activity worker and decider should be completely stateless and can be distributed across any number of locations both inside and outside of the AWS ecosystem.

The history (as a sequence of events each keyed to a unique ID) of each workflow execution is available to view in the AWS Management Console so that you have plenty of information to aid you when investigating why workflows failed, for instance.

imageimage

 

Consider the following example given by the SWF developer guide:

Sample Workflow Overview

Each of the steps can be represented as an activity task and along the way the decider will receive decision tasks and by inspecting the history of events thus far the decider can schedule the next activity in the workflow, e.g.

image

The actual SWF API is rather different so the pseudo code above tends to translate to something slightly more involved, which brings us to the topic of..

Short Comings of SWF

Workflows are modelled implicitly

In my opinion the biggest shortcoming with SWF is that the workflow itself (an order sequence of activities) is implied by the decider logic and at no point as you work with the service does it feel like you’re actually modelling a workflow. This might not be an issue in simple cases, but as you string together more and more activities (and potentially child workflows) and having to pass data along from one activity to the next and deal with failure cases the decider logic is going to become much more complex and difficult to maintain.

Need for boilerplate

The .Net AWSSDK provides a straight mapping to the set of actions available on the SWF service and provides very little added value to developers because as it stands every workflow requires boilerplate code to:

  • poll for decision task (multiple times if you need to go back further than the max 100 events per request)
  • inspect history of events after receiving a decision task
  • schedule next activity or complete workflow based on last events
  • poll for activity task
  • record heartbeats periodically when processing an activity task
  • respond completed message on successful completion of the activity task
  • capture exceptions during the processing of a task and respond failed message

Many of these steps are common across all deciders and activity workers and it’s left to you to implement this missing layer of abstraction. Java developers have access of a heavy-weight Flow Framework which allows you to declaratively (using decorators) specify activities and workflows, giving you more of a sense of modelling the workflow and its constituent activities. However, as you can see from the canonical ‘hello world’ example, a lot of code is required to carry out even a simple workflow, not to mention the various framework concepts one would have to learn..

A light-weight, intuitive abstraction layer is badly needed.

All activity and workflow types must be registered

Every workflow and every activity needs to be explicitly registered with SWF before they can be executed, and like workflow executions, registered workflow and activity types can be viewed directly in the AWS Management Console:

image

This registration can be done programmatically (as is the case with the Flow Framework) or via the AWS Management Console. The programmatic approach is clearly preferred but again, as far as .Net developers are concerned, it’s an automation step which you’d have to implement yourself and derive a versioning scheme for both workflows and activities. As a developer who just wants to model and implement a workflow with SWF, the registration represents another step in the development process which you would rather do without.

Another thing to keep in mind is that, in the case where you have more than one activity with the same name but part of different workflows and require different task to be performed, you need a way to distinguish between the different activities so that the corresponding activity workers do not pick up the incorrect task.

SimpleWorkflow.Extensions

Driven by the pain of developing against SWF because of its numerous shortcomings (pain-driven development…) I started working on an extension library to the .Net AWSSDK to give .Net developers an intuitive API to model workflows and handle all the necessary boilerplate tasks (such as exception handling, etc.) so that you can truly focus on modelling workflows and not worry about all the other plumbing required for working with SWF.

Intuitive modelling API

The simple ‘hello world’ example given by the Flow Framework can be modelled with less than 10 lines of code that are far easier to understand:

Here the ++> operator attaches an activity or child workflow to an existing empty workflow and returns a new instance of Workflow rather than modifying the existing workflow (in the spirit of functional programming and immutability).

An activity in SWF terms, in essence can be thought of a function which takes an input (string), performs some task and returns a result (string). Hence the Activity class you see above accepts a function of the signature string –> string though there is a generic variant Activity<TInput, TOutput> which takes a function of signature TInput –> TOutput and uses ServiceStack.Text JSON serializer (the fastest JSON serializer for .Net) to marshal data to and from string.

Exchanging data between activities

The input to the workflow execution is passed to the first activity as input, and the result provided by the first activity is then passed to the second activity as input and so on. This exchange of data also extends to child workflows, for example:

Starting a workflow execution with the input ‘theburningmonk’ prints the following outputs to the console:

MacDonald: hello theburningmonk!

MacDonald: good bye, theburningmonk!

Old MacDonald had a farm

EE-I-EE-I-O

To visualize the sequence of event and how data is exchanged from one activity to the next:

starts main workflow “with_child_workflow” with input “theburningmonk”

-> “theburningmonk” is passed as input to the activity “greet”

-> calls curried function greet “MacDonald” with “theburningmonk”

-> greet function prints “MacDonald: hello theburningmonk!” to console

-> greet function returns “theburningmonk”

-> “theburningmonk” is passed as input to activity “bye”

-> calls curried function bye “MacDonald” with “theburningmonk”

-> bye function prints “MacDonald: good bye, theburningmonk!” to console

-> bye function returns “MacDonald”

-> “MacDonald” is used as input to start the child workflow “sing_along”

-> “MacDonald” is passed as input to the activity “sing”

-> calls function sing with “MacDonald”

-> sing function prints “Old MacDonald had a farm” to console

-> sing function returns “EE-I-EE-I-O”

-> the child workflow “sing_along” completes with result “EE-I-EE-I-O”

-> “EE-I-EE-I-O” is passed as input to the activity “echo”

-> calls function echo with “EE-I-EE-I-O”

-> echo function prints “EE-I-EE-I-O” to console

-> echo function returns “EE-I-EE-I-O”

-> main workflow “with_child_workflow” completes with result “EE-I-EE-I-O”

Error and Retry mechanism

You can optionally specify the max number of attempts (e.g. max 3 attempts = original attempt + 2 retries) that should be made for each activity or child workflow before letting it fail/timeout and fail the workflow.

Automatic workflow and activity registrations

The domain, workflow and activity types are all registered automatically (if they haven’t been registered already) when you start a workflow. You might notice that you don’t need to specify a version for each of the activities, this is because there is an convention-based versioning scheme in place (see below).

Versioning scheme

Deriving a versioning scheme for your activities is at best an arbitrary decision and one that is required by SWF which adds friction to the development process without adding much value to the developers.

The versioning scheme I’m using is such that if an activity ‘echo’ is part of a workflow ‘with_child_workflow’ and is the 4th activity in the workflow, then the version for this particular instance of ‘echo’ activity is with_child_workflow.3.

This scheme allows you to:

  • decouple the name of an activity to the delegate function
  • reuse the same activity name in different workflows, and allow them to perform different tasks if need be
  • reuse the same activity name for different activities in the same workflow, and allow them to perform different tasks if need be

Asynchronous execution

Nearly all of the communication with SWF (polling, responding with result, etc.) are all done asynchronously using non-blocking IO (using F# async workflows).

 

Currently, the extension library can also be used from F#, I’m still undecided on the API for C# (because you won’t be able to use the ++> custom operator) and would welcome any suggestions you might have!

As you can see from the Issues list, there is still a couple of things I want to add support for, but you should be seeing a Nuget package being made available in the near future. But if you want to try it out in the meantime, feel free to grab the source and run the various examples I had added in the ExampleFs project.

Enjoy!

Working with S3 folders using the .Net AWS SDK

If you’ve been using S3 client in the AWS SDK for .Net you might have noticed that there are no methods that let you interact with the folders in a bucket. As it turns out, S3 does not support folders in the conventional sense*, everything is still a key value pair, but tools such as Cloud Berry or indeed the Amazon web console simply uses ‘/’ characters in the key to indicate a folder structure.

This might seem odd at first but when you think about it, there are no folder structure on your hard drive either, it’s a logical structure the OS provides for you to make it easier for us mere mortals to work with.

Back to the topic at hand, what this means is that:

  • if you add an object with key myfolder/ to S3, it’ll be seen as a folder
  • if you add an object with key myfolder/myfile.txt to S3, it’ll be seen as a file myfile.txt inside a myfolder folder, if the folder object doesn’t exist already it’ll be added automatically
  • when you make a ListObjects call both myfolder/ and myfolder/myfile.txt will be included in the result

Creating folders

To create a folder, you just need to add an object which ends with ‘/’, like this:

public void CreateFolder(string bucket, string folder)
{
    var key = string.Format(@"{0}/", folder);
    var request = new PutObjectRequest().WithBucketName(bucket).WithKey(key);
    request.InputStream = new MemoryStream();
    _client.PutObject(request);
}

Here is a thread on the Amazon forum which covers this technique.

Listing contents of a folder

With the ListObjects method on the S3 client you can provide a prefix requirement, and to get the list of objects in a particular folder simply add the path of the folder (e.g. topfolder/middlefolder/) in the request:

var request = new ListObjectsRequest().WithBucketName(bucket).WithPrefix(folder);

If you are only interested in the objects (including folders) that are in the top level of your folder/bucket then you’d need to do some filtering on the S3 objects returned in the response, something along the line of:

// get the objects at the TOP LEVEL, i.e. not inside any folders
var objects = response.S3Objects.Where(o => !o.Key.Contains(@"/"));

// get the folders at the TOP LEVEL only
var folders = response.S3Objects.Except(objects)
                      .Where(o => o.Key.Last() == '/' &&
                                  o.Key.IndexOf(@"/") == o.Key.LastIndexOf(@"/"));