Introduction to AWS SimpleWorkflow Extensions Part 2 – Beyond Hello World

The series so far:

1.   Hello World example

3.   Parallelizing activities

 

In this post we’re going to go beyond the previous Hello World example and show you how to use the SWF extensions library to model workflows with multiple steps and allow data to flow naturally from one step to the next.

When using the extension library, the input to a workflow execution is passed onto the first activity of a workflow as input by default (as per the previous Hello World example), and the result of that activity is passed onto the next activity as input and so on. The result of the last activity is then used as the result of the whole workflow :

image

When you model an activity using the Activity type you need to pass in a function with the signature of string –> string. This function is called against the input when the generated activity worker receives a task, and will use the return value from the function as the result of the activity.

What you might not realize is that the Activity type is actually a specialized form of the generic Activity<TInput, TOutput> type which allows you to supply functions with arbitrary input and output types and simply uses the ServiceStack.Text JSON serializer to marshal data to and from string. I had decided to use the ServiceStack.Text serializer because it’s the fastest JSON serializer around based on my benchmarks.

Example : Sum Web Page Lengths

Suppose you want to count and sum the size of the HTML pages given a number of URLs and return the sum as the result:

image

To implement this workflow you need to attach two activities to the workflow, the first requiring a function that turns a string array into an int array and the second aggregates the int array into a single integer value, something along the lines of:

The main things to take away from this example is that:

  1. you can attach multiple activities to a workflow by chaining them up with the ++> operator
  2. handler functions for activities do not have to have string –> string signature

 

Let’s take a closer look at the two activities:

image

image

image

As you can see, given the input JSON string to the workflow:

[ “http://www.google.com”, “http://www.yahoo.com”, “http://www.bing.com” ]

the activities did what they were supposed to and first translated the input into an int array before summing them to give a total for the length of the landing pages for Google, Yahoo and Bing, and little surprise that Yahoo’s landing page is nearly an order of magnitude bigger than the rest!

 

Parting Thoughts

In this post I demonstrated how to model a workflow with multiple steps which accept and return arbitrary types as input and output. In the next post I’ll demonstrate how to schedule multiple activities to be performed in parallel as a single step of a workflow.