Introduction to AWS SimpleWorkflow Extensions Part 2 – Beyond Hello World

The series so far:

1.   Hel­lo World exam­ple

3.   Par­al­leliz­ing activ­i­ties

 

In this post we’re going to go beyond the pre­vi­ous Hel­lo World exam­ple and show you how to use the SWF exten­sions library to mod­el work­flows with mul­ti­ple steps and allow data to flow nat­u­ral­ly from one step to the next.

When using the exten­sion library, the input to a work­flow exe­cu­tion is passed onto the first activ­i­ty of a work­flow as input by default (as per the pre­vi­ous Hel­lo World exam­ple), and the result of that activ­i­ty is passed onto the next activ­i­ty as input and so on. The result of the last activ­i­ty is then used as the result of the whole work­flow :

image

When you mod­el an activ­i­ty using the Activ­i­ty type you need to pass in a func­tion with the sig­na­ture of string –> string. This func­tion is called against the input when the gen­er­at­ed activ­i­ty work­er receives a task, and will use the return val­ue from the func­tion as the result of the activ­i­ty.

What you might not real­ize is that the Activ­i­ty type is actu­al­ly a spe­cial­ized form of the gener­ic Activity<TInput, TOut­put> type which allows you to sup­ply func­tions with arbi­trary input and out­put types and sim­ply uses the ServiceStack.Text JSON seri­al­iz­er to mar­shal data to and from string. I had decid­ed to use the ServiceStack.Text seri­al­iz­er because it’s the fastest JSON seri­al­iz­er around based on my bench­marks.

Example : Sum Web Page Lengths

Sup­pose you want to count and sum the size of the HTML pages giv­en a num­ber of URLs and return the sum as the result:

image

To imple­ment this work­flow you need to attach two activ­i­ties to the work­flow, the first requir­ing a func­tion that turns a string array into an int array and the sec­ond aggre­gates the int array into a sin­gle inte­ger val­ue, some­thing along the lines of:

The main things to take away from this exam­ple is that:

  1. you can attach mul­ti­ple activ­i­ties to a work­flow by chain­ing them up with the ++> oper­a­tor
  2. han­dler func­tions for activ­i­ties do not have to have string –> string sig­na­ture

 

Let’s take a clos­er look at the two activ­i­ties:

image

image

image

As you can see, giv­en the input JSON string to the work­flow:

[ “http://www.google.com”, “http://www.yahoo.com”, “http://www.bing.com” ]

the activ­i­ties did what they were sup­posed to and first trans­lat­ed the input into an int array before sum­ming them to give a total for the length of the land­ing pages for Google, Yahoo and Bing, and lit­tle sur­prise that Yahoo’s land­ing page is near­ly an order of mag­ni­tude big­ger than the rest!

 

Parting Thoughts

In this post I demon­strat­ed how to mod­el a work­flow with mul­ti­ple steps which accept and return arbi­trary types as input and out­put. In the next post I’ll demon­strate how to sched­ule mul­ti­ple activ­i­ties to be per­formed in par­al­lel as a sin­gle step of a work­flow.