Ama­zon Sim­ple­Work­flow (abbre­vi­ated to SWF from here on) is a work­flow ser­vice pro­vided by Ama­zon which allows you to model busi­ness processes as work­flows using a task based pro­gram­ming model. The ser­vice pro­vides reli­able task dis­patch and state man­age­ment so that you can focus on devel­op­ing ‘work­ers’ to per­form the tasks that are required to move the exe­cu­tion of a work­flow along.

Intro­duc­tion to SWF

For more infor­ma­tion about SWF, have a look at the fol­low­ing intro­duc­tory webinar.

There are two types of tasks:

  • Activ­ity task – tells an ‘activ­ity worker’ to per­form a spe­cific func­tion, e.g. check inven­tory or charge a credit card.
  • Deci­sion task – tells a ‘decider’ that the state of a work­flow exe­cu­tion has changed so that it can deter­mine what the next course of action should be, e.g. con­tinue to the next activ­ity in the work­flow, or com­plete the work­flow if all activ­i­ties have been suc­cess­fully completed

Both the activ­ity worker and decider needs to poll SWF ser­vice for tasks and respond with some result or deci­sions respec­tively after receiv­ing a task. Each task is asso­ci­ated with one or more time­out val­ues and if no response is received before the time­out expires then the task will time­out and can be resched­uled (in the case of a deci­sion task, it is resched­uled auto­mat­i­cally by the system).

Since tasks can be polled from just about any­where (from an EC2 instance, or your home computer/laptop) and the tasks received can be part of any num­ber of cur­rently exe­cut­ing work­flows, both the activ­ity worker and decider should be com­pletely state­less and can be dis­trib­uted across any num­ber of loca­tions both inside and out­side of the AWS ecosystem.

The his­tory (as a sequence of events each keyed to a unique ID) of each work­flow exe­cu­tion is avail­able to view in the AWS Man­age­ment Con­sole so that you have plenty of infor­ma­tion to aid you when inves­ti­gat­ing why work­flows failed, for instance.

imageimage

 

Con­sider the fol­low­ing exam­ple given by the SWF devel­oper guide:

Sample Workflow Overview

Each of the steps can be rep­re­sented as an activ­ity task and along the way the decider will receive deci­sion tasks and by inspect­ing the his­tory of events thus far the decider can sched­ule the next activ­ity in the work­flow, e.g.

image

The actual SWF API is rather dif­fer­ent so the pseudo code above tends to trans­late to some­thing slightly more involved, which brings us to the topic of..

Short Com­ings of SWF

Work­flows are mod­elled implicitly

In my opin­ion the biggest short­com­ing with SWF is that the work­flow itself (an order sequence of activ­i­ties) is implied by the decider logic and at no point as you work with the ser­vice does it feel like you’re actu­ally mod­el­ling a work­flow. This might not be an issue in sim­ple cases, but as you string together more and more activ­i­ties (and poten­tially child work­flows) and hav­ing to pass data along from one activ­ity to the next and deal with fail­ure cases the decider logic is going to become much more com­plex and dif­fi­cult to maintain.

Need for boilerplate

The .Net AWSSDK pro­vides a straight map­ping to the set of actions avail­able on the SWF ser­vice and pro­vides very lit­tle added value to devel­op­ers because as it stands every work­flow requires boil­er­plate code to:

  • poll for deci­sion task (mul­ti­ple times if you need to go back fur­ther than the max 100 events per request)
  • inspect his­tory of events after receiv­ing a deci­sion task
  • sched­ule next activ­ity or com­plete work­flow based on last events
  • poll for activ­ity task
  • record heart­beats peri­od­i­cally when pro­cess­ing an activ­ity task
  • respond com­pleted mes­sage on suc­cess­ful com­ple­tion of the activ­ity task
  • cap­ture excep­tions dur­ing the pro­cess­ing of a task and respond failed message

Many of these steps are com­mon across all deciders and activ­ity work­ers and it’s left to you to imple­ment this miss­ing layer of abstrac­tion. Java devel­op­ers have access of a heavy-weight Flow Frame­work which allows you to declar­a­tively (using dec­o­ra­tors) spec­ify activ­i­ties and work­flows, giv­ing you more of a sense of mod­el­ling the work­flow and its con­stituent activ­i­ties. How­ever, as you can see from the canon­i­cal ‘hello world’ exam­ple, a lot of code is required to carry out even a sim­ple work­flow, not to men­tion the var­i­ous frame­work con­cepts one would have to learn..

A light-weight, intu­itive abstrac­tion layer is badly needed.

All activ­ity and work­flow types must be registered

Every work­flow and every activ­ity needs to be explic­itly reg­is­tered with SWF before they can be exe­cuted, and like work­flow exe­cu­tions, reg­is­tered work­flow and activ­ity types can be viewed directly in the AWS Man­age­ment Con­sole:

image

This reg­is­tra­tion can be done pro­gram­mat­i­cally (as is the case with the Flow Frame­work) or via the AWS Man­age­ment Con­sole. The pro­gram­matic approach is clearly pre­ferred but again, as far as .Net devel­op­ers are con­cerned, it’s an automa­tion step which you’d have to imple­ment your­self and derive a ver­sion­ing scheme for both work­flows and activ­i­ties. As a devel­oper who just wants to model and imple­ment a work­flow with SWF, the reg­is­tra­tion rep­re­sents another step in the devel­op­ment process which you would rather do without.

Another thing to keep in mind is that, in the case where you have more than one activ­ity with the same name but part of dif­fer­ent work­flows and require dif­fer­ent task to be per­formed, you need a way to dis­tin­guish between the dif­fer­ent activ­i­ties so that the cor­re­spond­ing activ­ity work­ers do not pick up the incor­rect task.

SimpleWorkflow.Extensions

Dri­ven by the pain of devel­op­ing against SWF because of its numer­ous short­com­ings (pain-driven devel­op­ment…) I started work­ing on an exten­sion library to the .Net AWSSDK to give .Net devel­op­ers an intu­itive API to model work­flows and han­dle all the nec­es­sary boil­er­plate tasks (such as excep­tion han­dling, etc.) so that you can truly focus on mod­el­ling work­flows and not worry about all the other plumb­ing required for work­ing with SWF.

Intu­itive mod­el­ling API

The sim­ple ‘hello world’ exam­ple given by the Flow Frame­work can be mod­elled with less than 10 lines of code that are far eas­ier to understand:

Here the ++> oper­a­tor attaches an activ­ity or child work­flow to an exist­ing empty work­flow and returns a new instance of Work­flow rather than mod­i­fy­ing the exist­ing work­flow (in the spirit of func­tional pro­gram­ming and immutability).

An activ­ity in SWF terms, in essence can be thought of a func­tion which takes an input (string), per­forms some task and returns a result (string). Hence the Activ­ity class you see above accepts a func­tion of the sig­na­ture string –> string though there is a generic vari­ant Activity<TInput, TOut­put> which takes a func­tion of sig­na­ture TIn­put –> TOut­put and uses ServiceStack.Text JSON seri­al­izer (the fastest JSON seri­al­izer for .Net) to mar­shal data to and from string.

Exchang­ing data between activities

The input to the work­flow exe­cu­tion is passed to the first activ­ity as input, and the result pro­vided by the first activ­ity is then passed to the sec­ond activ­ity as input and so on. This exchange of data also extends to child work­flows, for example:

Start­ing a work­flow exe­cu­tion with the input ‘the­burn­ing­monk’ prints the fol­low­ing out­puts to the console:

Mac­Don­ald: hello theburningmonk!

Mac­Don­ald: good bye, theburningmonk!

Old Mac­Don­ald had a farm

EE-I-EE-I-O

To visu­al­ize the sequence of event and how data is exchanged from one activ­ity to the next:

starts main work­flow “with_child_workflow” with input “theburningmonk”

-> “the­burn­ing­monk” is passed as input to the activ­ity “greet”

-> calls cur­ried func­tion greet “Mac­Don­ald” with “theburningmonk”

-> greet func­tion prints “Mac­Don­ald: hello the­burn­ing­monk!” to console

-> greet func­tion returns “theburningmonk”

-> “the­burn­ing­monk” is passed as input to activ­ity “bye”

-> calls cur­ried func­tion bye “Mac­Don­ald” with “the­burn­ing­monk”

-> bye func­tion prints “Mac­Don­ald: good bye, the­burn­ing­monk!” to console

-> bye func­tion returns “MacDonald”

-> “Mac­Don­ald” is used as input to start the child work­flow “sing_along”

-> “Mac­Don­ald” is passed as input to the activ­ity “sing”

-> calls func­tion sing with “MacDonald”

-> sing func­tion prints “Old Mac­Don­ald had a farm” to console

-> sing func­tion returns “EE-I-EE-I-O

-> the child work­flow “sing_along” com­pletes with result “EE-I-EE-I-O

-> “EE-I-EE-I-O” is passed as input to the activ­ity “echo”

-> calls func­tion echo with “EE-I-EE-I-O

-> echo func­tion prints “EE-I-EE-I-O” to console

-> echo func­tion returns “EE-I-EE-I-O

-> main work­flow “with_child_workflow” com­pletes with result “EE-I-EE-I-O

Error and Retry mechanism

You can option­ally spec­ify the max num­ber of attempts (e.g. max 3 attempts = orig­i­nal attempt + 2 retries) that should be made for each activ­ity or child work­flow before let­ting it fail/timeout and fail the workflow.

Auto­matic work­flow and activ­ity registrations

The domain, work­flow and activ­ity types are all reg­is­tered auto­mat­i­cally (if they haven’t been reg­is­tered already) when you start a work­flow. You might notice that you don’t need to spec­ify a ver­sion for each of the activ­i­ties, this is because there is an convention-based ver­sion­ing scheme in place (see below).

Ver­sion­ing scheme

Deriv­ing a ver­sion­ing scheme for your activ­i­ties is at best an arbi­trary deci­sion and one that is required by SWF which adds fric­tion to the devel­op­ment process with­out adding much value to the developers.

The ver­sion­ing scheme I’m using is such that if an activ­ity ‘echo’ is part of a work­flow ‘with_child_workflow’ and is the 4th activ­ity in the work­flow, then the ver­sion for this par­tic­u­lar instance of ‘echo’ activ­ity is with_child_workflow.3.

This scheme allows you to:

  • decou­ple the name of an activ­ity to the del­e­gate function
  • reuse the same activ­ity name in dif­fer­ent work­flows, and allow them to per­form dif­fer­ent tasks if need be
  • reuse the same activ­ity name for dif­fer­ent activ­i­ties in the same work­flow, and allow them to per­form dif­fer­ent tasks if need be

Asyn­chro­nous execution

Nearly all of the com­mu­ni­ca­tion with SWF (polling, respond­ing with result, etc.) are all done asyn­chro­nously using non-blocking IO (using F# async workflows).

 

Cur­rently, the exten­sion library can also be used from F#, I’m still unde­cided on the API for C# (because you won’t be able to use the ++> cus­tom oper­a­tor) and would wel­come any sug­ges­tions you might have!

As you can see from the Issues list, there is still a cou­ple of things I want to add sup­port for, but you should be see­ing a Nuget pack­age being made avail­able in the near future. But if you want to try it out in the mean­time, feel free to grab the source and run the var­i­ous exam­ples I had added in the Exam­pleFs project.

Enjoy!

Share

2 Responses to “Making Amazon SimpleWorkflow simpler to work with”

  1. […] my pre­vi­ous post I men­tioned some of the short­com­ings with Ama­zon Sim­ple­Work­flow (SWF) which drove me […]

  2. […] Yan Cui blogged “Mak­ing Ama­zon Sim­ple­Work­flow sim­pler to work with“. […]

Leave a Reply