My .Net Rocks talk is available

Hi, just a quick to say that my talk with .Net Rocks is now available on their web site. In this talk I shared some insights into how F# is used in our stack to help us build the backend for our social games, specifically in the areas of:



Introduction to AWS SimpleWorkflow Extensions Part 3 – Parallelizing activities

The series so far:

  1. Hello World example
  2. Beyond Hello World


Within a workflow, not all activities have to be performed sequentially. In fact, to increase throughput and/or reduce the overall time required to finish a workflow, you might want to perform several activities in parallel provided that they don’t have any inter-dependencies and can be performed independently.

In this post we’re going to see how we can use the SWF extensions library to parallelize activities by scheduling several activity tasks at a single step and then aggregate their results into a singular input to the next activity in the workflow.


The parallelized activities receive their input from either:

  1. the workflow execution’s input if this is the first step of the workflow, or
  2. the result of the preceding activity/child workflow in the workflow

As of now, the library requires you to specify a ‘reducer’ which is responsible for aggregating the results of the parallel activities into a single string which is returned as the result of the step in the workflow. There are some caveats to this reducer function (of signature Dictionary<int, string> –> string right now) as you will see in the example, I’ll look to address these oddities and clean up the API in future versions of the library, so please bear with me for now.

The aggregate result of these parallel activities can then be passed along as the input to the subsequent activity as per the example in my previous post.

Example : Count HTML element types

Suppose that, given a URL, you want to count the number of different HTML elements (e.g. <div>, <span>, …) the returned HTML contains, the counting of each element type is independent and can be carried out in parallel. For nicety, we can add an echo activity before and after the count activities so that we can print the input URL and the results to the screen. So you will end up with a workflow that perhaps looks like this:


The implementation of this workflow is as follows:


 1: #r "bin/Release/AWSSDK.dll"
 2: #r "bin/Release/SWF.Extensions.Core.dll"
 4: open Amazon.SimpleWorkflow
 5: open Amazon.SimpleWorkflow.Extensions
 7: open System.Collections.Generic
 8: open System.Net
10: let echo str = printfn "%s" str; str
12: // a function to count the number of occurances of a pattern inside the HTML returned
13: // by the specified URL address
14: let countMatches (pattern : string) (address : string) =
15:     let webClient = new WebClient()
16:     let html = webClient.DownloadString address
18:     seq { 0..html.Length - pattern.Length }
19:     |> (fun i -> html.Substring(i, pattern.Length))
20:     |> Seq.filter ((=) pattern)
21:     |> Seq.length
23: let echoActivity = Activity(
24:                         "echo", "echo input", echo,
25:                         taskHeartbeatTimeout       = 60, 
26:                         taskScheduleToStartTimeout = 10,
27:                         taskStartToCloseTimeout    = 10, 
28:                         taskScheduleToCloseTimeout = 20)
30: let countDivs = Activity<string, int>(
31:                         "count_divs", "count the number of <div> elements", 
32:                         countMatches "<div",
33:                         taskHeartbeatTimeout       = 60, 
34:                         taskScheduleToStartTimeout = 10,
35:                         taskStartToCloseTimeout    = 10, 
36:                         taskScheduleToCloseTimeout = 20)
38: let countScripts = Activity<string, int>(
39:                         "count_scripts", "count the number of <script> elements", 
40:                         countMatches "<script",
41:                         taskHeartbeatTimeout       = 60, 
42:                         taskScheduleToStartTimeout = 10,
43:                         taskStartToCloseTimeout    = 10, 
44:                         taskScheduleToCloseTimeout = 20)
46: let countSpans = Activity<string, int>(
47:                         "count_spans", "count the number of <span> elements", 
48:                         countMatches "<span",
49:                         taskHeartbeatTimeout       = 60, 
50:                         taskScheduleToStartTimeout = 10,
51:                         taskStartToCloseTimeout    = 10, 
52:                         taskScheduleToCloseTimeout = 20)
54: let countActivities = [| countDivs      :> ISchedulable
55:                          countScripts   :> ISchedulable
56:                          countSpans     :> ISchedulable |]
58: let countReducer (results : Dictionary<int, string>) =
59:     sprintf "Divs : %d\nScripts : %d\nSpans : %d\n" (int results.[0]) (int results.[1]) (int results.[2])
61: let countElementsWorkflow = 
62:     Workflow(domain = "", name = "count_html_elements", 
63:              description = "this workflow counts", 
64:              version = "1")
65:     ++> echoActivity
66:     ++> (countActivities, countReducer)
67:     ++> echoActivity
69: let awsKey      = "PUT-YOUR-AWS-KEY-HERE"
70: let awsSecret   = "PUT-YOUR-AWS-SECRET-HERE"
71: let client = new AmazonSimpleWorkflowClient(awsKey, awsSecret)
73: countElementsWorkflow.Start(client)

namespace Amazon
namespace Amazon.SimpleWorkflow
namespace Amazon.SimpleWorkflow.Extensions
namespace System
namespace System.Collections
namespace System.Collections.Generic
namespace System.Net
val echo : str:string -> string

Full name: ParallelizeActivities.echo

val str : string
val printfn : format:Printf.TextWriterFormat<‘T> -> ‘TFull name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val countMatches : pattern:string -> address:string -> intFull name: ParallelizeActivities.countMatches
val pattern : string
Multiple items

val string : value:’T -> stringFull name: Microsoft.FSharp.Core.Operators.string


type string = System.String

Full name: Microsoft.FSharp.Core.string

val address : string
val webClient : WebClient
Multiple items

type WebClient =  inherit Component

new : unit -> WebClient

member AllowReadStreamBuffering : bool with get, set

member AllowWriteStreamBuffering : bool with get, set

member BaseAddress : string with get, set

member CachePolicy : RequestCachePolicy with get, set

member CancelAsync : unit -> unit

member Credentials : ICredentials with get, set

member DownloadData : address:string -> byte[] + 1 overload

member DownloadDataAsync : address:Uri -> unit + 1 overload

member DownloadDataTaskAsync : address:string -> Task<byte[]> + 1 overload

Full name: System.Net.WebClient


WebClient() : unit

val html : string
WebClient.DownloadString(address: System.Uri) : string

WebClient.DownloadString(address: string) : string

Multiple items

val seq : sequence:seq<‘T> -> seq<‘T>Full name: Microsoft.FSharp.Core.Operators.seq


type seq<‘T> = IEnumerable<‘T>

Full name: Microsoft.FSharp.Collections.seq<_>

property System.String.Length: int
module Seqfrom Microsoft.FSharp.Collections
val map : mapping:(‘T -> ‘U) -> source:seq<‘T> -> seq<‘U>Full name:
val i : int
System.String.Substring(startIndex: int) : string

System.String.Substring(startIndex: int, length: int) : string

val filter : predicate:(‘T -> bool) -> source:seq<‘T> -> seq<‘T>Full name: Microsoft.FSharp.Collections.Seq.filter
val length : source:seq<‘T> -> intFull name: Microsoft.FSharp.Collections.Seq.length
val echoActivity : Activity<string,string>Full name: ParallelizeActivities.echoActivity
Multiple items

type Activity = Activity<string,string>Full name: Amazon.SimpleWorkflow.Extensions.Activity


new : name:obj * description:obj * processor:System.Func<‘TInput,’TOutput> * taskHeartbeatTimeout:obj * taskScheduleToStartTimeout:obj * taskStartToCloseTimeout:obj * taskScheduleToCloseTimeout:obj * ?taskList:obj -> Activity<‘TInput,’TOutput>

new : name:string * description:string * processor:(‘TInput -> ‘TOutput) * taskHeartbeatTimeout:Model.Seconds * taskScheduleToStartTimeout:Model.Seconds * taskStartToCloseTimeout:Model.Seconds * taskScheduleToCloseTimeout:Model.Seconds * ?taskList:string * ?maxAttempts:int -> Activity<‘TInput,’TOutput>

val countDivs : Activity<string,int>Full name: ParallelizeActivities.countDivs
Multiple items

val int : value:’T -> int (requires member op_Explicit)Full name:


type int = int32

Full name:


type int<‘Measure> = int

Full name:<_>

val countScripts : Activity<string,int>Full name: ParallelizeActivities.countScripts
val countSpans : Activity<string,int>Full name: ParallelizeActivities.countSpans
val countActivities : ISchedulable []Full name: ParallelizeActivities.countActivities
type ISchedulable =

interface    abstract member Description : string

abstract member MaxAttempts : int

abstract member Name : string


Full name: Amazon.SimpleWorkflow.Extensions.ISchedulable

val countReducer : results:Dictionary<int,string> -> stringFull name: ParallelizeActivities.countReducer
val results : Dictionary<int,string>
Multiple items

type Dictionary<‘TKey,’TValue> =  new : unit -> Dictionary<‘TKey, ‘TValue> + 5 overloads

member Add : key:’TKey * value:’TValue -> unit

member Clear : unit -> unit

member Comparer : IEqualityComparer<‘TKey>

member ContainsKey : key:’TKey -> bool

member ContainsValue : value:’TValue -> bool

member Count : int

member GetEnumerator : unit -> Enumerator<‘TKey, ‘TValue>

member GetObjectData : info:SerializationInfo * context:StreamingContext -> unit

member Item : ‘TKey -> ‘TValue with get, set

nested type Enumerator

nested type KeyCollection

nested type ValueCollection

Full name: System.Collections.Generic.Dictionary<_,_>


Dictionary() : unit

Dictionary(capacity: int) : unit

Dictionary(comparer: IEqualityComparer<‘TKey>) : unit

Dictionary(dictionary: IDictionary<‘TKey,’TValue>) : unit

Dictionary(capacity: int, comparer: IEqualityComparer<‘TKey>) : unit

Dictionary(dictionary: IDictionary<‘TKey,’TValue>, comparer: IEqualityComparer<‘TKey>) : unit

val sprintf : format:Printf.StringFormat<‘T> -> ‘TFull name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf
val countElementsWorkflow : WorkflowFull name: ParallelizeActivities.countElementsWorkflow
Multiple items

type Workflow =  interface IWorkflow

new : domain:string * name:string * description:string * version:string * ?taskList:string * ?stages:Stage list * ?taskStartToCloseTimeout:Seconds * ?execStartToCloseTimeout:Seconds * ?childPolicy:ChildPolicy * ?identity:Identity * ?maxAttempts:int -> Workflow

member private Append : toStageAction:(‘a -> StageAction) * args:’a -> Workflow

member Start : swfClt:AmazonSimpleWorkflowClient -> unit

member add_OnActivityFailed : Handler<Domain * Name * ActivityId * Details option * Reason option> -> unit

member add_OnActivityTaskError : Handler<Exception> -> unit

member add_OnDecisionTaskError : Handler<Exception> -> unit

member add_OnWorkflowCompleted : Handler<Domain * Name> -> unit

member add_OnWorkflowFailed : Handler<Domain * Name * RunId * Details option * Reason option> -> unit

member NumberOfStages : int

Full name: Amazon.SimpleWorkflow.Extensions.Workflow


new : domain:string * name:string * description:string * version:string * ?taskList:string * ?stages:Stage list * ?taskStartToCloseTimeout:Model.Seconds * ?execStartToCloseTimeout:Model.Seconds * ?childPolicy:Model.ChildPolicy * ?identity:Model.Identity * ?maxAttempts:int -> Workflow

val awsKey : stringFull name: ParallelizeActivities.awsKey
val awsSecret : stringFull name: ParallelizeActivities.awsSecret
val client : AmazonSimpleWorkflowClientFull name: ParallelizeActivities.client
Multiple items

type AmazonSimpleWorkflowClient =  inherit AmazonWebServiceClient

new : unit -> AmazonSimpleWorkflowClient + 11 overloads

member BeginCountClosedWorkflowExecutions : countClosedWorkflowExecutionsRequest:CountClosedWorkflowExecutionsRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginCountOpenWorkflowExecutions : countOpenWorkflowExecutionsRequest:CountOpenWorkflowExecutionsRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginCountPendingActivityTasks : countPendingActivityTasksRequest:CountPendingActivityTasksRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginCountPendingDecisionTasks : countPendingDecisionTasksRequest:CountPendingDecisionTasksRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginDeprecateActivityType : deprecateActivityTypeRequest:DeprecateActivityTypeRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginDeprecateDomain : deprecateDomainRequest:DeprecateDomainRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginDeprecateWorkflowType : deprecateWorkflowTypeRequest:DeprecateWorkflowTypeRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginDescribeActivityType : describeActivityTypeRequest:DescribeActivityTypeRequest * callback:AsyncCallback * state:obj -> IAsyncResult

member BeginDescribeDomain : describeDomainRequest:DescribeDomainRequest * callback:AsyncCallback * state:obj -> IAsyncResult

Full name: Amazon.SimpleWorkflow.AmazonSimpleWorkflowClient


AmazonSimpleWorkflowClient() : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(region: Amazon.RegionEndpoint) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(config: AmazonSimpleWorkflowConfig) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(credentials: Amazon.Runtime.AWSCredentials) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(credentials: Amazon.Runtime.AWSCredentials, region: Amazon.RegionEndpoint) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(credentials: Amazon.Runtime.AWSCredentials, clientConfig: AmazonSimpleWorkflowConfig) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(awsAccessKeyId: string, awsSecretAccessKey: string) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(awsAccessKeyId: string, awsSecretAccessKey: string, region: Amazon.RegionEndpoint) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(awsAccessKeyId: string, awsSecretAccessKey: string, clientConfig: AmazonSimpleWorkflowConfig) : unit

(+0 other overloads)

AmazonSimpleWorkflowClient(awsAccessKeyId: string, awsSecretAccessKey: string, awsSessionToken: string) : unit

(+0 other overloads)

member Workflow.Start : swfClt:AmazonSimpleWorkflowClient -> unit

Thanks to Tomas Petricek’s FSharp.Formatting project I’m now able to provide code snippets with intellisense! Tomas, you rock!


Running the above example and starting a workflow execution with the input

outputs the following to the console:


If you take a look at the history of events below, the decision task following the completion of stage 0 (the first echo activity) was completed with the state (tracked in the Execution Context property):


Without going into too much details on the inner workings of the generated decider, this JSON serialized state tells us that the workflow has moved into stage no. 1, where there are a total of 3 actions, each represented by an activity task to count a particular type of HTML element.


Switching to the Activities tab, you can see that 3 activities were completed at stage index 1 of the count_html_elements workflow, judging by the Activity ID and Version of the 3 activities:



Looking at the example code, a couple of questions jump out straight away:

Q. What is the ISchedulable interface?

The ISchedulable interface represents anything that can be scheduled as a part of a workflow, i.e. an activity or a child workflow. Both IActivity and IWorkflow inherits from it though in all the examples so far we’ve only worked directly against the concept implementation types for these two interfaces.


Q. Why does the reducer take a Dictionary<int, string>?

A. As far as the reducer is concerned, it probably doesn’t need to be. The main reason I’ve used a dictionary here is that when intermediate results are available (e.g. 2 out of 3 parallel activities have completed) I wanted to be able to show the current set of results in the Execution Context for the workflow execution (see screenshot above). Because we don’t have all the results back, so I needed to be able to show the result against the originating cativity, hence why a dictionary where the key is the zero-based index of the activity in the input array and the value is the string representation of the result.


Q. Why then, are the Dictionary’s value strings when the scheduled activity can be generic and return arbitrary types?

Because not all the activities have to return the same type.

Under the hood the generic Activity<TInput, TOutput> marshals data to and from JSON strings using ServiceStack.Text JSON serializer, and the Activity type is just a special case where both TInput and TOutput are strings.

When the result is recorded and retrieved via SWF, they’re already in string format, although it’s possible to inspect the originating activity’s generic type parameters to work out the returned type, to cater for different return types, the dictionary would need to be a Dictionary<int, object> instead, which is not any better.


Q. So what if I want to return anything other than a string from the reducer function?

For now, you can use the ServiceStack.Text JSON serializer (for better compatibility) to serialize the return value to string yourself, I’ll add support for the library to do this automatically in version 1.1.0 release. It skipped my mind at the time, sorry…



In the next post, I’ll show you how you can add child workflows in the mix. As I’ve mentioned above, workflows also implement the ISchedulable interface and can be scheduled into the workflow in the same way as activities.

To find out the latest announcements and updates on the Amazon.SimpleWorkflow.Extensions project, please follow the official twitter account @swf_extensions, and as always, your feedbacks and comments on the project will be much appreciated!

Introduction to AWS SimpleWorkflow Extensions Part 2 – Beyond Hello World

The series so far:

1.   Hello World example

3.   Parallelizing activities


In this post we’re going to go beyond the previous Hello World example and show you how to use the SWF extensions library to model workflows with multiple steps and allow data to flow naturally from one step to the next.

When using the extension library, the input to a workflow execution is passed onto the first activity of a workflow as input by default (as per the previous Hello World example), and the result of that activity is passed onto the next activity as input and so on. The result of the last activity is then used as the result of the whole workflow :


When you model an activity using the Activity type you need to pass in a function with the signature of string –> string. This function is called against the input when the generated activity worker receives a task, and will use the return value from the function as the result of the activity.

What you might not realize is that the Activity type is actually a specialized form of the generic Activity<TInput, TOutput> type which allows you to supply functions with arbitrary input and output types and simply uses the ServiceStack.Text JSON serializer to marshal data to and from string. I had decided to use the ServiceStack.Text serializer because it’s the fastest JSON serializer around based on my benchmarks.

Example : Sum Web Page Lengths

Suppose you want to count and sum the size of the HTML pages given a number of URLs and return the sum as the result:


To implement this workflow you need to attach two activities to the workflow, the first requiring a function that turns a string array into an int array and the second aggregates the int array into a single integer value, something along the lines of:

The main things to take away from this example is that:

  1. you can attach multiple activities to a workflow by chaining them up with the ++> operator
  2. handler functions for activities do not have to have string –> string signature


Let’s take a closer look at the two activities:




As you can see, given the input JSON string to the workflow:

[ “”, “”, “” ]

the activities did what they were supposed to and first translated the input into an int array before summing them to give a total for the length of the landing pages for Google, Yahoo and Bing, and little surprise that Yahoo’s landing page is nearly an order of magnitude bigger than the rest!


Parting Thoughts

In this post I demonstrated how to model a workflow with multiple steps which accept and return arbitrary types as input and output. In the next post I’ll demonstrate how to schedule multiple activities to be performed in parallel as a single step of a workflow.

Introduction to AWS SimpleWorkflow Extensions Part 1 – Hello World example

Series so far:

2. Beyond Hello World

3. Parallelizing activities


In my previous post I mentioned some of the shortcomings with Amazon SimpleWorkflow (SWF) which drove me to create an extension library on top of the standard .Net SDK to make it easier to model workflows and business processes using SWF.

In this series of blog posts I’ll give you more examples of how to use the library to model workflows to be executed against the SWF service to take advantage of the reliable state management and task dispatch it offers, but none of the plumbing and boilerplate code you would have to deal with using the SDK.

Before we start looking at examples, let’s have a quick recap of the SWF terminologies:

  • A workflow is a sequence of steps that are loosely strung together by the decisions the decider makes each time the state of the workflow changes. E.g. step 1 complete then schedule step 2 to commence.
  • A workflow execution is an instance of a particular workflow currently being executed, many executions of the same workflow (identified by name and version) can be in flight at the same time. A workflow execution can be started with string as input and it can return string as output.
  • A decision task is a task that is scheduled each time a workflow’s state changes.
  • A decider is a component in your application which is responsible for polling SWF for decision tasks and respond with decisions. The sequence of steps that need to be performed by the workflow is ultimately determined by the decider.
  • An activity task is a task that is scheduled by a decider, it takes a string as input (along with several other pieces of data which it can be scheduled with) and returns a string as result.
  • An activity worker is a component in your application which is responsible for polling SWF for activity tasks and respond with completion or failure signals, as well as providing regular heartbeat signals. If the decider is responsible for scheduling work to be done, then the activity worker is responsible for doing the actual work.
  • A child workflow is a workflow that is scheduled by the decider as a step in a workflow, similar to an activity.
  • The decider is able to schedule both child workflows and activities for a single step in a workflow, whilst child workflows can be rerun as an independent unit of work, activities cannot be rerun independently outside of the context of a workflow.
  • Both workflows and activities need to be registered with the SWF service before they can be used.

These are the most common concepts/components you’ll see in SWF, but there are also less commonly used (in my opinion at least) features such as:

  • Starting a timer to cause a timer event to be fired after some time.
  • Signalling an external workflow execution to cause an event to be recorded in its execution history and a decision task to be scheduled. This is a useful way to allow inter-workflow communication, e.g. one workflow suspends itself, until another workflow sends it a signal and then it can resume with its execution.
  • Recording a marker as means to provide additional information in the execution history of a workflow.


Example : Hello World

Consider a workflow where there is only one activity, which simply prints the input to the screen and echoes it back out.

If we start a workflow execution with the input “Hello World!” then we expect to see the input being printed to the console and then the workflow execution completed with the result “Hello World!”.


In its essence, you can think of an activity as nothing more than a function which accepts a string as argument and return a string, i.e. a fun with signature string –> string in F#.

With the standard .Net SDK you will need to write a decider for each workflow in order to provide the orchestration you need for that workflow. The decider logic tends to quickly become difficult to understand and maintain when the decision logic becomes more complicated, e.g. when multiple activities and child workflows are scheduled in parallel ,and you need to retry/fail activities/workflows, etc.

In my view, the decider is largely plumbing that developers should do without, so with the extensions library you should not need to write any custom decider code but instead, simply declare what activities and/or child workflows should be scheduled at each stage of a workflow and let the library do all the heavy lifting for you!

As far as workflow modelling is concerned, the only thing you need to do is use the custom ++> operator (inspired by Dave Thomas’s pipelets project) to attach additional steps to your workflow. So the above workflow can be modelled as:

and that’s it! No need to register the workflow and activity and write bespoke decider & activity worker yourself, the library does all of that for you, all you needed to do was to model the workflow you want.

Notice you haven’t had to provide any reference to SWF at all thus far, in fact, you only need to provide an instance of AmazonSimpleWorkflowClient (from the AWS SDK) when you start the workflow:


This way, it’s possible to run the workflow across multiple accounts simultaneously (dev, staging, prod, etc.) by calling the Start method with each of the client instances (one for each account), which fits well with the mobile worker model SWF is designed with – SWF holds the state but you can run your workers from anywhere in and out of the AWS ecosystem.

Once you’ve started the workflow, the library will automatically register the domain, workflow and activity for you if they are not present already. You can verify this by looking in the SWF Management Console:


Notice that whilst we didn’t specify the “echo” activity with a version number, it’s registered with “echo.0”? I’ll go into more details on the versioning scheme in a later post, but for now let’s just be glad that we didn’t have to register these by hand!

Next, you can start a workflow directly from the management console, but ticking against the workflow you want to start and clicking the “Start New Execution” button:


Let’s follow through with the dialogue box and set the input as Hello World! as below:

image image

Once you start the workflow execution you will see Hello World! being printed in the console:


This is a sign that our echo function (which is invoked by the generated activity worker) had been called.

Back in the SWF Management Console, if you look under Workflow Executions, you should see the execution is closed after having completed successfully:


Clicking on the workflow execution ID allows you to see the sequence of events which had been recorded for this execution:


This is a very granular view of what happened during the workflow execution, giving you plenty of useful information if you ever need to investigate why a workflow execution failed, for instance.

If you switch to the Activities tab, you’ll get a more condensed view with just the activities that were scheduled, along with their inputs, results, etc.


For now, ignore the JSON string in the Control field and the format of the Activity ID, these are both automatically generated by the library based on a set of conventions and will be covered by a later post.

So that’s it! I hope you can see that this extension library gives you a powerful way to express and model a workflow and focus your development efforts on the things that count (designing the process and writing the code that does the actual work) rather than wasting precious developer time on getting your code to work with SWF!


Parting Thoughts

For Java developers, there is an existing high-level framework (provided by Amazon itself) for working with SWF called the Flow Framework, which adapts a more object-oriented approach and in my opinion requires far more plumbing and most importantly does not

In case you’re wondering, this is how a solution to a similar Hello World example looks using the flow framework (taken straight from the flow framework developer guide) for your comparison:

Making Amazon SimpleWorkflow simpler to work with

Amazon SimpleWorkflow (abbreviated to SWF from here on) is a workflow service provided by Amazon which allows you to model business processes as workflows using a task based programming model. The service provides reliable task dispatch and state management so that you can focus on developing ‘workers’ to perform the tasks that are required to move the execution of a workflow along.

Introduction to SWF

For more information about SWF, have a look at the following introductory webinar.

There are two types of tasks:

  • Activity task – tells an ‘activity worker’ to perform a specific function, e.g. check inventory or charge a credit card.
  • Decision task – tells a ‘decider’ that the state of a workflow execution has changed so that it can determine what the next course of action should be, e.g. continue to the next activity in the workflow, or complete the workflow if all activities have been successfully completed

Both the activity worker and decider needs to poll SWF service for tasks and respond with some result or decisions respectively after receiving a task. Each task is associated with one or more timeout values and if no response is received before the timeout expires then the task will timeout and can be rescheduled (in the case of a decision task, it is rescheduled automatically by the system).

Since tasks can be polled from just about anywhere (from an EC2 instance, or your home computer/laptop) and the tasks received can be part of any number of currently executing workflows, both the activity worker and decider should be completely stateless and can be distributed across any number of locations both inside and outside of the AWS ecosystem.

The history (as a sequence of events each keyed to a unique ID) of each workflow execution is available to view in the AWS Management Console so that you have plenty of information to aid you when investigating why workflows failed, for instance.



Consider the following example given by the SWF developer guide:

Sample Workflow Overview

Each of the steps can be represented as an activity task and along the way the decider will receive decision tasks and by inspecting the history of events thus far the decider can schedule the next activity in the workflow, e.g.


The actual SWF API is rather different so the pseudo code above tends to translate to something slightly more involved, which brings us to the topic of..

Short Comings of SWF

Workflows are modelled implicitly

In my opinion the biggest shortcoming with SWF is that the workflow itself (an order sequence of activities) is implied by the decider logic and at no point as you work with the service does it feel like you’re actually modelling a workflow. This might not be an issue in simple cases, but as you string together more and more activities (and potentially child workflows) and having to pass data along from one activity to the next and deal with failure cases the decider logic is going to become much more complex and difficult to maintain.

Need for boilerplate

The .Net AWSSDK provides a straight mapping to the set of actions available on the SWF service and provides very little added value to developers because as it stands every workflow requires boilerplate code to:

  • poll for decision task (multiple times if you need to go back further than the max 100 events per request)
  • inspect history of events after receiving a decision task
  • schedule next activity or complete workflow based on last events
  • poll for activity task
  • record heartbeats periodically when processing an activity task
  • respond completed message on successful completion of the activity task
  • capture exceptions during the processing of a task and respond failed message

Many of these steps are common across all deciders and activity workers and it’s left to you to implement this missing layer of abstraction. Java developers have access of a heavy-weight Flow Framework which allows you to declaratively (using decorators) specify activities and workflows, giving you more of a sense of modelling the workflow and its constituent activities. However, as you can see from the canonical ‘hello world’ example, a lot of code is required to carry out even a simple workflow, not to mention the various framework concepts one would have to learn..

A light-weight, intuitive abstraction layer is badly needed.

All activity and workflow types must be registered

Every workflow and every activity needs to be explicitly registered with SWF before they can be executed, and like workflow executions, registered workflow and activity types can be viewed directly in the AWS Management Console:


This registration can be done programmatically (as is the case with the Flow Framework) or via the AWS Management Console. The programmatic approach is clearly preferred but again, as far as .Net developers are concerned, it’s an automation step which you’d have to implement yourself and derive a versioning scheme for both workflows and activities. As a developer who just wants to model and implement a workflow with SWF, the registration represents another step in the development process which you would rather do without.

Another thing to keep in mind is that, in the case where you have more than one activity with the same name but part of different workflows and require different task to be performed, you need a way to distinguish between the different activities so that the corresponding activity workers do not pick up the incorrect task.


Driven by the pain of developing against SWF because of its numerous shortcomings (pain-driven development…) I started working on an extension library to the .Net AWSSDK to give .Net developers an intuitive API to model workflows and handle all the necessary boilerplate tasks (such as exception handling, etc.) so that you can truly focus on modelling workflows and not worry about all the other plumbing required for working with SWF.

Intuitive modelling API

The simple ‘hello world’ example given by the Flow Framework can be modelled with less than 10 lines of code that are far easier to understand:

Here the ++> operator attaches an activity or child workflow to an existing empty workflow and returns a new instance of Workflow rather than modifying the existing workflow (in the spirit of functional programming and immutability).

An activity in SWF terms, in essence can be thought of a function which takes an input (string), performs some task and returns a result (string). Hence the Activity class you see above accepts a function of the signature string –> string though there is a generic variant Activity<TInput, TOutput> which takes a function of signature TInput –> TOutput and uses ServiceStack.Text JSON serializer (the fastest JSON serializer for .Net) to marshal data to and from string.

Exchanging data between activities

The input to the workflow execution is passed to the first activity as input, and the result provided by the first activity is then passed to the second activity as input and so on. This exchange of data also extends to child workflows, for example:

Starting a workflow execution with the input ‘theburningmonk’ prints the following outputs to the console:

MacDonald: hello theburningmonk!

MacDonald: good bye, theburningmonk!

Old MacDonald had a farm


To visualize the sequence of event and how data is exchanged from one activity to the next:

starts main workflow “with_child_workflow” with input “theburningmonk”

-> “theburningmonk” is passed as input to the activity “greet”

-> calls curried function greet “MacDonald” with “theburningmonk”

-> greet function prints “MacDonald: hello theburningmonk!” to console

-> greet function returns “theburningmonk”

-> “theburningmonk” is passed as input to activity “bye”

-> calls curried function bye “MacDonald” with “theburningmonk”

-> bye function prints “MacDonald: good bye, theburningmonk!” to console

-> bye function returns “MacDonald”

-> “MacDonald” is used as input to start the child workflow “sing_along”

-> “MacDonald” is passed as input to the activity “sing”

-> calls function sing with “MacDonald”

-> sing function prints “Old MacDonald had a farm” to console

-> sing function returns “EE-I-EE-I-O”

-> the child workflow “sing_along” completes with result “EE-I-EE-I-O”

-> “EE-I-EE-I-O” is passed as input to the activity “echo”

-> calls function echo with “EE-I-EE-I-O”

-> echo function prints “EE-I-EE-I-O” to console

-> echo function returns “EE-I-EE-I-O”

-> main workflow “with_child_workflow” completes with result “EE-I-EE-I-O”

Error and Retry mechanism

You can optionally specify the max number of attempts (e.g. max 3 attempts = original attempt + 2 retries) that should be made for each activity or child workflow before letting it fail/timeout and fail the workflow.

Automatic workflow and activity registrations

The domain, workflow and activity types are all registered automatically (if they haven’t been registered already) when you start a workflow. You might notice that you don’t need to specify a version for each of the activities, this is because there is an convention-based versioning scheme in place (see below).

Versioning scheme

Deriving a versioning scheme for your activities is at best an arbitrary decision and one that is required by SWF which adds friction to the development process without adding much value to the developers.

The versioning scheme I’m using is such that if an activity ‘echo’ is part of a workflow ‘with_child_workflow’ and is the 4th activity in the workflow, then the version for this particular instance of ‘echo’ activity is with_child_workflow.3.

This scheme allows you to:

  • decouple the name of an activity to the delegate function
  • reuse the same activity name in different workflows, and allow them to perform different tasks if need be
  • reuse the same activity name for different activities in the same workflow, and allow them to perform different tasks if need be

Asynchronous execution

Nearly all of the communication with SWF (polling, responding with result, etc.) are all done asynchronously using non-blocking IO (using F# async workflows).


Currently, the extension library can also be used from F#, I’m still undecided on the API for C# (because you won’t be able to use the ++> custom operator) and would welcome any suggestions you might have!

As you can see from the Issues list, there is still a couple of things I want to add support for, but you should be seeing a Nuget package being made available in the near future. But if you want to try it out in the meantime, feel free to grab the source and run the various examples I had added in the ExampleFs project.