The series so far:
In this post we’re going to go beyond the previous Hello World example and show you how to use the SWF extensions library to model workflows with multiple steps and allow data to flow naturally from one step to the next.
When using the extension library, the input to a workflow execution is passed onto the first activity of a workflow as input by default (as per the previous Hello World example), and the result of that activity is passed onto the next activity as input and so on. The result of the last activity is then used as the result of the whole workflow :
When you model an activity using the Activity type you need to pass in a function with the signature of string –> string. This function is called against the input when the generated activity worker receives a task, and will use the return value from the function as the result of the activity.
What you might not realize is that the Activity type is actually a specialized form of the generic Activity<TInput, TOutput> type which allows you to supply functions with arbitrary input and output types and simply uses the ServiceStack.Text JSON serializer to marshal data to and from string. I had decided to use the ServiceStack.Text serializer because it’s the fastest JSON serializer around based on my benchmarks.
Example : Sum Web Page Lengths
Suppose you want to count and sum the size of the HTML pages given a number of URLs and return the sum as the result:
To implement this workflow you need to attach two activities to the workflow, the first requiring a function that turns a string array into an int array and the second aggregates the int array into a single integer value, something along the lines of:
The main things to take away from this example is that:
- you can attach multiple activities to a workflow by chaining them up with the ++> operator
- handler functions for activities do not have to have string –> string signature
Let’s take a closer look at the two activities:
As you can see, given the input JSON string to the workflow:
[ “http://www.google.com”, “http://www.yahoo.com”, “http://www.bing.com” ]
the activities did what they were supposed to and first translated the input into an int array before summing them to give a total for the length of the landing pages for Google, Yahoo and Bing, and little surprise that Yahoo’s landing page is nearly an order of magnitude bigger than the rest!
In this post I demonstrated how to model a workflow with multiple steps which accept and return arbitrary types as input and output. In the next post I’ll demonstrate how to schedule multiple activities to be performed in parallel as a single step of a workflow.
I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.
Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.
Here is a complete list of all my posts on serverless and AWS Lambda. In the meantime, here are a few of my most popular blog posts.
- Lambda optimization tip – enable HTTP keep-alive
- You are thinking about serverless costs all wrong
- Many faced threats to Serverless security
- We can do better than percentile latencies
- I’m afraid you’re thinking about AWS Lambda cold starts all wrong
- Yubl’s road to Serverless
- AWS Lambda – should you have few monolithic functions or many single-purposed functions?
- AWS Lambda – compare coldstart time with different languages, memory and code sizes
- Guys, we’re doing pagination wrong