If you enjoy reading these exercises then please buy Crista’s book to support her work.
Style 28 – Actors
- The larger problem is decomposed into ‘things’ that make sense for the problem domain.
- Each ‘thing’ has a queue meant for other things to place messages in it.
- Each ‘thing’ is a capsule of data that exposes only its ability to receive messages via the queue.
- Each ‘thing’ has its own thread of execution independent of the others.
I have been looking forward to doing this style every since the Letterbox style months ago So without any delay, let’s get started!
First, we’ll create an actor to load and store the words that we need to process. Since MailboxProcessor is generic, we can define what messages it can receive with a DataStorageMessage union type that allows two kinds of messages:
- LoadWords — which contains the path to the local file to load words from
- NextWord — once loaded, this message fetches the next available word
In the snippet below, you can see that we have a recursive receive loop which asynchronously waits for message to arrive in its mailbox and handles them accordingly. One thing to note here is that, if a LoadWords message is received before all the loaded words are processed the current behaviour is to override the remaining words list. This can give you unexpected results, the current design leaves the responsibility of ensuring this doesn’t happen with higher level abstractions (see the controller actor below).
Next, we’ll add another actor to manage the stop words. Similar to the dataStorageManager, we’ll first define the messages that can be sent to our actor:
- LoadStopWords — which contains the path to the local file to load stop words from
- IsStopWord — which passes in a word and expects a boolean as reply
As we did in the dataStorageManager, whenever the stopWordsManager actor receives a LoadStopWords message it’ll replace the current list of stop words. All future words passed in via the IsStopWord message will be checked against the updated list of stop words.
Next, we’ll add an actor to track the word frequencies. This actor can accept three messages:
- Add — adds another word to the current word frequencies
- TopN — fetches the current top N words with their corresponding frequencies
- Reset — resets the count
Again, the code snippet below should be pretty straight forward, although there is actually a frailty here due to the use of Seq.take. One thing that often annoys me about Seq.take is that, unlike Enumerable.Take, it throws when there is insufficient number of elements in the sequence! In this case, if the calling agent ask for TopN too early or with a large N it’s possible to kill our actor (which is also something that we’re not handling here). It’s fine given the context of this exercise, but these are things that you need to consider when writing production-ready code.
Lastly, we’ll add an actor to orchestrate the control flow of our program, that will accept a simple Run message. Here, I’ve added an AsyncReplyChannel<unit> to the Run message so that the caller has a deterministic way to know when the program has completed.
When the controller receives a Run message, it’ll initialize the other actors (which happens concurrently due to the asynchronous nature of messaging) and then process all the words from Pride and Prejudice by recursively fetch words from the dataStorageManager until there’s no more. One thing to note is that, because a MailboxProcessor process messages one-at-a-time, so even if the controller receives multiple Run messages at the same time it’ll still process them one at a time and we don’t even have to use locks!
To run our program, we’ll kick things off by sending a Run message to the controller. I opted to run this synchronously, but you could just easily ignore the reply and run the program asynchronously with Async.Ignore and Async.Start.
I’m a massive fan of Erlang and the Actor Model, they’re highly related since Erlang implements the Actor Model but shouldn’t be mixed up — e.g. code hot swapping and supervision trees are features of Erlang and Erlang OTP, and are not prescribed as part of the Actor Model (which is a theoretical model for describing computation).
If you haven’t already, please go ahead and watch this Channel9 recording of a conversation between Carl Hewitt and Erik Meijer on the Actor Model — what it is, and what it isn’t. I also did a write up to summarise the key points.
You can find the source code for this exercise here.