Exercises in Programming Style–Dataspaces

NOTE : read the rest of the series, or check out the source code.

If you enjoy read­ing these exer­cises then please buy Crista’s book to sup­port her work.

exercises-prog-styles-cover

Fol­low­ing on from the last post, we will look at the Data­spaces style today.

 

Style 29 – Dataspaces

Constraints

  • Exis­tence of one or more units that exe­cute con­cur­rent­ly.
  • Exis­tence of one or more data spaces where con­cur­rent units store and retrieve data.
  • No direct data exchanges between the con­cur­rent units, oth­er than via the data spaces.

 

To get start­ed, we’ll define our data­spaces — one to store the words we need to process, and one to store the par­tial fre­quen­cies from each con­cur­rent unit pro­cess­ing the words (we’ll see what this means soon).

Style29-01

Next, we’ll define the process­Words func­tion that will be exe­cut­ed con­cur­rent­ly.

Each con­cur­rent unit will poll the word­Space data­space for words to process and cre­ate a word fre­quen­cies dic­tio­nary for the words that it has processed. Upon exhaust­ing all the avail­able words, each con­cur­rent unit will save the local­ly aggre­gat­ed word fre­quen­cies into the fre­q­Space data­space.

Style29-02

Next, we’ll read the text from Pride & Prej­u­dice and add the words into our word­Space data­space for pro­cess­ing.

Style29-03

In Crista’s solu­tion, she kicked off 5 con­cur­rent threads to process the words and wait­ed for all of them to fin­ish before merg­ing the par­tial results in the fre­q­Space data­space. I’m not sure if this fork-join approach is a nec­es­sary part of this style, but it seems a rea­son­able choice here.

To fol­low the same approach, we can use F#‘s Async.Parallel method.

Here, I chose to use Async.RunSynchronously to syn­chro­nous­ly wait for the par­al­lel tasks to fin­ish (this is the same approach Crista took in her solu­tion). Alter­na­tive­ly, you can make the wait hap­pen asyn­chro­nous­ly by cap­tur­ing the result of Async.Parallel instead (see Ver­sion 2 below).

The next step is pret­ty straight for­ward. Iter­ate through the par­tial results in the fre­q­Space data­space and aggre­gate them into a sin­gle word fre­quen­cies dic­tio­nary, then return the word fre­quen­cies as a sort­ed array.

Style29-04

Final­ly, take the top 25 results from the sort­ed array and dis­play them on screen.

Style29-05

 

Version 2 — Async all the way

If you didn’t like the syn­chro­nous wait­ing in the fork-join approach above, here’s a mod­i­fied ver­sion of the solu­tion that is async all the way.

So first, we’ll cap­ture the par­al­lel pro­cess­ing of words (and sub­se­quent­ly ignor­ing the results) as an Async<unit>. Notice that at this point we haven’t done any work yet, we mere­ly cap­tured the asyn­chro­nous com­pu­ta­tion that we will per­form (which is one of the key dif­fer­ences between async in C# and F#).

Style29-06

Inside anoth­er async { } block, we can action the par­al­lel pro­cess­ing, asyn­chro­nous­ly wait for its com­ple­tion (i.e. do! proces­sAll­Words) and then merge the par­tial results in the fre­q­Space data­space as before.

Style29-07

Final­ly, we’ll kick off the entire train of asyn­chro­nous com­pu­ta­tions that we have com­posed togeth­er with Async.Start.

Style29-08

And voila, now every­thing runs asyn­chro­nous­ly end-to-end 

 

You can find the source code for this exer­cise here (v1) and here (v2 — async all the way).