Takeaways from Hewitt, Meijer and Szyperski’s talk on the Actor model

This is a list of my take­aways from the excel­lent talk between Erik Mei­jer (of the LINQ and Rx fame), Carl Hewitt (cre­ator of the Actor mod­el) and Clemens Szyper­s­ki, on the Actor mod­el.

 

Dis­claimer : this con­ver­sa­tion revolves around the con­cep­tu­al mod­el of an actor, as opposed to spe­cif­ic imple­men­ta­tions of the Actor mod­el.

 

What is an actor?

An actor is the fun­da­men­tal unit of com­pu­ta­tion which embod­ies the 3 things – pro­cess­ing, stor­age and com­mu­ni­ca­tions – that are essen­tial to com­pu­ta­tion.

One actor is no actor, they come in sys­tems, and they have to have address­es so that one actor can send mes­sages to anoth­er actor.

 

Beyond the high-lev­el abstrac­tion, an actor has a num­ber of prop­er­ties:

  • Every­thing is an actor
  • An actor has a mail­box

 

Since a mail­box is also an actor, it too will have a mail­box, and so the recur­sion begins! This recur­sion ends with axioms.

 

Axiom

When an actor receives a mes­sage it can:

  • Cre­ate new actors
  • Send mes­sages to actors it has address­es before
  • Des­ig­nate how to han­dle the next mes­sage it receives (e.g. state)

and that’s it!

Con­cep­tu­al­ly, mes­sages are processed one at a time, but the imple­men­ta­tion can allow for con­cur­rent pro­cess­ing of mes­sages.” – Carl Hewitt

 

This is not the same as a con­tin­u­a­tion, which is the lamb­da expres­sion that you exe­cute after doing the cur­rent one and is a con­cept for sin­gle thread­ed pro­cess­ing.

 

Whilst con­cep­tu­al­ly mes­sages are processed one at a time, the imple­men­ta­tion can allow for con­cur­rent pro­cess­ing of mes­sages. For instance, a fac­to­r­i­al actor which has no state and will process each mes­sage the same way can process an arbi­trary num­ber of mes­sages at the same time.

 

An actor can also send mes­sages to itself (i.e. recur­sion), and to avoid dead­locks we have the notion of a future.

The idea of a future is that you can cre­ate an actor with any result whilst it’s still being com­put­ed. For instance, you can cre­ate a future for fac­to­r­i­al 100m, which will take a long time to com­pute, but you can have the future straight away and pass it around.

 

Address­es

The address of an actor is not the same as its iden­ti­ty because:

  • One actor can have one address for many actors if you’re repli­cat­ing behind the scenes
  • One actor can have many address­es that for­ward to one anoth­er (via proxy actors)

hence there’s a many-to-many rela­tion­ship between actors and address­es.

 

With actors, all you have are address­es, which doesn’t tell you whether you have one or many actors behind those address­es. The same notion of address­es also applies to the web, e.g. whilst search­ing on google.com it’s not the same actor that are pro­cess­ing your requests every time.

 

Address­es are sim­i­lar to capa­bil­i­ties, but is a much clear­er name for a capa­bil­i­ty because it tells you exact­ly what you are allowed to do – send­ing mes­sages to it, which is its only capa­bil­i­ty.

If you can main­tain the integri­ty of address­es, you get capa­bil­i­ties for free” – Carl Hewitt

 

Mes­sages

Mes­sages are like ‘pack­ets’ in the inter­net, they obey the same rule as pack­ets for effi­cien­cy rea­sons – mes­sages are received in any order because it’s more expen­sive on the sys­tem to enforce the order­ing con­straint.

 

Mes­sages are also deliv­ered on a best-efforts basis, which when cross­ing machines this means they are per­sist­ed on some stor­age and can be resent if receipt acknowl­edge­ment is not received. But if the source machine is ter­mi­nat­ed before the resent hap­pens then the mes­sage is lost.

 

Mes­sages sent between actors are deliv­ered at most once, and may take a long time to arrive depend­ing on dis­tance and net­work laten­cy between the actors (e.g. mes­sage in a bot­tle..).

 

Chan­nels

There are no chan­nels” – Carl Hewitt

Instead, the actors talk direct­ly to one anoth­er.

 

The prob­lem with a chan­nel is that if you’re try­ing to send a mes­sage to two recip­i­ents only one of them will receive the mes­sage, unless you go through with the over­head of a two-phase com­mit.

 

As an imple­men­ta­tion detail, you can imple­ment a chan­nel (which will be anoth­er actor in the sys­tem) if you want, but it’s not part of the con­cep­tu­al mod­el.

 

Non­de­ter­min­ism vs Inde­ter­min­ism

A quick recap on tur­ing machines, which is the­o­ret­i­cal machine that defines com­putabil­i­ty. It can be thought of as a sim­ple com­put­er that reads and writes sym­bols one at a time on an infi­nite­ly long tape by fol­low­ing a set of rules. It deter­mines what to do next accord­ing to an inter­nal state and what sym­bol it cur­rent­ly sees on the tape.

In a deter­min­is­tic tur­ing machine, giv­en the cur­rent state and sym­bol it spec­i­fies only one action to be per­formed. For exam­ple, “if you are in state 2 and you see an ‘A’, write a ‘B’ and move left”.

In a non­de­ter­min­is­tic tur­ing machine (NTM), giv­en the cur­rent state and sym­bol it may spec­i­fy more than one action to be per­formed. For exam­ple, “if you are in state 2 and you see an ‘A’, write a ‘B’, move right and switch to state 5”.

 

In a NTM, the state of the com­pu­ta­tion is fixed, and can be proved that a state machine mod­el of com­pu­ta­tion has to have a bound­ed non­de­ter­min­ism (i.e. it halts after a bound­ed num­ber of steps, hence has a bound­ed num­ber of pos­si­ble con­fig­u­ra­tions).

With the Actor mod­el, you have a con­fig­u­ra­tion-based mod­el of com­pu­ta­tion (based on mes­sages that are received, which are dynam­ic as opposed to fixed), which is more pow­er­ful because it incor­po­rates com­mu­ni­ca­tion. This con­fig­u­ra­tion-based mod­el gives you inde­ter­min­ism, which is what hap­pens when things work them­selves out.

 

Con­trary to pop­u­lar believes, tur­ing machine is not the only thing that defines com­putabil­i­ty, and inter­ac­tions with an open envi­ron­ment cer­tain­ly changes what com­pu­ta­tion means and is the dif­fer­ence between non­de­ter­min­ism and inde­ter­min­ism.

 

Syn­chro­niza­tion

Syn­chro­niza­tion is built into the Actor mod­el because mes­sages can be received one at a time by an actor.

In a check-in account exam­ple where many par­ties can cash-in or with­draw from the account, sup­pose the cur­rent bal­ance is £2, and one per­son tries to with­draw £7 whilst anoth­er tries to cash-in £8, the out­come is inde­ter­mi­nant based on the order in which the mes­sages are received by the actor.

This is where the arbiters come in.

 

Arbiter

The arbiter decides, and there’s noth­ing before the arbiter decides” – Carl Hewitt

Giv­en an arbiter, you can have mul­ti­ple inputs (e.g. I0 and I1) into the arbiter at the same time, but only one of the pos­si­ble out­comes (e.g. O0 or O1) will come out on the oth­er end.

image

The arbiter is what gives us inde­ter­min­ism, it can take an arbi­trary amount of time (with the prob­a­bil­i­ty of inde­ci­sion decreas­ing expo­nen­tial­ly over time) to come to a deci­sion but it must decide.

 

Imple­men­ta­tion

There’s an art to the imple­men­ta­tion of the Actor mod­el in pro­gram­ming lan­guages and there are many ways you can make mis­takes in the imple­men­ta­tion – by vio­lat­ing some of the fun­da­men­tal prin­ci­ples or by not tak­ing them seri­ous­ly.

 

The Actor mod­el is not the same as tail recur­sive calls (because it can change the state for the next mes­sage received) or event loops (because of the opti­miza­tions).

 

 

I hope I’ve done the talk jus­tice with these short notes I’ve tak­en and that you find them use­ful as you no doubt watch the talk over and over as I had, and before we go I’d like to leave you with yet anoth­er great quoteSmile

We don’t know much, and some of it is wrong” – Carl Hewitt