Understanding homoiconicity through Clojure macros

Hav­ing been pri­mar­i­ly involved with .Net lan­guages in my career so far, homoiconic­i­ty was a new idea to me when I first encoun­tered it in Clo­jure (and also lat­er in Elixir).

If you look it up on wikipedia, you’ll find the usu­al wordy def­i­n­i­tion that vague­ly makes sense.

…homoiconic­i­ty is a prop­er­ty of some pro­gram­ming lan­guages in which the pro­gram struc­ture is sim­i­lar to its syn­tax, and there­fore the program’s inter­nal rep­re­sen­ta­tion can be inferred by read­ing the text’s lay­out…”

so, in short: code is data, data is code.

 

quote & eval

Take a sim­ple exam­ple:

image

this line of code per­forms a tem­po­rary bind­ing (binds x to the val­ue 1) and then incre­ments x to give the return val­ue of 2.

So it’s code that can be exe­cut­ed and yields some data.

But equal­ly, it can also be thought of as a list with three ele­ments:

  • a sym­bol named let;
  • a vec­tor with two ele­ments – a sym­bol named x, and an inte­ger;
  • a list with two ele­ments – a sym­bol named inc, and a sym­bol named x.

quote

You can use the quote func­tion to take some Clo­jure code and instead of eval­u­at­ing it, return it as data.

image

side­bar: any F# devel­op­er read­ing this will notice the sim­i­lar­i­ty to code quo­ta­tions in F#, although the rep­re­sen­ta­tion you get back is not near­ly as easy to manip­u­late nor is there a built-in way to eval­u­ate it. That said, you do have some options, includ­ing:

 

eval

On the flip side, you have the eval func­tion. It takes data and exe­cutes it as code.

image

After you have cap­tured some exe­cutable code as data, you can also manip­u­late it before exe­cut­ing the trans­formed code. This is where macros come in.

 

Macros

clojure.test for instance, is a unit test frame­work writ­ten with macros. You can do a sim­ple asser­tion using the ‘is’ macro:

image

and con­trast this with the error mes­sage we get from, say, NUnit.

image

Isn’t it great that the fail­ing expres­sions are print­ed out so you can straight away see what was wrong? It’s much more infor­ma­tive than the gener­ic mes­sage we get from NUnit, which forces us to dig around and fig­ure out which line of the test failed.

Update 23/05/2015:


As Vasi­ly point­ed out in the com­ments, there is an asser­tion library for F# called Unquote which uses F# code quo­ta­tions (men­tioned above) and pro­duces user-friend­ly error mes­sages sim­i­lar to clojure.testIt goes to show that, even with­out macros, just being able to eas­i­ly cap­ture code as data struc­tures in your lan­guage can enable many use cas­es — Phil Trelford’s Foq mock­ing library is anoth­er good exam­ple.

 

Building an assert-equals macro

As a process of dis­cov­ery, let’s see how this can be done via macros.

 

Version 1

To start off, we will define the sim­plest macro that might work:

image

oops, so that last case didn’t work.

That’s because the actu­al and expect­ed val­ues passed into the macro are code, not the inte­ger val­ue 2.

image

 

Version 2

So what if we just throw an eval in there?

image

that works, right? right?

Well, not quite.

Instead of manip­u­lat­ing the data rep­re­sent­ing our code, we have eval­u­at­ed them at com­pile time (macros runs at com­pile time).

You can ver­i­fy this by using macroex­pand:

image

so you can see that our macro has trans­formed the input code into the boolean val­ue true and returned it as code.

 

Version 3

What we ought to do is return the code we want to exe­cute as data, which we know how to do already – using the quote func­tion. In the returned code, we also need to error when the asser­tion fails.

So start­ing with the code we want to exe­cute giv­en that:

image

well, we’d want to:

  • com­pare the eval­u­at­ed val­ues of actu­al and expect­ed and throw an Asser­tion­Error if they are not equal
  • dis­play the actu­al expres­sion (inc 1)  and expect­ed expres­sion (+ 0 1) in the error mes­sage
  • dis­play the eval­u­at­ed val­ue for actu­al — 2

so some­thing along the lines of the fol­low­ing, per­haps?

image

Now that we know our endgame we can work back­wards to define our macro:

image

See the resem­blance? The impor­tant thing to note here is that we have quot­ed the whole let block (via the ‘ short­hand). But in order to ref­er­ence the actu­al and expect­ed expres­sions and return them as they are, i.e. (inc 1), (+ 0 1) , we had to selec­tive­ly unquote cer­tain things using the ~ oper­a­tor.

You can expand the macro and see that it’s seman­ti­cal­ly iden­ti­cal to the code that we want­ed to out­put:

image

Before we move on, you might be won­der­ing about some of the quote-unquote, unquote-quote actions going on here, so let’s spend a few moments to dwell into them.

Out­putting the actu­al expres­sion to be eval­u­at­ed

Remem­ber, the actu­al and expect­ed argu­ments in our def­macro block are the quot­ed ver­sions of (inc 1) and (+ 0 1).

We want to eval­u­ate actu­al only once for effi­cien­cy, and in case it caus­es side effects. Which is why we need to eval­u­ate it and bind the result to a sym­bol.

In order to gen­er­ate the out­put code (let [actu­al-val­ue (inc 1)] …) which will eval­u­ate (inc 1) at run­time, we need to ref­er­ence the actu­al expres­sion in its quot­ed form, hence ~actu­al.

Note the dif­fer­ence in the expand­ed code if we don’t unquote actu­al.

image

with­out the ~, the gen­er­at­ed code would look for a local vari­able called actu­al which will fail because it doesn’t exist.

Out­putting the actu­al-val­ue sym­bol

In order to out­put the actu­al-val­ue sym­bol in the let bind­ing we had to write ~’actu­al-val­ue, that is, (unquote (quote actu­al-val­ue)).

image

I know, right!? Took me a while to get my head around it too.

Q. Can we not just write ‘(let [actu­al-val­ue ~actu­al] …) ?

A. No, because it’ll trans­late to (let [user/ac­tu­al-val­ue (inc 1)]…) which is not a valid let bind­ing.

Q. Ok, how about ~actu­al-val­ue?

A. No, because the macro won’t com­pile as we’ll be look­ing for a non-exis­tent local vari­able actu­al-val­ue inside the scope of def­macro.

Q. Ok.. or ‘actu­al-val­ue?

A. No, because it’ll trans­late to (let [(quote actu­al-val­ue)  (inc 1)]…) which fails at run­time because that’s not a valid syn­tax for bind­ing.

Q. So how does ~’actu­al-val­ue  work exact­ly?

A.  The fol­low­ing:

  1. (quote actu­al-val­ue) to cap­ture the sym­bol actu­al-val­ue
  2. unquote the sym­bol so that it appears as it is in the out­put code

Out­putting the actu­al and expect­ed expres­sions

Final­ly, when for­mu­lat­ing the error mes­sage, we also saw ‘~actu­al and ‘~expect­ed.

Here are the expand­ed code with and with­out the quote.

image

See the dif­fer­ence?

With­out the quote, the gen­er­at­ed code will have eval­u­at­ed (inc 1) and print­ed FAIL in 2.

With the quote, it’d have print­ed FAIL in (inc 1) instead, which is what we want.

Rule of thumb

  • to cap­ture a sym­bol, use ~’sym­bol-name
  • to ref­er­ence an argu­ment to the macro and gen­er­ate code that will be eval­u­at­ed at run­time, use ~arg-name
  • to ref­er­ence an argu­ment to the macro and gen­er­ate code that quotes it at run­time, use ‘~arg-name

 

Final­ly, let’s test out our new macro.

image

Sweet! So that’s it?

Almost.

There’s a minor prob­lem with our macro here – it’s not safe from name col­li­sions on actu­al-val­ue.

image

 

Version 4

If you see # at the end of a sym­bol then this is used to auto­mat­i­cal­ly gen­er­ate a new sym­bol with a ran­dom name. This is use­ful in macros as it keeps the sym­bols declared in macros from leak­ing out.

So instead of using ~’actu­al-val­ue in the let bind­ing we might do the fol­low­ing instead:

image

When expand­ed, you can see the let bind­ing is using a ran­dom­ly gen­er­at­ed sym­bol actu­al-val­ue__16087__au­to__:

image

Not only is this ver­sion safer, it’s also more read­able with­out the mind-bend­ing (unquote (quote actu­al-val­ue)) busi­ness!

 

So there, a quick(-ish) intro­duc­tion to homoiconic­i­ty and Clo­jure macros. Macros are a pow­er­ful tool to have in one’s tool­box, and allows you to extend the lan­guage in a very nat­ur­al way as clojure.test does. I hope you find the idea inter­est­ing and I have done the top­ic jus­tice and explained it clear­ly enough.

Feel free to let me know in the com­ments if anything’s not clear.

 

Links