Understanding homoiconicity through Clojure macros

Having been primarily involved with .Net languages in my career so far, homoiconicity was a new idea to me when I first encountered it in Clojure (and also later in Elixir).

If you look it up on wikipedia, you’ll find the usual wordy definition that vaguely makes sense.

“…homoiconicity is a property of some programming languages in which the program structure is similar to its syntax, and therefore the program’s internal representation can be inferred by reading the text’s layout…”

so, in short: code is data, data is code.

 

quote & eval

Take a simple example:

image

this line of code performs a temporary binding (binds x to the value 1) and then increments x to give the return value of 2.

So it’s code that can be executed and yields some data.

But equally, it can also be thought of as a list with three elements:

  • a symbol named let;
  • a vector with two elements – a symbol named x, and an integer;
  • a list with two elements – a symbol named inc, and a symbol named x.

quote

You can use the quote function to take some Clojure code and instead of evaluating it, return it as data.

image

sidebar: any F# developer reading this will notice the similarity to code quotations in F#, although the representation you get back is not nearly as easy to manipulate nor is there a built-in way to evaluate it. That said, you do have some options, including:

 

eval

On the flip side, you have the eval function. It takes data and executes it as code.

image

After you have captured some executable code as data, you can also manipulate it before executing the transformed code. This is where macros come in.

 

Macros

clojure.test for instance, is a unit test framework written with macros. You can do a simple assertion using the ‘is’ macro:

image

and contrast this with the error message we get from, say, NUnit.

image

Isn’t it great that the failing expressions are printed out so you can straight away see what was wrong? It’s much more informative than the generic message we get from NUnit, which forces us to dig around and figure out which line of the test failed.

Update 23/05/2015:

As Vasily pointed out in the comments, there is an assertion library for F# called Unquote which uses F# code quotations (mentioned above) and produces user-friendly error messages similar to clojure.testIt goes to show that, even without macros, just being able to easily capture code as data structures in your language can enable many use cases – Phil Trelford’s Foq mocking library is another good example.

 

Building an assert-equals macro

As a process of discovery, let’s see how this can be done via macros.

 

Version 1

To start off, we will define the simplest macro that might work:

image

oops, so that last case didn’t work.

That’s because the actual and expected values passed into the macro are code, not the integer value 2.

image

 

Version 2

So what if we just throw an eval in there?

image

that works, right? right?

Well, not quite.

Instead of manipulating the data representing our code, we have evaluated them at compile time (macros runs at compile time).

You can verify this by using macroexpand:

image

so you can see that our macro has transformed the input code into the boolean value true and returned it as code.

 

Version 3

What we ought to do is return the code we want to execute as data, which we know how to do already – using the quote function. In the returned code, we also need to error when the assertion fails.

So starting with the code we want to execute given that:

image

well, we’d want to:

  • compare the evaluated values of actual and expected and throw an AssertionError if they are not equal
  • display the actual expression (inc 1)  and expected expression (+ 0 1) in the error message
  • display the evaluated value for actual – 2

so something along the lines of the following, perhaps?

image

Now that we know our endgame we can work backwards to define our macro:

image

See the resemblance? The important thing to note here is that we have quoted the whole let block (via the ‘ shorthand). But in order to reference the actual and expected expressions and return them as they are, i.e. (inc 1), (+ 0 1) , we had to selectively unquote certain things using the ~ operator.

You can expand the macro and see that it’s semantically identical to the code that we wanted to output:

image

Before we move on, you might be wondering about some of the quote-unquote, unquote-quote actions going on here, so let’s spend a few moments to dwell into them.

Outputting the actual expression to be evaluated

Remember, the actual and expected arguments in our defmacro block are the quoted versions of (inc 1) and (+ 0 1).

We want to evaluate actual only once for efficiency, and in case it causes side effects. Which is why we need to evaluate it and bind the result to a symbol.

In order to generate the output code (let [actual-value (inc 1)] …) which will evaluate (inc 1) at runtime, we need to reference the actual expression in its quoted form, hence ~actual.

Note the difference in the expanded code if we don’t unquote actual.

image

without the ~, the generated code would look for a local variable called actual which will fail because it doesn’t exist.

Outputting the actual-value symbol

In order to output the actual-value symbol in the let binding we had to write ~’actual-value, that is, (unquote (quote actual-value)).

image

I know, right!? Took me a while to get my head around it too.

Q. Can we not just write ‘(let [actual-value ~actual] …) ?

A. No, because it’ll translate to (let [user/actual-value (inc 1)]…) which is not a valid let binding.

Q. Ok, how about ~actual-value?

A. No, because the macro won’t compile as we’ll be looking for a non-existent local variable actual-value inside the scope of defmacro.

Q. Ok.. or ‘actual-value?

A. No, because it’ll translate to (let [(quote actual-value)  (inc 1)]…) which fails at runtime because that’s not a valid syntax for binding.

Q. So how does ~’actual-value  work exactly?

A.  The following:

  1. (quote actual-value) to capture the symbol actual-value
  2. unquote the symbol so that it appears as it is in the output code

Outputting the actual and expected expressions

Finally, when formulating the error message, we also saw ‘~actual and ‘~expected.

Here are the expanded code with and without the quote.

image

See the difference?

Without the quote, the generated code will have evaluated (inc 1) and printed FAIL in 2.

With the quote, it’d have printed FAIL in (inc 1) instead, which is what we want.

Rule of thumb

  • to capture a symbol, use ~’symbol-name
  • to reference an argument to the macro and generate code that will be evaluated at runtime, use ~arg-name
  • to reference an argument to the macro and generate code that quotes it at runtime, use ‘~arg-name

 

Finally, let’s test out our new macro.

image

Sweet! So that’s it?

Almost.

There’s a minor problem with our macro here – it’s not safe from name collisions on actual-value.

image

 

Version 4

If you see # at the end of a symbol then this is used to automatically generate a new symbol with a random name. This is useful in macros as it keeps the symbols declared in macros from leaking out.

So instead of using ~’actual-value in the let binding we might do the following instead:

image

When expanded, you can see the let binding is using a randomly generated symbol actual-value__16087__auto__:

image

Not only is this version safer, it’s also more readable without the mind-bending (unquote (quote actual-value)) business!

 

So there, a quick(-ish) introduction to homoiconicity and Clojure macros. Macros are a powerful tool to have in one’s toolbox, and allows you to extend the language in a very natural way as clojure.test does. I hope you find the idea interesting and I have done the topic justice and explained it clearly enough.

Feel free to let me know in the comments if anything’s not clear.

 

Links

2 thoughts on “Understanding homoiconicity through Clojure macros”

  1. Vasily Kirichenko

    I think it’s not good to compare clojure’s `is` macro and NUnit assertions. It’s better to compare to Unquote F# library which produces error messages similar to clojure’s.

  2. Cheers for that. I wasn’t trying to point a negative light on .Net-based test libraries, the same applies to most test frameworks I have seen. It was merely to point out what you can do if you are able to easily capture code as data, which of course also applies to F# code quotations.

Leave a Comment

Your email address will not be published. Required fields are marked *