Seven ineffective coding habits many F# programmers don’t have

This post is part of the F# Advent Cal­en­dar in Eng­lish 2014 project. Check out all the oth­er great posts there!

Spe­cial thanks to Sergey Tihon for orga­niz­ing this.

A good cod­ing habit is an incred­i­bly pow­er­ful tool, it allows us to make good deci­sions with min­i­mal cog­ni­tive effort and can be the dif­fer­ence between being a good pro­gram­mer and a great one.

I’m not a great pro­gram­mer; I’m just a good pro­gram­mer with great habits.”

- Kent Beck

A bad habit, on the oth­er hand, can for­ev­er con­demn us to repeat the same mis­takes and is dif­fi­cult to cor­rect.

 

I attend­ed Kevlin Henney’s “Sev­en Inef­fec­tive Cod­ing Habits of Many Pro­gram­mers” talk at the recent Build­Stuff con­fer­ence. As I sat in the audi­ence and reflect­ed on the times I exhib­it­ed many of the bad habits he iden­ti­fied and chal­lenged, some­thing hit me – I was cod­ing in C# in pret­ty much every case even though I also spend a lot of time in F#.

Am I just a bad C# devel­op­er who’s bet­ter at F#, or did the lan­guage I use make a big­ger dif­fer­ence than I real­ized at the time?

With this ques­tion in mind, I revis­it­ed the sev­en inef­fec­tive cod­ing habits that Kevlin iden­ti­fied and pon­dered why they are habits that many F# pro­gram­mers don’t have.

 

Noisy Code

Sig­nal-to-noise ratio some­times refers to the ratio of use­ful infor­ma­tion to false or irrel­e­vant data in a con­ver­sa­tion or exchange. Noisy code requires greater cog­ni­tive effort on the reader’s part, and is more expen­sive to main­tain.

We have a lot of habits which – with­out us real­is­ing it – add noise to our code. But often the lan­guage we use have a big impact on how much noise is added to our code, and the C-style syn­tax is big cul­prit here.

Objec­tive com­par­i­son between lan­guages are usu­al­ly dif­fi­cult because com­par­ing dif­fer­ent lan­guage imple­men­ta­tions across dif­fer­ent projects intro­duces too many oth­er vari­ables. Com­par­ing dif­fer­ent lan­guage imple­men­ta­tions of a small project is achiev­able but can­not answer how well the solu­tions scale up to big­ger projects.

For­tu­nate­ly, Simon Cousins was able to pro­vide a com­pre­hen­sive analy­sis of two code-bases writ­ten in dif­fer­ent lan­guages – C# and F# – imple­ment­ing the same appli­ca­tion.

The appli­ca­tion was non-triv­ial (~350k lines of C# code) and the num­bers speak for them­selves:

alt

image

image

Not only is the F# imple­men­ta­tion short­er and gen­er­al­ly more use­ful (i.e. high­er sig­nal-to-noise ratio), accord­ing to Simon’s post it also took a frac­tion of the man-hours to pro­vide a more com­plete imple­men­ta­tion of the require­ments:

The C# project took five years and peaked at ~8 devs. It nev­er ful­ly imple­ment­ed all of the con­tracts.

The F# project took less than a year and peaked at three devs (only one had pri­or expe­ri­ence with F#). All of the con­tracts were ful­ly imple­ment­ed.”

In sum­ma­ry, by remov­ing the need for { } as a core part of the lan­guage struc­ture and not hav­ing nulls, F# removes a lot of the noise that are usu­al­ly found in C# code.

 

Visual Dishonesty

…a clean design is one that sup­ports visu­al think­ing so peo­ple can meet their infor­ma­tion­al needs with a min­i­mum of con­scious effort.”

- Daniel Hig­gin­both­am

When it comes to code, visu­al hon­esty is about lay­ing out your code so that their visu­al rela­tion­ships are obvi­ous and accu­rate.

For exam­ple, when you put things above each oth­er then it sig­ni­fies hier­ar­chy. This is impor­tant, because you’re show­ing your read­er how to process the infor­ma­tion you’re giv­ing them. How­ev­er, you may not be aware that’s what you’re doing, which is why we end up with prob­lems.

You con­vey infor­ma­tion by the way you arrange a design’s ele­ments in rela­tion to each oth­er. This infor­ma­tion is under­stood imme­di­ate­ly, if not con­scious­ly, by the peo­ple view­ing your designs. This is great if the visu­al rela­tion­ships are obvi­ous and accu­rate, but if they’re not, your audi­ence is going to get con­fused. They’ll have to exam­ine your work care­ful­ly, going back and forth between the dif­fer­ent parts to make sure they under­stand.”

- Daniel Hig­gin­both­am

Take the sim­ple mat­ter of how nest­ed method calls are laid out in C#, they betray every­thing we have been taught about read­ing, and the order in which infor­ma­tion needs to be processed has been reversed.


image image

For­tu­nate­ly, F# intro­duced the pipeline oper­a­tor |> which allows us to restore the visu­al hon­esty with the way we lay out nest­ed func­tions calls.

image

In his talk Kevlin also touched on the place­ment of { } and how it affects read­abil­i­ty, using a rather sim­ple tech­nique:

image  image

and by doing so, it reveals inter­est­ing prop­er­ties about the struc­ture of the above code which we might not have noticed:

  1. we can’t tell where the argu­ment list ends and method body starts
  2. we can’t tell where the if con­di­tion ends and where the if body starts

These tell us that even though we are align­ing our code, its struc­ture and hier­ar­chy is still not imme­di­ate­ly clear with­out the aid of the curly braces. Remem­ber, if the visu­al rela­tion­ship between the ele­ments is not accu­rate, it’ll cause con­fu­sion for your read­ers and they need to exam­ine your code care­ful­ly to ensure they under­stand it cor­rect­ly. In order words, you cre­ate addi­tion­al cog­ni­tive bur­den on your read­ers when the lay­out of your code does not match the pro­gram struc­ture.

Now con­trast with the fol­low­ing:

image image

where the struc­ture and hier­ar­chy of the code is much more evi­dent. So it turns out that place­ment style of { } is not just a mat­ter of per­son­al pref­er­ence, it plays an impor­tant role in con­vey­ing the struc­ture of your code and there’s a right way to do it.

It turns out that style mat­ters in pro­gram­ming for the same rea­son that it mat­ters in writ­ing.

It makes for bet­ter read­ing.”

- Dou­glas Crock­ford

In hind­sight this seems obvi­ous, but why do we still get it wrong? How can so many of us miss some­thing so obvi­ous?

I think part of the prob­lem is that, you have two com­pet­ing rules for struc­tur­ing your code in C-style lan­guages – one for the com­pil­er and one for humans. { and } are used to con­vey struc­ture of your code to the com­pil­er, but to con­vey struc­ture infor­ma­tion to humans, you use both { } and inden­ta­tion.

This, cou­pled with the eager­ness to super­fi­cial­ly reduce the line count or adhere by guide­lines such as “meth­ods shouldn’t be more than 60 lines long”, and we have the per­fect storm that results in us sac­ri­fic­ing read­abil­i­ty in the name of read­abil­i­ty.

 

So what if you use space for con­vey­ing struc­ture infor­ma­tion to both the com­pil­er and humans? Then you remove the ambi­gu­i­ty and peo­ple can stop fight­ing about where to place their curly braces!

The above exam­ple can be writ­ten in F# as the fol­low­ing, using con­ven­tions in the F# com­mu­ni­ty:

image  image

notice how the code is not only much short­er, but also struc­tural­ly very clear.

 

In sum­ma­ry, F#’s pipes allows you to restore visu­al hon­esty with regards to the way nest­ed func­tion calls are arranged so that the flow of infor­ma­tion match­es the way we read. In addi­tion, white­spaces pro­vide a con­sis­tent way to depict hier­ar­chy infor­ma­tion to both the com­pil­er and human. It removes the need to argue over { } place­ment strate­gies whilst mak­ing the struc­ture of your code clear to see at the same time.

 

Lego naming

Nam­ing is hard, and as Kevlin points out that so often we resort to lego nam­ing by glu­ing com­mon words such as ‘cre­ate’, ‘process’, ‘val­i­date’, ‘fac­to­ry’ togeth­er in an attempt to cre­ate mean­ing.

This is not nam­ing, it is labelling.

Adding more words is not the same as adding mean­ing. In fact, more often than not it can have the oppo­site effect of dilut­ing the mean­ing of the thing we’re try­ing to name. This is how we end up with gems such as con­troller­Fac­to­ry­Fac­to­ry, where the mean­ing of the whole is less than the sum of its parts.

image

 

Nam­ing is hard, and hav­ing to give names to every­thing — every class, method and vari­able — makes it even hard­er. In fact, try­ing to give every­thing a mean­ing­ful name is so hard, that even­tu­al­ly most of us sim­ply stop car­ing, and lego nam­ing seems like the lazy way out.

 

In F#, and in func­tion­al pro­gram­ming in gen­er­al, it’s com­mon prac­tice to use anony­mous func­tions, or lamb­das. Straight away you remove the need to come up with good names for a whole bunch of things. Often the mean­ing of these lamb­das are cre­at­ed by the high­er order func­tions that use them — Array.map, Array.filter, Array.iter, e.g. the func­tion passed into Array.map is used to, sur­prise sur­prise, map val­ues in an array!

(Before you say it, yes, you can use anony­mous del­e­gates in C# too, espe­cial­ly when work­ing with LINQ. How­ev­er, when you use LINQ you are doing func­tion­al pro­gram­ming, and the use of lamb­das is much more com­mon in F# and oth­er func­tion­al-first lan­guages.)

 

Lego nam­ing can also be the symp­tom of a fail­ure to iden­ti­fy the right lev­el of abstrac­tions.

Just like nam­ing, com­ing up with the right abstrac­tions can be hard. And when the right abstrac­tion is a piece of pure func­tion­al­i­ty, we don’t have a way to rep­re­sent it effec­tive­ly in OOP (note, I’m not talk­ing about the object-ori­en­ta­tion that Alan Kay had in mind when he coined the term objects).

In sit­u­a­tions like this, the com­mon prac­tice in OOP is to wrap the func­tion­al­i­ty inside a class or inter­face. So you end up with some­thing that pro­vides the desired func­tion­al­i­ty, and the func­tion­al­i­ty itself. That’s two things to name instead of one, this is hard, let’s be lazy and com­bine some com­mon words togeth­er and see if they make sense…

pub­lic inter­face Con­di­tionCheck­er

{

    bool Check­Con­di­tion();

}

The prob­lem here is that the right lev­el of abstrac­tion is small­er than an “object”, so we have to intro­duce anoth­er lay­er of abstrac­tion just to make it fit into our world view.

 

In F#, and in func­tion­al pro­gram­ming in gen­er­al, no abstrac­tion is too small and func­tions are so ubiq­ui­tous that all the OO pat­terns that we’re so fond of can be rep­re­sent­ed as func­tions.

Take the Con­di­tionCheck­er exam­ple above, the essence of what we’re look­ing for is a con­di­tion that is eval­u­at­ed with­out input and returns a boolean val­ue. This can be rep­re­sent­ed as the fol­low­ing in F#:

type Con­di­tion = unit –> bool

Much more con­cise, wouldn’t you agree? And any func­tion that match­es the type sig­na­ture can be treat­ed as a Con­di­tion with­out hav­ing to explic­it­ly imple­ment some inter­face.

 

Anoth­er com­mon prac­tice in C# and Java is to label excep­tion types with Excep­tion, i.e. Cas­tEx­cep­tion, Argu­mentEx­cep­tion, etc. This is anoth­er symp­tom of our ten­den­cy to label things rather than nam­ing them, out of lazi­ness (and not the good kind).

If we had put in more thought into them, than maybe we could have come up with more mean­ing­ful names, for instance:

image

 

In F#, the com­mon prac­tice is to define errors using the light­weight excep­tion syn­tax, and the con­ven­tion here is to not use the Excep­tion suf­fix since the lead­ing excep­tion key­word already pro­vides suf­fi­cient clue as to what the type rep­re­sents.

image

 

In sum­ma­ry, whilst F# doesn’t stop you from lego nam­ing things, it helps because:

  • the use of anony­mous func­tions reduces the num­ber of things you have to name sig­nif­i­cant­ly;
  • being able to mod­el your appli­ca­tion at the right lev­el of abstrac­tion removes unnec­es­sary lay­ers of abstrac­tions, and there­fore reduce the num­ber of things you have to name even fur­ther,
  • it’s eas­i­er to name things when they’re at the right lev­el of abstrac­tion;
  • con­ven­tion in F# is to use the light­weight excep­tion syn­tax to define excep­tion types with­out the Excep­tion suf­fix.

 

Under-Abstraction

In his pre­sen­ta­tion, Kevlin showed an inter­est­ing tech­nique of using a tag cloud to see what pops out from your code:

image           image

Com­pare these two exam­ples you can see the domain of the sec­ond exam­ple sur­fac­ing through the tag cloud – paper, pic­ture, print­ingde­vice, etc. where­as the first exam­ple shows raw strings and lists.

When we under abstract, we often find our­selves with a long list of argu­ments to our methods/functions. When that list gets long enough, adding anoth­er one or two argu­ments is no longer sig­nif­i­cant.

If you have a pro­ce­dure with ten para­me­ters, you prob­a­bly missed some.”

- Alan Perlis

Unfor­tu­nate­ly, F# can’t stop you from under abstract­ing, but it has a pow­er­ful type sys­tem that pro­vides all the nec­es­sary tools for you to cre­ate the right abstrac­tions for your domain with min­i­mal effort. Have a look at Scott Wlaschin’s talk on DDD with the F# type sys­tem for inspi­ra­tions on how you might do that:

 

Unencapsulated State

If under-abstrac­tion is like going to a job inter­view in your pyja­mas then hav­ing unen­cap­su­lat­ed state would be akin to wear­ing your under­wear on the out­side (which, inci­den­tal­ly put you in some rather famous cir­cles…).

6824294674_5038e1630a_z

The exam­ple that Kevlin used to illus­trate this habit is dan­ger­ous because the inter­nal state that has been exposed to the out­side world is muta­ble. So not only is your under­wear worn on the out­side for all to see, every­one is able to mod­i­fy it with­out your con­sent… now that’s a scary thought!

In this exam­ple, what should have been done is for us to encap­su­late the muta­ble list and expose only the prop­er­ties that are rel­e­vant, for instance:

type Recent­lyUs­edList () =

    let items = new List<string>()

    mem­ber this.Items = items.ToArray() // now the out­side world can’t mutate our inter­nal state

    mem­ber this.Count = items.Count

    …

Whilst F# has no way to stop you from expos­ing the items list pub­li­cal­ly, func­tion­al pro­gram­mers are very con­scious of main­tain­ing the immutabil­i­ty facade so even if a F#’er is using a muta­ble list inter­nal­ly he would not have allowed it to leak out­side.

In fact, a F#’er would prob­a­bly have imple­ment­ed a Recent­lyUs­edList dif­fer­ent­ly, for instance:

type Recent­lyUs­edList (?items) =

    let items = default­Arg items [ ]

    mem­ber this.Items = List.toArray items

    mem­ber this.Count = List.length items

    mem­ber this.Add newItem =

        let newItems = newItem::(items |> List.filter ((<>) newItem))

        Recent­lyUs­edList newItems

 

But there’s more.

Kevlin also touched on encap­su­la­tion in gen­er­al, and its rela­tion to a usabil­i­ty study con­cept called affor­dance.

An affor­dance is a qual­i­ty of an object, or an envi­ron­ment, which allows an indi­vid­ual to per­form an action. For exam­ple, a knob affords twist­ing, and per­haps push­ing, whilst a cord affords pulling.”

- Wikipedia

If you want the user to push, then don’t give them some­thing that they can pull, that’d be bad usabil­i­ty design. The same prin­ci­ples applies to code, your abstrac­tions should afford the right behav­iours whilst make it impos­si­ble to do the wrong thing.

 

When mod­el­ling your domain with F#, since there are no nulls it imme­di­ate­ly elim­i­nates the most com­mon ille­gal state that you have to look out for. And since types are immutable by default, once they are val­i­dat­ed at con­struc­tion time you don’t have to wor­ry about them enter­ing into an invalid state lat­er.

To make invalid states un-rep­re­sentable in your mod­el, a com­mon prac­tice is to cre­ate a finite, closed set of pos­si­ble valid states as a dis­crim­i­nat­ed union. As a sim­ple exam­ple:

type Pay­ment­Method =

    | Cash

    | Cheque of ChequeNum­ber

    | Card     of Card­Type * Card­Num­ber

Com­pared to a class hier­ar­chy, a dis­crim­i­nat­ed union type can­not be extend­ed and there­fore invalid states can­not be intro­duced at a lat­er date by abusing/exploiting inher­i­tance.

 

In sum­ma­ry, F# pro­gram­mers are very con­scious of immutabil­i­ty so even if they are using muta­ble types to rep­re­sent inter­nal state it’s high­ly unlike­ly for them to expose their muta­bil­i­ty and break the immutabil­i­ty facade they hold so dear­ly.

And because types are immutable by default, and there are no nulls in F#, it’s also easy for you to ensure that invalid states are sim­ply un-rep­re­sentable when mod­el­ling a domain.

 

Getters and Setters

Kevlin chal­lenged the nam­ing of get­ters and set­ters, since in Eng­lish ‘get’ usu­al­ly implies side effects:

I get mar­ried”

I get mon­ey from the ATM”

I get from point A to point B”

Yet, in pro­gram­ming, get implies a query with no side effects.

Also, get­ters and set­ters are oppo­sites in pro­gram­ming, but in Eng­lish, the oppo­site of set is reset or unset.

 

Sec­ond­ly, Kevlin chal­lenged the habit of always cre­at­ing set­ters when­ev­er we cre­ate get­ters. This habit is even enforced and encour­aged by many mod­ern IDEs that gives you short­cuts to auto­mat­i­cal­ly cre­ate these get­ters and set­ters in pairs.

That’s great, now we have short­cuts to do the wrong thing.

We used to have type lots to do the wrong thing, not any­more.”

- Kevlin Hen­ney

And he talked about how we need to be more cau­tious and con­scious about what can change and what can­not.

When it is not nec­es­sary to change, it is nec­es­sary not to change.”

- Lucius Cary

 

With F#, immutabil­i­ty is the default, so when you define a new record or val­ue, it is immutable unless you explic­it­ly say so oth­er­wise (with the muta­ble key­word). So to do the wrong thing, i.e. to define a cor­re­spond­ing set­ter for every get­ter, you have to do lots of extra work.

Every time you have to type the muta­ble key­word is anoth­er chance for you to ask your­self “is it real­ly nec­es­sary for this field to change”. In my expe­ri­ence it has pro­vid­ed suf­fi­cient fric­tion and forced me to make very con­scious deci­sions on what can change under what con­di­tions.

 

Uncohesive Tests

Many of us have the habit of test­ing meth­ods – that is, for every method Foo we have a Test­Foo that invokes Foo and inspects its behav­iour. This type of test­ing cov­ers only the sur­face area of your code, and although you can achieve a high code cov­er­age per­cent­age this way (and keep the man­agers hap­py), that cov­er­age num­ber is only super­fi­cial.

Meth­ods are usu­al­ly called in dif­fer­ent com­bi­na­tions to achieve some desired func­tion­al­i­ty, and many of the com­plex­i­ties and poten­tial bugs lie in the way they work togeth­er. This is par­tic­u­lar­ly true when states are con­cerned as the order the state is updat­ed in might be sig­nif­i­cant and you also bring con­cur­ren­cy into the equa­tion.

Kevlin calls for an end of this prac­tice and for us to focus on test­ing spe­cif­ic func­tion­al­i­ties instead, and use our tests as spec­i­fi­ca­tions for those func­tion­al­i­ties.

For tests to dri­ve devel­op­ment they must do more than just test that code per­forms its required func­tion­al­i­ty: they must clear­ly express that required func­tion­al­i­ty to the read­er. That is, they must be clear spec­i­fi­ca­tion of the required func­tion­al­i­ty.”

- Nat Pryce and Steve Free­man

This is in line with Gojko Adzic’s Spec­i­fi­ca­tion by Exam­ple which advo­cates the use of tests as a form of spec­i­fi­ca­tion for your appli­ca­tion that is exe­cutable and always up-to-date.

 

But, even as we improve on what we test, we still need to have suf­fi­cient num­ber of tests to give us a rea­son­able degree of con­fi­dence. To put it into con­text, an exhaus­tive test suit for a func­tion of the sig­na­ture Int –> Int would need to have 2147483647 test cas­es. Of course, you don’t need an exhaus­tive test suit to reach a rea­son­able degree of con­fi­dence, but there’s a lim­i­ta­tion on the num­ber of tests that we will be able to write by hand because:

  • writ­ing and main­tain­ing large num­ber of tests are expen­sive
  • we might not think of all the edge cas­es

This is where prop­er­ty-based auto­mat­ed test­ing comes in, and that’s where the F# (and oth­er QuickCheck-enabled lan­guages such as Haskell and Erlang) com­mu­ni­ty is at with the wide­spread adop­tion of FsCheck. If you’re new to FsCheck or prop­er­ty-based test­ing in gen­er­al, check out Scott Wlaschin’s detailed intro­duc­to­ry post to prop­er­ty-based test­ing as part of the F# Advent Cal­en­dar in Eng­lish 2014.

 

Summary

We pick up habits – good and bad – over time and with prac­tice. Since we know that prac­tice doesn’t make per­fect, it makes per­ma­nent; only per­fect prac­tice makes per­fect, it is impor­tant for us to acquire good prac­tice in order to form and nur­ture good habits.

The pro­gram­ming lan­guage we use day-to-day plays an impor­tant role in this regard.

Pro­gram­ming lan­guages have a devi­ous influ­ence: they shape our think­ing habits.”

- Dijk­stra

As anoth­er year draws to a close, let’s hope the year ahead is filled with the good prac­tice we need to make per­fect, and to ensure it let’s all write more F# :-P

 

Wish you all a mer­ry xmas!

 

Links

F# Advent Cal­en­dar in Eng­lish 2014

Kevlin Hen­ney – Sev­en inef­fec­tive cod­ing habits of many pro­gram­mers

Andreas Self­ik – The pro­gram­ming lan­guage wars

Simon Cousins – Does the lan­guage you use make a dif­fer­ence (revis­it­ed)

Cod­ing Hor­ror – The best code is no code at all

F# for Fun and Prof­it — Cycles and mod­u­lar­i­ty in the wild

Being visu­al­ly hon­est with F#

Ian Bar­ber – Nam­ing things

Joshua Bloch – How to design a good API and why it mat­ters

Null Ref­er­ences : the Bil­lion dol­lar mis­take