QCon London 2015–Takeaways from “Code as Crime Scene”

The first day of this year’s QCon Lon­don is over, and it’s been a thor­ough­ly enjoy­able day of talks. Most of the  talks are on soft­er, more philo­soph­i­cal top­ics, which is a nice change of pace from last week’s Lamb­da­Days.

One of my favourite talks from today was Adam Tornhill’s Code as Crime Scene and here are my key take­aways.


Many stud­ies have showed that we spend most of our time mak­ing changes or fix­ing bugs, which always start with under­stand­ing what the code does. We should there­fore we opti­mize for that.

A com­mon prob­lem we face in today’s world is that soft­ware is pro­duced by many devel­op­ers across many teams, and no one has a holis­tic view of how the whole looks.


When it comes to mea­sur­ing com­plex­i­ty, both lines-of-code and  cyclo­mat­ic com­plex­i­ty are use­ful met­rics to con­sid­er even though nei­ther pro­vide a full pic­ture of what we’re up against. They are use­ful because they fit nice­ly with our main con­straint as devel­op­ers — our work­ing mem­o­ry.

Since we don’t have a met­ric that can pro­vide a com­plete and accu­rate view on com­plex­i­ty, some has advo­cat­ed for the use of good old human intu­itions to mea­sure com­plex­i­ty instead. How­ev­er, intu­itions are prone to social and cog­ni­tive bias, and doesn’t scale well because of the same cog­ni­tive con­straints that neces­si­tate the mea­sur­ing of com­plex­i­ty in the first place.

Instead, Adam shows us how tech­niques from foren­sic psy­chol­o­gy can be applied in soft­ware, specif­i­cal­ly the prac­tice of geo­graph­i­cal offend­er pro­fil­ing.


Most offend­ers behave like us most of the time, and that’s where they spot oppor­tu­ni­ties for crime. Hence there’s an over­lap between the offender’s area of activ­i­ty and the loca­tions of his/her crimes.

Whilst this tech­nique does not nec­es­sar­i­ly give you exact loca­tions of sus­pect, it does help nar­row down the area of search. Using tools such as CodeCity you can lay down the geog­ra­phy for your code which reflex their com­plex­i­ty.


But com­plex­i­ty alone is not the prob­lem, it only becomes a prob­lem when we have to deal with it.

If you over­lay this geog­ra­phy with devel­op­er activ­i­ties (i.e. com­mit his­to­ry) and you will be able to iden­ti­fy hotspots – com­plex code that we need to work with often.



Defects tend to clus­ter, and if you over­lay area of code where defects occur with hotspots then you’re like­ly to find a high cor­re­la­tion between hotspots and defects.

Adam also showed how you can track com­plex­i­ty of hot spots over time and use them to project into the future with Com­plex­i­ty Trend analy­sis.



Tem­po­ral Cou­pling – by analysing your com­mit his­to­ry, you can find source files that are changed togeth­er in com­mits to iden­ti­fy depen­den­cies (phys­i­cal cou­pling), as well as ‘copy-and-paste’ code (log­i­cal cou­pling).


And remem­ber, if you have depen­den­cy between soft­ware com­po­nents devel­oped by dif­fer­ent peo­ple, then you essen­tial­ly have depen­den­cy on peo­ple.


When Conway’s law

orga­ni­za­tions which design sys­tems… are con­strained to pro­duce designs which are copies of the com­mu­ni­ca­tion struc­tures of these orga­ni­za­tions

- M. Con­way

is applied in reverse, it becomes a use­ful orga­ni­za­tion tool, i.e. orga­nize your com­mu­ni­ca­tion struc­ture to fit the soft­ware you want to build. It’s worth men­tion­ing that this mir­rors the shift in orga­ni­za­tion­al struc­ture that is hap­pen­ing in the DevOps move­ment.


If you con­nect peo­ple who com­mit to the same code by build­ing links between them, then you build up a graph that tells you how the mem­bers of your team inter­act with each oth­er through the parts of your code­base that they need to work on.

You can then com­pare that with your real orga­ni­za­tion­al struc­ture to see how well it sup­ports the way you actu­al­ly work. In the exam­ple below, you can see that mem­bers of the 4 teams are high­ly con­nect­ed to every­one else, so it’s an indi­ca­tion that the team-lev­el group­ing does not reflect areas of respon­si­bil­i­ty as everyone’s respon­si­bil­i­ties are over­lapped.



The num­ber of pro­gram­mers behind a piece of code is the most effec­tive indi­ca­tor of the num­ber of defects in that code – more pro­gram­mers = more defects. You should pay atten­tion to code that are changed by a lot of devel­op­ers, it might be an indi­ca­tion that it has too many respon­si­bil­i­ties and there­fore rea­sons for dif­fer­ent devel­op­ers to change it.


By show­ing the num­ber of com­mits each devel­op­er makes on a source file you can iden­ti­fy the knowl­edge own­ers of that part of your code­base.


You can then build up a knowl­edge map of your orga­ni­za­tion and even group the knowl­edge own­ers into their respec­tive team struc­ture to iden­ti­fy rela­tion between code changes to teams.


In the per­fect world, all knowl­edge own­ers for a com­po­nent (source files for one project, for instance) would be con­cen­trat­ed with­in a team, which shows that the respon­si­bil­i­ty of that com­po­nent is well defined and aligns with the orga­ni­za­tion­al struc­ture.

How­ev­er, when you find com­po­nents whose knowl­edge own­ers are scat­ter across your orga­ni­za­tion, then it might be an indi­ca­tion that:

  • maybe you’re miss­ing a team to take own­er­ship of that com­po­nent, or
  • that com­po­nent has too many respon­si­bil­i­ties and in need of refac­tor­ing


Using the knowl­edge map, you can also iden­ti­fy key play­ers in your orga­ni­za­tion – peo­ple who are knowl­edge own­ers in many areas of your code­base. This can help you iden­ti­fy risks of knowl­edge loss should they ever leave so you can mit­i­gate these risks via planned knowl­edge share with oth­er mem­bers of the team.

As it often hap­pens, when key play­ers leave, they also leave behind dead spots in your code­base – entire com­po­nents which are aban­doned because the peo­ple who under­stands them are no longer around. I have per­son­al­ly wit­nessed this hap­pen­ing mul­ti­ple times and it’s often the rea­son why projects are “rein­vent­ed”.



Adam’s talk was awe­some, and his book will be released at the end of the month on Ama­zon, you can also get the beta ver­sion eBook from The Prag­mat­ic Book­shelf too.



Adam Tornhill’s arti­cle on Code as Crime Scene

Your Code as a Crime Scene on Ama­zon

YouTube – Code as Crime Scene at TEDx

Code Maat – com­mand line tool to mine and analyse data from ver­sion con­trol sys­tems