Pseudo RealTime performance monitoring with AOP and AWS CloudWatch

This is some­thing I’ve men­tioned in my recent AOP talks, and I think it’s wor­thy of a wider audi­ence as it can be very use­ful to any­one who’s obsessed with per­for­mance as I am.

At iwi, we take per­for­mance very seri­ous­ly and are always look­ing to improve the per­for­mance of our appli­ca­tions. In order for us to iden­ti­fy the prob­lem areas and focus our efforts on the big wins we first need a way to mea­sure and mon­i­tor the indi­vid­ual per­for­mance of the dif­fer­ent com­po­nents inside our sys­tem, some­times down to a method lev­el.

For­tu­nate­ly, with the help of AOP and AWS Cloud­Watch we’re able to get a pseu­do-real­time view on how fre­quent­ly a method is exe­cut­ed and how much time it takes to exe­cute, down to one minute inter­vals:

image

With this infor­ma­tion, I can quick­ly iden­ti­fy meth­ods that are the worst offend­ers and focus my pro­fil­ing and opti­miza­tion efforts around those par­tic­u­lar methods/components.

Whilst I can­not dis­close any imple­men­ta­tion details in this post, it is my hope that it’ll be suf­fi­cient to give you an idea of how you might be able to imple­ment a sim­i­lar mech­a­nism.

AOP

A while back I post­ed about a sim­ple attribute for watch­ing method exe­cut­ing time and log­ging warn­ing mes­sages when a method takes longer than some pre-defined thresh­old.

Now, it’s pos­si­ble and indeed easy to mod­i­fy this sim­ple attribute to instead keep track of the exe­cu­tion times and bun­dle them up into average/min/max val­ues for a giv­en minute. You can then pub­lish these minute-by-minute met­rics to AWS Cloud­Watch from each vir­tu­al instance and let the Cloud­Watch ser­vice itself han­dle the task of aggre­gat­ing all the data-points.

By encap­su­lat­ing the log­ic of mea­sur­ing exe­cu­tion time into an attribute, you can start mea­sur­ing a par­tic­u­lar method by sim­ply apply­ing the attribute to that method. Alter­na­tive­ly, Post­Sharp sup­ports point­cut and lets you mul­ti­cast an attribute to many meth­ods at once, and allows you to fil­ter the method tar­get by name as well as vis­i­bil­i­ty lev­el. It is there­fore pos­si­ble for you to start mea­sur­ing and pub­lish­ing the exe­cu­tion time of ALL pub­lic meth­ods in a class/assembly with only one line of code!

CloudWatch

The Cloud­Watch ser­vice should be famil­iar to any­one who has used AWS EC2 before, it’s a mon­i­tor­ing ser­vice pri­mar­i­ly for AWS cloud resources (vir­tu­al instances, load bal­ancers, etc.) but it also allows you to pub­lish your own data about your appli­ca­tion. Even if your appli­ca­tion is not being host­ed inside AWS EC2, you can still make use of the Cloud­Watch ser­vice as long as you have an AWS account and a valid AWS access key and secret.

Once pub­lished, you can visu­al­ize your data inside the AWS web con­sole, depend­ing on the type of data you’re pub­lish­ing there are a num­ber of dif­fer­ent ways you can view them – Aver­age, Min, Max, Sum, Count, etc.

Note that AWS only keeps up to two weeks worth of data, so if you want to keep the data for longer you’ll have to query and store the data your­self. For instance, it makes sense to keep a his­to­ry of hourly aver­ages for the method exe­cu­tion times you’re track­ing so that in the future, you can eas­i­ly see where and when a par­tic­u­lar change has impact­ed the per­for­mance of those meth­ods. After all, stor­age is cheap and even with thou­sands of data points you’ll only be stor­ing that many rows per hour.