I gave a talk about our use of F# at last year’s CodeMesh event, and the recording is now up on Vimeo.
You can also find the slides for the talk up on SlideShare:
If you have done any DevOps work on Amazon Web Services (AWS) then you should be familiar with Amazon CloudWatch, a service for tracking and viewing metrics (CPU, network in/out, etc.) about the various AWS services that you consume, or better still, custom metrics that you publish about your service.
On top of that, you can also set up alarms on any metrics and send out alerts via Amazon SNS, which is a pretty standard practice of monitoring your AWS-hosted application. There are of course many other paid services such as StackDriver and New Relic which offer you a host of value-added features, personally I was impressed with some of the predicative features from StackDriver.
The built-in Amazon management console for CloudWatch provides the rudimentary functionalities that lets you browse your metrics and view/overlap them on a graph, but it falls short once you have a decent number of metrics.
For starters, when trying to browse your metrics by namespace, you’re capped at 200 metrics so discovery is out of the question, you have to know what you’re looking for to be able to find it, which isn’t all that useful when you have hundreds of metrics to work with…
Also, there’s no way for you to filter metrics by the recorded datapoints, so to answer even simple questions such as
‘what other timespan metrics also spiked at mid-day when our service discovery latency spiked?’
you now have to manually go through all the relevant metrics (and of course you have to find them first!) and then visually check the graph to try and find any correlations.
After being frustrated by this manual process for one last time I decided to write some tooling myself to make my life (and hopefully others) a bit easier, and in comes Amazon.CloudWatch.Selector, a set of DSLs and CLI for querying against Amazon CloudWatch.
With this simple library you will get:
Both DSLs support the same set of filters, e.g.
|NamespaceIs||Filters metrics by the specified namespace.|
|NamespaceLike||Filters metrics using a regex pattern against their namespaces.|
|NameIs||Filters metrics by the specified name.|
|NameLike||Filters metrics using a regex pattern against their names.|
|UnitIs||Filters metrics against the unit they’re recorded in, e.g. Count, Bytes, etc.|
|Average||Filters metrics by the recorded average data points, e.g. average > 300 looks for metrics whose average in the specified timeframe exceeded 300 at any time.|
|Min||Same as above but for the minimum data points.|
|Max||Same as above but for the maximum data points.|
|Sum||Same as above but for the sum data points.|
|SampleCount||Same as above but for the sample count data points.|
|DimensionContains||Filters metrics by the dimensions they’re recorded with, please refer to the CloudWatch docs on how this works.|
|DuringLast||Specifies the timeframe of the query to be the last X minutes/hours/days. Note: CloudWatch only keeps up to 14 days worth of data so there’s no point going any further back then that.|
|Since||Specifies the timeframe of the query to be since the specified timestamp till now.|
|Between||Specifies the timeframe of the query to be between the specified start and end timestamp.|
|IntervalOf||Specifies the ‘period’ in which the data points will be aggregated into, i.e. 5 minutes, 15 minutes, 1 hour, etc.|
Here’s some code snippet on how to use the DSLs:
In addition to the DSLs, you’ll also find a simple CLI tool as part of the project which you can start by setting the credentials in the start_cli.cmd script and running it up. It allows you to query CloudWatch metrics using the external DSL.
Here’s a quick demo of using the CLI to select some CPU metrics for ElasiCache and then plotting them on a graph.
As a side note, one of the reasons why we have so many metrics is because we have made it super easy for ourselves to record new metrics (see this recorded webinar for more information) to gives ourselves a very granular set of metrics so that any CPU-intensive or IO work is monitored as well as any top-level entry points to our services.
Some time ago I put together a small BERT serializer and BERT-RPC client for .Net called Filbert (which is another name for Hazelnut, that has the word bert and the letter F and at the time every F# library has a leading F in its name!).
As an experimental project admittedly I hadn’t given too much thought to performance, and as you can see from below, the numbers don’t make for a flattering reading even against the BCL’s binary formatter!
I finally found some time to take a stab at improving the dreadful deserialization speed and with a couple of small changes I was able to halve the deserialization time for a very simple benchmark for a simple object, and there are still a couple of low hanging fruits that can improve things further.
As an experiment I have put together a simple, actor-based customer appender for log4net which allows you to publish your log messages into a configured Kinesis stream. You can then have another cluster of machines to fetch the data from the stream and do whatever processing or aggregation you like to do.
The implementation is done in F# in 100 lines of code, and as you can see is very simple, easy to reason with, fully asynchronous and thread-safe.
Once you have pushed your log messages into the stream, you’ll need to use the AWSSDK to fetch the data and process them. For Java, there’s a client application which takes care of most of the heavy lifting – e.g. tracking your progress, handling failovers and load balancing. Unfortunately, at the time of writing, there’s no equivalent of such client application in the current version of the .Net AWSSDK.
So to help make it easier for us .Net folks to build real-time data processing applications on top of Amazon Kinesis, I had started a Rx-based .Net client library called ReactoKinesiX (I really wanted to get RX into the name!), more details to follow.
I think the introduction of Kinesis is very exciting and opens up many possibilities, and at the current pricing model it also represents a very cost effective alternative to some of the other competing and more polished services out there.
I work with Amazon S3, and indeed many of the Amazon’s cloud services, and one of the things that is often frustrating in my day-to-day workflow is the need to keep jumping out of the IDE, find the files I’m looking for in a S3 explorer, grab the key to the file and then go back to the IDE and write code against the AWS SDK to read the data from S3 and do something with that data.
This inefficiency in my development process is magnified when working against buckets with large number of keys and/or buckets which has versioning turned on.
With these in mind, I wanted to create a type provider which is:
plus all the usual goodness that comes with F# type providers, such as:
After a few nights of work I finally have something that’s of usable shape, here is a quick demo video I’ve prepared of the type provider in action:
Go on, give it a play and let me know if you have any feedbacks!