The C# Dispose Pattern

The Dispose pattern is something we’ve all seen before, and it’s so tried and tested most of us (especially myself!) have been more than happy to apply without question.

Whilst reading various blogs/articles I came across some differing opinion about this well known pattern and started to question what I had taken for granted myself.

After some more research and a question on the goldmine of knowledge that is the StackOverflow I have shortlisted a few points you should consider when implementing the standard C# dispose pattern:

  1. if your object doesn’t hold any IDisposable objects or unmanaged resources (DB connection, for example) then you don’t need to implement the IDisposable or finalizer at all
  2. if your object doesn’t hold any unmanaged resources then don’t implement a finalizer, the Garbage Collector won’t attempt to finalize your object (which has a performance hit) unless you have implemented a finalizer.
  3. don’t forget to call Dispose() on each of the IDisposable objects in the Dispose(bool) method.
  4. if your object holds unmanaged resources, clean them up in the finalizer without re-writing any of the cleanup code in the Dispose(bool) method already.

So for a simple class with no unmanaged resources and a collection of IDisposable objects, your class might look something like this:

public sealed class MyClass : IDisposable
{
     IList<MyObject> objects;  // MyClass holds a list of objects
     private bool _disposed;   // boolean flag to stop us calling Dispose(twice)

     public void Dispose()
     {
          Dispose(true);
          GC.SuppressFinalize(this);
     }

     private void Dispose(bool disposing)
     {
          if (!_disposed)
          {
               // call Dispose on each item in the list
               if (disposing)
               {
                    foreach (var o in objects)
                    {
                         // check if MyObject implements IDisposable
                         var d = o as IDisposable();
                         if (d != null) d.Dispose();
                    }
               }
          _disposed = true;
          }
     }
}

This is fairly similar to the standard C# Dispose pattern, the main difference being the lack of a finalizer because remember, implementing a finalizer will impact the performance of your type so don’t implement it unless you need it.

Memory leak in ADO.NET DataSet

Over the last couple of years, there have been many discussions/debates on DataSet vs Collections, and there was a very good article in MSDN magazine on just that:

http://msdn.microsoft.com/en-gb/magazine/cc163751.aspx#S7

To add to the Dark Sides of DataSet, there is a little known feature/bug/annoyance in the DataTable.Select() method – every time you call the Select() method it creates a new index implicitly without you having any control over it, and the index is not cleared until you call DataTable.AcceptChanges().

If your application has to deal with a large amount of data and have to use the Select() method repeatedly without calling AcceptChanges() then you might have a problem! Why? Consider these two factors:

1. the bigger the DataTable, the bigger the index, and if the index object is bigger than 85kb it gets allocated to the Large Object Heap which are not cleared automatically by the Garbage Collector/takes much longer to clear than small objects

2. in a 32-bit windows system, there’s a 2GB Virtual Address Space limit for each process, and in practice, you will usually get an OutOfMemoryException when your process has used around 1.2GB – 1.5GB of RAM

combine them and it’s not hard to imagine a scenario where your process might actually run out of memory and crash out before it completes its task! (Believe me, it was a hard learned lesson from my personal experience!)

Solutions:

1. unless you actually need some of the features DataSet offers such as the ability to keep multiple versions of the same row (Original, Current, etc.) you might be better off with using POCO (plain old CLR object) instead which are simple, lightweight and you can use LINQ to Objects with i4o to get some impressive performance improvements. After I implemented this change, my application went from crashing out with OutOfMemoryException to maxing out at 70MB throughout its lifetime and finished in about 15% of the time it’d have taken using DataSet.

2. if getting rid of DataSet altogether takes a little too much time and effort than you can afford, then there’s a quick workaround by using a DataView and dynamically change the Filter string every time you intend to call the Select() method.

If you wish to learn more about Garbage Collection in general, you should read Maoni’s WebLog which covers all things CLR Garbage Collector! He also wrote a nice article focused on Large Object Heap back in June 2008 which is well worth a read:

http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

Under the cover of i4o

I did some performance optimization work a little while back, and one of the changes which yielded a significant result was when I migrated some server side components (which are CPU intensive and performs a large number of loops) from using ADO.NET DataSets to using POCOs (plain old CLR object).

The looping was then done using LINQ to Objects, and I discovered a nice little extension to LINQ called i4o – which stands for Index for Objects – to help make the loops faster. However, I wasn’t able to observe any difference in performance, which contradicts with the findings on Aaron’s Technology Musing

Digging a little deeper into the i4o source code (admittedly I didn’t do this myself, credit to Mike Barker for doing this!), it turns out that there are a number of drawbacks in i4o which aren’t immediately obvious or mentioned anywhere in the documentation. The biggest problem for us was that it only supports equality comparison, which means it would simply ignore the index you have on the MatchID property if you try to run this query:

var result = from m in Matches where m.MatchID >= 1 select m;

but it’ll use the index on MatchID if you run this query instead:

var result = from m in Matches where m.MatchID == 1 select m;

The conclusion?

i4o is an awesome tool that can turbo boost your LINQ query, but ONLY put indices on properties which you will be doing equality comparison in your queries otherwise you’ll just be wasting some memory space holding indices which would be used at all.

Aspect Oriented Programming in .Net using PostSharp

I saw this article on D. Patrick Caldwell’s blog a little while back:

http://dpatrickcaldwell.blogspot.com/2009/03/validate-parameters-using-attributes.html

It was this article that got me interested in PostSharp and the possibilities that it can bring. PostSharp, in short, is a lightweight framework which introduces some Aspect-Oriented Programming into .Net.

Some of the common usages I have seen include tracing and the ‘memorizer‘ (again, from D. Patrick Caldwell’s blog) being one of the more interesting. There is also a blog entry over at Richard’s Braindump which highlights how you can use PostSharp to implement the INotifyPropertyChanged interface.

One thing I’d like to point out though, is that the parameter validation technique in D. Patrick Caldwell’s blog entry above should be used with care and you should avoid applying the [CheckParameters] attribute at class/assembly level as it does carry some performance hits. After playing around and experimenting with it for a little while, I have settled on applying the [CheckParameters] attribute only on methods whose parameters require validation.

In my line of work, we have a lot of problems with deadlocks in the DataBase due to the number of different applications using the same Tables in the DataBase and the different way they use these tables (some uses nolock, others don’t). As a result, there are a lot of boilerplate code in the DAL classes which catches SqlExceptions and in case of deadlocks or connection timeouts retry up to x number times. This, of course, is a cross-cutting concern, and by employing PostSharp I am able to deal with them with a simple attribute like the one below instead of hundreds and hundreds lines of code.

[Serializable]
[AttributeUsage(AttributeTargets.Method)]
public class RetryOnSqlDeadLockOrConnectionTimeOutExceptionAttribute : OnMethodInvocationAspect
{
     [NonSerialized]
     private int CurrentAttempt;

     public override void OnInvocation(MethodInvocationEventArgs eventArgs)
     {
          CurrentAttempt++;

          try
          {
               eventArgs.Proceed();
          }
          catch (SqlException sqlException)
          {
               if (sqlException.Number == -2 || sqlException.Number == 1205)
               {
                    // put retry logic here
               }
               else
                    throw;
          }
     }
}

and to use it:

[RetryOnSqlDeadLockOrConnectionTimeOutException]
public void SomeDataBaseBoundOperationWhichNeedsRetryOnDataBaseDeadLock()
{
     ...
}