Testing with Pex

Some time ago I read about (might be on DotNetRocks) a little gem coming out of Microsoft’s research lab called Pex, which is a framework for doing automated White Box Testing in .Net. It’s still in its early days (despite having been around for more than 2 years now) but packs a bag of potential judging by what I’ve seen of the demo materials and been able to use myself!

In short, Pex is able to analyze your methods and work out the boundary conditions, etc. and derive a series of tests that can be used to test your method with as high a coverage as possible. The download package also includes a lightweight framework for test stubs and detours (which basically allows you to replace any .Net method with your own delegate) called Stubs and Moles. It also comes with the ability to automatically generate test project for you in MSTest and NUnit though I haven’t tested out the NUnit generation as it didn’t support NUnit generation the last time I played around it.

I won’t go into detail on how to use it as there are a ton of documentation and demo material on its site and I have barely scratched the surface myself, but do check it out if you haven’t done so already!

DataContract Serialization by Reference using the IsReference Property

I came across this blog post the other day which introduced to me a cool addition to the DataContract serializer – the ability to generate XML by reference rather than by value:

http://www.zamd.net/2008/05/20/DataContractSerializerAndIsReferenceProperty.aspx

Not much for me to add to it really, just read the blog to see how it works.

Dealing with Circular References in WCF

Using entity classes in your application and WCF is complaining about the circular references between your classes? Well, I had the exact same problem not long ago, and I found this post on James Kovac’s blog about circular references and how to get around them:

http://www.jameskovacs.com/blog/GoingAroundInCirclesWithWCF.aspx

The key things to note from this post is that:

  1. WCF can handle circular references, but is not switched on by default
  2. There is a boolean flag in the constructor for the DataContractSerializer class which enables object references to be preserved
  3. We can tell WCF to switch this on by deriving from the DataContractSerializerOperationBehaviour class (as shown in the blog post above)

By this point you’re probably wondering why circular reference handling is not enabled by default, according to James Kovac, it’s because WCF by default tries to respect interoperability safety:

Now why can’t WCF handle circular references out-of-the-box. The reason is that there is no industry-accepted, interoperable way of expressing anything but parent-child relationships in XML. You can use the ID/IDREF feature of XML or the key/keyref feature of XML Schema, but a lot of serializers don’t respect these attributes or handle them properly. So if you want to serialize circular references, you need to stray out of the realm of safe interoperability.

So here are the classes you need to create to extend DataContractSerializerOperationBehaviour in order to preserve object reference:

public class PreserveReferencesOperationBehavior : DataContractSerializerOperationBehavior
{
     public PreserveReferencesOperationBehavior(OperationDescription operation) : base(operation)
     {
     }

     public PreserveReferencesOperationBehavior(
          OperationDescription operation, DataContractFormatAttribute dataContractFormatAttribute)
          : base(operation, dataContractFormatAttribute)
     {
     }

     public override XmlObjectSerializer CreateSerializer(
          Type type, XmlDictionaryString name, XmlDictionaryString ns, IList<Type> knownTypes)
     {
          return new DataContractSerializer(type, name, ns, knownTypes,
                                            0x7FFF /*maxItemsInObjectGraph*/,
                                            false/*ignoreExtensionDataObject*/,
                                            true/*preserveObjectReferences*/,
                                            null/*dataContractSurrogate*/);
     }
}

And the attribute to use on your operation contract:

public class PreserveReferencesAttribute : Attribute, IOperationBehavior
{
     public void AddBindingParameters(OperationDescription description,
                                      BindingParameterCollection parameters)
     {
     }

     public void ApplyClientBehavior(OperationDescription description, ClientOperation proxy)
     {
          IOperationBehavior innerBehavior = new PreserveReferencesOperationBehavior(description);
          innerBehavior.ApplyClientBehavior(description, proxy);
     }

     public void ApplyDispatchBehavior(OperationDescription description,
                                       DispatchOperation dispatch)
     {
          IOperationBehavior innerBehavior = new PreserveReferencesOperationBehavior(description);
          innerBehavior.ApplyDispatchBehavior(description, dispatch);
     }

     public void Validate(OperationDescription description)
     {
     }
}

which you apply like this:

[OperationContract]
[PreserveReferences]
MyClass RetrieveMyClass();

The C# Dispose Pattern

The Dispose pattern is something we’ve all seen before, and it’s so tried and tested most of us (especially myself!) have been more than happy to apply without question.

Whilst reading various blogs/articles I came across some differing opinion about this well known pattern and started to question what I had taken for granted myself.

After some more research and a question on the goldmine of knowledge that is the StackOverflow I have shortlisted a few points you should consider when implementing the standard C# dispose pattern:

  1. if your object doesn’t hold any IDisposable objects or unmanaged resources (DB connection, for example) then you don’t need to implement the IDisposable or finalizer at all
  2. if your object doesn’t hold any unmanaged resources then don’t implement a finalizer, the Garbage Collector won’t attempt to finalize your object (which has a performance hit) unless you have implemented a finalizer.
  3. don’t forget to call Dispose() on each of the IDisposable objects in the Dispose(bool) method.
  4. if your object holds unmanaged resources, clean them up in the finalizer without re-writing any of the cleanup code in the Dispose(bool) method already.

So for a simple class with no unmanaged resources and a collection of IDisposable objects, your class might look something like this:

public sealed class MyClass : IDisposable
{
     IList<MyObject> objects;  // MyClass holds a list of objects
     private bool _disposed;   // boolean flag to stop us calling Dispose(twice)

     public void Dispose()
     {
          Dispose(true);
          GC.SuppressFinalize(this);
     }

     private void Dispose(bool disposing)
     {
          if (!_disposed)
          {
               // call Dispose on each item in the list
               if (disposing)
               {
                    foreach (var o in objects)
                    {
                         // check if MyObject implements IDisposable
                         var d = o as IDisposable();
                         if (d != null) d.Dispose();
                    }
               }
          _disposed = true;
          }
     }
}

This is fairly similar to the standard C# Dispose pattern, the main difference being the lack of a finalizer because remember, implementing a finalizer will impact the performance of your type so don’t implement it unless you need it.

Memory leak in ADO.NET DataSet

Over the last couple of years, there have been many discussions/debates on DataSet vs Collections, and there was a very good article in MSDN magazine on just that:

http://msdn.microsoft.com/en-gb/magazine/cc163751.aspx#S7

To add to the Dark Sides of DataSet, there is a little known feature/bug/annoyance in the DataTable.Select() method – every time you call the Select() method it creates a new index implicitly without you having any control over it, and the index is not cleared until you call DataTable.AcceptChanges().

If your application has to deal with a large amount of data and have to use the Select() method repeatedly without calling AcceptChanges() then you might have a problem! Why? Consider these two factors:

1. the bigger the DataTable, the bigger the index, and if the index object is bigger than 85kb it gets allocated to the Large Object Heap which are not cleared automatically by the Garbage Collector/takes much longer to clear than small objects

2. in a 32-bit windows system, there’s a 2GB Virtual Address Space limit for each process, and in practice, you will usually get an OutOfMemoryException when your process has used around 1.2GB – 1.5GB of RAM

combine them and it’s not hard to imagine a scenario where your process might actually run out of memory and crash out before it completes its task! (Believe me, it was a hard learned lesson from my personal experience!)

Solutions:

1. unless you actually need some of the features DataSet offers such as the ability to keep multiple versions of the same row (Original, Current, etc.) you might be better off with using POCO (plain old CLR object) instead which are simple, lightweight and you can use LINQ to Objects with i4o to get some impressive performance improvements. After I implemented this change, my application went from crashing out with OutOfMemoryException to maxing out at 70MB throughout its lifetime and finished in about 15% of the time it’d have taken using DataSet.

2. if getting rid of DataSet altogether takes a little too much time and effort than you can afford, then there’s a quick workaround by using a DataView and dynamically change the Filter string every time you intend to call the Select() method.

If you wish to learn more about Garbage Collection in general, you should read Maoni’s WebLog which covers all things CLR Garbage Collector! He also wrote a nice article focused on Large Object Heap back in June 2008 which is well worth a read:

http://msdn.microsoft.com/en-us/magazine/cc534993.aspx