For a while now I’ve been won­der­ing why C#‘s sup­port for covari­ance does not cover value types, both in nor­mal array covari­ance and covari­ance in the generic para­me­ter intro­duced in C# 4:

   1: void Main()

   2: {

   3:     int i = 0;

   4:     string str = "hello world";

   5:     

   6:     TestMethod(i);       // legal

   7:     TestMethod(str);     // legal

   8:     TestMethod2(Enumerable.Empty<int>());           // illegal

   9:     TestMethod2(Enumerable.Empty<string>());        // legal

  10:     

  11:     Console.WriteLine(i is object);                 // true

  12:     Console.WriteLine(new int[0] is object[]);      // false

  13:     Console.WriteLine(new string[0] is object[]);   // true

  14:     Console.WriteLine(new uint[0] is int[]);        // false

  15: }

  16:  

  17: public void TestMethod(object obj)

  18: {

  19:     Console.WriteLine(obj);

  20: }

  21:  

  22: public void TestMethod2(IEnumerable<object> objs)

  23: {

  24:     Console.WriteLine(objs.Count());

  25: }

Until I stum­bled upon this old post by Eric Lip­pert on the topic of array covari­ance, which essen­tially points to a dis­agree­ment in the C# and CLI spec­i­fi­ca­tion on the rule of array covariance:

CLI

if X is assign­ment com­pat­i­ble with Y then X[] is assign­ment com­pat­i­ble with Y[]

C#

if X is a ref­er­ence type implic­itly con­vert­ible to ref­er­ence type Y then X[] is implic­itly con­vert­ible to Y[]

Whilst this doesn’t directly point to the gener­ics case with IEnumerable<out T>, one would expect they are one and the same, oth­er­wise you end up with dif­fer­ent rules for int[] and IEnumerable<int> where (new int[0] is IEnumerable<int>) == true.. now that would be weird!

Ref­er­ences:

Eric Lip­pert – Why is covari­ance of value-typed arrays inconsistent?

Ques­tion on Stack­Over­flow – why does my C# array lose type sign infor­ma­tion when cast to object?

Share

The other day I had an inter­est­ing obser­va­tion on the optional para­me­ters in C# 4, whereby if you spec­ify a para­me­ter as optional on an inter­face you don’t actu­ally have to make that para­me­ter optional on any imple­ment­ing class:

   1: public interface MyInterface

   2: {

   3:     void TestMethod(bool flag=false);

   4: }

   5:

   6: public class MyClass : MyInterface

   7: {

   8:     public void TestMethod(bool flag)

   9:     {

  10:         Console.WriteLine(flag);

  11:     }

  12: }

Which means you won’t be able to use the imple­ment­ing class and the inter­face interchangeably:

   1: var obj = new MyClass();

   2: obj.TestMethod(); // compiler error

   3:

   4: var obj2 = new MyClass() as MyInterface;

   5: obj2.TestMethod(); // prints false

Nat­u­rally, this bags the ques­tion of why the com­piler doesn’t enforce the imple­men­ta­tion to match the default value spec­i­fied by the contract?

Luck­ily, my sub­se­quent ques­tion on SO was answered by Eric Lip­pert from the C# com­piler team, not to waste time and effort repeat­ing what’s already been said, check out his answer and it’s clear to see the ratio­nale here and why it would be imprac­ti­cal and incon­ve­nient should the com­piler does it differently.

Ref­er­ences:

My ques­tion on StackOverflow

Arti­cle on pos­i­tives and pit­falls of using optional parameters

Share

As great as LINQ is and how it has trans­formed the way we inter­act with data in .Net to the point many of us won­der how we had man­aged with­out it all this time! There are how­ever, some pit­falls one can fall into, espe­cially with the con­cept of delayed exe­cu­tion which is eas­ily the most mis­un­der­stood aspect of LINQ.

Despite feel­ing pretty com­pe­tent at LINQ and hav­ing grasped a decent under­stand­ing of how delay exe­cu­tion works, I still find myself mak­ing the odd mis­takes which resulted in bugs that are hard to detect as they don’t tend to fail very loudly. So as a reminder to myself and any­one who’s had sim­i­lar expe­ri­ence with these WTF bugs, here’s some pit­falls to look out for.

Before we start, here’s a sim­ple Per­son class which we will reuse over and over:

   1: public class Person

   2: {

   3:     public string Name { get; set; }

   4:     public int Age { get; set; }

   5: }

Exhibit 1 – mod­i­fy­ing items in an IEnumerable

As you’re pass­ing IEnu­mer­able objects around in your code, adding pro­jec­tion to pro­jec­tions, this might be a pat­tern which you have wit­nessed before:

   1: public void ExhibitOne()

   2: {

   3:     var names = new[] { "yan", "yinan" };     // define some names

   4:     var persons = InitializePersons(names);

   5:

   6:     foreach (var person in persons)

   7:     {

   8:         Console.WriteLine("Name: {0}, Age: {1}", person.Name, person.Age);

   9:     }

  10: }

  11:

  12: public IEnumerable<Person> InitializePersons(IEnumerable<string> names)

  13: {

  14:     // project the name strings to Person objects

  15:     var persons = names.Select(n => new Person { Name = n });

  16:

  17:     // set their age and return the Person objects

  18:     SetAge(persons);

  19:

  20:     return persons;

  21: }

  22:

  23: public void SetAge(IEnumerable<Person> persons)

  24: {

  25:     // set each person's age to 28

  26:     foreach (var person in persons)

  27:     {

  28:         person.Age = 28;

  29:     }

  30: }

Any guess on the out­put of the ExhibitOne method?

image

Not what you were expecting?

What had hap­pened here is that the loops in both ExhibitOne and SetAge use a pro­jec­tion (from string to Per­son object) of the names array, a pro­jec­tion that is eval­u­ated and its items fetched at the point when it’s actu­ally needed. As a result, both loops loop through a new set of Per­son objects cre­ated by this line:

   1: // project the name strings to Person objects

   2: var persons = names.Select(n => new Person { Name = n });

hence why the work SetAge had done is not reflected in the ExhibitOne method.

Rem­edy

The fix here is sim­ple, sim­ply ‘mate­ri­al­ize’ the Per­son objects in the Ini­tial­izePer­sons method before pass­ing them onto the SetAge method so that when you mod­ify the Per­son objects in the array you’re mod­i­fy­ing the same objects that will be passed back to the ExhibitOne method:

   1: public IEnumerable<Person> InitializePersons(IEnumerable<string> names)

   2: {

   3:     // project the name strings to Person objects

   4:     var persons = names.Select(n => new Person { Name = n }).ToArray();

   5:

   6:     // set their age and return the Person objects

   7:     SetAge(persons);

   8:

   9:     return persons;

  10: }

This out­puts:

image

Ver­dict

Whilst this will gen­er­ate the expected result IF the per­sons para­me­ter passed to the SetAge method is an array or list of some sort, it does leave room for things to go wrong and when they do it’s a pain to debug as the defect might man­i­fest itself in all kinds of strange ways.

There­fore I would strongly sug­gest that any­time you find your­self iter­at­ing through an IEnu­mer­able col­lec­tion and mod­i­fy­ing its ele­ments you should sub­sti­tute the IEnu­mer­able type with either an array or list.

Exhibit 2 – dan­ger­ous overloads

As you’re build­ing libraries it’s often use­ful to pro­vide over­loads to cater for sin­gle item as well as col­lec­tions, for example:

   1: public void ExhibitTwo()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:     Print(persons); // IEnumerable T?

   8:     Print(persons.ToList()); // IEnumerable T?

   9: }

  10:

  11: public void Print<T>(IEnumerable<T> items)

  12: {

  13:     Console.WriteLine("IEnumerable T");

  14: }

  15:

  16: public void Print<T>(T item)

  17: {

  18:     Console.WriteLine("Single T");

  19: }

This out­puts:

image

WTF? Yes I hear you, but C#‘s over­load res­o­lu­tion algo­rithm deter­mines that Per­son[] and List<Person> are bet­ter matched to T than IEnumerable<T> in these cases because no implicit cast­ing is required (which the IEnumerable<T> over­load does in order to make them match IEnumerable<Person>).

Rem­edy

   1: Print(persons.AsEnumerable());

This prints

image

Alter­na­tively, if you were to add fur­ther over­loads for array and lists:

   1: public void Print<T>(IEnumerable<T> items)

   2: {

   3:     Console.WriteLine("IEnumerable T");

   4: }

   5:

   6: public void Print<T>(List<T> items)

   7: {

   8:     Console.WriteLine("List T");

   9: }

  10:

  11: public void Print<T>(T[] items)

  12: {

  13:     Console.WriteLine("Array T");

  14: }

  15:

  16: public void Print<T>(T item)

  17: {

  18:     Console.WriteLine("Single T");

  19: }

then the right meth­ods will be called:

   1: public void ExhibitTwo()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:     Print(persons);                 // Array T

   8:     Print(persons.ToList());        // List T

   9:     Print(persons.AsEnumerable());  // prints IEnumerable T

  10:     Print(persons.First());         // prints Single T

  11: }

The obvi­ous down­side here is that you need to pro­vide an over­load for EVERY col­lec­tion type which is far from ideal!

Ver­dict

Obvi­ously, to expect the callers to always remem­ber to case their col­lec­tion as an enu­mer­able is unre­al­is­tic, in my opin­ion, it’s always bet­ter to leave as lit­tle room for con­fu­sion as pos­si­ble and there­fore the approach I’d rec­om­mend is:

  • rename the meth­ods so that it’s clear to the caller which method they should be call­ing, i.e.
   1: public void Print<T>(T item)

   2: {

   3:     Console.WriteLine("Single T");

   4: }

   5:

   6: public void PrintBatch<T>(IEnumerable<T> items)

   7: {

   8:     Console.WriteLine("IEnumerable T");

   9: }

or

   1: public void SinglePrint<T>(T item)

   2: {

   3:     Console.WriteLine("Single T");

   4: }

   5:

   6: public void Print<T>(IEnumerable<T> items)

   7: {

   8:     Console.WriteLine("IEnumerable T");

   9: }

Exhibit 3 – Enumerable.Except returns dis­tinct items

I have already cov­ered this topic ear­lier, read more about it here.

Ver­dict

One to remem­ber, Enumerable.Except per­forms a set oper­a­tion and a set by def­i­n­i­tion con­tains only dis­tinct items, just keep this sim­ple rule in mind next time you use it.

Exhibit 4 – Do you know your T from your object?

This one should be rare, but inter­est­ingly nonetheless:

   1: public void ExhibitFour()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:

   8:     FirstMethod(persons);

   9: }

  10:

  11: public void FirstMethod(IEnumerable<object> items)

  12: {

  13:        SecondMethod(items.ToList());

  14: }

  15:

  16: public void SecondMethod<T>(IList<T> items)

  17: {

  18:        Console.WriteLine(typeof(T));

  19: }

What do you think this code prints? Object or Per­son? The answer is … Object!

How is it that typeof(T) returns Object instead of Per­son? The rea­son is sim­ple actu­ally, it’s because items.ToList() returns List<object> as the com­pile time time of items is IEnumerable<object> instead of IEnumerable<Person>.

Rem­edy

Turn First­Method into a generic method and every­thing will flow naturally:

   1: public void FirstMethod<T>(IEnumerable<T> items)

   2: {

   3:        SecondMethod(items.ToList());

   4: }

Ref­er­ences:

Eric Lippert’s post on Gener­ics != Templates

My ques­tion on Stack­Over­flow regards over­load resolution

Share