As great as LINQ is and how it has trans­formed the way we inter­act with data in .Net to the point many of us won­der how we had man­aged with­out it all this time! There are how­ever, some pit­falls one can fall into, espe­cially with the con­cept of delayed exe­cu­tion which is eas­ily the most mis­un­der­stood aspect of LINQ.

Despite feel­ing pretty com­pe­tent at LINQ and hav­ing grasped a decent under­stand­ing of how delay exe­cu­tion works, I still find myself mak­ing the odd mis­takes which resulted in bugs that are hard to detect as they don’t tend to fail very loudly. So as a reminder to myself and any­one who’s had sim­i­lar expe­ri­ence with these WTF bugs, here’s some pit­falls to look out for.

Before we start, here’s a sim­ple Per­son class which we will reuse over and over:

   1: public class Person

   2: {

   3:     public string Name { get; set; }

   4:     public int Age { get; set; }

   5: }

Exhibit 1 – mod­i­fy­ing items in an IEnumerable

As you’re pass­ing IEnu­mer­able objects around in your code, adding pro­jec­tion to pro­jec­tions, this might be a pat­tern which you have wit­nessed before:

   1: public void ExhibitOne()

   2: {

   3:     var names = new[] { "yan", "yinan" };     // define some names

   4:     var persons = InitializePersons(names);

   5:

   6:     foreach (var person in persons)

   7:     {

   8:         Console.WriteLine("Name: {0}, Age: {1}", person.Name, person.Age);

   9:     }

  10: }

  11:

  12: public IEnumerable<Person> InitializePersons(IEnumerable<string> names)

  13: {

  14:     // project the name strings to Person objects

  15:     var persons = names.Select(n => new Person { Name = n });

  16:

  17:     // set their age and return the Person objects

  18:     SetAge(persons);

  19:

  20:     return persons;

  21: }

  22:

  23: public void SetAge(IEnumerable<Person> persons)

  24: {

  25:     // set each person's age to 28

  26:     foreach (var person in persons)

  27:     {

  28:         person.Age = 28;

  29:     }

  30: }

Any guess on the out­put of the ExhibitOne method?

image

Not what you were expecting?

What had hap­pened here is that the loops in both ExhibitOne and SetAge use a pro­jec­tion (from string to Per­son object) of the names array, a pro­jec­tion that is eval­u­ated and its items fetched at the point when it’s actu­ally needed. As a result, both loops loop through a new set of Per­son objects cre­ated by this line:

   1: // project the name strings to Person objects

   2: var persons = names.Select(n => new Person { Name = n });

hence why the work SetAge had done is not reflected in the ExhibitOne method.

Rem­edy

The fix here is sim­ple, sim­ply ‘mate­ri­al­ize’ the Per­son objects in the Ini­tial­izePer­sons method before pass­ing them onto the SetAge method so that when you mod­ify the Per­son objects in the array you’re mod­i­fy­ing the same objects that will be passed back to the ExhibitOne method:

   1: public IEnumerable<Person> InitializePersons(IEnumerable<string> names)

   2: {

   3:     // project the name strings to Person objects

   4:     var persons = names.Select(n => new Person { Name = n }).ToArray();

   5:

   6:     // set their age and return the Person objects

   7:     SetAge(persons);

   8:

   9:     return persons;

  10: }

This out­puts:

image

Ver­dict

Whilst this will gen­er­ate the expected result IF the per­sons para­me­ter passed to the SetAge method is an array or list of some sort, it does leave room for things to go wrong and when they do it’s a pain to debug as the defect might man­i­fest itself in all kinds of strange ways.

There­fore I would strongly sug­gest that any­time you find your­self iter­at­ing through an IEnu­mer­able col­lec­tion and mod­i­fy­ing its ele­ments you should sub­sti­tute the IEnu­mer­able type with either an array or list.

Exhibit 2 – dan­ger­ous overloads

As you’re build­ing libraries it’s often use­ful to pro­vide over­loads to cater for sin­gle item as well as col­lec­tions, for example:

   1: public void ExhibitTwo()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:     Print(persons); // IEnumerable T?

   8:     Print(persons.ToList()); // IEnumerable T?

   9: }

  10:

  11: public void Print<T>(IEnumerable<T> items)

  12: {

  13:     Console.WriteLine("IEnumerable T");

  14: }

  15:

  16: public void Print<T>(T item)

  17: {

  18:     Console.WriteLine("Single T");

  19: }

This out­puts:

image

WTF? Yes I hear you, but C#‘s over­load res­o­lu­tion algo­rithm deter­mines that Per­son[] and List<Person> are bet­ter matched to T than IEnumerable<T> in these cases because no implicit cast­ing is required (which the IEnumerable<T> over­load does in order to make them match IEnumerable<Person>).

Rem­edy

   1: Print(persons.AsEnumerable());

This prints

image

Alter­na­tively, if you were to add fur­ther over­loads for array and lists:

   1: public void Print<T>(IEnumerable<T> items)

   2: {

   3:     Console.WriteLine("IEnumerable T");

   4: }

   5:

   6: public void Print<T>(List<T> items)

   7: {

   8:     Console.WriteLine("List T");

   9: }

  10:

  11: public void Print<T>(T[] items)

  12: {

  13:     Console.WriteLine("Array T");

  14: }

  15:

  16: public void Print<T>(T item)

  17: {

  18:     Console.WriteLine("Single T");

  19: }

then the right meth­ods will be called:

   1: public void ExhibitTwo()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:     Print(persons);                 // Array T

   8:     Print(persons.ToList());        // List T

   9:     Print(persons.AsEnumerable());  // prints IEnumerable T

  10:     Print(persons.First());         // prints Single T

  11: }

The obvi­ous down­side here is that you need to pro­vide an over­load for EVERY col­lec­tion type which is far from ideal!

Ver­dict

Obvi­ously, to expect the callers to always remem­ber to case their col­lec­tion as an enu­mer­able is unre­al­is­tic, in my opin­ion, it’s always bet­ter to leave as lit­tle room for con­fu­sion as pos­si­ble and there­fore the approach I’d rec­om­mend is:

  • rename the meth­ods so that it’s clear to the caller which method they should be call­ing, i.e.
   1: public void Print<T>(T item)

   2: {

   3:     Console.WriteLine("Single T");

   4: }

   5:

   6: public void PrintBatch<T>(IEnumerable<T> items)

   7: {

   8:     Console.WriteLine("IEnumerable T");

   9: }

or

   1: public void SinglePrint<T>(T item)

   2: {

   3:     Console.WriteLine("Single T");

   4: }

   5:

   6: public void Print<T>(IEnumerable<T> items)

   7: {

   8:     Console.WriteLine("IEnumerable T");

   9: }

Exhibit 3 – Enumerable.Except returns dis­tinct items

I have already cov­ered this topic ear­lier, read more about it here.

Ver­dict

One to remem­ber, Enumerable.Except per­forms a set oper­a­tion and a set by def­i­n­i­tion con­tains only dis­tinct items, just keep this sim­ple rule in mind next time you use it.

Exhibit 4 – Do you know your T from your object?

This one should be rare, but inter­est­ingly nonetheless:

   1: public void ExhibitFour()

   2: {

   3:     var persons = new[] {

   4:                         new Person { Name = "Yan", Age = 28 },

   5:                         new Person { Name = "Yinan", Age = 28 },

   6:                     };

   7:

   8:     FirstMethod(persons);

   9: }

  10:

  11: public void FirstMethod(IEnumerable<object> items)

  12: {

  13:        SecondMethod(items.ToList());

  14: }

  15:

  16: public void SecondMethod<T>(IList<T> items)

  17: {

  18:        Console.WriteLine(typeof(T));

  19: }

What do you think this code prints? Object or Per­son? The answer is … Object!

How is it that typeof(T) returns Object instead of Per­son? The rea­son is sim­ple actu­ally, it’s because items.ToList() returns List<object> as the com­pile time time of items is IEnumerable<object> instead of IEnumerable<Person>.

Rem­edy

Turn First­Method into a generic method and every­thing will flow naturally:

   1: public void FirstMethod<T>(IEnumerable<T> items)

   2: {

   3:        SecondMethod(items.ToList());

   4: }

Ref­er­ences:

Eric Lippert’s post on Gener­ics != Templates

My ques­tion on Stack­Over­flow regards over­load resolution

Share

One Response to “LINQ — Some pitfalls to look out for”

  1. Alex says:

    Thanks for the inter­est­ing post, I learned a cou­ple of things.

Leave a Reply