I have heard a few peo­ple argue that when it comes to per­for­mance crit­i­cal code you should pre­fer arrays over other col­lec­tions (such as F#’s lists) as it ben­e­fits from sequen­tial reads (which is faster than seeks) and offers bet­ter mem­ory locality.

To test that the­ory some­what, I wanted to see if there is any dif­fer­ence in how fast you can iter­ate through an array ver­sus a list in F#, and how much faster you can map over an array com­pared to a list:

The result is a lit­tle sur­pris­ing, whilst I wasn’t expect­ing there to be a mas­sive dif­fer­ence in the iter­at­ing through the two types of col­lec­tions, I didn’t think map­ping over a list would be quite as slow in com­par­i­son. I knew that con­struct­ing a list is much heav­ier than con­struct­ing an array, but I didn’t think it’d take 22x as long in this case.

What was even more sur­pris­ing was how much slower the Seq.iter and Seq.map func­tions are com­pared to the Array and List mod­ule equiv­a­lents! This is, accord­ing to John Palmer:

Once you call in to Seq you lose the type infor­ma­tion — mov­ing to the next ele­ment in the list requires a call to IEnumerator.MoveNext. Com­pare to for Array you just incre­ment an index and for List you can just deref­er­ence a pointer. Essen­tially, you are get­ting an extra func­tion call for each ele­ment in the list.

The con­ver­sions back to List and Array also slow the code down for sim­i­lar reasons

Update 2012/06/04:

As a work around, you COULD shadow the Seq mod­ule with iter and map func­tions that adds sim­ple type check­ing and in the case of an array or list sim­ply call the cor­re­spond­ing func­tion in the Array or List mod­ule instead:

Whilst this approach will work to a cer­tain extend, you should be care­ful with which func­tions you shadow. For instance, it’s not safe to shadow Seq.map because it can be used in con­junc­tion with other func­tions such as Seq.takeWhile or Seq.take. In the base imple­men­ta­tion, a line of code such as:

arr |> Seq.map incr |> Seq.take 3

will not map over every ele­ment in the source array.

With the shad­owed ver­sion (see above) of Seq.map, how­ever, this would first cre­ate a new array by apply­ing the map­per func­tion against every ele­ment in the source array before dis­card­ing all but the first three ele­ments in the new array. This, as you can imag­ine, is far less effi­cient and requires much more mem­ory space (for the new array) and defeats the pur­pose of using Seq mod­ule func­tions in most cases.

Share

4 Responses to “F# – Speed test iter and map operations with array vs list”

  1. Gert-Jan says:

    Hi,

    I don’t think you’re being fair on the Seq in com­bi­na­tion with the List. Since the List is immutable, there’s no need to mate­ri­al­ize the List in the last step “|> Seq.ToList”. You can just pass around the lazy Seq over the Immutable list and be sure that you will always get the same result.
    The Array does not have that lux­ury, but this may not have to be an issue in most cases if there’s no con­cur­rency involved.

  2. theburningmonk says:

    @Gert-Jan — with­out the last Seq.toList step you will get a sequence back as opposed to a list, which is not the same as ‘lst |> List.map incr’ which returns a new list where each ele­ment is mapped from the source list.

    The test was purely intended to show if there’s any dif­fer­ence in how fast sim­ple iter and map oper­a­tions are per­formed on array and lists, and when using the Seq.iter and Seq.map functions.

  3. ray says:

    Nice tests! But your con­clu­sion is misleading.

    Your loop size is huge: 10,000,000! So the actual over­head for a sin­gle iter­a­tion is 10 mil­lion times less.
    If you take a rea­son­able loop size, say 1000, then your tim­ings are 10,000 times less.

    So, your tests actu­ally show, all things con­sid­ered, the rel­a­tive over­head for Seq, Loop, and Array are negligible.

  4. theburningmonk says:

    @ray — it’s more about rel­a­tive cost, actual impact depends upon usage in your appli­ca­tion of course. The dif­fer­ence is neg­li­gi­ble if you don’t have to deal with large col­lec­tions or many many small col­lec­tions on a reg­u­lar basis.

Leave a Reply