Yan Cui
I help clients go faster for less using serverless technologies.
In F#, you have the choice of using a struct or a record as a lightweight container for data. The similarities between the two are striking – both are immutable by default, neither can be inherited, and they both offer structural equality semantics by default too!
However, there’s a key difference between them, their performance characteristics.
When you’re dealing with tons of small objects, structs offer significant performance benefits because as value types they are stack allocated which is much faster, and don’t need to be garbage collected.
Records on the other hand, are reference types, and are therefore heap allocated (plus pointers on the stack), which is slower, and require the extra step of garbage collection when they’re no longer referenced.
As a simple test, given these two identical types, one as a struct and one as a record:
The snippet below constructs two arrays each with 10 million items, one with struct instances and the other with record instances:
Over three runs, the structs array took an average of 0.146 seconds to construct whilst the records array took an average of 2.919 seconds!
Parting Thoughts…
Although this test shows that creating large numbers of records takes significantly longer than structs, in practice however, would you really care if generating 10 millions objects takes 3 seconds instead of 0.1? Is that likely to be the source of your performance issues?
All and all, the performance gains you get by using structs over records is negligible, in most cases you won’t be generating large number of these objects frequently enough for you to notice the difference. Also, we haven’t even covered the cost of copying them when passing them as parameters (Value types are passed by value as opposed to reference), which if you’re not careful can have a detrimental effect on the overall performance of your application.
Also, records have two very useful features for working with F# which structs don’t:
- type inference can infer a record’s type, no need for type annotation
- records can be used as part of standard pattern matching, no need for when guards
both are big pluses in my book and worth considering when you’re choosing between records and structs.
Whenever you’re ready, here are 3 ways I can help you:
- Production-Ready Serverless: Join 20+ AWS Heroes & Community Builders and 1000+ other students in levelling up your serverless game. This is your one-stop shop for quickly levelling up your serverless skills.
- I help clients launch product ideas, improve their development processes and upskill their teams. If you’d like to work together, then let’s get in touch.
- Join my community on Discord, ask questions, and join the discussion on all things AWS and Serverless.
Hi,
there are some minor issues here. First creating the objects on heap is very cheap – it’s the GC step that’s causing the pain and here I have to disagree with your conclusion.
3s vs. 0.1s is not a big deal? Really?
IMHO this is a very big deal as 10mio. objects might be big if you think on big buisiness objects but from time to time you process *smaller* data like points or small values. These are the places where F# really shine – processing large quantities of data – and of course 3s vs. 0.1s. matters a LOT in those cases.
Pingback: Contrasting F# and Elm’s record types | theburningmonk.com
NB: I tried it using last F# 4.1 feature to tag record with []. Ratio is still 0.15 (struct record, same as struct without record, like yours) vs 1.2 (default record). Was there any improvement for reference type?
With struct record, you got the benefit of the two worlds. Also, you forgot to mention that using struct means copy each time you pass it to a function…so in this case, if you have 10 copy of the same record, you got the same performance in the end. Besides, the test does not take in account that garbage collection is probably not taken in account in time measurement.
Hi Clement, since this post was written in 2011, there could have been improvements to ref type in .Net/.NetCore, as I haven’t used either for about 2 years now I’m not well versed to what has changed in the latest iterations of the runtime. Matt Warren has been writing some pretty great stuff on .Net performance, so he might have some insights. You should check out his blog if you haven’t already – http://mattwarren.org
I didn’t forget to mention the pass by value nature of structs, I just assumed you know it :-P This simple test is only meant to measure there’s a measurable difference in the time to allocate a large no. of structs vs records (ref type, struct record wasn’t a thing back then), as a reminder that whilst semantically a record is similar to a struct they have the performance characteristics of a ref type, obviously the introduction of struct records changes that!
As for GC time, it’ll be included in the test (wish I could have designed them out, but hey..) because of the no. of objects that are allocated – eg. 1 million ref types, each 10 bytes of data + 8 bytes of object headers and method table pointers (see https://theburningmonk.com/2015/07/smallest-net-ref-type-is-12-bytes-or-why-you-should-consider-using-value-types), which is 18 bytes * 10,000,000 ~= 170MB, which is way bigger than the size of Gen 1 (which is usually sized proportionally to your L3 cache).
In fact, several Gen 1, and possibly Gen 2 collections would have happened for both (you can probably see in the F# interactive if you turn on the #time directive), since we’re allocating into an array, which is a ref type, so is heap allocated and falls within the control of the GC. The difference being structs are stored as a tightly packed sequence of bytes, whereas records are stored as word-sized pointers.