Yan Cui
I help clients go faster for less using serverless technologies.
Whilst searching for an elegant solution to apply string interning across a large number of classes (we’re talking about hundreds of classes here..) it dawned on me that I can achieve this with ease using PostSharp’s LocationInterceptionAspect. All I needed was something along the lines of:
You can apply this attribute to a class or even a whole assembly and it’ll ensure every piece of string constructed is interned, including string properties and fields defined by its subclass, which is exactly what I was after.
For example, take this trivial piece of code:
If you inspect the compiled code for the Base class in ILSpy you will see something along the lines of:
notice how the setter for BaseStringProperty has been modified to invoke the OnSetValue method defined in our aspect above as opposed to the setter method. In this case, it’ll call the String.Intern method to retrieve a reference to an interned instance of the string and set the property to that reference.
For more details on PostSharp’s interception aspects, I recommend reading Dustin Davis’s excellent posts on the topic:
PostSharp Principles: Day 7 Interception Aspects – Part 1
PostSharp Principles: Day 8 Interception Aspects – Part 2
As we’ve specified the multicast inheritance behaviour to multicast the attribute to members of the children of the original element, the string properties defined in both A and B classes are also subject to the same string interning treatment without us having to explicitly apply the InternAttribute on them:
F# Compatible
What’s more, this attribute also works with F# types too, including record and discriminated unions types. Take for instance:
If you look at the generated C# code for the discriminated union type, the internal MyDuType.CaseB type would look something like the following:
notice how the two internal item1 and item2 properties’s setter methods have been modified in much the same way as the C# examples above? The public Item1 and Item2 properties are read-only and get their values from the internal properties instead.
Indeed, when a new instance of the CaseB type is constructed, it is the internal properties whose values are initialized:
Finally, let’s look at the record type, which interestingly also defines a non-string field:
because we have specified that the InternAttribute should only be applied to properties or fields of type string (via the CompileTimeValidate method which is executed as part of the post-compilation weaving process as opposed to runtime), so the internal representation of the Age field is left unaltered.
The Name field, however, being of string type, was subject to the same transformation as all our other examples.
I hope this little attribute can prove to be useful to you too, it has certainly saved me from an unbearable amount of grunt work!
Whenever you’re ready, here are 3 ways I can help you:
- Production-Ready Serverless: Join 20+ AWS Heroes & Community Builders and 1000+ other students in levelling up your serverless game. This is your one-stop shop for quickly levelling up your serverless skills.
- I help clients launch product ideas, improve their development processes and upskill their teams. If you’d like to work together, then let’s get in touch.
- Join my community on Discord, ask questions, and join the discussion on all things AWS and Serverless.
What would be the purpose of implementing custom string interning? I believe this is the default behaviour in new .NET versions. Please forgive me if this is a naive question.
Not a criticism, just wondering! Great article.
@Sprague:
No worries, I welcome constructive criticism :-)
The CLR interns string literals only, not dynamically created strings. For instance, if you were to read a bunch of data from XML/JSON/etc. the string data that you come across will not be interned.
When you’re dealing with large numbers of strings in your application and many of them are duplicates (keys to reference other domain objects with, for instance) then this can help reduce your memory footprint.
Also, if you find that many of these duplicates live beyond Gen0 collections then it could also have a positive impact on the GC with less data to collect or move during compaction.
@theburningmonk
This makes sense as string interning is a feature on the C# compiler. I haven’t been able to find any information online saying that string interning isn’t available in CLI in general, but it makes sense… I wonder how many people I’ve given bad advice to :X
@sprague
I remember reading about the string interning behaviour on Eric Lippert’s a blog quite some time ago, and the MSDN documentation on the String.Intern method touched on it too.
Perhaps I should put together a follow-up post with examples before and after applying the attribute (you can check if a string is in the pool by using String.IsInterned and checking its result, which would be null if the string is not interned) and maybe some profiling results when dealing with reasonably large data sets with lots of duplicated strings.
hello there and thank you for your info – I have definitely
picked up anything new from right here. I did however expertise a few technical points
using this web site, as I experienced to reload the website a lot of times previous
to I could get it to load correctly. I had been wondering if your hosting is OK?
Not that I am complaining, but slow loading instances times will sometimes affect your placement in google and could damage your quality
score if advertising and marketing with Adwords. Well I’m adding this RSS to my e-mail and could look out for much more of your respective exciting content. Ensure that you update this again very soon.
Pingback: AOP – A story of how we localized a MMORPG with minimal effort | theburningmonk.com
Pingback: Year in Review, 2013 | theburningmonk.com
Pingback: Contrasting F# and Elm’s record types | theburningmonk.com