Oct 19, 2015

Roslyn Adventures: Optimizing StringBuilder string interpolation

C# string interpolation is awesome. But we can make it even more awesome by making it less wasteful. Consider this line of code:
Currently Roslyn emits the following IL for this call:
You see what it does there? Let's translate that back to C#, to make it more obvious:
It allocates another string and possibly even another StringBuilder. The thing is, you wouldn't be using a StringBuilder if you weren't concerned about allocations. My initial idea of how to solve this was FormattableString. So basically something like this:
Unfortunately overload resolution doesn't work in favor of the method accepting the FormattableString whenever there is an overload for the string parameter. So the example above would write STRING: a test 42 to the console. If we could make the overload resolution smarter (e.g. make the compiler create FormattableString instances wherever there's an matching overload accepting FormattableString instead of a string argument) the solution would be as easy as creating Append/AppendLine(FormattableString) extension methods for StringBuilder.

Roslyn to the rescue

Luckily we can use Roslyn to do some metaprogramming magic and work around this. Basically we need to rewrite StringBuilder.Append/AppendLine calls to StringBuilder.AppendFormat:
... in IL speak:
Let's create a reusable Roslyn-based solution that knows how to do that optimization.
We can use the class above to rewrite each SyntaxTree in a CSharpCompilation:
Originally I wanted the optimization to do the same for TextWriter.Write/WriteLine and Console.Write/WriteLine calls, but it turns out that they actually call string.Format internally anyway. So, add that to the list of possible optimizations.

Conclusion

As you've seen by yourself, it's not particulairly difficult to optimize the back and forth between string interpolation and StringBuilder. I really think optimizations like that should be in Roslyn. As long as that's not implemented though, using string interpolation to build big string fragments (like for example HTML...) might be a bit more expensive than you might think.

Stay tuned for my next blog post, where I'll show you how to plug the optimization into the metaprogramming infrastructure of DNX and/or StackExchange.Precompilation, if you're not ready to migrate to vNext yet

1 comment:

fxfighter said...

This is pretty cool, this would be a good candidate for a compiler warning & code fix since the ease of implementing string interpolation anywhere will occasional lead to using it in the wrong places.