Jan 24, 2016

Roslyn Adventures: Metaprogramming with StackExchange.Precompilation

In this article we’ll wrap the StringBuilderInterpolationOptimizer from my previous article into an StackExchange.Precompiltion module, and use it to optimize an existing C# project.

First things first, we start off with an empty console project, to which we add some sample StringBuilder.Append calls, passing interpolated strings as parameters:


Now, let's add the StackExchange.Precompilation.Build package to our demo project, and make sure it still builds. This package replaces csc.exe with a custom and hookable compiler, which is basically a wrapper around the roslyn API. Unfortunately, roslyn doesn't expose any hooks for adding custom processing from the command line (yet). Luckily the API let's you do all kinds of crazy stuff, it even comes with a parser for the csc.exe command line arguments...

After this package is successfully installed, you should see the precompiler executable being called instead of csc.exe in the build output window:


Next, we need to build the compilation module that can be loaded into the compilation. So we create an empty class library, and install the StackExchange.Precompilation.Metaprogramming package (contains all the required base types).

After we’ve pulled in the SyntaxTree rewriter from the last article, we can implement the hook. The interface we need to implement is ICompileModule. We want to implement the BeforeCompilation method, which allows us to inspect and modify our compilation before any IL is emitted. Inside this method we have full access to all syntax trees in the compilation and their semantic model. Note that CSharpCompilation is immutable, so we have to replace the one in the compilation context with the modified one:


All that’s left to do now is to wire up our shiny new compile module, so it's picked up as by the compiler. This is done in the app.config (or web.config for ASP.NET project) file of the console application:


The modified build script forwards the project’s config file to the compiler (this is especially useful when it needs to actually precompile razor views, but this is out of scope of this post). Also, be sure to add the precompilation module as a project reference to the console application.

After everything is set up correctly, a quick build and decompilation can verify that the emitted assembly contains optimized string builder calls:


The sources for this demo, along with commits reflecting the described steps, can be found on GitHub.

P.S. a useful trick for debugging your compilation module is to add a Debugger.Launch() call to attach your current VS instance to the currently executing compilation.

Oct 19, 2015

Roslyn Adventures: Optimizing StringBuilder string interpolation

C# string interpolation is awesome. But we can make it even more awesome by making it less wasteful. Consider this line of code:
Currently Roslyn emits the following IL for this call:
You see what it does there? Let's translate that back to C#, to make it more obvious:
It allocates another string and possibly even another StringBuilder. The thing is, you wouldn't be using a StringBuilder if you weren't concerned about allocations. My initial idea of how to solve this was FormattableString. So basically something like this:
Unfortunately overload resolution doesn't work in favor of the method accepting the FormattableString whenever there is an overload for the string parameter. So the example above would write STRING: a test 42 to the console. If we could make the overload resolution smarter (e.g. make the compiler create FormattableString instances wherever there's an matching overload accepting FormattableString instead of a string argument) the solution would be as easy as creating Append/AppendLine(FormattableString) extension methods for StringBuilder.

Roslyn to the rescue

Luckily we can use Roslyn to do some metaprogramming magic and work around this. Basically we need to rewrite StringBuilder.Append/AppendLine calls to StringBuilder.AppendFormat:
... in IL speak:
Let's create a reusable Roslyn-based solution that knows how to do that optimization.
We can use the class above to rewrite each SyntaxTree in a CSharpCompilation:
Originally I wanted the optimization to do the same for TextWriter.Write/WriteLine and Console.Write/WriteLine calls, but it turns out that they actually call string.Format internally anyway. So, add that to the list of possible optimizations.


As you've seen by yourself, it's not particulairly difficult to optimize the back and forth between string interpolation and StringBuilder. I really think optimizations like that should be in Roslyn. As long as that's not implemented though, using string interpolation to build big string fragments (like for example HTML...) might be a bit more expensive than you might think.

Stay tuned for my next blog post, where I'll show you how to plug the optimization into the metaprogramming infrastructure of DNX and/or StackExchange.Precompilation, if you're not ready to migrate to vNext yet

Nov 14, 2014

Localization Adventures: Walk the #line


With Roslyn, C# developers now have a powerful tool which makes modifying the source code a breeze. At Stack Exchange we’ve invented our own little set of extension to C# for localization purposes, previously described on Matt Jibson’s blog. The project originally supported ASP.NET MVC views only, but we’ve expanded it to C# source files (.cs) because our projects have strings that simply don't fit into MVC views and rendering a view for each string is overkill. In this blog series I’ll try to highlight some of the fun things I've learned on this .cshtml ->.cs journey.

Localization Adventures: Walk the #line

With code rewriting being the biggest problem, one can quickly forget to support other things in the development workflow, like debugging. Take ASP.NET MVC’s .cshtml views for example. Although they are compiled to C# source code (you get CodeDOM from Razor, ASP.NET vNext will change that) you can still step through breakpoints in the .cshtml file. In C# we have the #line preprocessor directive that can enables us to do this. As trivial as it might seem, it’s not so easy to get right. The fun part is having more than one statement on a single line. Let’s say we have this simple program fragment that we need to rewrite:

… and the whole thing get’s rewritten into:

The question is, how to place the #line directives to hit a breakpoint on the baz = Baz() statement? A clue lies in the breakpoint itself, which Visual Studio shows as Program.Original.cs, line 7 character 67. As you can see though, there is no way to specify character 67 using the #line directive. To solve the problem we need to be aware of two simple facts the MSDN page forgets to mention:
  • the same line number can be used by multiple #line directives
  • whitespace is important when using the #line directive
Which brings us to the solution:

pics or it didn't happen

That's all for now. Next time in Localization Adventures, be ready to get you hands dirty with some Roslyn.