Archive

In my previous post, I have given a quick overview of some new features of the aspect weaver.

Equally important to features: runtime performance.

PostSharp 1.5 already did a great job compared to other aspect frameworks. However, this could still be greatly improved. And this had been done in PostSharp 2.0.

Take, for instance, a rather simple aspect: a performance counter. Say we want to increase a counter every time a method is executed. The easiest and most efficient way with PostSharp 1.5 is to create an aspect of type OnMethodBoundaryAspect and to implement the OnEntry:

[Serializable]
public sealed class MethodInvocationCounterAttribute : OnMethodBoundaryAspect
{
    [NonSerialized]
    private PerformanceCounter performanceCounter;

    public MethodInvocationCounterAttribute(string category, string name)
    {
        this.Category = category;
        this.Name = name;
    }

    public string Category { get; private set; }
    public string Name { get; private set; }

    public override void RuntimeInitialize(System.Reflection.MethodBase method)
    {
        this.performanceCounter = new PerformanceCounter(this.Category, 
                                                         this.Name, false);
    }

    public override void OnEntry(MethodExecutionEventArgs eventArgs)
    {
        this.performanceCounter.Increment();
    }

}

Now let's apply this aspect on an empty method (and forbid inlining), and put it on the test bench (here on my AMD Phenom II X4 940 3.00 GHz):

Implementation CPU Time (ns) Overhead (ns)
Manual implementation 26 0
PostSharp 1.5 108 82

As you can see, in PostSharp 1.5, the overhead of the using an aspect, compared to hand coding, is 108 - 26 = 82 ns (remember, a nanosecond is a billionth of a second). What's the problem? The aspect overhead may is more than 3 times the aspect effect itself! If the aspect has to be invoked thousands of times per second, its cost can surely not be ignored.

Now look at the benchmark with PostSharp 2.0:

Implementation CPU Time (ns) Overhead (ns)
Manual implementation 26 0
PostSharp 2.0 29 3

This time, it's much better. For this specific aspect, PostSharp 2.0 is more than 25 times faster at runtime than PostSharp 1.5! Most importantly, the overhead of the aspect is now only a small fraction of the cost of the effect itself. So it now makes perfectly sense to use an aspect for lightweight instrumentation.

How does PostSharp 2.0 achieves this performance gain? In short, by being smarter about the code it generates. PostSharp 1.5 did not look into the code of your aspects, it had to generate code for any offered feature, even if your aspect did not use them. Therefore, PostSharp 1.5 generated a lot of useless instructions. This is clear if we looked at the output assembly using Reflector:

Code generated by PostSharp 1.5

As you can see, there are instructions to pass the parameters of the method to the aspect, even if the aspect never uses them. Handlers OnSuccess, OnException and OnExit are invoked, even if the aspect does not implement them. That's bad overhead: it does not translate into anything useful.

PostSharp 2.0 is way smarter: it analyzes the code of your aspect, figures out which feature you actually use, and generate instructions only for them. Watch the difference:

Code generated by PostSharp 2.0

No wonder it's faster. Since the aspect does not even use the eventArgs object, why should we pass it? PostSharp 2.0 is smarter than you could think: it also looks at which member of MethodExecutionEventArgs you are using. So if you read the Instance property and not the Arguments property, you will get Instance, not Arguments.

Sure, there is a catch in this benchmark: I have intentionally chosen an example where the improvement is dramatic. But, from today, you know that you can achieve amazing performance with OnMethodBoundaryAspect, and that it's up to you to design an aspect that is really lightweight at runtime. The learning curve of PostSharp has always been pay-as-you-consume. Now runtime performance is also pay-as-you-consume.

What with other aspects? Take OnMethodInvocationAspect. It has been reimplemented from scratch, is now named MethodInterceptionAspect, and is "just" 77% faster in PostSharp 2.0 than in PostSharp 1.5.

PostSharp 2.0 delivers better runtime performance by playing on 3 factors:

  • Adaptive Code Generation, as demonstrated above (major benefits in OnMethodBoundaryAspect, minor benefits elsewhere).
  • Use of generic tuples instead of untyped arrays to store arguments (no boxing-unboxing, no casting).
  • Use of binding classes instead of delegates.
  • Aggressively optimized design (we do consider a virtual call or a boxing/unboxing cycle as an expensive operation).
  • Cross-aspect optimizations, delivering great benefits when many aspects are applied to the same method (artifacts used by an aspect can be reused by next aspects in chain).

Now a last piece of code. What if a 3 ns overhead is still too much for your case? Look at the following piece of code, it has zero overhead.

[Serializable]
public sealed class MethodInvocationCounter3Attribute : MethodLevelAspect
{
    [NonSerialized]
    private readonly static PerformanceCounter performanceCounter;

    static MethodInvocationCounter3Attribute()
    {
        if ( !PostSharpEnvironment.IsPostSharpRunning )
        {
            performanceCounter = new PerformanceCounter("Custom", "NumberOfItems64", false);
        }
    }

    [OnMethodEntryHandler, SelfSelector]
    public static void OnEntry(MethodExecutionArgs eventArgs)
    {
        performanceCounter.Increment();
    }
}

The OnEntry handler is a static method. So, when applied on a target method, it gives the following:

Code generated by PostSharp 2.0 - Static Method

Since the OnEntry handler is inlined by the JIT compiler, this method is strictly equivalent to invoking performanceCounter.Increment manually from the instrumented method. Sure, your possibilities are very limited when you use a static method (you can't access instance fields of the aspect instance, so the counter name has to be the same for all methods using this aspect). But the promise holds: you use nothing, you pay nothing. Here, the aspect is free of any overhead.

Happy PostSharping!

-gael

PS. Kicking ass?

Comments (4) -

Alex Yakunin
Alex Yakunin
9/25/2009 7:55:45 AM #

Gael, that's really great. We use ReporecessMethodBoundaryAspect in DO4 with custom weaver that actually emits very similar, optimized code by inspecting of what's overridden. Since it does not needs arguments at all, it is really fast. So now I see there is a generic solution of this problem.

> Use of generic tuples instead of untyped arrays to store arguments

Beware of virtual generic method calls in this case. As I assume, we must parameterize OnXxx method by tuple type. So they must not be virtual.

> Use of binding classes instead of delegates.

Can you explain this?

Gael Fraiteur
Gael Fraiteur
9/25/2009 8:12:43 AM #

> Use of generic tuples instead of untyped arrays to store arguments

There is no generic method at all, just generic types.

> Use of binding classes instead of delegates.

Basically, we create a new binding class for each 'interception' aspect, and deriving from an abstract class. The binding class basicallt binds the aspect to the next handler in chain. The aspect class calls the abstract method. This is the only virtual call in the procedure of calling the underlying aspect. You'll see when the first beta will be delivered.

Miguel Madero
Miguel Madero
9/27/2009 5:58:48 AM #

Gael,

I understand runtime is even more important, but unfortunately, compile time was really slow in Postsharp 1.5 and that's the main reason we had to stop using it, it was doubling our compile time from ~1 min to ~2 mins. This isn't a rant, but rather a questions: How are the compile time improvements looking in 2.0?

Gael Fraiteur
Gael Fraiteur
9/30/2009 5:43:39 PM #

When you were writing this message, I was implementing compile time improvements. PostSharp will run as a pipe server, so load time will be lower. Load time plays a big role principally when building a large number of small projects.

Comments are closed