Structured exception handling and defensive programming are the two pillars of robust software.

Both pillars fail however when it comes to handling internal faults, those that normally originate in software defects rather than in any external factors.

In this webinar, Zoran Horvat demonstrates advanced defensive coding techniques that can bring the quality of your code to an entirely new level.

Watch the webinar and learn:

  • When throwing an exception is the right thing to do
  • Why exceptions and defensive coding cannot be applied to recover from defects
  • How to handle situations when internal software defect is causing the fault
  • How to treat fault detection as an orthogonal concern to normal operation

Advanced Defensive Programming Techniques on Vimeo.

Download slides.

Download code samples.

Video Content

  1. Bird's View of Defensive Programming (3:21)
  2. Demo (6:47)
  3. Design by Contract (16:24)
  4. Q&A (50:34)

Webinar Transcript


Hello everyone and welcome to today's webinar on advanced defensive coding techniques. My name is Tony and I'll be your moderator now. I work as a software engineer here at PostSharp and I'm excited to be hosting this session today. I'm pleased to introduce today's speaker, Zoran Horvat. Zoran is CEO and principle consultant at Coding Helmet Consultancy, and today he's going to share about what should you know about defensive programming and how to improve internal quality of code. 

Before I hand the mike over to Zoran, I have a few housekeeping items to cover about this presentation. So first, this webinar is brought to you by PostSharp, that's why this slide is there, this is not Zoran, neither me. PostSharp is an extension that adds support for patterns to C# and Visual Basic, so if you are tired of repeating yourself in your code, you may want to check it out, as did at Microsoft Intel or Bank of America who have been using Post Sharp in their project to save development and maintenance time. Customers typically take down their code base by 15% by using our product, so feel free to go to our website, for more details, or you can get also a free trial there, and now back to our webinar. Today's webinar is being recorded and the recording will be available after this live session, and all of you who have registered to the webinar will receive an email with the link to the recording. 

And last, during the webinar we also love to hear from you, so if you have any questions for our speaker, feel free to send your questions through the question window at the bottom of your player, and all the questions will be answered in the end of the session. If we do not get to all of your questions, we will also follow up after the webinar, so all of your questions will be answered in the end. So without any further ado, I'd like to kick off things by welcoming Zoran Horvat so, Zoran, over to you.


Yeah. Thank you Tony. Hello everyone. As Tony has already told you, on this webinar I'm going to talk to you about certain aspects of defensive coding. You may know me from Pluralsight courses mainly, I have five courses published there, and things that you will see in this webinar might be seen generally in all five of the courses I have published this far. Those are general programming principles applied, and applied on a very specific and narrow goal of making code internally more stable. 

Now, without any delay, I will just tell you precisely what this demonstration is going to be about, because the term defensive programming is talking about a large area, and I will identify a very small subset of problems that we are going to address today. 

Bird’s View of Defensive Programming

So, historically, you may probably be aware that the whole thing about making programs stable first started with garbage in garbage out concept. Which was very good in one respect which we programmers like a lot. You don't have to code anything to put garbage out. So a lot of software was written in a way that code is not responsible for invalid, for behaving in the face of invalid input, and that worked quite well for many years only now and then producing distraction and that, it was generally very good. 

But as time progressed, as years passed, programmers were more and more interested in having stable software, so that is how defensive coding came into being for the first time. And then I'm pretty sure that you know more or less everything about defensive coding already, and you might be asking yourself, why, what am I going to tell new today. Probably nothing, but anyway, let's draw the map of defensive coding.

For one thing, you want to defend from input, from invalid input to reject invalid input right away, and to only let valid input in, or to standardize communications with outer systems, which not only means other services or, I don't know, network, things like that, or database, but also the libraries that you might include in your own process. If you don't trust other people's code, you might sanitize the output from a library. However there are more things there, and today I'm going to talk to you about things that are not, that don't have to do with outer layers of your system or your component, or whatever you are writing. I'm going to talk to you about defending from yourself. 

Many people don't really understanding that there is a concept of defending from your own code, and I will give you a hint now, and you will hear about it in around 20 minutes to think about that hint. For example, if you face non reference exception, what do you do? How do you defend from that? If it's not something that came from a third party code, from a library, or from any metric service or anything, if it originated from inside your code, how do you defend from that? And that is what we're going to be talking about today.


So I'm going to introduce a very small example, it's going to handle students and subjects there enlisted to listen and to have exams, so student is going to be identified by name, and is going to keep a list of grades from exams that student has passed, and student will expose three methods, add grade, remove last grade, and get average of grades, and now as you look at this API, this is very small API, you can already start thinking about what can go wrong with only three methods we have, and many things can go wrong already. And you will see even more things will go wrong when we introduce training, which is, for example, a university subject, and students can enlist for the training. Training will expose another three methods, you will be able to add a student to enlist for a training, and then two more complicated methods will come, that top student will have to work through the list of students and sort them by average grade. And the last methods add grades, will be the most complicated of all, it will be involved supposedly after the exam, it will receive the list of student's names and grades, it will work through the list of students and add a grade to each of them. 

And now, what can go wrong? This is usually a place where I apologize for using PowerPoint and step into Visual Studio. This is the code, I'm going to show you the code which is implementing all these six members. First let me show you the grade, grade is just an enum. It's an enumeration defining grades, and the only working code is located in the student and training classes. So I will show you first things that can go wrong in these two small, very small classes, and you can imagine how bad it could all be when applied to a large project.

Sometimes people tell me that I'm showing too simple examples and non-realistic, but you will see that only six very small functions, half of them one-liners, can cause you so much trouble that you will have to reconsider your entire coding practices, all the practices you're using every day, if you want to survive six simple functions.

So let's start. Grade average. There's the list of grades, down below I'm adding grades to the list, and this is the grades average. Every grade is first converted to double, this is my extension method, it is not important at all, what is important here is the average extension method, which comes from the link library. This method will throw an exception if the list is empty. So we already have one thing to defend from, and I'm even helping the course by introducing a new member, he has grades, which returns boolean, telling whether these grades is empty or not. It returns true if there are any grades in it.

So if you want to avoid an exception coming automatically from this property getter, you better check where this property returns true beforehand. 

Now remember, this moment, I'm telling the callers to make sure to check potential output before using the member. That is hiding half of the answer to the great question, how to defend from the inside of our repetition.


Excuse me Zoran, I have a question here about this great average property. You say that we should check in advance whether the student has any grades or not. What if we, instead of that, return some special value from the grades average, like, not a number, or if we make the double nullable and we return null?


Yes. I could answer that question on several levels. On one level, you should ask yourself, what have you got by returning, for example, null or double? Did you get anything more compared to a boolean and a double? No. Nullable is still a type which has a value boolean property and a value double property, so you still have the same thing in your hands. 

On more elaborate level, if you return something that could mean one or the other thing, then you are placing a burden on the receiver of this result, to be afraid of what he might get. I think it is much better to put the thing straight before the caller. Now, if you call grades average, then make sure to not receive an exception back, So if you get to the point of calling grades average, then you should have already made sure, either by adding a grade, and therefore making sure that there is something to average, or by asking whether there are grades or no, and after you have made clear that it is safe, then you call this, and then you get an exact result. No speculation beyond this point what you have got back. I hope ...

Tony: Okay, thank you.

Zoran: ... This is enough.

Tony: Thank you.


Okay, so the add method. First, let me make a disclaimer here, this method is returning a flag as an indication whether the operation passed well or not, this is not how you should write code in my opinion, and I will remove these flags by the end of this demonstration. I have started the demonstration with an ancient way of defending, which was based on status codes returned for matters. So people used to return, for example, false, if operation cannot be conducted to the end, then true if it went well.

So this is the old way of dealing with unknown input, so this input was not sanitized before, and now we have to deal with it, and I have a piece of private logic here, it is very simple, I'm asking if this grade is defined in this enumeration, so don't send me a grade value of 57, because it's not an existing grade. If something like that happens, I will just return false and be done with it, right?

Why is this bad? Again, this is bad on a couple of levels. On one level, I am making troubles to the caller. The caller doesn't know what's going to happen after he calls my method, and that is a bad thing. We don't like non deterministic code. We don't want to call a function and not know what is going to happen after that. Imagine a list of five steps that you have to make one after the other. You made the first step, the second step, the third step, now fourth step is calling this function for example, and the fourth step returns false. That means that now you have to roll back the first three steps, which might be very complicated. It might include calling dozen of other methods to roll back the effects of these three steps.

So returning a flag is a bad thing to do from that aspect, but there is another aspect which is even more important. You see, if I give up at this point, it's not only that I give up. I must also recover from this situation somehow. It's not the end by returning from this method. Every ever must have a recovery. If you don't recover, you must kill the process, alright? If you want to keep going, to keep running, you have to recover from every single, every bit appears during execution. How is add grade going to recover?

Design by Contract

So this is the first concept, the first important place where I will mention design by contract. If you haven't met design by contract, it was introduced by Bertrand Meyer quite a long ago, and it's still not widely popular. I suppose it's not widely popular because it is a bit complicated, but I will try today to tell you the most important part of the theory, which will make you not really understand it to it's fullest extent, but it will make you start thinking in a different way. 

So here we have a function, add grade, which is defending from invalid input. And I say that is wrong, because this function doesn't know what it means to recover, it is easy to just give up. Now, how to recover. This function cannot recover, do you know why? Because it doesn't know who called it. And that is an extraordinary idea. I myself was defending at the called site, so inside the method that was called, for many years, until I understood that the method that was called cannot defend because it doesn't know the context in which it is executing.

Is it a web application recovering from a format error? Could mean to send HTTP status, internal server error, 500, or temporarily unavailable, or not found, whatever, so it needs to send an HTTP response back. If it is a desktop application, recovery means to pop up a message box, or if it is a unit test, it has searched false and made the test fail. So only the caller knows what it means to recover.

Okay. You keep thinking about this and I will keep talking about current code. Remove last grade is removing the grade at the last position in the list, but only if there are grades. It also returns boolean flag and that's it. And now to the more complicated class, training. It keeps a list of students, it allows us to add a student, but we can only add non-null students, you will see why. Here is the get top student, this is the method which is sorting the students by their average grade descending. But now, we have two kinds of students, those that do have grades, and those that don't.

And now you see that story about the caller and the callee. The calling class training knows exactly how to recover from the situation when the student doesn't have any grades. It is very simple at the disposition. I can pick only the students that do have grades, and sort them descending by their average grade. And then, just concatenate those that do not have any grades. Do you see? This list is going to have students all descending by their grades average, and then all those who didn't have any exams yet at the end of the list, and it's never going to fail. This function is safe, only because the student class has given me the means to test them, so that I can recover from potential. It's still not an error, it is a condition, I can recover from something that might end up in an exception, and I would have to recover it using other means if I had the exception from the grades average. Now there's no trouble at all.

So this is the great concept, giving public members to the caller to test the state, and prepare defense from any condition in advance. 

Okay, let's move forward, set grades. This is a terrible, terrible piece of code, don't write code like this ever, but I had to. I had to do this, let me show you what this function is doing. It is receiving a sequence of tuples, each tuple consisting of a string representing the name and the grade to give to a student with this name. 

Now, what can go wrong? A student might not exist in the list of students. There is a list here. So the student might not exist here, and this function should fail. I'm still returning boolean flags to indicate failure. And on a more detailed level, the student might exist, and we might try to settle the grade to that student, but add grade could fail, saying, "No, no, I don't want this grade, goodbye." So we might face a failure deep inside the loop, and that is, that scenario, which I told you already, like having five steps and one of them fails.

So look what I am doing here. I am first defending from the situation, forgive me the syntax, this is going to be much cleaner later. This is a query which has... For any tuple in the input, now all students say that their name is different, and that's meaning that there is a name which doesn't exist in the list of students, then give up and return false right away.

So I'm defending from one sort of troubles, but the other sort of troubles comes only inside a loop. Much later, when I try to add a grade to a student and this might return false. Now if one student returns false, then I must work through the list of all students who did accept their grades and roll back the change. That is that roll back which I mentioned, so you can see how terrible code is produced if I try to defend from a flag returned from deep inside a loop. So this is the roll back list, any student who accepts a grade is put into the roll back list, if any of the students later decides not to accept the grade, I must walk through all the students who were successful, and roll back changes, and then return false. So this code is terrible.

Alright. Let's start fixing it. The first thing, if you want to defend in your code using status codes, which I strongly advise not to do. If you insist on doing that, then it's better to do that right away at the beginning of each method. You see, it's better to ask all the students first whether they exist, second whether they accept grades. So I would need another public member, this is the student class here. I would ask for a can add method on the student class so that I can ask them upfront, and if I do have that method, then all this doesn't have to look like this. It becomes much simpler, you see? I can just walk through the list of students and find each of them by name, picking the first one with that name, and adding a grade to it. This time I know that the first that this creative, or the student by name, is not going to fail, and second, I know that every student will accept the grade, because I have asked both of these things up front.

Alright, now I need this can add method, so here it is. It is just repeating the logic that I already had in the add grade. You see, this is the same thing, only this is the negation. So, I could rise this theory to a higher level and say, and really introduce contracts at last, and say, student and training classes should have an agreement how they work together, and that agreement is called the contract. And a student class says, listen, don't call add grade unless you're sure that can add grade will return true. If you're not sure, ask. If you are sure, alright. And then, after having can add method, I must change add grade to actually use that method. Previous code was this, and this is a pretty much private piece of logic. 

Now can add becomes part of the documentation for the add grade method, and add grade method will check can add according to documentation. I'm not sure whether you understand this, but the idea is to ... Let me rephrase, contract will be defined in terms of public members only, and then the contract will be checked in terms of those same public members. 

Okay, now. Now we have a situation in which add grade method is checking the public contract and giving up immediately, if the contract is broken. But then, why returning false? We can push contracts to even higher level and say, contract must never be broken. Never. The other class must not call add grade if can add is returning false. It can throw an argument exception for example, and stop execution not only of this method, but of the caller as well. The caller must be sure that the object and arguments are correct before calling the method.

And so, this remove, okay. Remove is going to test public member, HasGrades, it was public from the very beginning, and now HasGrades is a contract. It says HasGrades must return true, or otherwise I will fail. So you may start thinking, but this is a bit too hard, maybe this is too much, why not keep returning boolean flags? I will show you.

Before that, let me fix these two complicated preconditions as well. The difference between these preconditions and those nice preconditions in the student class is that, again. I'm having a private logic, something I cannot write this into any documentation, because nobody has access to students, for example. So what I'm going to do will be to give access to the students, not just indiscriminate access, I will introduce a very concrete contained student method, so if you're not sure whether I contain a student or a given name, then ask, and after you asked, you will be free to call set grades and pass that name as part of that list of names, because you know that it is inside. If you are not sure, I keep right to throw, and I will stop your execution if you didn't obey my contract.

Similarly, this will have to be turned into public contract again, I will imagine some accept grade method on this class, which will receive name of the student and the grade, and it will tell you transitively whether the student with this name will accept this grade, if yes, then you are free to include that tuple in this list and call. 

So I'm finally giving up all these return flag instructions, then I am going to not return any flag from this method as well. So all the methods that are doing stuff are just returning void in my solution this time. I need this accept grade method, it is very simple, it is looking up the student, now I know that the student exists, because that is a precondition for the precondition, so to say, and I ask every student whether it can accept a grade. 

So, we have moved this solution to a higher evolutionary step, and this is typically sa far as many people would go. But now, if you were throwing argument exception, for example, or argument null exception, or index out of range, or invalid operation exception, things like that, then it is the time for you probably to think why did you do that?

So, the question is, you can think for yourself, did you ever throw argument exception? And you probably say yes, all the time. Then I ask you the second question, did you ever catch argument exception? And I don't know, I suppose that nobody has ever done that. The problem with argument exception, argument null exception, and similar exceptions, is that nobody ever catches them. People catch exception, and they do that at a very topmost level before the exception goes out of our scope and becomes an unhammer exception. We are handling exception just to protect our process from failing. We are never catching argument exception. And the reason why is that because we have no clue what to do with it. It is as simple as that. You cannot handle argument exception, because handling and recovering from that exception would mean to correct that argumented call, make a call again. But then, if I knew what's wrong with the argument I am passing, I would pass the correct argument right away, I wouldn't call the exception in the first place.

This is a mind-bending idea that might need to mingle in your head for a while before you start thinking the same way I'm thinking. I don't know how to make you believe me in a one hour session. That is my point. But anyway, we could ... Suppose that I have persuaded you to not throw concrete arguments and not to try to handle them, what is the next evolutionary step?


Excuse me Zoran, before you go farther, I would have a question here. Wouldn't it help if you are more specific in these exceptions, because here you just say student not found, but if someone on the top level catches the exception, wouldn't it help him to know, for example, which student hasn't been found, or which student has an unacceptable grade, or which grade?


Mm-hmm, yes, so you are suggesting to make even more specific exceptions, like our own custom exceptions, and populate them with more specific information, right?

Tony: Exactly.


Okay, so again, what would we do with that? Suppose that there is a student name which exists here but doesn't exist in the student's ... Where is it, student's list. Now we would include that student name in the exception. So what would we get with doing that? The caller would have to catch specific exception, to expect that exception, and then to dig for specific information there, our student's name, and then do what? Once again, we have no clue what to do. So a student is not there. Is it going to add it? No, it doesn't have a grade for that student, so we have an impossible situation. So we are not going to recover in any ways, including more specific exception, plus, even if you do catch a more specific exception you are still going to do the same things as in this contract scenario, so we are going to understand that this student does not exist in that list, alright? But we could do that by asking whether this list contains that student for every student name. 

So all the code is still there, it's only packaged in different places. So you're not going to have any functionality more than you already have, and you are still not going to be able to recover from the error.

Tony: Okay, thank you.


And then, the revelation. If precondition is violated, in any place in code, do you know how that is called? How do you call remove last grade when there are no grades? What do you call that? Or what do you call adding a grade which is not part of the enumeration, which is invalid grade? You call it a bug. Precondition violation is identically equal to a bug, and if you ever thought why you cannot recover from mistakes like having a name of a student which doesn't exist in a list, or having a grade which cannot be accepted by an existing student, if you thought why you cannot recover and keep going, but you must stop execution of data operation, of course, and just stop it, there's no more execution of data operation, then the answer is because that happened because you have a bug.

And then, checking preconditions means to catch bugs in your code. There's no recovery, just stop execution, that's why people throw exceptions, they don't return status codes, there's no more execution beyond this point. And now, as good programmers, we could do a new thing, completely new thing, and say, "Alright then, checking preconditions is a separate concern. Let's create a class," we are object oriented programmers, and we resolve issues by creating new classes, alright. So here's the contract class. This class will be a library somewhere on the side, not part of real production code. And in the contract class, I want to define a utility code requires. It is even a static, static method, it will receive a predicate which must return true. I will call this predicate, if it returns false, I will throw an exception, and listen, I don't care what exception, I will just throw an exception, because this is the end of the road, I'm writing precondition violations, there is no more. No specific exceptions, they carry no information which has any meaning at a disposition.

And now, … using this, for example in the add grade, I will call requires and say this is ... I can add, I can accept, can add a call with this grade, this must be true. Done. Remove last grade would be another. An another, an another, let me find number five. Another precondition, has grades, it must be true, and that's it. 

Now look, preconditions have been turned into a declarative code. It's not imperative, and just declaring what must happen in order to start doing this. Alright? Now I suppose if this is the first time that you see preconditions, at least in this way, that this is a bit too fast for you, you will have access to code and to code snippets who would be able to repeat this whole process and see the evolution of code yourself. 

I will move faster towards the end. Now look in the training class, I have these complicated preconditions, they are complicated because they are working in a collection. They must work on all elements of the collection. So we could even, now that we have moved the contracts to a separate class and declared them an orthogonal concern, now we can add universal quantifying precondition. Alright.

So contract extensions is going to be the class that will hold that. You see, this is another extension method, this time it is an extension on a sequence of whatever, it receives predicate on that whatever, and it just calls, requires for you to … elements in the sequence, nothing more. This is just a universal quantifier for preconditions. 

I was able to define this very easily because I have moved this different concern into a separate class. So, back at the training class, let me rewrite these preconditions in terms of universal preconditions. Now this reads like names and grades, for them I require that all tuples satisfy that I contain a student with this name, this is the name of the constraint, and I request that I'm accepting a grade for this student name and this grade, this is the name of the constraint.

Do you like this code? It is 100% declarative, and one even more important thing, it is encapsulated. It is encapsulated here, not in my production code, but in some other library. And now, what can we do with the encapsulated code? We can change it without affecting the caller. That is what object oriented programming is all about. 

So, look at this. Suppose that somebody has said ... I'll return to this. Somebody said, "Listen, this is working for the  sequences, it's even working twice, I don't want this to work on every call in production, this is flow." Alright. And I'm pretty sure that this will never happen, that this precondition is false, we have really tested the code to the end. Alright. We want universal and existential quantifiers to be switched off in production. 

Do you know how you switch off this code in production? One way is to add conditional compilation, that is a feature of .NET. You can specify that this piece of code is going to be included in the assembly, but then when just in time compilation is performed, calls to this function will be removed if this symbol, debug, is not defined. 

So let me show you what happens if I end testing and decide to go to production. So this is debug configuration, now I turn to release, and this goes great. Compiler removes both calls at just in time compilation. So you can keep quantifiers alive in debug and turn them off in production, making code safe during testing and making it fast during execution. That is one thing you can do when you treat a contract as a separate concern.

Or, even more, you could, this is the, requires you change the method, you could do something like this. This is the old fashioned conditional compilation. You can say that in debug mode, I want to assert, I want to kill the process. Now, why would I kill the process? Why not keep throwing plain exception? Because, exceptions can be caught. Sometimes people inadvertently catch exceptions and do nothing with them. They catch exceptions because they fear that those exceptions would propagate up and make damage. Alright, that fear is not well placed, you should not catch an exception unless you know perfectly well what to do with it, but people still used to not handle exceptions, but just cover them up. And if you throw a precondition violation exception like this, then even in the testing code, it might happen that somebody has hidden the fact that exception has been thrown. And then, you have a bug, and your code has raised a flag saying, "This is a bug, you have a bug, fix it," but somebody has hidden that flag from you. And then you deliver buggy software to production.

So in order to avoid that, you may choose to assert in debug mode. Assertion, if predicate is false, assertion will kill the process. There's no way for you to hide that, to catch assertion, it will propagate to the operating system. And it is very often a good idea to assert on precondition violations during debugging, because then everybody on the team will know that there is a bug, and they will have to address it. And again, if you go to build configuration, when you finish testing and go to release, then you switch off assertion, and even without that, I was using this assertion from debug name space, so this wouldn‘t affect the release build anyway, so you get back to throwing an exception, because, well, it's not nice to assert in production, you don't like big red screen, windows and message boxes on the screen, alright. 

So then you throw, even if that means that some exceptions would be silently ignored, just for the sake of end users, keep throwing the exception rather asserting. And another thing, the last thing in this demonstration, there's even more interesting idea. What exception should we throw? Not plain exception, plain exception really says nothing. Maybe we could throw some exception like contract exception, you see, and it says precondition violation, this precondition, alright, but what is the contract exception? 

I could, I could define contract exception as a private class inside my contract class, for example. So that nobody can see it, only I can see it. So when it comes to doing something, for example, I know this add grade might fall, I could wrap it in a try catch, but catch what? Contract exception? No. There's no such thing, it is a private class, I cannot write a synthetically correct catch block which includes contract exception. I cannot handle contract exception, the only way to handle it is to handle the base exception, which is a bad practice, and I suppose nobody of you is doing that, you should only catch general exception on the very last level in your component, just to stop exceptions from propagating outside of your component, so you would generally capture general exception and say, if you are writing a service, you would return either 500 in internet, or whatever code, or temporarily unavailable, or I don't know, not found, things like that. 

If you are writing a desktop application, you would return some object which indicates to the user interface that it should pop up a message box and excuse to the user, so that is the place where you catch general exceptions. The very limits of your system, not inside. On the inside, I would suggest you to throw private exception class so that nobody can really catch it. And again, the reason is because any precondition violation equals to a bug, and you cannot recover from a bug. You cannot fix an argument and try again. If you knew what's wrong with the argument, you wouldn't cause the exception, so that is the simplest way to look at precondition violations.

Alright, this was the demonstration, I hope that you liked it. Now you can ask questions, I have left a bit more time for your questions, because I expect to have at least a couple of them, and I would like actually to hear you, and to try to respond to your questions now. 


Q: Would the debug assert be removed from release build even without the debug compile directive?

A: It is not an easy decision to remove parts of the logic that is checking the conditions. If you are absolutely certain that you have checked everything in testing, you can remove all precondition checks from your application and make it absolutely fast, as fast as possible. If you are not sure, then you leave preconditions in production, but that is one of the concepts. Also, there is that gray area in between. You might add a lot of heavy checks that are good for debugging, like just imagine a sorting algorithm, an algorithm which sorts the array. Can you imagine how many checks you could perform during the execution of that algorithm? Including the final check that the array is really sorted at the end.

All that would make the sorting algorithm tremendously slow. And then you just remove all of that with a compiler directive from production code. You don't have to check the debug symbol. You could introduce your own symbols, like heavy precondition, or things like that, and switch that off not only in release build, but in the build which is going to be on staging or in production. Then, for example in .NET core, you can use environment and check what environment it is, and then to have heavy checks in development, less checks in staging, no checks or almost no checks in production, so that is the point.

It is very important to understand the concept of preconditions as a separate concern which is not turned into 100% configuration question. It is not a question of your code, it is not part of your code anymore. You could see that I was changing the way preconditions are compiled without making a single change in production code, that is the most important part.

Q: Why not use Microsoft code contracts? How does this presentation apply to PostSharp? 

A: Zoran:

Microsoft code contract is a grade library, which unfortunately having hard time right now. Recently it was turned into a community project and I don't see much activity on there. For those who do not know, code contracts library is defining preconditions, post conditions, and variants, even precondition inheritance, you can define preconditions on an interface, not on a class, and the compiler ... Not the compiler, the code contracts library would jump after building the byte code and rewrite your byte code so that all classes that implement certain interface with contract, the contract code will be injected into them after compilation, so it is tremendous idea. 

Unfortunately it never got popularity, and it had a grain of poison from the day one. That byte code rewriting is a heavy hammer. You don't want your production code to be rewritten by any tool. If you imagine deploying such code to production, and knowing that some tool has changed it after you, then you have probably decided not to do that, and code contracts library, I like it very much, it is great idea, unfortunately it is not popular today. And it's not going in any direction towards success right now.

PostSharp is including the tools for code contracts, right now the tool does not support all these concepts that I have shown, however it is hitting the central point by turning contracts into a separate concern. In PostSharp, contracts are implemented as aspect, so again, it is outside of your production code. It is a very important idea to not keep contracts as part of your code.


PostSharp allows you to add patterns to C# without changing your language, and it comes with some ready made contracts library, which covers part of what you have seen in this presentation. But you are also free to create your own contract aspects, so if you don't like how we have prepared this, you can still use PostSharp and make your own pattern implementation. 

Q: Is there any situation in which non private custom exceptions should be used? 

A: Yes, of course. Exceptions are what their name says. Just imagine, I don't know, metric exceptions for example. Metric exceptions are giving you the way to handle an external situation, something that did not originate and end inside of your inner code, but it came from the outside, the network failed. You could write a library which terminates execution by throwing an exception, alright, that is perfectly valid solution. However, you must make a clear distinction between situations which are suspicious in terms of looking like a bug, from situations where something has prevented you from completing the operation and nobody could see that in advance. 

It is easy to see that, for example, a user is blocked and you cannot move money to that user, so it is trivial to see that. If somebody, even after that, calls you and asks, "Now give money to this user," and you see that the user is blocked, there's no point in throwing an exception. Because that caller might handle exception and keep going, thinking that the money is there, and it's not, it cannot be. So that is a bug, that is not an exceptional situation, it's a clear bug and you must stop execution right now. There's one saying, popular saying, which says, "Dead process makes less damage than the crippled one." If you catch a bug and let your process keep going, it is a crippled process, something has happened and you have no clue what, and then if you let it keep working, this might damage the data, it might commit something in the database that it should have rolled it back.

I hope you understand the distinction now. There are bugs, and there are exceptional situations, so throw exceptions only in the second category.

Q: Doesn't the argument against the rewriter for Code Contracts also apply to PostSharp?

A: Actually, no – IL rewriting done by Code Contracts makes the code you see during debugging not the same as the code you execute. PostSharp respects the whole development chain, so even though it is rewriting the IL, the debugging experience remains untouched. You can debug not only your production code, but also the aspect code including even your custom build-time logic. And the PostSharp Visual Studio extension helps you to keep track of all the enhancements PostSharp does in your code.

Q: Why not using the standard Microsoft Code Contracts library instead?

A: Code Contracts are obtrusive, and my experience shows that development teams are not ready to take the risk. The teams I was leading have never even considered using Code Contracts library, and I didn’t want to push the matters either.
If I had to pick the sole reason for not advocating use of Code Contracts library, it would be the fact that it is modifying the IL code in a non-transparent manner. Even the original documentation states that IL code rewriting may be the reason for teams to avoid using Code Contracts.
In my opinion, Code Contracts could have been designed as a regular library, and that would remove that veil of mystery that surrounds the process of code rewriting. I suspect that primary concern was performance. But then, if you look at spectacular performance gains of ASP.NET Core, for example, we could question that reasoning on grounds that performance could have been improved using other means. We will probably never know.

Q: How does this relate to Code Contracts?

A: If you decided to try Code Contracts on this same example, then you would find that there is correlation between syntax I have shown and Code Contracts syntax.
That is not a coincidence, because my example was strongly influenced by the Code Contracts library.
There are two reasons why I have opted to use custom contracts instead of a library. For one thing, I wanted to show you the bare bones, because that looks more convincing. Core of the contracts implementation is, as you could see, no longer than ten lines of code. The second reason is that in previous demonstrations part of the audience was complaining that I am pursuing a lost cause, predicting slow and painful death to Code Contracts. I am very fond of that library, it is very smart and also complete, but current level of support truly doesn’t fuel optimistic feelings.

Q: How does this presentation apply to PostSharp?

A: PostSharp tool supports code contracts as an aspect. You can add attributes to method arguments and describe preconditions that must be satisfied (e.g. that an argument must be non-null). It also supports custom extensions so that you can widen the use of aspects provided by the tool.
The theory which I have demonstrated is more general than any tool (except Code Contracts) currently supports, and that stands for PostSharp as well. I have insisted on unconstrained version of the theory so that you can see the direction in which that is going. Tools are getting better with every year anyway.

Q: How to extend code in a way that client code will know that method does not return null or does not accept null as an argument for instance?

A: Expectations made by the method are called preconditions; promises made by the method are called postconditions. Failing to meet a precondition indicates a bug in the caller; failing to meet the postcondition indicates a bug in the called code. So much about terminology.
Preconditions and postconditions are part of documentation. You may wish to write unit tests which would document them in an executable way, but there is currently no way to rely on, say, language or Visual Studio about that.
There was an attempt to design a Visual Studio extension which would read Code Contracts and expose them as part of IntelliSense. That extension was unstable and I had to remove it from my copy of Visual Studio after trying it for some time. There are other similar attempts, but none which would make an impression.
There were requests to Microsoft to include contracts in Roslyn compiler, but the team has rejected that. It remains to wait for someone to develop an extension which would incorporate contracts validation into the build, I suppose.

Q: Are these given precondition contracts all we need, or do you use (many) different ones not shown here?

A: As in previous question, we also have postconditions – promises made by the method. For example, method should never return null, and then it asserts that the return value is non-null before exiting. Such postcondition tells callers that they don’t have to guard against null before accessing the result of our method – a useful hint indeed.
But generally, even if you decide not to use any tools, preconditions that I have shown in this demonstration are pretty much everything you might need. As any great idea, Design by Contract is very simple.
There is, however, one subtlety. Preconditions are inherited, so that derived classes are satisfying the Liskov Substitution Principle out of the box. It is forbidden to add more preconditions in the derived class. Now, that sounds easy when you have base and derived classes. The real problem is that all this stands for interface inheritance as well, and that is what makes Design by Contract messy in .NET.
Code Contracts library came with a powerful (and complicated) solution to contracts inheritance problem. You may refer to their documentation for details.

Q: Where can I get more instructions on using contracts?

A: The authoritative source of information on Design by Contract is Bertrand Meyer’s book Object-Oriented Software Construction. That is where Mr. Meyer has introduced and explained DbC theory in such terms that there is virtually nothing more to be added.

Q: Your idea of recovery is not technically sound... How this "separate" concern can work in multi-threaded environment?

A: Multithreading is another concern. It is not advisable to enforce thread safety on any given piece of code. It is better to implement thread safety as a separate class which wraps a non-safe component. In that way, code is much simpler and usually less error prone.

When things are put that way, it turns that contracts will remain in the component which doesn’t deal with threads and therefore it is 100% sound.
Take TPL as an example – threading was incorporated in Tasks and Task Managers, leaving logic intact. If you take a look at the way Code Contracts library is already handling async/await constructs, you will see that there is nothing missing in their solution.

Q: How practical it is to remove precondition from production code? What should your method returns in production if someone violate your preconditions for method?

A: Sometimes people decide to remove contracts checking from production code just for performance reasons. Note that contract clauses can also be viewed as part of a larger testing picture.
You can consider rigorous proofs as one level (which you run once – on paper), automated tests as another level (which you run sometimes), and then contract checks as the third safety valve (which you run every time the code executes). Now, it’s obvious that you can drop contract checks if you are pretty sure that automated tests have done the good job, or even that you have a formal proof that contract check will never fail.
In about one month from now a new course of mine will be published at Pluralsight – titled Writing Highly Maintainable Unit Tests. In that course I am explaining in great depth relation between formal proofs, unit tests and code contracts. For those of you having Pluralsight subscription, it might be useful to watch that course once it goes live.

Q: If we have checks/contracts on debug or staging, how would we not want that on production? Ex: checking to see if the students exists would have to be checked on production as well right?

A: This relates to previous question, where I have argued that code contracts are only one level of safety, though the most rigid one. If your code is tested well, then you don’t even have to put explicit contract checks into your code. However, I don’t think that any production code is tested enough to truly remove contract checks, because contract violation would then pass unnoticed and it could possibly damage data.
Therefore the conclusion. It pays to consider removing some of the costly contract checks. For example, remove quantifier which has O(N) complexity from methods with sub-O(N) complexity – like methods that have O(logN) or O(1) complexity. Leaving contract checks there would make the code observably slower. You probably don’t have to remove an O(N) contract check from a method which already has O(N) complexity or higher – like O(NlogN) or O(N^2). Contract check would then reduce performance by at most a constant factor, which we can survive.

Q: Wouldn't Debug.Assert be removed from Release build even without the #if DEBUG compiler directive?A: That is true in my example. I have included the conditional build just to show you the concept. But don’t forget that there is the Release version of assertions in .NET. Some people opt to assert in Release build as well. That makes sense in very sensitive code, where erroneous execution might cause unacceptable consequences.

Q: Is there any situation in which non-private custom Exceptions should be used?A: Exceptions remain valid solution in all exceptional situations. If you are writing a library, you would probably opt to throw a custom exception back to the caller, than to put a burden to the caller to have to use certain code contracts library (including your own).
However, inside your library, you could still be fine if you decide to institute contracts as the design strategy. That will at least save you from your own bugs, not related to any other code that might call your functions.
Other legitimate cases are already covered by the .NET Framework and other libraries and frameworks. Consider network exceptions as an example. Network communication is a typical example of the process which is full of exceptional cases, and that is where exceptions can be of great value.
Note, however, that there is one funny side of exceptions, which can give us a hint about when exceptions are a valid option. Although they are representing exceptional events, we always expect exceptions. We expect network exceptions when working with remote services. We expect database exceptions, like concurrent edit, deadlocks or timeouts. We expect IO exceptions when working with files. If you have a situation in which you actually do not expect anything exceptional to happen, than you will probably be better off if you avoid introducing an exception.

Q: Is it correct to say that catching or throwing exceptions in between the execution of a function is an anti-pattern?

A: Using exceptions to procure information or to alter control flow is definitely the anti-pattern. Just consider this case. You have caught a custom exception and you are ready to do what it says. Now – how do you know that the exception was thrown by the entity you expected? Exceptions can jump over activation frames. You are certainly not going to inspect call stack from the exception object to discover which function has thrown it.
This analysis can go much deeper than this. Maybe it is enough to think of catching specific exceptions as breaking encapsulation of the called code and knowing too much about its implementation. That is rigid and fragile and it breaks principles of OO programming - which is not the problem in itself, but then you won’t have any benefits of OO like polymorphic execution.

Q: Since preconditions have been rephrased as declarative code, can we use attributes to define contracts? Any frameworks for doing this?

A: PostSharp tool is defining contracts through use of attributes. However, keep in mind that attributes must be compile-time constant, and therefore you cannot pass just any expression as an attribute-based contract.
However, it is possible to cope with that (in theory). You could use static attribute to somehow point to another method which contains more elaborate contract verification code.

Q: Briefly, what was the purpose of the private ContractException?

A: Primary fear from throwing exceptions when precondition/postcondition is violated is that somebody might catch that exception and forget to rethrow it. In that way, operation might continue to run, even though we have just found sufficient reason to terminate it.
That is the reason why in Debug mode many developers opt to assert, i.e. to kill entire process, rather than throw an exception. But then, in production, killing the process is not the most user friendly behavior we could come up with, and for that reason we may decide to throw exceptions instead.
But then, we don’t want to throw an exception which can be explicitly listed in the catch block. We don’t want anyone to catch that exception for any reason. In order to catch a private exception, you must catch its base System.Exception instead. And it makes sense to expect that nobody will ever catch System.Exception deep inside code. That is the justification behind private exception class.

Q: If you want to access a resource such as database and you split it to two function: One to check whether there is a value, and another to fetch the value. This will use expensive resource twice, so maybe here return nullable value is ok?

A: Database access is not subject to Design by Contract. Database is to our code just another outer system and we are dealing with it with all precautions we would make in any other case. You have to catch exceptions coming out from the database because you have no idea what might go wrong there (like deadlocks or timeouts, or even plain failures such as corrupt index).

Then the question how to deal with a nonexistent value where it was expected becomes easier. You just query the database as it is. Then turn concrete data into concrete objects. And then those objects would be subject to Design by Contract. If an object requires a value, and you got nothing from the database, then don’t call the method and take a recovery course instead. Otherwise, if you try to invoke the method with null reference, just because database returned no rows, the called object would preserve right to explode.


About the speaker, Zoran Horvat

Matt Warren

After fifteen years in development and after leading a dozen of development teams, Zoran has turned to training, publishing and recording video courses for Pluralsight, eager to share practical experience with fellow programmers. Zoran is currently working as CEO and principal consultant at Coding Helmet Consultancy, a startup he has established to streamline in-house training and production of video courses. Zoran's blog.


Modern business applications rely heavily on rich domain classes, which in turn rely heavily on polymorphic execution, code reuse and similar concepts.

How can we extend rich domain classes to support complex requirements?

In this webinar, Zoran Horvat will show why an object composition approach is favored over class inheritance when it comes to code reuse and polymorphism.

Watch the webinar to learn:

  • How class inheritance can lead to combinatorial explosion of classes
  • What the limitations of object composition are
  • What design patterns help consume composed objects
  • Techniques for creating rich features on composed objects

Applying Object Composition to Build Rich Domain Models on Vimeo.

Download slides.

Download code samples.

Video Content

  1. Limitations of Inheritance Approach (3:00)
  2. Composition Approach (12:45)
  3. Visitor Pattern and Double Dispatch Principle (20:47)
  4. Difficulty with Visitor Pattern (36:49)
  5. More Complex Examples (40:07)
  6. Accumulating State of Visitor (43:15)
  7. Q&A (55:08)

Webinar Transcript


Hello, this is Zoran Horvat speaking. I will be delivering a webinar and here with me is Alex from PostSharp.

Alex: Hello everyone.


Well, as you can see, this webinar is on the topic of creating rich domain models. I believe most or all of you have already suffered a lot building rich domain models, so in this webinar, I will show you one approach to that problem, so let's get started.

You may know me from Pluralsight, where I have delivered five courses this far and the sixth one is on route. The most interesting one regarding this webinar would probably be this course on design patterns called Managing Responsibilities. If you want to watch more of this, you can go to Pluralsight and you can search my name and get into the courses.

In this webinar, we will talk mostly about the Visitor design pattern. In this course on design patterns on Pluralsight, you would find visitor and also chain of responsibility, which may also be applied to the same problem that I will show today.

The problem. Let's start with an example. I have come up with an example which doesn't require any particular domain knowledge. It has to do with animals because I suppose everyone knows a lot about animals and you will not have to learn any specific business domain to follow this webinar.

Suppose that we have a company which deals with animals. It needs to categorize them to do stuff to them. Let's see where that will lead us. For example, we could have a number of objects, each representing one animal like a cow, horse or lizard, snail, catfish, parrot, eagle and now we have to do something with these objects. Suppose that the first requirement is to categorize them somehow and what we do, what we will try to do will be to create a hierarchy of classes which is categorizing the objects based on their features so that two objects would share the same base type either directly or indirectly and through that base type, we would be able to access common features of those two objects.

Limitations of Inheritance Approach

In the first part of the presentation, I will show you the approach with the class inheritance. You know the old saying in object oriented programming, favor composition over inheritance. Now I will show you why. Suppose we have started to categorize these animals into ground animals, water and airborne animals. Suppose that doesn't satisfy all the requirements right now. Suppose that we want to work with the mammals only, so here they are.

We can add another level of inheritance, another layer of derived classes, which only covers these two animals that we are interested in. At this level of complexity, we are quite happy. We can apply business logic that we have with mammals and that looks like this for example. We might have some object of type animal, but it's not really animal. It is some class derived from animal and now we check whether that is the mammal. If it is, we can cast this abstract animal object into a mammal object. For example, a tiger or anything else and then we can apply the domain logic, which is strongly tied to mammals.

For example, our business might pull the tail of an animal and then I don't know, run away. At this point, we are in a good situation when talking about the program. We can access a concrete feature of a concrete object, concrete class. No matter the fact that we have started from an abstract animal object. In the end, we could pick a concrete feature of a concrete class, so this looks good, but let's continue.

You know about special cases of mammals, namely whales and dolphins. Well, they are also mammals and they don't live on the ground. They live in water. If we wanted to also implement the same feature on these two kinds of animals, then we would have to introduce mammals in one additional place in the class hierarchy. If you think about this situation here, if you tried to, I don't know, map it to any real business domain that you may be working in, you would recognize a situation that you have certainly seen before.

There is this first front of classes that derive directly from the animal and at this level, everything is straight. We have ground, water, air. Everything is right, but then if you want to add one more, one additional kind of feature, not one feature, but additional aspect of an animal, then we start splitting ground animals into mammals, water animals into mammals, and we start duplicating logic.

Now, both of these would have some features that their parents do not have, like giving birth to live younglings as opposed to laying eggs. We have the feature of the mammal, another feature of a mammal which doesn't exist in any common ancestor. If you get back to this piece of domain code, then we see that it is not complete. To complete it, we also have to check whether it is another kind of mammal and then to repeat everything, only this time, with a different type.

Now, observe that this is not code duplication because for example, water mammals do not have tails. They have fins and we are not going to run, but to swim in the sea, so obviously we cannot reuse this code here because we are not working with the same type. There's nothing in common that we could use in these two examples.


If I can ask a quick question here. What if we try to change the class hierarchy for example to fix this issue and for example, pull mammal implementation up in the hierarchy and then push what is up down and try to somehow simplify it?


Yes. Yes, that is a good question. The approach would be to pull the mammal up and in that way, make both of these have the same class or the same ancestor. However, if you do that, then ground and water animals would have to split under them.

Alex: Yeah.


We would virtually end up with the same problem, only different set of objects. Basically the problem is that we are solving a problem that we have with an inappropriate tool. The problem is that we have distinct features of an animal and we are trying to turn those distinct, unrelated features into a hierarchy which they do not exhibit and so we suffer. We suffer in a very painful way, as it says here.

We are duplicating domain logic, but we are not duplicating the code. The code is not the same because it is operating on different classes which are having no common ancestor. We can come up with more counter examples. You know about emus and ostriches as flightless birds, so if you wanted to add them, they would fit into the ground animal category, but they are still birds, so we have birds here and we have flightless birds there.

If we wanted to do something with birds again, we have code duplication. If you wanted to add even more specific mammals like flying squirrel or a bat, which are proper mammals, we would see that they are airborne animals. Or if we wanted to add a flying fish, it is still a fish, but it sometimes glides outside of the water. Here is the hierarchy of classes that is covering these features. Only this small, limited number of features of these dozen of animals, we have a dozen of classes because we have to split every single class into two or three of them to cover specific and duplicated features in every part of the hierarchy.

My point here is that attempting to solve a complicated domain problem with class hierarchy very quickly leads up into this situation in which this small hierarchy already is. Only the first layer is clean and everything under that is a mess and if you go to the second, the second layer, now you see duplication in three places, but if you get even lower, you see that even those are cut into smaller compartments because we want to know that the flying squirrel doesn't really fly. It glides. It needs wind or something to fly, but a bat needs nothing. It has full-blown flying ability.

If you want to explain an entire domain model with a class hierarchy, it is very soon going to become unbearable. You won't be able to complete it, now again, this is an example with animals. You can make it less funny by applying it to a financial market or to bank or to insurance company or I don't know, production plant and you would see numbers. I mean, dozens of special cases that you cannot fit anywhere. For example, just try to imagine amphibians. Where would I put amphibians here? I have no clue.

Now, imagine an amphibian which is a mammal who knows how to fly and you will see that there is absolutely no place where to put it in this hierarchy of classes. I have seen very large hierarchies in real business domains and they were really impossible to manage. There was so much duplicated or almost duplicated code that it was impossible to manage.

Composition Approach

We come to the primary topic of this presentation in which we are going to talk about a different approach to solving the same problem. It is composition. We are not going to inherit classes. Any of these animals is not going to be every of these ancestor classes. It is going to be just an animal and now animal is not going to derive from whatever. It is going to contain features, so we had a special kind of feature which is classification. It is biological classification and we had mammals, birds, bony fishes or reptiles or gastropods. That is a snail.

This diagram says that animal is going to contain an instance of a class deriving from classification. It is going to contain an object mammal. Let's see another. We had environment in which the animal is living, so it could live on the ground, in the water. It could be seen in the air. We would add an object of environment, even more, we could add multiple objects because some animals, like amphibians could be seen on the ground and in the water, no problem.

Some others like birds are living on the ground and sometimes flying in the air, no problem. Two objects again. Even more than that, we could refine some of these objects and say, there are two kinds of water. Not every fish is living in the fresh water or in the salt water. The third aspect of those animals that I had were their abilities. They could walk or run if they are on the ground. They could fly. Now, fly could come in two flavors, gliding or full flight. Swimming could be even more complicated.

Now, as you can seem the animal could also have multiple abilities. There's nothing to prevent that. As you can see, I have split that large and cumbersome hierarchy into three distinct, smaller hierarchies and none of these hierarchies alone is exhibiting any of the problems of the previous one because each of them is dealing with only one aspect of an animal, and you can always add multiple objects if you want to represent multiple abilities or environments or whatever.

Now, these hierarchies can also start growing and become not easy to manage, like abilities are very complicated right now. If that happens, then just apply composition to the abilities again. You could have, I don't know, ground abilities, air abilities, and water abilities like a product between these two hierarchies and you could again, compartmentalize the object and the composite from multiple smaller objects that would explain it closer.

This looks like a solution, right? And it is really not. The problem here is now the animal class is encapsulating its traits. We don't know what is inside and if you follow the encapsulation principle, which is a very important principle in programming. Okay, I will give a sentence on that just after. If you follow encapsulation, then we are in trouble we don't know whether this animal is a mammal. We cannot see that. If animal broke encapsulation and let us access its classification directly from the outside, then we come to the reason why encapsulation is good. Animal can never change its implementation of classification ever in the future because somebody's depending on a concrete implementation of this feature.

That is why encapsulation is good. We are keeping the right to change the way animal finds out who it is. For example, this animal could contain an object of classification, but also it would contain a reference, a dependency on some classifier. Something, I don't know and whenever we ask the animal what class you are, what biological class do you belong to, that animal object could contact its dependency and say, now, look, somebody's asking. You tell me, who am I? That object would tell it's a mammal or a bird or whatever and the animal would return that. That is what we get if we preserve encapsulation. 

We need some, I don't know how to say, a good, legal, legitimate way to poke into the animal object without violating its encapsulation. One of the ways to deal with that is the visitor pattern.

Now, before explaining the visitor pattern, I will show you what these animal objects will look like, so that will give you the idea, if you're already familiar with the visitor, that will give you the idea what we will try to visit. If you have a cow, its class is mammal, it's environment is ground. It has two abilities, to walk and run. The horse is the same.

Emu and ostrich are are almost the same, only they are birds. Now we can have about lizards, we can walk. We can have snail who is gastropod and I don't now how to call its movements, whether they are walk. Now, whale and dolphin are mammals. Now you can see an example of what was almost impossible in that hierarchy, the approach with the inheritance. Now, they are also mammals, as cow and horse, but they share nothing else with them. They dwell in water. They can dive, live under water, anything, so there's no resemblance with cow and horse.

That is exactly what I wanted to achieve with the composition. That is something that you cannot do with inheritance. We have, I don't know, two kinds of fish, birds and flying squirrel and a bat, who also share a little of common features. We have this table which is explaining how we can construct an animal. We can construct a lizard by telling that its biological class is reptile, that it lives on the ground, which means give it an object of type reptile, give it an object of type ground, give it an object of type walk and then the lizard will be able to walk. And so comes the visitor. 

The Visitor Pattern and Double Dispatch Principle

Here is a principal idea of the visitor pattern. There we can talk about a hierarchy, like these abilities, running, flying and swimming. Now, take these three concrete classes. They will have to do something. What will they do? For example, we might say that ability is defining an interface use. Like run, okay, start running. Or fly, start flying. Or start swimming. This is the common feature of all elements of the hierarchy.

However, after a while, we come up with a different idea that abilities might sometimes be suspended. Like I could break a leg and not be able to run. Ability might also have a suspend feature. Alright, we implement three more functions here and we are fine. Now we have six functions in total. However, later on, we come with another idea, advertise an ability. Like in the mating season, you can see someone running and jumping around with no obvious reason, so it would not be the same as use, but it will be, I don't know. It could still run, fly or swim, but in a completely different setting. Where does this lead us? If we have built this hierarchy of classes, we don't want to change them every now and then.

If you have this hierarchy, which might be overwhelmed with new features that are using them, then adding all those features to those classes might not be the best idea. Instead, we could turn this hierarchy or view it sideways, like let's see using feature, like something that develops here and then let the using feature use the run ability or use the fly ability or use the swim ability. The features are becoming classes and these are becoming their arguments. That is how we come to the visitor.

Some ability of visitor who'd visit the ability class and the class would tell, "Okay, now use me. Do whatever you do to myself, to this object." This class, AbilityVisitor would now have one method for each element of the hierarchy. Do you see how the things, they are skewed by 90 degrees? You can see that now we have one visitor method, but in three flavors, run flavor, fly flavor and swim flavor. Each of them would implement an entire ability for each of the classes in the hierarchy.

Now, when I said that the problem with the original design, where ability based class had all three methods is the problem when we add new features, then we have to add them to each of the classes. It's obvious that visitors are suffering the same kind of problems. If we add a new class, then we have to add a new visit method. It is not really better than the previous solution. It is just a different view on the methods.

We come to concrete visitors. Like useVisitor or suspendVisitor or advertiseVisitor. Those are the three visitors, which are implementing the three functions, the three features, but each of them is implementing an entire feature for an entire hierarchy, each type in the hierarchy. How does the visitor work? We take some ability. We don't now what concrete type this is, ability object and we call accept visitor. This is still something that is derived from ability visitor. In a concrete override, override of the accept, all these three members of the hierarchy would have to override the accept method.

In each of them, we just say accept this, visitor you are, accept me. Now guess what? Since this class is concrete, this will statically link to this method, visit here, visit of run because we are in the run class. Accepting method would look the same in the fly, but it will call visit of fly instead. This is the basic principle of the visitor.

It is called double dispatch. We have one dispatch to let two objects meet each other, like concrete object derived from this class, meeting an abstract visitor and then the concrete visitor, this one. It must be something concrete, is meeting concrete object run and that is how these two objects finally find each other and then you can implement the feature on the run object, on a strongly typed run object. That is the major benefit that comes with the visitor design pattern. That is the major benefit and now you know everything else is probably drawbacks.

As any design pattern, this looks deceptively simple. This looks like you implement entire universe by applying visitor pattern to everything. However you don't. I already said we have the same problem with added classes. If you have a dynamic hierarchy, if you have interfaces here and there, which are indeed, you don't know who's going to implement them, this is not going to work. However, when it comes to those animals, I will show you the situation in which the visitor can be applied and it will be applied pretty well.

The rest of the demonstration will be code and let's get started. This is the animal. The animal class is having a public name, but it doesn't do anything. More important parts are private. One classification, a list of environments where it can be seen and the list of abilities it has. I mean, the constructor animal requires name, classification, and one environment. Later on, we can add new environment. We can add as much environments as we like and we can add as many abilities as we like.

These are the animals. This interface is letting me compose an animal, compose it from its features. I have prepared this static utility class in which I have shown you how you can actually compose a cow. Its name is cow. It is a mammal. It has an object of type mammal inside of it. It has an object of type ground inside of it. It has the ability to walk, which is an object walk and another object of type run, so it can run. That is the cow. You have horse, emu, ostrich, everything. All the animals are here.

Now, of course you would never construct objects in this way. I have just tried to mimic a repository of animals. The only really important member in this utility class is this static property which is getting all the animals. You can imagine that this is the day and you have loaded all these objects and attributes from the database and constructed each of these animals. This will be the entry point for us. I have a sequence containing all the animals. Those are all these animals from the diagram plus a salmon, which I have added because it is going to provoke a bug and I will show you that in a minute.

Let's try to use these animals, all animals to perform a couple of tasks. One thing will be to find all the mammals and say hello to them. This is what I want to do. I want to iterate. The example is obviously in C#, but I think I haven't used any specific features of the language, whatever your native language is, you should be able to follow the example. Here I'm iterating through the collection of all animals and this and this.

Now, let's explain this. This is the visitor. It is a visitor which says hello to mammals. How does it do that?

This is the classification visitor. Now, a word of warning: if you have been using visitor before, or if you have learned it from the Gang of Four book or any other book on the subject, you didn‘t probably not see the implementation like this one. I don't like to tie too much to any concrete implementation of any concrete design pattern. The point here is to recognize what is the core of the pattern and the core of the visitor pattern is double dispatch mechanism. That is the pattern also called double dispatch pattern. If you remove everything else from any explanation of the visitor pattern, including that diagram that I've shown you, this implementation is not equal to that diagram.

Just as a demonstration, how much I do not want to tie to any concrete implementation of any design pattern. If you strip off everything else and only leave the double dispatch mechanism, then you are suddenly free to implement the design pattern in a way which is just right for your problem at hand. One of the problems that I had was that I couldn't find common ancestor for the two classes. You will see in this implementation that now we can visit a common ancestor of two features of an animal. It will be like magic.

Now, the difference between this implementation and what you could find in literature is that now I also have visit base class. Not only five concrete biological classes, but also visit the base. If you want to just see whether an animal has a classification or it is, I don't know, some alien, then you would listen to this, not to any of these.

Okay, now concrete visitors, this is the classification. This is the classification visitor there. You can see that the classification, abstract class has a virtual, which is overridable method accept. It is accepting any classification visitor and it is calling its visit this. If I choose to go to the definition. Where is go to definition here? Here it is.

It will go to concrete visit of classification. None other but this concrete one. Now, let's see more. Classification for example, as a descendant mammal. Mammal is derived from classification. It overrides accept and then says, base, do whatever you like, which means that we will end up in this method first, but then visit me and I am a mammal. This ends up. Look where it is if I go to definition. I'm visiting concrete visit method. I'm invoking concrete visit method, which accepts mammal, so it goes to concrete classification visitor.

Now, I want to say hello to all the mammals, so do is to accept an animal that I have to visit and then in this public say hello method, I say, "You animal, whoever you are, accept me, concrete visitor." Then the animal will do the magic. Let's see. This is the animal class. It says my classification, you accept this visitor. We get to a concrete classification and it might be a mammal. If it is a mammal, it will again call this visit of mammal, but not in the base visitor, but in the concrete visitor.

I don't know if you could get a grasp of what I did. This looks like magic, but that is the basic double dispatch principle, so we had concrete classification object and we have concrete classification visitor, so in the end, concrete visitor will meet concrete classification. If we ever get into this method, that means that the object animal had an object mammal inside of it, which means that we can say hello to it because it is a mammal. If we do not get into the visit of mammal method for this animal target, because we are visiting that object, it's not a mammal. That is how a visitor pattern works.

Difficulty with Visitor Pattern


Okay, so if I can just maybe have a quick question. First there's a note that we actually have a number of questions from our listeners, but we will answer them at the end of presentation, so please keep asking questions. We will read them and answer then, but regarding the question I had, I think you already touched upon it already in the slides. I noticed in the base class, you basically have a method for every classification. We have classification visitor, base class. It means you need to maintain this class when you are adding a new classification subclasses, right? This is some negative part of that or that we need basically to add method every time you-


Yes, yes. I have only touched that question and here's the good point to show the consequences. If you add another type to the classification hierarchy, we must add one more method like this for that new type, which has a couple of drawbacks. One is that we have to change a class which is not really related to the new type, which is bad. The other important aspect is that now everybody else would have to be aware of the new class, but I have made that a little bit easier, by making these base implementations actually have no implementation. These are not abstract, so nobody has to override them. Only who is interested in.


If you take this mammals one, it overrides one of them and then the next result is that if you add a new class and this visitor knows nothing, it has no business with that new kind of animal or new kind of classification. We won't have to change this concrete visitor. It is somewhere in between.

Alex: Yeah, yeah. You don't need to change all the classes.


That is the major difficulty with the visitor.

Alex: Okay, thank you.


Thanks. I will run this and you will see that it will actually, if you see. I hope the font is not too small, you will see that I'm saying hello actually to all the ... Oh, this is terrible. Let's go back. Alright, so I have picked cow, horse, whale, dolphin, flying squirrel, and bat. All those objects from distinct parts of that class hierarchy from the beginning are now in the same bag and they have been recognized because each of them had a mammal object inside of it and none other animal object had the mammal object inside of it. That is how the visitor works.

More Complex Examples

Okay, I will show you more complicated things. Now, I want to say hello to all the animals under the sea, so that is a completely different thing that is environment and I have environment visitor, which is visiting all of these kinds of environments and now I have a lot of water creatures, which is catching, you see who? Water. Now, water is derived from environment, but it has salt descendant and it has freshwater descendant. Now, I don't want to have to override visit freshwater and then separately have to visit saltwater. I don't want to have to duplicate logic in two places just because there are two kinds of water.

I want to catch their base class and then visit that. That is the change that I made to the original visitor pattern in this example. That is why saltwater for example, when it accepts a visitor calls, this is another implementation, just a different way to call the base, no problem. It calls visit with this as water, as a base class and then visits this as the saltwater, so we will have two invocations on the visitor for the same object. Now, let's see how it works.

I'm using the same sequence of operations. I'm just creating a different visitor, like water creature visitor, giving it each of the animals and to each of them, trying to say hello. What will happen? Now we have whale, dolphin, flying fish, cat fish and two salmons, so we said hello to the salmon twice because if you take a look at the salmon animal, it has been ... In the animals, it is somewhere here. Salmon. Look, it has two environments, freshwater and that is one and the other is saltwater because it lives its entire life in the sea and then goes back to the river.

I said the salmon, hello twice. 

Accumulating State of Visitor

There is one very interesting feature of the visitor pattern. It is called accumulating visitor. I will show you that in the next example. Accumulating visitor does not do its operation as soon as it can, but it accumulates an object which it has visited. I will show you on the abilities visitor. You will see that. Let me just find any of the abilities. Oh, I cannot find.

Oh, here it is. This will be the next exercise. When we want to take a picture of anything that flies. Now, we have two kinds of flying. We have gliding and full flight. I'm visiting the base class again. It is fly, which derives from ability, but that class has two descendants. One is glide, as you can see and another is full flight, so if you take an animal which can glide and fly, which knows both things, then we would again visit it twice. Well, now instead of taking a picture of this animal here, I will accumulate it in this private property and just keep going. Then I will wait it again when I see the same, when I see another flying ability in the same object, in the same target. I'll override this again. I will override it as many times as it gets into the visit fly, but after the fact.

Here's the take picture method. After all the visitor methods have been unfolded, I will remain with this recognized target, which is either null, if I didn't recognize any flying ability or not a null. If it is not a null, then this animal has at least one flying ability and I can take a picture of it. I hope you followed this. It is another piece of convoluted logic. Not easy, not quite easy to follow. However, this is the example which is doing precisely that, and I will take a picture of each of the flying animals, including mammal bat and the squirrel which is also a mammal and a fish and two proper birds.

I have found all kinds of flying abilities in all of these animals that I have. Now, there's another exercise which is telling all the mammals to run. Now, let me show that very quickly. I think you have already got grasp of the basic idea. I have scare mammals visitor, however, when it visits mammal, it uses another visitor to scare it, which I must explain what it means to scare an animal. It means to make it run, so there's no point in scaring an animal which would stay in place, like scaring a snail.

Scare animal is different. It is ability visitor, which is trying to see if the animal can run and then you know what it does, it calls a feature of the run object because in this object, in this method, we have a strongly typed run object and we can use the feature which only the run class has in this entire system. It's just going to say that it is now running. You will have the opportunity to download this source code and that will help you take more time to see all this convoluted logic.

I will just run this example and show you that really, the task was to scare mammals, so I have found a cow, a horse and a flying squirrel. Those are all the mammals that can run. I didn't try to scare away, oh, because it doesn't run. This would be two level visitor, so to call it.


Before you go to the next example, I can also ask a question because this looks interesting. This scare mammals visitor, it looks like it instead can be implemented as a chain instead, so if you have a concept of chain of visitors, you can say the first visitor will say, "Filter out mammals." Then you just have another visitor and that one will for example, tell them to run and so you can dynamically build the chain of visitors.

Zoran: Yeah, that question is funny because that is the next task.

Alex: Okay.


I'm doing precisely that. Yes, this scare mammals visitor which is tightly coupled to scare animal visitor, which is then working on the abilities, this code is very hard to follow. It's very hard to understand how it works and why it works. It is basically ugly if you have to construct all these objects and match their types. It is hard to manage. If you could, I don't know, close this, encapsulate this convoluted logic into something which is easier to use, that will be great. Here's what I have prepared for you. I have prepared a solution, which says, all animals of classification mammal. I believe this looks much better than using the visitor.

Now, in this example, I just want to join all the mammal names and to print them out. Let's just find mammals. This Ofclassification is an extension method on the sequence of animals. Internally, it is using some filter that I have made. Now, classification filter is the new kind of visitor that I made and it is visiting classification, some type, which is derived from classification, which I don't know in advance. It is a generic class.

Then this concrete visitor is expecting this classification and then checking if that has a runtime type, which is equal to what I expect. If it does, then add it or construct the result, which contains only that animal, which I am visiting. This is extremely ugly code and I don't ever want to see it in my custom code. However, this is the utility class and it has a great value because it can hide this class and this Enumerable extensions class which is using it. They're really ugly, but they are utilities that I will write now, test now and never touch again.

I won't have to rely on any visitor which is just looking for a classification of an animal. This is done forever. An interesting aspect of this solution is that it has ultimately given up the idea of the visitor because I am receiving the type T, which I don't know what it is and I'm not visiting any concrete classification, so you could even I don't know. This is not the visitor, although it derives from the visitor class, but now, you can see how good it is to have something like that because when I have all the animals, I can say of classification mammal and it will return me. You can see, it will return a sequence of animals, but now I know that only the mammals will be in that sequence and it really looks like magic.

I can pick names of all those animals that I have got back and those will be the names of mammals. Let's run it and it will be correct, you see, only cow, horse, whale, dolphin, flying squirrel and that in a chained sequence of calls. The last example in this presentation will be the same thing, only to scare the mammals. Now I have a filter of mammals. I have another extension method which is using an ability filter. I'm looking for the ability to run, which is ability of T. I don't know which ability it is. It works exactly the same as the mammal filter or the classification filter. It is the same idea behind and the same run time implementation. At this point after all classification, I still have a sequence of animals.

Now it comes with use ability. I want to use ability run and this method is asking me to give a lambda, which receives an animal, which would run and the run object. Guess what I am doing? I'm telling that now I'm scaring away this animal. I have access to concrete animal object which can run and ease mammal and I can use its own internal feature to make it run.

Now, this is magic. This is something that is not violating encapsulation. It has those problems if you imagine a new kind of animal. Alright, it has its limitations, but this is really magic. I can run this and it says, "Alright, cow, horse, and flying squirrel." Each of you run and they say, "We started running," so this is obviously working and it is certainly readable.

I would stop the presentation at this point. I hope you have enjoyed it. Now you can ask a few questions and all the questions that we cannot answer right now will be answered in textual form after the webinar.



Q: Is it possible to implement the same interface called I mammal? This is regarding mammal and water mammal and when you tried to cast to each of the concrete types. This is about user interfaces in this hierarchy of animals.

A: Yes, that is a good question and that is what many people actually do and let me disappoint you. They do that to postpone the disaster in fact. Here's the problem. Basically the question is related to single inheritance languages and all this presentation was in one of those, C#, and the same stands for Java. Now, if you do implement interfaces on classes that are not related to a common ancestor, you can do that, but then you have to implement the same logic twice in them. For the first point is that you still have duplicate logic, only it has moved to classes. You're not going to reduce your code.

The second thing is very often, you do not have an interface which is the same for two distinct members of the large hierarchy because doing stuff to objects may involve a sequence of calls, calling protocol to complete an operation. Well, that calling protocol is not going to be the same in one and the other case and you will actually not have the common interface. It might get you to some extent and then you will get in a quick sand actually. You won't be able to implement more features.

Q: Why do you show such unreal examples? No serious developer would create object in property getters?

A: Yeah, because I didn't have the database under. I said I have just constructed objects in those getters to have them. You can ignore all those property getters and only view the last, the last one which returns the sequence of them all. That is the only one which I actually used. I know it's not realistic, but it resembles a repository for example.

Q: How do you persist encapsulated domain entities to database without breaking encapsulation? 

A: Yes, you have many problems if you try to persist a domain model when domain model becomes complicated. If you have started, if you have got to the point where you cannot extend your huge domain model anymore, then you have already been suffering a lot with persisting that model before. The answer is that probably at some prior stage, you will already separate domain model from persistence model and then persistence model would break encapsulation and let the data out and domain model would try to only expose methods.

Q: How to control if visitor always matches concrete object and there is no implicit conversion?

A: Visitor is designed to match concrete objects. If there is the need to support out-of-band objects, like objects from one hierarchy that can be converted to objects from a different hierarchy, then the compiler will have hard time trying to understand which method of the visitor to invoke.

Major selling point of the Visitor pattern is that it makes obvious to the compiler which method to invoke in every possible situation. The variation – which concrete implementation of a method to invoke – has been turned into an object, a visitor object. That makes it possible to reach enormous flexibility at run time, while remaining within premises of strong types all the time. Not so many designs can do that.

Q: What happened to simplicity? Seems like you are a making an alternative to LINQ-to-Objects to avoid having to deal with a solution, that you call ugly. It seems over-complicated.

A: There are no elegant solutions in complex domains, and certainly no simple ones. It’s only different places to hold complexity. The goal of the game is to make client code simple.

In Visitor pattern, concrete visitors will encapsulate complexity. Chaining calls approach has only saved the caller from having to know the visitor protocol.

One alternative was mentioned by a viewer: Make classes with no common base class implement common interfaces. This is a viable solution unless distinct classes come with different method dependencies and different calling protocols. At that moment, common interface idea is off.

Another alternative is Chain of Responsibility pattern, which I have applied to this same problem in the Tactical Design Patterns in .NET: Managing Responsibilities course on Pluralsight. It works nicely, but fails to differentiate subtle cases, like what happens if an object contains more than one component of requested type.

The worst options are bare hierarchy of classes and public composition (in which the object exposes its content). Class hierarchy promotes code duplication. Public composition prevents future maintenance of the composed class.

Q: Would it be possible for OfClassification<T> method to return an IEnumerable<T> instaled of IEnumerable<Animal>?

A: Classification objects do not reference their containing Animal – that was the design decision. It might be better to expect OfClassification<T> to return IEnumerable<Tuple<Animal, T>>, which then gives entire information to the consumer.

Another question here is how will this method behave if one animal object contains multiple classification objects of the same type – will it return multiple records then? That is one of the difficulties in approach I have shown in the webinar.

Q: Why not use dynamic to implement double dispatch instead of the visitor pattern?

A: Because you don’t know whether the dynamic object will even have the desired feature (like the Start() method which only exists on the Run class). Fundamental feature of double dispatch mechanism is that it lets two concrete objects meet each other.

Q: How to apply the Visitor pattern and composition altogether with dependency injection?

A: Visitor can be injected using common DI techniques. It boils down to selecting a concrete visitor, which is exactly what DI frameworks are good at. There are subtle issues here, though, but they are not hard to resolve. For example, I have requested target objects in all visitor constructors. That would have to be redesigned to fit DI, but as I said, it’s not too hard.

Composition does not fit the DI pattern since there is no object that implements certain type. It is rather an object which is constructed in certain way. We could argue that composite objects are closer to the Builder pattern which codifies the process of building a complex object. In that respect, IoC container (which implements DI) would rather resemble Abstract Factory, which is of lower complexity level compared to the Builder.

Q: How do you persist encapsulated domain entities to database without breaking encapsulation?

A: The persistence problem is not related to application of Visitor pattern, or object composition, as it exists in class hierarchies as well.
There are two levels at which we can attack it. One is to use ORM which can access private data members of a class. Entity Framework and Hibernate can both access private members. You can use that approach to some extent.

In really complex domains differences between OO model and relational model are significant. When you see that persistence is affecting domain model in adverse ways, it is probably time to separate domain model from persistence model. You can then apply a mapping framework to convert proper OO model into flat persistence model and vice versa.

Q: Can we call this pattern in borderlines of SOLID principles?

A: Depending on application, Visitor pattern may be seen as enforcement to SOLID principles, or as an adversary. Here are some guidelines on that:

S – Single responsibility – Each concrete visitor deals with only one feature. Visitor pattern enforces SRP more than the original hierarchy of classes which were doing more than one thing each.

O – Open for extension, closed for modification – If you have a fixed hierarchy, then you will never have to modify abstract visitor. That is the primary niche of the Visitor pattern. It is natively applicable to fixed hierarchies – like classes of animal species, or types of bank accounts, etc. It also supports easy extension, because extending the system boils down to implementing new concrete visitor.

L – Liskov substitution principle – Methods of original concrete classes are now moved to concrete visitors. LSP is violated if a derived class adds method preconditions. Therefore, if original classes had new preconditions added, then concrete visitors hill have to add them too. Conversely, Visitor pattern will violate LSP to exactly the same extent as original classes did.

I – Interface segregation – Original class can implement one interface per concrete operation. With Visitor pattern, each concrete visitor would represent one such interface. Segregation then boils down to selection of a visitor object, which is even more flexible than compile-time interface segregation. This means that concrete visitors will satisfy ISP at least to the same extent as original classes did.

D – Dependency inversion – Since each concrete visitor is a separate class, and all visitors share the same base type, it is possible to inject them as polymorphic dependencies. That is the premise of inversion of control principle. Also, this was the case with original classes which had the same feature. Therefore, DI is supported by both the class hierarchy and by the corresponding hierarchy of visitors.

Conclusion is that the only true difficulty with the Visitor pattern is when the hierarchy of classes is not stable. If we have to add new classes to the hierarchy, then we will have troubles implementing and maintaining the hierarchy of visitors.

Q: How will be the application performance with this approach when we need create several methods?

A: From source code perspective nothing will change. Solution will contain exactly the same number of methods, only they will be moved to different classes.

From run time performance point of view, there is only a negligent penalty. Where one virtual method used to be called, now we will have two virtual methods called. That does not give sufficient motivation to consider performance loss due to application of the Visitor pattern.

About the speaker, Zoran Horvat

Matt Warren

After fifteen years in development and after leading a dozen of development teams, Zoran has turned to training, publishing and recording video courses for Pluralsight, eager to share practical experience with fellow programmers. Zoran is currently working as CEO and principal consultant at Coding Helmet Consultancy, a startup he has established to streamline in-house training and production of video courses. Zoran's blog.


Starting with the premise that "Performance is a Feature", Matt Warren will show you how to measure, what to measure and how to get the best performance from your .NET code.

We will look at real-world examples from the Roslyn code-base and StackOverflow (the product), including how the .NET Garbage Collector needs to be tamed!

Watch the webinar to learn:

  • Why we should care about performance
  • Pitfalls to avoid when measuring performance
  • How the .NET Garbage Collector can hurt performance
  • Real-world performance lessons from Open-source code

Performance is a Feature on Vimeo.

You can find the slide deck here:

Video Content

  1. Why Does Performance Matter? (4:36)
  2. What to Measure? (11:48)
  3. When to Measure? (21:06)
  4. How to Identify Performance Issues? (25:31)
  5. Alternatives (36:41)
  6. StringConcat versus StringBuilder (41:49)
  7. Garbage Collection (46:21)
  8. Stack Overflow Performance Lessons (50:04)
  9. Roslyn Performance Lessons (50:59)
  10. Q&A (56:52)

Webinar Transcript

Hi. Good afternoon. My name is Matt Warren and this is a webinar doing alongside PostSharp. We have Tony from PostSharp on the webinar as well who will handle answering your questions. Let's make a start. Just to start things off ... this is my details and I'm on Twitter, like most people. I have a blog where I blog about similar things to this talk, certainly around the idea of performance and a bunch of things around internals of .net and that type of thing. That's the kind of thing I talk about a lot.

I do have currently on my Twitter account, there's a little poll if people want to just take a look at that at some point in the next half an hour. There's a poll around ... I would like to get some idea of how much people's current projects they have performance requirements and those sorts of things. If you have the chance and want to go to my Twitter account and see the poll there and answer it, it would be good to get an idea of how those things work out for different people's projects. That's me. Let's get into the main part of the presentation.

I have to put this upfront to really say that, unfortunately, I'm not eloquent enough to come up with this really nice title of Performance as a Feature. Mostly because you can type this into Google and this is the first result returned. If you're not familiar with the Coding Horror blog, it's Jeff Atwood's and he's the founder or starter of StackOverflow and that's when I first heard of this term and probably one of the most popular uses of this term recently is this idea. It's where we're going with this talk. He talks about it in his blog post and this talk is covering the same ideas in that we treat ... security as a feature, we treat usability as a feature we treat, obviously, functionality as a feature, otherwise there's nothing much left. But do we treat performance as a feature? Should we treat performance as a feature? What does it look like if we do treat performance as a feature? That's where we're going with this talk today.

Just to give a little bit more context as well, there's obviously a whole range of areas within the general .net applications, web applications, client or other ones. Many of these different levels are involved, so the UI and whether that's on a phone or an app or your web UI. You obviously have a database and caching layer quite a lot of the time as well, .NET CLR.

This talk is looking about performance within the .NET CLR and the specifics of that and where that is. There's a lot of resources out there on some of the other things for front-end stuff and there's great books around getting better performance there. Database and caching, that's just standard stuff as well. We'll touch on those side of things. The bottom box, if people aren't familiar with this idea of mechanical sympathy, it actually comes from motor racing originally. I don't know if any people are into their cars or petro-heads. This guy here, Guy Martin, popularized this quote. He's saying, basically, you've got to have a level of mechanical sympathy, don't you, or otherwise you're just a bull in a china shop.

The basic idea is for motor racing, to be a good driver, you have to understand the mechanics of the car, you have to have sympathy for the mechanics of the car to get the best out of it. A guy called Martin Thompson co-opted this term. He's mostly from the Java space and his blog is called Mechanical Sympathy. If you want to find more about getting the performance, if you look at the level below the CLR, we'll be talking around things like CPU caches and stuff that's almost outside of the CLR, then mechanical sympathy is a good term. That blog is a good blog to start with to find out what's going on there.

Onto the bits, the agenda, that we'll be covering today. Initially starting with why does performance matter? Why should we take performance seriously? Why should we care about performance? What do we need to measure as part of that and how can we fix the issues and some real-world examples of how some of these issues are fixed, what can be done, where we need to worry about these types of performance issues, so why, what and how.

Why Does Performance Matter?

Why, why should we care about performance? Why do we need to take performance seriously? I think there's a few reasons. One is that actually in this day and age of everything being cloud hosted or ... not everything, sorry, a lot of things being cloud hosted ... Actually there's a monetary saving to be made. If you were able to improve the performance of your application by 20%, 30%, that might mean that you could go into your boss on Monday morning and say, "We can save on our ... your hosting bill or AWS or whatever it might be."

I don't know what sort of relationship you have to your boss and whether you saying that is going to get any money passed back to you, I don't know how that works, but potentially there's savings for the companies anyway. Even if you're not in a hosted situation like that, where you're looking to spec machines yourself, you can still spec lower-cost machines or things like that. There's an idea of saving money. 

I think another one is saving power into ... particularly around constrained devices, phones, tablets, these sorts of things. There's a level where actually saving power is very useful for our users. It makes us happy users, if you like. I don't know how many of you have installed an app on your phone and within a week got rid of it, basically, because you realized it is draining your battery 10 times faster than without the app, so there's that idea.

The other end of the scale is, I guess, people like Google and Amazon and those where they're hosting data centers and for them, every amount of power they can save is more vital. We're probably not the extremes of that, but somewhere in the middle. Again, the idea of good performance equals a saving of power.

I think one of the main ones actually for a lot of our users, bad performance, bad perf basically equals broken. We might, as a developer understand that in reality that page is just loading slow or that button click takes a long time to render the response, whatever it might be, because of bad performance, but users don't really think in those terms. They just think this site's running slowly, this site's not responsive, this app takes too long every time I click a button I get frustrated and I kick it 10 times. For them, bad performance equals broken. The worst end of that is that they're customers who don't come back or they're customers who never buy our products or maybe they'd just be unhappy customers. Either way, it's not a good experience for our customers. Bad performance, at some level, equals broken for our customers.

A real classic example of this is ... Google did a study and they introduced artificially a half a second delay and for them that caused a 20% drop off in traffic. Obviously, that's an extreme end, but for us maybe there's a level that that's a problem. Maybe the customer demo goes so badly wrong because of bad performance and the customer never buys your product or maybe a fairly influential person on Twitter uses your product as a bad experience and tweets about it and gives you some bad press, whatever it might be. We're not going to probably see the same level as drop off in traffic as Google, but there's some level, I think, where we're going to have lost customers or unhappy customers. Customers aren't going to buy our products, aren't going to buy again, that type of thing.

I think there's a few reasons there. Maybe all of them apply to the sorts of products you work on, maybe just some of them, but there's some reasons why we should be taking performance seriously. I think another one, as well, is almost like a pride as us as software developers. This quote from a guy called Henry Petroski, who is an engineer and written a lot of books about engineering and mechanical engineering and civil engineering, a professor at Duke University. It says that basically us, if a lot of you on this webinar are part of the software industry, we're doing our level best to cancel out the steady gains of the hardware industry. We're probably not being that deliberate about it, but the idea applies. Hardware has generally been getting faster. We're not in the same level of CPU increasing, but it's multi-core and the ability of what hardware can do is generally increasing at a fairly large rate.

But potentially, software is treading water or causing that to slow down. We sort of know that ourselves, don't we? If we get annoyed with Word, the latest version of Word and we say, "Oh, on my old 386 PC, Word 98 was lightning fast."

The new version of Word on my quad-core PC with SDS and stuff is running ridiculously slow. We kind of know. There's that idea as well. It wouldn't be a talk about performance without this famous Donald Knuth quote about premature optimization is the root of all evil. That may be true, but a lot of the time that quote is actually misquoted, as you can probably guess where this is going. The entire quote looks like this, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

There's this idea that actually yes, there is times where premature optimization is valid. We're doing it for the wrong reasons, we're doing it because we just want to optimize for the sake of it or whatever it might be. But it is also saying that there's times where there's opportunities, the critical 3%. It's interesting because it kind of implies that, in a way, that we need to measure this. What is the critical 3%? What is the 97% we can ignore? We need to know these things. We can't guess at these things. At least if people are going to quote that optimization is the root of all evil, it's nice to know the full quote that goes around it and where it fits in and what he was saying a bit more.

One bit to sum this all up for me, there's a developer named Rico Mariani, an architect at Microsoft, did a lot of work on, I believe, the version of Visual Studio, when it came out, when they added a lot of WCF capability into it and lots of nice UI, but it slowed things down a lot. I believe he had a lot of ... a hand in all of that as well, but a hand in making the performance of that better. He sums it up like this, "Never give up your performance accidentally."


This idea that there's always going to be a trade-off. We can't always have the time to make everything perform as possible and that's not always a worthwhile endeavor. There's a point where, actually, the performance is good enough for business reasons, for customer reasons, whatever it might be. But at least let's not give up our performance accidentally. Let's know where these places are. Have them measures and make sure we understand, yes, this bit of the code is not as fast as it could be, but we've measured it and we understand that it's as fast as we need it to be for our situation and any extra optimizations, we believe, are going to take too much time or not be worth it. We're not treating this blindly, we're saying we're going to understand where we might have performance issues. We're going to be deliberate about the places we do and don't fix them.

What Do We Need to Measure?

So onto the what side of things. I have a little section in this talk where I want to go on to about how averages are bad. I don't want to just flash up something in red in our webinar and leave it there. I'm going to explain it in a bit more detail, but generally when we're measuring the averages, it's not a bad aura. Don't give us the whole picture. I'm going to demonstrate that now and give a little bit more context to that now.

If you remember back to your math days at school, if that's the last time you've done math or maybe for some of you this is more familiar. This is the ... what's often known as a normal distribution or a nice bell curve and here the average is sitting right in the middle. We'd say the average is the peak of the curve. We know because it's a normal distribution we know that 95 our of 100 people, 95% of people are going to fall within the dark blue area right in the middle. We know that only four of 100 people are going to fall further out, with the extremes, the plus or minus. If you're wondering, that strange circle symbol, that's standard deviation. We know that only four out of 100 people are going to fall more than two standard deviations, plus or minus, and we know that only three out of 1,000 will fall into the very end pieces, the pink pieces is right at the end.

So given an average value in this sort of scenario, we can, in effect, the average just sets where the middle of the curve is. We know that if it's normally distributed, we know what the tail of. We know that we're not going to get responses, if we're talking about measuring response times on the webpage. We know that we're not going to get renderings if we're in an application, whatever it might be we're measuring. If it fits in a normal distribution, we know there's not going to be outliers, which is the problem. If not, people are going to have a really bad experience. Given the average value, we have a rough idea of where all the values might fall. But, unfortunately, this doesn't cover all scenarios. To put it a different way, I always like some good quotes and this guy, Hans Rosling ... I'll talk a bit more about him in a moment, but he came up with this fantastic quote, "Most people have more than the average number of legs."

I'll give you a little while to process that for your minds and see if you can figure it all out. I'm not going to do the math, I haven't got a whiteboard or anything, but the rough math is basically in a population of any country, whatever, there's a lot of people with two legs. Some amounts, bit smaller amounts, with people with less than two legs for a variety of reasons I won't go into, but you can imagine the reasons. If we then calculate the average number of legs divided by number of people, we're going to get a number less than two. 1.99, 1.998, whatever it might be, but it's going to be less than two. But we've said that by far the majority, most people, have two legs. Most people have more than the average number of legs. It's just a way of showing actually sometimes, in certain situations, averages can be misleading, not really give us the whole picture.

Hans Rosling, just as a very short aside if you're into stats or not even at all into stats, but want to learn a bit more about stats, he has some amazing Ted talks and links there at the bottom of the screen or you can search for Hans Rosling Ted talks. He has a fantastic way of bringing statistics alive in a way that few people do. If you've seen a talk with a guy jumping around, pointing at bubbles on the screen, you've seen some of his talks and I'm sure they'll be familiar to you.

But we're not generally measuring numbers of legs. That's a nice quote, it's a nice aside, it shows the point, but actually we're probably, for some of us anyway, we're measuring something like this. This is response times of a webpage, but it could apply to a variety of other scenarios. But we're just going to focus on this one. Again, don't remember your math because you haven't done it since school, histograms, in effect, buckets. The very left-hand bar is the bucket from zero to five milliseconds, in this case. We know that we have 21 responses fell in that bucket. We don't know where they fell within the zero to five milliseconds, we just know there's 21. The next bucket is five to 10, 10 to 15 and so on and so on across the scale across the bottom and high to the bars and underwriting. We know in this case most items fell between 20 and 25, if I'm reading it right, the bar at 31 high.

This response times, this is actually quite a classic scenario of response times. The reason we have the large amount of response times on the left that are happening in under 40 milliseconds, that blob of bars on the left-hand side is because they're not hitting the cache, in effect. They are hitting the cache. They are fetching it very quickly out of an in-memory, cache or wherever it might be, some level of caching. Most of our responses hit the cache. I think in this scenario it's around five out of every six or maybe six out of every seven, something like that. Anyway, a majority of them. We can see that from the graph.

We can actually see quite clearly in this case, our cache is working. The little group of ones on the right-hand side take around 100 to 140 milliseconds, that's ones that don't hit the cache because they're not in the cache, in effect. We have to then do a network call, for instance, to go and get the value from a backend service or a database call or something. That's why there's nothing really going on in the middle, because majority of ones hit the cache very quick, minus a network call or a quicker network lookup. The other ones take longer to do.

Now I've explained it or even from the point I showed the slide, for a lot of you, you can understand this. You can see whether this is acceptable for your users or not. You say, "Well actually, yeah, definitely the caching is working the majority. More than the majority of our hits are going to the cache, that's what we want."

We can see as well that actually no one's getting a response more than 130 milliseconds, the final bar. By the time you get down there, there's very few users. Only one, I think, got between 125. We know that's a kind of worst case and that's often what we care about, the worst case scenario. I've done it backwards deliberately. This is the histogram. If you were to try and imagine ... I'm not going to ask for a question and answer on this, but if you were to try and imagine what the average would be, it may be hard to fit it backwards so I'll help you out. The average value of this is 38.3. We've gone from the more detailed and more informative histogram to the average, that's fine, we can see that. If I was to work the other way around, if I was to have given you the average first and only the average, not the histogram, you might not have imagined that this was the way the response times panned out. You might not have imagined that there were some people getting response times of over 100 milliseconds when I told you the average is 38.3.

You might have imagined the nice bell curve we talked about in a few previous slides where they were trailing off nicely and by far the majority of them were clustered around 38. Actually, the story is a lot more complex than that and for those into your math, this is known as bimodal. There's two modes to this. It's not a normal distribution. Normal distributions only happen in things like height of a population, weight of a population. Response times of applications don't often fit the normal distribution. That's the main reason why if we're measuring performance and particularly when we're measuring things like websites or response times or applications or whatever it might be, when we're measuring these sorts of things, we need to look at things like histograms and stuff like that.

Histograms are all well and good, but they take a bit of space to plot out. They're not always ... they don't help for things over time. What are some other things? Well, this is from the Application Insights Analytics tool from Microsoft Azure and, actually, you can use it outside of Azure as well in this information map. They use what's called percentiles. Again, not to get too much into the math side of things, I know that's not everyone's cup of tea, but basically very simply, percentiles is if you took all the responses in a certain period of time and rank them from lowest to highest, the 95th percentile, if you only had 100 responses, would only be the 95th highest one or the 95th, so that's what it means.

Again, it tells you how people are experiencing at the worst end of stuff. Responses can never be lower than zero, that's kind of fixed, but they can tail off. 95th percentile, 99 percentile, that's linked to the idea of five nines and four nines and all this sort of stuff. This graph is quite nice to show this, because the huge peak that we see about a third away from the right-hand side, only shows up at a high percentile, is completely lost to the other percentiles and would be lost in the average. Whether that matters or not is another discussion, but we can certainly see here that some amount of customers had a very bad experience at that particular time. I'm only showing the average when I've seen … effect having the straight bar across the bottom, pretty much, which is the 50 percentile, which is very similar to the average. This is a way and a lot of tools display the idea behind histograms or the similar things they display in percentiles, because we can track it over time.

When We Should Measure Performance?

This leads us on to when. When should we be measuring this? When should we be looking at this sort of thing and, hopefully, if you've had the chance to answer my Twitter poll and I'll see the responses later on for how that matches in terms of performance requirements, but I would argue that actually for a lot of this we need to be doing this in production. I guess, if you've done web apps or apps on phones or these sorts of things, you can do all the testing you want with all the different handsets, all the different browsers, but there's always that one person that has that one that you couldn't possibly have imagined and all your fantastic test team and all the work you did before production wouldn't have shown up or you'd have to try 10 times harder and the costs would have been prohibitive.

As much as it's really useful to be measuring this stuff before production, there's a level that you want to use some type of tool and there's lots of tools out there that allow this to see this in production. The other reason ... the way to argue is that your users are seeing this. If there's a performance problem in production, a user or several users seeing it and you'd kind of like to know before they tell you, because they possibly won't. They might just never come back. Some level of monitoring production or whatever is possible is a good thing to have.

I would also say that you're unlikely to see any perf issues or very few of them in unit testing. This is not to knock unit testing in any way. I think it's a fantastic tool for what it does, which is allowing you to test a unit of your program and make sure the functionality works, but you can use unit testing frameworks to give you some idea about performance, but you'd want to be writing a different type of test. The basic reason is a lot of time in unit tests we put some mock data and that mock data is just enough to make the test do what we want it to do, to exercise the path. It's unlikely to be the same amount of data that we might put through our production systems or have running through our production systems. An algorithm that works fine for 10 items in a list might fall over and have horrible performance when there's 1000 items on a list. That's why generally you won't see any or very few performance issues during unit testing.

Also I'd argue to my first point, I don't think you'll see all performance issues in development or you'll have to try very hard to get to that level. There's always going to be ones that come up outside of development and production. You can have a great testing team, they can test out a lot of things and I've seen examples when it works, but there's always the times where things come out in production that you can't have imagined otherwise.

Tony: Excuse me , Matt, I have a question here.

Matt: Sure.


Can you show us some real-world examples of when unit tests do not catch the performance issue and this performance issue will be only seen in production using some performance test or performance measurement?


Yeah, I had one on a previous project we worked on. We were using an off the shelf IoC container, but we were customizing a bit for our needs and it worked absolutely fine for our unit testing, it worked fine when we were testing it, single person testing the application, but as soon as we put any load to the system, it fell off because we were using it in the wrong way, in effect. It looked absolutely fine in all stages of our development until we did our real longterm perf tests of some multiple days and it showed up over that time as a ... in effect a huge memory leak and actually caused pretty bad knock on effect for our response times and stuff like that. I've definitely seen that happen and that's a classic example. Everything looked fine under small load, because the app hadn't been running long enough, there wasn't the usage of the IOC framework and stuff wasn't getting exercised in unit tests because it was starting up with a clean one each time. But when it had been running for a while and we had a more realistic load through it, we definitely saw a big difference. Fortunately, in that case, we caught it in our pre-production.

Tony: Okay. Thank you.

How to Identify Performance Issues?

Okay, so how can we go about this? How can we identify performance issues? Measure, measure, measure, measure. Measure once, measure twice, however you want to think about it. You really need to be measuring this sort of stuff, but even more than that, I would say that you want to be identifying ... you want to measure to identify the bottlenecks in the first place. We'll talk about some tools that can help you with that in a moment. Also, equally important is you want to measure to verify the optimization works. A lot of time has built up knowledge over the years of things that are more performant or aren't more performant in .NET particularly or in other frameworks as well and some of those things may have been true five years ago in that version of the framework that aren't true nowadays.

You don't want to be just blindly applying what we think is an optimization we want to measure. Measure in the beginning, measure during and certainly measure afterwords to verify our optimizations work. Just get away from the idea of blindly applying stuff that we may have read elsewhere. Some of the tools we can use to do this ... one of the best ones I've come across is a tool called Mini Profiler from the development team at Stack Overflow developed this for themselves and then have made it, fortunately, for the rest of us has made it available. It's a great tool. Initially when you run the tool, you don't get this whole popup, you just get the little red section in the top right-hand corner. Integrates with ASP.NET, MVC web applications, integrates with a whole range, actually. You can see it runs as versions of Ruby, as versions for ... you can run it in console applications. It's quite a wide ranging tool, but the initial or the main use case is for ASP.NET.

It puts this little render into the top right-hand corner of your pages when you have it turned on or just for certain users, like if you want your developers to see it, but not for your customers, however you want to set that up. That gives you the page rendering time. This quote at the bottom really sums up to me why it's so useful is this idea of having that in the top right-hand corner when you're in development is pretty useful for developers. I know I'd much rather see straight away that I've made a certain page on the website slower by something I've just changed. I do like to see it before anyone else sees it, but certainly I'd like to see it before it goes to production. This idea of having these numbers up front and not in some log that developers need to go and look at every time they're there, every time the page is rendered it's there.

It gives you more than just the total time, it gives you a great drill down into ... we can see here, sequel calls, page render times. It gives you quite detailed information about parts of MVC, pipelines, rendering pages, natural controller action. You can insert your own timings if there's bits of your code that you particularly want to have a number for those ... it will tell you time for database calls. It integrates into things like NC framework and other ones as well and other, I believe, anyway. It does some wrappers to give you this. One other thing, which is not completely obvious, is down in the bottom right-hand corner, you've got these sequel in red and that's telling you when you have things like selecting plus one queries or duplicated queries. It's a pretty informative tool.

I know for a fact that at Stack Overflow, they run this in production. We don't get to see this when we visit the site, but their developers get to see this from any page when their developer visits the site. They also store the numbers that are used to create this rendering here in aggregate so they can then query that and come back to look at it later. I believe it's not for every request, it's for some sample of the requests, but certainly they're happy in having this running in production. I agree. I think it's quite a useful tool. There's a lot of information you can get from there. So check out Mini Profiler and there's a search for Mini Profiler on the site explains more of the features in detail.

Again, from StackOverflow, there's another tool they made available their Opserver Monitoring Tool. There's lots of monitoring tools available that give you this sort of idea of dash, but I just picked this one because it's an open source one and, again, it's one that StackOverflow, which I believe is a top 50 website, certainly a very high website in terms of page views and stuff like that they've had their own tool that they wrote because existing ones maybe didn't fit their needs. This tool is at least a bit more … for the type of scenarios they're in and this tool runs on a busy website. You can go and see more about Opserver there. It integrates quite nicely with Mini Profiler. What I quite like is this screenshot shows they actually used Mini Profiler, as you can see in the top left-hand corner, they run Mini Profiler on their own Opserver tool to make sure their Opserver pages are rendering reasonably quickly, which I guess makes some sense. If the page isn't rendering quick enough or is rendering so slowly because of performance issues, it's not going to help you as a dashboard, you want it updating reasonably quickly or reasonably frequently.

At some point, particularly, as I've said, this talk really focuses on stuff inside the CLR, that level of performance and we'll come to the real-world examples in a moment. You get to this topic of Micro-benchmarks. I guess I would always say that with these, you want to be profiling first to identify where there's places that are an issue and then do your micro-benchmarks. The problem when there's micro-benchmarks is you get into a situation ... you pick some bit of code that you think is running slowly, you run a micro-benchmark and say, "Oh yeah, that's running 20 milliseconds, or whatever it might be."

You then increase that bit of code to run 10 times faster or whatever it might be, but you lose the context of where that fits in the application. You lose any acknowledgment of actually is that a part of my application that runs repeatedly or is that a part of my application that runs just once a day? Does that speed improvement that I've made, the optimization made, does that have any effect on the real webs production system or is that just quicker when I'm testing in a micro-benchmark?

Whilst there are useful tools, micro-benchmarks, by their definition, they lose the context of where it runs in the whole system. I would always say you want to be starting the profiling first.


Excuse me, can I have one question about profiling here? When we use the profiler, it will certainly show us a lot of issues and there's also this 80/20 rule in software development telling us that by modifying 20% of code, we usually solve 80% of troubles. Does it also apply in this case? What would you recommend for problems to solve?


It's a good question. Generally, you should always be starting with the most expensive thing, the thing at the top of the profile. You should be fixing that first and the reason for that is quite simple. It's that actually when you ... if you fix that one first, if it's one you can fix, if you fix that one first, it might make some of the other ones go away, because they might have been dependent on the first one. If at all possible, you should always be starting with the most expensive thing, which fits in with if you like the 80/20 rule, the thing that's taking the time. Generally, a lot of times ... I've seen performance issues, there's often one thing that stands out as being a worst case and you should make your best effort to fix that first, if at all possible. Then when you've made that optimized, then run your tests again and see.

A similar sort of idea applies. You shouldn't just be ... if you're going to bother profiling, you shouldn't pick the thing that you want to fix the most. Try and start with the thing that's taking the most time at the top and get rid of that first and then see what the profile looks after that.

Tony: Okay, thanks.


Okay, I just briefly flashed this up. It's a little bit tongue in cheek, but the idea behind this is when I show code samples ... a lot of presentations will show code samples and expect you to go away and use them. In this presentation, it's almost the opposite. I will show you some code samples of performance issues and the before and after and the changes that were made, but actually hopefully what I've got across previously is actually I really don't want people to go away and blindly change their code because it was in my presentation. I'm more showing you the tools and some of the areas of code to look at that can be performance bottlenecks, but you should certainly not be changing any of your code based on things you see on the slides coming up unless you've identified that as a bottleneck for your particular application.

The reason for that mostly is that a lot of the time the high performance code is harder to read, harder to understand and less intuitive. If it wasn't, you probably would have written it in the first place. It means that you're potentially making the code base, as a general thing, worse for the sake of performance and if that's needless, for the sake of performance, it's not a good thing to be doing. That's what I hope this half tongue in cheek but half seriously hope that people take away, particularly when you see some of the things later on, you shouldn't be changing it just because I told you to.

With micro--benchmark, I worked on a library called and there's a nice writeup on Scott Hanselman's blog about it and working on it with a guy called Andrey and a guy called Adam and we have attempted to make a library that will make micro-benchmarking as easy as possible for you. If I show ... I'm going to click on the next slide to show an example. We're going to look at this one. This is a benchmark of reflection, so the standard stuff for most ... there's other tools available, but for most of them, really what has done is you're writing ... in the functions you're writing what you want to benchmark. You put the benchmark attributes onto it and some other ones as well, baseline it, which is true in this case, and then the last bit of code is really asking to run a benchmark. That's as simple as you want it to be. Then it does the work of giving you accurate numbers, giving you numbers in a nice format and things we'll show later on ... trying to get an accurate representation of the functions that you're asking it to run. Alternatives


Could you please mention some other tools similar to and tell us briefly how compares to those?


Yeah, sure. A few that I've come across ... there's one called Nbench, which is from the guys, the people who made and it's interesting actually because they made it because, as far as I understand, the story they had a particular release of their software that had a performance regression and they wanted to make sure that didn't happen again, so they devised NBench, which is more focused around writing performance ... they look like unit tests and they run as part of a unit test runner, but they are tailored, or they are designed to be performance tests. What I said before you're not going to catch by accident, probably, performance issues with a unit test if you craft a specific unit/performance test and that's what they've done.

That runs on every build and they have assertions in there that say, "Did this particular code take longer than x amount of milliseconds? Does this bit of code allocate more than this much memory?"

I believe you can do, "Does this bit of code run faster than this other bit of code?"

Anyway, they have those sort of ideas and those tests will then fail if they run too slow. The idea is to pick up performance regression. is not focused around that, it's more of a console running tool. It could be extended, but we don't have that at the moment. That's the main difference with NBench.

There's also another tool called xunit.perfomance, that's used by some of the Microsoft tools like Roslyn and the Core FX and they have a seamless idea that it runs on every build and they've crafted performance tests for specific parts of that code and they just want to spot regression so it will give xunit.performance more traction over time so they can see these bits of code are running this fast and this build is fast and this build ... over time will be regressing were they getting slower or faster. The main issues with these sorts of tools is you need to make sure you're running on the same hardware if you're going to compare runs like that.

Another tool I should mention etimo.benchmarks. I just learned about this the other day is a very similar, much more aligned to and similar tools. There are other tools out there. Most of them, as far as all the ones I've come across, they all do the accuracy bit ... it's not straightforward, but they all do that, otherwise they won't be using that and you want accurate results, but they just vary in terms of their focus, whether it's focus for performance test or failure build or focus for running into console, wanted to try things out and give you results like that, has a different focus.

Tony: Okay, thank you.


Yeah, no problem. I picked this example because reflection is often talked about and we say, "Reflection is slow."

We're just looking here on the uri system, .uri object and we're doing a regular property call, so object dot host in the first benchmark and in the second one we're going to get the same thing via reflection and get the same value and see the difference. Just a really quick recap, certainly in we cover all the way down to reporting results in nanoseconds, which is a billionth of a second. There's microseconds and milliseconds, just as a quick refresher for people, if you're not familiar with those and the terminology. A lot of the time when you're talking about stuff just in the CLR and stuff, you can get down very easily into nanoseconds, it's not inconceivable to have stuff running in the nanosecond range and hopefully the tools will report that.

What does this look like for the benchmark we talked about before? This just shows quickly the output of As I said, we mostly focus on console output, although we provide tables and stuff you can post into to get help in other places. This is the numbers. The regular property call comes in at 13 nanoseconds and reflection comes in at 230, so reflection is clearly slow, there's no argument in that. It's roughly 18 times slower. In this case, the regular property URI is not the most ... simpler properties there's a bit more going on so the regular property call is a bit slower than what you'd expect if it was just directly getting a backing field, so it skews things a little bit. Anyway, you get a rough idea of the timings here. That's why in we like to report the scaled number and the absolute timings to give you an idea. So yes, reflection is slower, definitely. It's slow, it depends on how often you're doing it. What ways you're doing it. This is just a simple property access. If you're doing more of it, you're doing more complex reflection, that's going to add up. If you're doing it lots of times a second, that's going to add up. But the idea is not to say that that's wrong, that is definitely right, but the reflection is slower. How much slower is worth figuring out for your example, for your scenario.

StringConcat vs. StringBuilder 

Onto another one. StringConcat versus StringBuilder. It's always sort of said that you should be using StringBuilder and that's a great general rule, we're not going to argue against that, but in terms of performance, what's the difference? Interesting enough, there's a link to the Roslyn issue at the bottom to try and introduce StringBuilder in more places and you can read the issue if you're interested to see where that went, because generally StringBuilder is better. You're doing less allocations. The issue ... if you see what we're doing with String Concat in this case is we're taking the string and cutting a new string, but each time we're throwing away the previous string because next time around the loop, we're adding a new string to it. Strings in .NET are immutable. If you can concat them all in one go, great, but if you concat like this in the loop, you're basically taking a string that contains a number zero, we're adding to that a string that contains number one, making a new string which is zero one. Next time around it's taking zero one and adding two and so on. Basically there's a lot of waste in this particular example of StringConcat.

StringBuilder doesn't have that issue, we only build the string fully up at the end. We're doing two string. Anyway, what does the actual difference look like? This is an output from It shows you a lot of detailed information around the allocations and I realize this is hard to look at, I'm just going to show that briefly, this is the raw stuff we give you but much more useful is some graphs that show this in a better way. Basically, the long and short of it is that for a lot of cases, the performance isn't hugely different but there comes a point, depending on how many times you're concatenating strings where the performance of StringBuilder is way better. That's mostly related to the fact that it's doing less temporary allocations, so less work for the garbage collector to do. The actual difference is not huge, but it's all about those temporary strings.

This is one example, it's potentially a controlled example because we're concatenating a lot of very small strings. If you were to conduct a small amount of larger strings, your results would be different. The point is not to take this as a general thing, but measure this sort of stuff when it matters and try and get some truth around these ideas if a StringBuilder is better than a StringConcat.


Since this kind of performance issue is visible in source code directly, should we consider them when doing core reviews?


Yes. I think the general rule of using StringBuilder with StringConcat is great and I think it should always be applied, basically because if we flick back to the code, you're not making the code more complex. You're not using some handwritten class to get absolute performance reviews built into the .NET run time. It's designed for this sort of thing. It's tailor made for it. This sort of thing is a great example of what actually ... this whole idea of premature optimization or not. Actually, we want to be writing the best code we can from the start, really, based on our knowledge and our best practice and all that sort of stuff. I think this is a good one, but the time, I would say, be careful in code reviews going beyond this. I think StringConcat, String Builder are on quite good ground there, but particularly back to the reflection example, to say blindly we shouldn't ever use reflection ... actually sometimes you have no choice, so that kind of balances that one, but in other situations it's like how much slower is reflection? Is it a worthwhile trade off in our case? To say don't use reflection during code review stage… I think like in a lot of these things, it is a balance, but I think the String Builder, StringConcat ... going to your actual question is actually there's no real downside to changing the code to String Builder if it's not ... it's no more complex. You're not writing code that won't be understandable by someone else, things like that. I think it's a worthwhile thing there. I guess the thing to do is make sure that you have someone else on who's taken the time to just check these things and understand what's going on in different scenarios and ... so there's a bit more knowledge and data behind it would be my recommendation.

Tony: Okay. Thank you.

Garbage Collection 

Onto some other things. Basically, we touched upon it on the benchmark just there but with the .NET Garbage Collector ... it’s fantastic aid to programming on the .NET runtime. It takes away so many issues that you have to worry about in languages that don't have or run times that don't have it and it's true that allocating is very cheap, but the main issue is that cleaning up afterwards, to make allocations cheap, the .NET GC has to do work in the background. It has to compact. It has to search for objects that are available, this sort of stuff. It's actually sometimes difficult to measure the impact because it happens that some of the tasks are, in effect, are synchronous, so when you write a bit of code, it's not that point the Garbage Collector kicks in. The Garbage Collector kicks in when it feels like it needs to and at that point you might get GC pauses. That's the main issue.

There's a few tools that can help understand when there's excessive amounts of GC. Very simply, with perf …tool, system internals, time in GC, it's hard to put an exact number, but I would say once you get above 50% of the time in GC, there's a problem because it's spending more time on doing garbage collection than it is running your program. I've heard numbers that say 10% or above GC is another cause of concern. But certainly seeing a sustained high amount of time in GC is a red flag and you want to investigate that more.

Another tool that allows you to investigate that more is PerfView. I always say that PerfView wins the prize for being the most useful but possibly ugliest looking tool. It's on that end of the scale. I'm sure we've all used tools that look amazing, but give you no functionality or no use for functionality. PerfView is the complete opposite. Don't be put off by what it looks. It's a functional UI. It does exactly what it needs to, but it can give you some very useful, low-level information. It works on top of ETW Events, event tracing for Windows events. It's designed to be very fast. They do say that it can be used in production apps for short periods of time with minimal impact. It's not saying you'd want to turn it on all the time, but you can turn it on for a while for investigation. Please test it out before turning on your production app.

In terms of GC, what it gives us in the chart at the bottom is this max pause time and GC pause time equals time when your application wasn't running. If you're here with a quite small pause time of eight milliseconds, but it does vary a bit with which GC mode, whether it's work station or server and background and foreground. At certain points of time, the GC kicks in and does stop the world or kind of stops the world and when it's doing that, none of your code can run. If that pause takes 100 milliseconds and you're SLA is 100 milliseconds, you've lost it because GC pools, because any responses that were happening at that time were only button clicks away will happen at that time will be paused until the GC is finished.

Fortunately, over the time the last releases of .NET, the GC has had more and more features, so it does this more and more in the background and a GC server mode has a background mode now in. NET related versions. So the times when your entire application is paused is becoming less and less, but it's still definitely a possibility.

StackOverflow saw these huge spikes in GC pauses ... at least over one second up to four seconds and they would render their pages generally in under 100 milliseconds. So for them, this is a bad experience for users and you can see a link there at the bottom for the full details of what happened there and how they fixed it. 

Stack Overflow Performance Lessons 

There's also some nice performance lessons from StackOverflow. They controversially say use static classes for them, the performance benefit of having static classes versus […]classes all the time was found to be a measurable impact on their application.

I'm not, again, saying that was a general thing, but for them they found that it worked well. They're also not afraid to write their own tools when off the shelf tools don't give them what they need or don't give them performance they need. Generally, Dapper is their micro ORM or macro ORM that has very high performance. Jil, JSON Serializer again, that is tuned for high performance and Miniprofilers that we talked about before. For a lot of this, you need to understand the platform, the CLR is not a black box. There's stuff going on in there, particularly around the garbage collector and things like that, you need to ... if you want to get the most performance out of .NET, you need to try to understand what's going on there.

Roslyn Performance Lessons 

Again, onto the code samples and more just finish with this last section and talk with some examples from the Roslyn codebase, actually. There's an entire talk on this. This is just a small sample. You can see the link at the bottom. There are some places ... the thing I find most interesting about this is this is the people who write C#, the C# compiling team. Some of the stuff they come up against is interesting because some of this then fed back into the language, because they were seeing this as performance issues in the Roslyn product they were writing, the Roslyn C# compiler and so they then, where applicable, fed it back into the language.

The performance ... with all these examples, you can assume it's a bit of code that's running a lot. This is a logger class and the fix in this case or what they changed their majors, they have all these boxing and added the two string cores there. There's some details in the pull request at the bottom to explain what's going on there and a bit more context to it and how it can, in some ways, has been added up back into the compiler but can't in all different ways because ... the link explains a bit better, but what's interesting with this is actually if you use Resharper or other tools, they tell you to remove the two strings, because they're redundant because they are technically redundant, but not if you care about the boxing.

Really, this isn't one you should be applying unless you, as I said, profiled it makes the codes… I guess uglier. It's not intuitive while you're doing that. Unless that code's being called a lot, the overhead of boxing won't be noticeable. But it is one they came across in Roslyn. 

Another one they came across as a performance issue. This is fine matching symbol in the compiling. You can assume that's being called a lot of times. A lot of calls to this server over a period of time. Roslyn compiler is not just running when we build our projects in Visual Studio, but it's constantly running in the background of Visual Studio to power intellisense, to power syntax highlighting. So there is bits of Roslyn, in effect, running continuously in the background whilst we develop in Visual Studio as well.

Interestingly, their fix in this case was to not use LINQ. This is really the one I really would hate for anyone to go away and take out LINQ. I think LINQ is a fantastic feature. It makes code that is much more understandable, more concise. Anyone, almost, could read LINQ. It takes a bit more understanding, potentially. But there's an overhead to LINQ. It doesn't come for free. There's stuff going on with the compiler to make that possible. Stuff going on in the background. It's basically, again, an extra allocation with LINQ. They found that the old iterative way of a simple foreach loop in doing the same thing worked for them in this case.

And to give you an idea, the actual difference in timing is between Iterative and Linq isn't huge, it isn't even double in terms of raw speed. But the main issue is the number of gen zero collections basically caused by the number of allocations. The Iterative one will almost never allocate. The LINQ one almost always will allocate, basically and it's those allocations that cause the garbage collector to do more work and if you call in this sort of code enough or cause the difference in times.

The final one from Roslyn, again, this is a bit of code that's running a lot. It's working with generic types in Roslyn and it's already using StringBuilder, which we discussed before is actually good best practice. It's not allocating lots of temporary strings. It's building it up bit by bit and then recoding the strings at the end. So in terms of string processing, this is about as efficient as you can currently get in .NET. Actually, they found that in this case, they used object pooling. They found that the allocation of a new StringBuilder object was costing them every time if you do it a lot. StringBuilder, by default, I think allocates pre-sizes to around 16, so there's a bunch of allocations, it's a certain amount of bytes allocated every time you make a StringBuilder even before you've added something to it. They made a simple object pooling cache. I’ll show you the code they used to do that. The system pool thread static when they acquire they look for the one on the current thread. If it's not there because it's not been created yet, they create a new one, clear it out and then return it. When they finish with it, they call it GetStringAndRelease and put it, in effect, back in the cache, it‘s not a pool in the classic sense, because it's per thread and there‘s a single one.

There's a couple of things just to call out before people even consider object pooling. One is that if you're going to pool objects, you need to make sure you return them to a clear state when you're finished with them because when we allocate new objects, normally that's done for us behind the scenes and we get an object in a known state of things errored out, mulled out if you like. With the StringBuilder, that's nice and easy because you can call .clear and in other cases it's not. This is a simple cache because it's thread static, so it's per thread. The downside is it stays on that thread for the lifetime of that thread. StringBuilder is still there. If you have lots of threads, you have lots of StringBuilders. Often the other solution used is a global logic pool, but that requires locking and that's much more complex. There's a trade off in both these scenarios but you need to worry about ... you don't want your object pool growing too large and storing more because there's a cost to storing. It has to be tuned and thought about before implemented.

That's brought me to the end of the talk. Hopefully that's been useful for people. I don't know if there's any questions that have come up as I've been going along.


Q: Can can be used in continuous integration and fail?

A: It's not something we currently have support, if you like, out of the box with the main product. It's something we keep thinking about adding, but with time ... there's been a few community contributions getting us there, but we're not there yet, but it's certainly something we plan to have. You can take the raw stuff gives you and certainly you could build that yourself, but it's not something we provide as yet. The main issues around providing that is if you're going to do the whole thing, you need to worry about storing your results, running it on consistent hardware and a lot of other stuff that's on the outside of our ideas at It may be something we get in the future, but it's not something you can do straightaway with, but you can certainly use to give you the raw numbers and then you could build that tool on top of it or implement on top of it, yes.

Q: How is benchmarking different than finding the difference of time span between start and end of a function?

A: How does do it differently? One of the main things is we use the stopwatch, which is a bit more accurate than just relying on time span. We run the code multiple times, that's the other one. There's a couple of things ... there's not lots of things you have to do, but you certainly want to call a function once first to let it be jitted, because you pay that cost of the jitting of the function the first time it's called. You don't want to measure that in terms of the real performance, because that only happens once. You want to get that out of the way, then you generally want to run the function multiple times in batches and get the timings on the batches because there's a limit to ... if we're talking about something that takes nanoseconds, you can't just measure that with a before and after.

You need to run it multiple times in a batch until you can actually record the length of the batch and then work out the per iteration time. It's doing a bit more but it basically boils down to running a function multiple times. Jitting it, first of all, and the one other thing we do is the jit compiler in .net ... if it sees that you're calling a function but not doing anything with the result, it might remove that, say there's no need to do that because it's … . It doesn't go into anything. We make sure that doesn't happen. We prevent that from happening so that we ensure the code that you think you‘re benchmarking is actually benchmarked. So they're the main sort of things we do.

Q: Does LINQ perfomance problem raise only in Roslyn or does it have a place with older .Net platforms?

A: With regards to LINQ, there are no major differences in the Roslyn compiler compared to the older one. So performance issues can exist in LINQ across all .NET compiler versions. The example was included because it was a change made to the Roslyn code base itself, not to the code it produces when compiling LINQ.

Q: What is the best setting for GC on Win Server 2012 R2 which hosts 250 ASP.NET MVC apps on IIS. I am talking about gcConcurrent and gcServer in aspnet.config file.

A: Generally, server mode is the best when you are running on a server, but you actually don’t need to do anything, because it’s the default mode in ASP.NET Apps, from
You can also specify server garbage collection with unmanaged hosting interfaces. Note that ASP.NET and SQL Server enable server garbage collection automatically if your application is hosted inside one of these environments.

Q: Do we get hints, how to measure or which tools are recommended to measure performance issues?

I really like PerfView, it takes a while to find your way around it, but it’s worth it. This post by Ben Watson will help you get started, plus these tutorials I’ve also used the JetBrains and RedGate profiling tools and they are all very good

Q: What other techniques can be applied for embedded dotnet, single use application, to avoid unpredicttable GC hesitation. Considering the embedded device is not memory constrained.

A: Cut down your unnecessary allocations, take a look with a tool like PerfView (or any other .NET profiler) to see what’s being allocated and if you can remove it. This post by Ben Watson will help you get started with PerfView PerfView will also tell you how long the GC pauses are, so you can confirm if they are really impacting your application or not.

Q: How is benchmarking different from finding the difference of timespan between start and end of a function?

A: BenchmarkDotNet does a few things to make its timings are accurate as they can be:

  1. Using Stopwatch rather than TimeSpan, as it’s more accurate and has less overhead.
  2. Call the [Benchmark] method once, outside the timer, so that the one-time effects of JITting the method are included in the timings.
  3. Call the [Benchmark] method several times, in a loop. Even Stopwatch has a limited granularity, so that has to be accounted for, when the methods only takes several nanoseconds to execute.

Q: This is not a question, but an answer for the question about examples for Unit tests not showing performance issues. We need to load data from Azure / SQL server in production, but in Unit tests we have a mock service that responses immediately

A: Thanks, that’s another great example, a mock service is going to perform much quicker than a real service!

Q: What measure could point me in the right direction to know that I could benefit from object pooling?

A: See Tip 3 and Tip 4 in this blog post by Ben Watson for a really great discussion on ‘object pooling'.

Q: How does benchmarking work on asynchronous code?

A: Currently we don’t do anything special in BenchmarkDotNet to help you benchmark asynchronous code. It’s actually a really hard thing to do accurately and so we are waiting till after the 1.0 release before we tackle it, sorry!


About the speaker, Matt Warren

Matt Warren

Matt is a C# dev who loves nothing more than finding and fixing performance issues. He's worked with Azure, ASP.NET MVC and WinForms on projects such as a web-site for storing government weather data, medical monitoring devices and an inspection system that ensured kegs of beer didn't leak! He’s an Open Source contributor to BenchmarkDotNet and RavenDB. Matt currently works on the C# production profiler at ca and blogs at