In this webinar, PostSharp’s founder and principal architect Gael Fraiteur demonstrates some of the most exciting new features of PostSharp 5.0:

  • Logging: PostSharp 5.0 comes with a fully revamped logging aspect, which allows for much more flexibility than before. Key new features include complete customizability, support for semantic logging, and super-fast performance.
  • Caching: You can now cache method return values and invalidate the cache easily, with custom attributes. PostSharp 5.0 will include support for MemoryCache and Redis, two-layered caches, and local caches with pub/sub invalidation.
  • Async methods: We have filled the gaps in OnMethodBoundayAspect and MethodInterceptionAspect for async methods.

Watch the webinar and see what's coming up in the new version:

PostSharp 5.0 Sneak Preview Logging, Caching and Async on Vimeo.

Download source code of the examples.

About the speaker, Gael Fraiteur

Gael Fraiteur

Gael has been passionately programming since childhood; building and selling his first commercial software at age 12. He is President and Chief Architect at PostSharp Technologies based in Prague, Czech Republic. Gael is a widely recognized expert in aspect-oriented programming and speaks at developer conferences in Europe and the United States.

 

Structured exception handling and defensive programming are the two pillars of robust software.

Both pillars fail however when it comes to handling internal faults, those that normally originate in software defects rather than in any external factors.

In this webinar, Zoran Horvat demonstrates advanced defensive coding techniques that can bring the quality of your code to an entirely new level.

Watch the webinar and learn:

  • When throwing an exception is the right thing to do
  • Why exceptions and defensive coding cannot be applied to recover from defects
  • How to handle situations when internal software defect is causing the fault
  • How to treat fault detection as an orthogonal concern to normal operation

Advanced Defensive Programming Techniques on Vimeo.

Download slides.

Download code samples.

Video Content

  1. Bird's View of Defensive Programming (3:21)
  2. Demo (6:47)
  3. Design by Contract (16:24)
  4. Q&A (50:34)

Webinar Transcript

Tony:

Hello everyone and welcome to today's webinar on advanced defensive coding techniques. My name is Tony and I'll be your moderator now. I work as a software engineer here at PostSharp and I'm excited to be hosting this session today. I'm pleased to introduce today's speaker, Zoran Horvat. Zoran is CEO and principle consultant at Coding Helmet Consultancy, and today he's going to share about what should you know about defensive programming and how to improve internal quality of code. 

Before I hand the mike over to Zoran, I have a few housekeeping items to cover about this presentation. So first, this webinar is brought to you by PostSharp, that's why this slide is there, this is not Zoran, neither me. PostSharp is an extension that adds support for patterns to C# and Visual Basic, so if you are tired of repeating yourself in your code, you may want to check it out, as did at Microsoft Intel or Bank of America who have been using Post Sharp in their project to save development and maintenance time. Customers typically take down their code base by 15% by using our product, so feel free to go to our website, www.postsharp.net for more details, or you can get also a free trial there, and now back to our webinar. Today's webinar is being recorded and the recording will be available after this live session, and all of you who have registered to the webinar will receive an email with the link to the recording. 

And last, during the webinar we also love to hear from you, so if you have any questions for our speaker, feel free to send your questions through the question window at the bottom of your player, and all the questions will be answered in the end of the session. If we do not get to all of your questions, we will also follow up after the webinar, so all of your questions will be answered in the end. So without any further ado, I'd like to kick off things by welcoming Zoran Horvat so, Zoran, over to you.

Zoran:

Yeah. Thank you Tony. Hello everyone. As Tony has already told you, on this webinar I'm going to talk to you about certain aspects of defensive coding. You may know me from Pluralsight courses mainly, I have five courses published there, and things that you will see in this webinar might be seen generally in all five of the courses I have published this far. Those are general programming principles applied, and applied on a very specific and narrow goal of making code internally more stable. 

Now, without any delay, I will just tell you precisely what this demonstration is going to be about, because the term defensive programming is talking about a large area, and I will identify a very small subset of problems that we are going to address today. 

Bird’s View of Defensive Programming

So, historically, you may probably be aware that the whole thing about making programs stable first started with garbage in garbage out concept. Which was very good in one respect which we programmers like a lot. You don't have to code anything to put garbage out. So a lot of software was written in a way that code is not responsible for invalid, for behaving in the face of invalid input, and that worked quite well for many years only now and then producing distraction and that, it was generally very good. 

But as time progressed, as years passed, programmers were more and more interested in having stable software, so that is how defensive coding came into being for the first time. And then I'm pretty sure that you know more or less everything about defensive coding already, and you might be asking yourself, why, what am I going to tell new today. Probably nothing, but anyway, let's draw the map of defensive coding.

For one thing, you want to defend from input, from invalid input to reject invalid input right away, and to only let valid input in, or to standardize communications with outer systems, which not only means other services or, I don't know, network, things like that, or database, but also the libraries that you might include in your own process. If you don't trust other people's code, you might sanitize the output from a library. However there are more things there, and today I'm going to talk to you about things that are not, that don't have to do with outer layers of your system or your component, or whatever you are writing. I'm going to talk to you about defending from yourself. 

Many people don't really understanding that there is a concept of defending from your own code, and I will give you a hint now, and you will hear about it in around 20 minutes to think about that hint. For example, if you face non reference exception, what do you do? How do you defend from that? If it's not something that came from a third party code, from a library, or from any metric service or anything, if it originated from inside your code, how do you defend from that? And that is what we're going to be talking about today.

Demo

So I'm going to introduce a very small example, it's going to handle students and subjects there enlisted to listen and to have exams, so student is going to be identified by name, and is going to keep a list of grades from exams that student has passed, and student will expose three methods, add grade, remove last grade, and get average of grades, and now as you look at this API, this is very small API, you can already start thinking about what can go wrong with only three methods we have, and many things can go wrong already. And you will see even more things will go wrong when we introduce training, which is, for example, a university subject, and students can enlist for the training. Training will expose another three methods, you will be able to add a student to enlist for a training, and then two more complicated methods will come, that top student will have to work through the list of students and sort them by average grade. And the last methods add grades, will be the most complicated of all, it will be involved supposedly after the exam, it will receive the list of student's names and grades, it will work through the list of students and add a grade to each of them. 

And now, what can go wrong? This is usually a place where I apologize for using PowerPoint and step into Visual Studio. This is the code, I'm going to show you the code which is implementing all these six members. First let me show you the grade, grade is just an enum. It's an enumeration defining grades, and the only working code is located in the student and training classes. So I will show you first things that can go wrong in these two small, very small classes, and you can imagine how bad it could all be when applied to a large project.

Sometimes people tell me that I'm showing too simple examples and non-realistic, but you will see that only six very small functions, half of them one-liners, can cause you so much trouble that you will have to reconsider your entire coding practices, all the practices you're using every day, if you want to survive six simple functions.

So let's start. Grade average. There's the list of grades, down below I'm adding grades to the list, and this is the grades average. Every grade is first converted to double, this is my extension method, it is not important at all, what is important here is the average extension method, which comes from the link library. This method will throw an exception if the list is empty. So we already have one thing to defend from, and I'm even helping the course by introducing a new member, he has grades, which returns boolean, telling whether these grades is empty or not. It returns true if there are any grades in it.

So if you want to avoid an exception coming automatically from this property getter, you better check where this property returns true beforehand. 

Now remember, this moment, I'm telling the callers to make sure to check potential output before using the member. That is hiding half of the answer to the great question, how to defend from the inside of our repetition.

Tony:

Excuse me Zoran, I have a question here about this great average property. You say that we should check in advance whether the student has any grades or not. What if we, instead of that, return some special value from the grades average, like, not a number, or if we make the double nullable and we return null?

Zoran:

Yes. I could answer that question on several levels. On one level, you should ask yourself, what have you got by returning, for example, null or double? Did you get anything more compared to a boolean and a double? No. Nullable is still a type which has a value boolean property and a value double property, so you still have the same thing in your hands. 

On more elaborate level, if you return something that could mean one or the other thing, then you are placing a burden on the receiver of this result, to be afraid of what he might get. I think it is much better to put the thing straight before the caller. Now, if you call grades average, then make sure to not receive an exception back, So if you get to the point of calling grades average, then you should have already made sure, either by adding a grade, and therefore making sure that there is something to average, or by asking whether there are grades or no, and after you have made clear that it is safe, then you call this, and then you get an exact result. No speculation beyond this point what you have got back. I hope ...

Tony: Okay, thank you.

Zoran: ... This is enough.

Tony: Thank you.

Zoran:

Okay, so the add method. First, let me make a disclaimer here, this method is returning a flag as an indication whether the operation passed well or not, this is not how you should write code in my opinion, and I will remove these flags by the end of this demonstration. I have started the demonstration with an ancient way of defending, which was based on status codes returned for matters. So people used to return, for example, false, if operation cannot be conducted to the end, then true if it went well.

So this is the old way of dealing with unknown input, so this input was not sanitized before, and now we have to deal with it, and I have a piece of private logic here, it is very simple, I'm asking if this grade is defined in this enumeration, so don't send me a grade value of 57, because it's not an existing grade. If something like that happens, I will just return false and be done with it, right?

Why is this bad? Again, this is bad on a couple of levels. On one level, I am making troubles to the caller. The caller doesn't know what's going to happen after he calls my method, and that is a bad thing. We don't like non deterministic code. We don't want to call a function and not know what is going to happen after that. Imagine a list of five steps that you have to make one after the other. You made the first step, the second step, the third step, now fourth step is calling this function for example, and the fourth step returns false. That means that now you have to roll back the first three steps, which might be very complicated. It might include calling dozen of other methods to roll back the effects of these three steps.

So returning a flag is a bad thing to do from that aspect, but there is another aspect which is even more important. You see, if I give up at this point, it's not only that I give up. I must also recover from this situation somehow. It's not the end by returning from this method. Every ever must have a recovery. If you don't recover, you must kill the process, alright? If you want to keep going, to keep running, you have to recover from every single, every bit appears during execution. How is add grade going to recover?

Design by Contract

So this is the first concept, the first important place where I will mention design by contract. If you haven't met design by contract, it was introduced by Bertrand Meyer quite a long ago, and it's still not widely popular. I suppose it's not widely popular because it is a bit complicated, but I will try today to tell you the most important part of the theory, which will make you not really understand it to it's fullest extent, but it will make you start thinking in a different way. 

So here we have a function, add grade, which is defending from invalid input. And I say that is wrong, because this function doesn't know what it means to recover, it is easy to just give up. Now, how to recover. This function cannot recover, do you know why? Because it doesn't know who called it. And that is an extraordinary idea. I myself was defending at the called site, so inside the method that was called, for many years, until I understood that the method that was called cannot defend because it doesn't know the context in which it is executing.

Is it a web application recovering from a format error? Could mean to send HTTP status, internal server error, 500, or temporarily unavailable, or not found, whatever, so it needs to send an HTTP response back. If it is a desktop application, recovery means to pop up a message box, or if it is a unit test, it has searched false and made the test fail. So only the caller knows what it means to recover.

Okay. You keep thinking about this and I will keep talking about current code. Remove last grade is removing the grade at the last position in the list, but only if there are grades. It also returns boolean flag and that's it. And now to the more complicated class, training. It keeps a list of students, it allows us to add a student, but we can only add non-null students, you will see why. Here is the get top student, this is the method which is sorting the students by their average grade descending. But now, we have two kinds of students, those that do have grades, and those that don't.

And now you see that story about the caller and the callee. The calling class training knows exactly how to recover from the situation when the student doesn't have any grades. It is very simple at the disposition. I can pick only the students that do have grades, and sort them descending by their average grade. And then, just concatenate those that do not have any grades. Do you see? This list is going to have students all descending by their grades average, and then all those who didn't have any exams yet at the end of the list, and it's never going to fail. This function is safe, only because the student class has given me the means to test them, so that I can recover from potential. It's still not an error, it is a condition, I can recover from something that might end up in an exception, and I would have to recover it using other means if I had the exception from the grades average. Now there's no trouble at all.

So this is the great concept, giving public members to the caller to test the state, and prepare defense from any condition in advance. 

Okay, let's move forward, set grades. This is a terrible, terrible piece of code, don't write code like this ever, but I had to. I had to do this, let me show you what this function is doing. It is receiving a sequence of tuples, each tuple consisting of a string representing the name and the grade to give to a student with this name. 

Now, what can go wrong? A student might not exist in the list of students. There is a list here. So the student might not exist here, and this function should fail. I'm still returning boolean flags to indicate failure. And on a more detailed level, the student might exist, and we might try to settle the grade to that student, but add grade could fail, saying, "No, no, I don't want this grade, goodbye." So we might face a failure deep inside the loop, and that is, that scenario, which I told you already, like having five steps and one of them fails.

So look what I am doing here. I am first defending from the situation, forgive me the syntax, this is going to be much cleaner later. This is a query which has... For any tuple in the input, now all students say that their name is different, and that's meaning that there is a name which doesn't exist in the list of students, then give up and return false right away.

So I'm defending from one sort of troubles, but the other sort of troubles comes only inside a loop. Much later, when I try to add a grade to a student and this might return false. Now if one student returns false, then I must work through the list of all students who did accept their grades and roll back the change. That is that roll back which I mentioned, so you can see how terrible code is produced if I try to defend from a flag returned from deep inside a loop. So this is the roll back list, any student who accepts a grade is put into the roll back list, if any of the students later decides not to accept the grade, I must walk through all the students who were successful, and roll back changes, and then return false. So this code is terrible.

Alright. Let's start fixing it. The first thing, if you want to defend in your code using status codes, which I strongly advise not to do. If you insist on doing that, then it's better to do that right away at the beginning of each method. You see, it's better to ask all the students first whether they exist, second whether they accept grades. So I would need another public member, this is the student class here. I would ask for a can add method on the student class so that I can ask them upfront, and if I do have that method, then all this doesn't have to look like this. It becomes much simpler, you see? I can just walk through the list of students and find each of them by name, picking the first one with that name, and adding a grade to it. This time I know that the first that this creative, or the student by name, is not going to fail, and second, I know that every student will accept the grade, because I have asked both of these things up front.

Alright, now I need this can add method, so here it is. It is just repeating the logic that I already had in the add grade. You see, this is the same thing, only this is the negation. So, I could rise this theory to a higher level and say, and really introduce contracts at last, and say, student and training classes should have an agreement how they work together, and that agreement is called the contract. And a student class says, listen, don't call add grade unless you're sure that can add grade will return true. If you're not sure, ask. If you are sure, alright. And then, after having can add method, I must change add grade to actually use that method. Previous code was this, and this is a pretty much private piece of logic. 

Now can add becomes part of the documentation for the add grade method, and add grade method will check can add according to documentation. I'm not sure whether you understand this, but the idea is to ... Let me rephrase, contract will be defined in terms of public members only, and then the contract will be checked in terms of those same public members. 

Okay, now. Now we have a situation in which add grade method is checking the public contract and giving up immediately, if the contract is broken. But then, why returning false? We can push contracts to even higher level and say, contract must never be broken. Never. The other class must not call add grade if can add is returning false. It can throw an argument exception for example, and stop execution not only of this method, but of the caller as well. The caller must be sure that the object and arguments are correct before calling the method.

And so, this remove, okay. Remove is going to test public member, HasGrades, it was public from the very beginning, and now HasGrades is a contract. It says HasGrades must return true, or otherwise I will fail. So you may start thinking, but this is a bit too hard, maybe this is too much, why not keep returning boolean flags? I will show you.

Before that, let me fix these two complicated preconditions as well. The difference between these preconditions and those nice preconditions in the student class is that, again. I'm having a private logic, something I cannot write this into any documentation, because nobody has access to students, for example. So what I'm going to do will be to give access to the students, not just indiscriminate access, I will introduce a very concrete contained student method, so if you're not sure whether I contain a student or a given name, then ask, and after you asked, you will be free to call set grades and pass that name as part of that list of names, because you know that it is inside. If you are not sure, I keep right to throw, and I will stop your execution if you didn't obey my contract.

Similarly, this will have to be turned into public contract again, I will imagine some accept grade method on this class, which will receive name of the student and the grade, and it will tell you transitively whether the student with this name will accept this grade, if yes, then you are free to include that tuple in this list and call. 

So I'm finally giving up all these return flag instructions, then I am going to not return any flag from this method as well. So all the methods that are doing stuff are just returning void in my solution this time. I need this accept grade method, it is very simple, it is looking up the student, now I know that the student exists, because that is a precondition for the precondition, so to say, and I ask every student whether it can accept a grade. 

So, we have moved this solution to a higher evolutionary step, and this is typically sa far as many people would go. But now, if you were throwing argument exception, for example, or argument null exception, or index out of range, or invalid operation exception, things like that, then it is the time for you probably to think why did you do that?

So, the question is, you can think for yourself, did you ever throw argument exception? And you probably say yes, all the time. Then I ask you the second question, did you ever catch argument exception? And I don't know, I suppose that nobody has ever done that. The problem with argument exception, argument null exception, and similar exceptions, is that nobody ever catches them. People catch exception, and they do that at a very topmost level before the exception goes out of our scope and becomes an unhammer exception. We are handling exception just to protect our process from failing. We are never catching argument exception. And the reason why is that because we have no clue what to do with it. It is as simple as that. You cannot handle argument exception, because handling and recovering from that exception would mean to correct that argumented call, make a call again. But then, if I knew what's wrong with the argument I am passing, I would pass the correct argument right away, I wouldn't call the exception in the first place.

This is a mind-bending idea that might need to mingle in your head for a while before you start thinking the same way I'm thinking. I don't know how to make you believe me in a one hour session. That is my point. But anyway, we could ... Suppose that I have persuaded you to not throw concrete arguments and not to try to handle them, what is the next evolutionary step?

Tony:

Excuse me Zoran, before you go farther, I would have a question here. Wouldn't it help if you are more specific in these exceptions, because here you just say student not found, but if someone on the top level catches the exception, wouldn't it help him to know, for example, which student hasn't been found, or which student has an unacceptable grade, or which grade?

Zoran:

Mm-hmm, yes, so you are suggesting to make even more specific exceptions, like our own custom exceptions, and populate them with more specific information, right?

Tony: Exactly.

Zoran:

Okay, so again, what would we do with that? Suppose that there is a student name which exists here but doesn't exist in the student's ... Where is it, student's list. Now we would include that student name in the exception. So what would we get with doing that? The caller would have to catch specific exception, to expect that exception, and then to dig for specific information there, our student's name, and then do what? Once again, we have no clue what to do. So a student is not there. Is it going to add it? No, it doesn't have a grade for that student, so we have an impossible situation. So we are not going to recover in any ways, including more specific exception, plus, even if you do catch a more specific exception you are still going to do the same things as in this contract scenario, so we are going to understand that this student does not exist in that list, alright? But we could do that by asking whether this list contains that student for every student name. 

So all the code is still there, it's only packaged in different places. So you're not going to have any functionality more than you already have, and you are still not going to be able to recover from the error.

Tony: Okay, thank you.

Zoran:

And then, the revelation. If precondition is violated, in any place in code, do you know how that is called? How do you call remove last grade when there are no grades? What do you call that? Or what do you call adding a grade which is not part of the enumeration, which is invalid grade? You call it a bug. Precondition violation is identically equal to a bug, and if you ever thought why you cannot recover from mistakes like having a name of a student which doesn't exist in a list, or having a grade which cannot be accepted by an existing student, if you thought why you cannot recover and keep going, but you must stop execution of data operation, of course, and just stop it, there's no more execution of data operation, then the answer is because that happened because you have a bug.

And then, checking preconditions means to catch bugs in your code. There's no recovery, just stop execution, that's why people throw exceptions, they don't return status codes, there's no more execution beyond this point. And now, as good programmers, we could do a new thing, completely new thing, and say, "Alright then, checking preconditions is a separate concern. Let's create a class," we are object oriented programmers, and we resolve issues by creating new classes, alright. So here's the contract class. This class will be a library somewhere on the side, not part of real production code. And in the contract class, I want to define a utility code requires. It is even a static, static method, it will receive a predicate which must return true. I will call this predicate, if it returns false, I will throw an exception, and listen, I don't care what exception, I will just throw an exception, because this is the end of the road, I'm writing precondition violations, there is no more. No specific exceptions, they carry no information which has any meaning at a disposition.

And now, … using this, for example in the add grade, I will call requires and say this is ... I can add, I can accept, can add a call with this grade, this must be true. Done. Remove last grade would be another. An another, an another, let me find number five. Another precondition, has grades, it must be true, and that's it. 

Now look, preconditions have been turned into a declarative code. It's not imperative, and just declaring what must happen in order to start doing this. Alright? Now I suppose if this is the first time that you see preconditions, at least in this way, that this is a bit too fast for you, you will have access to code and to code snippets who would be able to repeat this whole process and see the evolution of code yourself. 

I will move faster towards the end. Now look in the training class, I have these complicated preconditions, they are complicated because they are working in a collection. They must work on all elements of the collection. So we could even, now that we have moved the contracts to a separate class and declared them an orthogonal concern, now we can add universal quantifying precondition. Alright.

So contract extensions is going to be the class that will hold that. You see, this is another extension method, this time it is an extension on a sequence of whatever, it receives predicate on that whatever, and it just calls, requires for you to … elements in the sequence, nothing more. This is just a universal quantifier for preconditions. 

I was able to define this very easily because I have moved this different concern into a separate class. So, back at the training class, let me rewrite these preconditions in terms of universal preconditions. Now this reads like names and grades, for them I require that all tuples satisfy that I contain a student with this name, this is the name of the constraint, and I request that I'm accepting a grade for this student name and this grade, this is the name of the constraint.

Do you like this code? It is 100% declarative, and one even more important thing, it is encapsulated. It is encapsulated here, not in my production code, but in some other library. And now, what can we do with the encapsulated code? We can change it without affecting the caller. That is what object oriented programming is all about. 

So, look at this. Suppose that somebody has said ... I'll return to this. Somebody said, "Listen, this is working for the  sequences, it's even working twice, I don't want this to work on every call in production, this is flow." Alright. And I'm pretty sure that this will never happen, that this precondition is false, we have really tested the code to the end. Alright. We want universal and existential quantifiers to be switched off in production. 

Do you know how you switch off this code in production? One way is to add conditional compilation, that is a feature of .NET. You can specify that this piece of code is going to be included in the assembly, but then when just in time compilation is performed, calls to this function will be removed if this symbol, debug, is not defined. 

So let me show you what happens if I end testing and decide to go to production. So this is debug configuration, now I turn to release, and this goes great. Compiler removes both calls at just in time compilation. So you can keep quantifiers alive in debug and turn them off in production, making code safe during testing and making it fast during execution. That is one thing you can do when you treat a contract as a separate concern.

Or, even more, you could, this is the, requires you change the method, you could do something like this. This is the old fashioned conditional compilation. You can say that in debug mode, I want to assert, I want to kill the process. Now, why would I kill the process? Why not keep throwing plain exception? Because, exceptions can be caught. Sometimes people inadvertently catch exceptions and do nothing with them. They catch exceptions because they fear that those exceptions would propagate up and make damage. Alright, that fear is not well placed, you should not catch an exception unless you know perfectly well what to do with it, but people still used to not handle exceptions, but just cover them up. And if you throw a precondition violation exception like this, then even in the testing code, it might happen that somebody has hidden the fact that exception has been thrown. And then, you have a bug, and your code has raised a flag saying, "This is a bug, you have a bug, fix it," but somebody has hidden that flag from you. And then you deliver buggy software to production.

So in order to avoid that, you may choose to assert in debug mode. Assertion, if predicate is false, assertion will kill the process. There's no way for you to hide that, to catch assertion, it will propagate to the operating system. And it is very often a good idea to assert on precondition violations during debugging, because then everybody on the team will know that there is a bug, and they will have to address it. And again, if you go to build configuration, when you finish testing and go to release, then you switch off assertion, and even without that, I was using this assertion from debug name space, so this wouldn‘t affect the release build anyway, so you get back to throwing an exception, because, well, it's not nice to assert in production, you don't like big red screen, windows and message boxes on the screen, alright. 

So then you throw, even if that means that some exceptions would be silently ignored, just for the sake of end users, keep throwing the exception rather asserting. And another thing, the last thing in this demonstration, there's even more interesting idea. What exception should we throw? Not plain exception, plain exception really says nothing. Maybe we could throw some exception like contract exception, you see, and it says precondition violation, this precondition, alright, but what is the contract exception? 

I could, I could define contract exception as a private class inside my contract class, for example. So that nobody can see it, only I can see it. So when it comes to doing something, for example, I know this add grade might fall, I could wrap it in a try catch, but catch what? Contract exception? No. There's no such thing, it is a private class, I cannot write a synthetically correct catch block which includes contract exception. I cannot handle contract exception, the only way to handle it is to handle the base exception, which is a bad practice, and I suppose nobody of you is doing that, you should only catch general exception on the very last level in your component, just to stop exceptions from propagating outside of your component, so you would generally capture general exception and say, if you are writing a service, you would return either 500 in internet, or whatever code, or temporarily unavailable, or I don't know, not found, things like that. 

If you are writing a desktop application, you would return some object which indicates to the user interface that it should pop up a message box and excuse to the user, so that is the place where you catch general exceptions. The very limits of your system, not inside. On the inside, I would suggest you to throw private exception class so that nobody can really catch it. And again, the reason is because any precondition violation equals to a bug, and you cannot recover from a bug. You cannot fix an argument and try again. If you knew what's wrong with the argument, you wouldn't cause the exception, so that is the simplest way to look at precondition violations.

Alright, this was the demonstration, I hope that you liked it. Now you can ask questions, I have left a bit more time for your questions, because I expect to have at least a couple of them, and I would like actually to hear you, and to try to respond to your questions now. 

Q&A

Q: Would the debug assert be removed from release build even without the debug compile directive?

A: It is not an easy decision to remove parts of the logic that is checking the conditions. If you are absolutely certain that you have checked everything in testing, you can remove all precondition checks from your application and make it absolutely fast, as fast as possible. If you are not sure, then you leave preconditions in production, but that is one of the concepts. Also, there is that gray area in between. You might add a lot of heavy checks that are good for debugging, like just imagine a sorting algorithm, an algorithm which sorts the array. Can you imagine how many checks you could perform during the execution of that algorithm? Including the final check that the array is really sorted at the end.

All that would make the sorting algorithm tremendously slow. And then you just remove all of that with a compiler directive from production code. You don't have to check the debug symbol. You could introduce your own symbols, like heavy precondition, or things like that, and switch that off not only in release build, but in the build which is going to be on staging or in production. Then, for example in .NET core, you can use environment and check what environment it is, and then to have heavy checks in development, less checks in staging, no checks or almost no checks in production, so that is the point.

It is very important to understand the concept of preconditions as a separate concern which is not turned into 100% configuration question. It is not a question of your code, it is not part of your code anymore. You could see that I was changing the way preconditions are compiled without making a single change in production code, that is the most important part.

Q: Why not use Microsoft code contracts? How does this presentation apply to PostSharp? 

A: Zoran:

Microsoft code contract is a grade library, which unfortunately having hard time right now. Recently it was turned into a community project and I don't see much activity on there. For those who do not know, code contracts library is defining preconditions, post conditions, and variants, even precondition inheritance, you can define preconditions on an interface, not on a class, and the compiler ... Not the compiler, the code contracts library would jump after building the byte code and rewrite your byte code so that all classes that implement certain interface with contract, the contract code will be injected into them after compilation, so it is tremendous idea. 

Unfortunately it never got popularity, and it had a grain of poison from the day one. That byte code rewriting is a heavy hammer. You don't want your production code to be rewritten by any tool. If you imagine deploying such code to production, and knowing that some tool has changed it after you, then you have probably decided not to do that, and code contracts library, I like it very much, it is great idea, unfortunately it is not popular today. And it's not going in any direction towards success right now.

PostSharp is including the tools for code contracts, right now the tool does not support all these concepts that I have shown, however it is hitting the central point by turning contracts into a separate concern. In PostSharp, contracts are implemented as aspect, so again, it is outside of your production code. It is a very important idea to not keep contracts as part of your code.

Tony:

PostSharp allows you to add patterns to C# without changing your language, and it comes with some ready made contracts library, which covers part of what you have seen in this presentation. But you are also free to create your own contract aspects, so if you don't like how we have prepared this, you can still use PostSharp and make your own pattern implementation. 

Q: Is there any situation in which non private custom exceptions should be used? 

A: Yes, of course. Exceptions are what their name says. Just imagine, I don't know, metric exceptions for example. Metric exceptions are giving you the way to handle an external situation, something that did not originate and end inside of your inner code, but it came from the outside, the network failed. You could write a library which terminates execution by throwing an exception, alright, that is perfectly valid solution. However, you must make a clear distinction between situations which are suspicious in terms of looking like a bug, from situations where something has prevented you from completing the operation and nobody could see that in advance. 

It is easy to see that, for example, a user is blocked and you cannot move money to that user, so it is trivial to see that. If somebody, even after that, calls you and asks, "Now give money to this user," and you see that the user is blocked, there's no point in throwing an exception. Because that caller might handle exception and keep going, thinking that the money is there, and it's not, it cannot be. So that is a bug, that is not an exceptional situation, it's a clear bug and you must stop execution right now. There's one saying, popular saying, which says, "Dead process makes less damage than the crippled one." If you catch a bug and let your process keep going, it is a crippled process, something has happened and you have no clue what, and then if you let it keep working, this might damage the data, it might commit something in the database that it should have rolled it back.

I hope you understand the distinction now. There are bugs, and there are exceptional situations, so throw exceptions only in the second category.

Q: Doesn't the argument against the rewriter for Code Contracts also apply to PostSharp?

A: Actually, no – IL rewriting done by Code Contracts makes the code you see during debugging not the same as the code you execute. PostSharp respects the whole development chain, so even though it is rewriting the IL, the debugging experience remains untouched. You can debug not only your production code, but also the aspect code including even your custom build-time logic. And the PostSharp Visual Studio extension helps you to keep track of all the enhancements PostSharp does in your code.

Q: Why not using the standard Microsoft Code Contracts library instead?

A: Code Contracts are obtrusive, and my experience shows that development teams are not ready to take the risk. The teams I was leading have never even considered using Code Contracts library, and I didn’t want to push the matters either.
If I had to pick the sole reason for not advocating use of Code Contracts library, it would be the fact that it is modifying the IL code in a non-transparent manner. Even the original documentation states that IL code rewriting may be the reason for teams to avoid using Code Contracts.
In my opinion, Code Contracts could have been designed as a regular library, and that would remove that veil of mystery that surrounds the process of code rewriting. I suspect that primary concern was performance. But then, if you look at spectacular performance gains of ASP.NET Core, for example, we could question that reasoning on grounds that performance could have been improved using other means. We will probably never know.

Q: How does this relate to Code Contracts?

A: If you decided to try Code Contracts on this same example, then you would find that there is correlation between syntax I have shown and Code Contracts syntax.
That is not a coincidence, because my example was strongly influenced by the Code Contracts library.
There are two reasons why I have opted to use custom contracts instead of a library. For one thing, I wanted to show you the bare bones, because that looks more convincing. Core of the contracts implementation is, as you could see, no longer than ten lines of code. The second reason is that in previous demonstrations part of the audience was complaining that I am pursuing a lost cause, predicting slow and painful death to Code Contracts. I am very fond of that library, it is very smart and also complete, but current level of support truly doesn’t fuel optimistic feelings.

Q: How does this presentation apply to PostSharp?

A: PostSharp tool supports code contracts as an aspect. You can add attributes to method arguments and describe preconditions that must be satisfied (e.g. that an argument must be non-null). It also supports custom extensions so that you can widen the use of aspects provided by the tool.
The theory which I have demonstrated is more general than any tool (except Code Contracts) currently supports, and that stands for PostSharp as well. I have insisted on unconstrained version of the theory so that you can see the direction in which that is going. Tools are getting better with every year anyway.

Q: How to extend code in a way that client code will know that method does not return null or does not accept null as an argument for instance?

A: Expectations made by the method are called preconditions; promises made by the method are called postconditions. Failing to meet a precondition indicates a bug in the caller; failing to meet the postcondition indicates a bug in the called code. So much about terminology.
Preconditions and postconditions are part of documentation. You may wish to write unit tests which would document them in an executable way, but there is currently no way to rely on, say, language or Visual Studio about that.
There was an attempt to design a Visual Studio extension which would read Code Contracts and expose them as part of IntelliSense. That extension was unstable and I had to remove it from my copy of Visual Studio after trying it for some time. There are other similar attempts, but none which would make an impression.
There were requests to Microsoft to include contracts in Roslyn compiler, but the team has rejected that. It remains to wait for someone to develop an extension which would incorporate contracts validation into the build, I suppose.

Q: Are these given precondition contracts all we need, or do you use (many) different ones not shown here?

A: As in previous question, we also have postconditions – promises made by the method. For example, method should never return null, and then it asserts that the return value is non-null before exiting. Such postcondition tells callers that they don’t have to guard against null before accessing the result of our method – a useful hint indeed.
But generally, even if you decide not to use any tools, preconditions that I have shown in this demonstration are pretty much everything you might need. As any great idea, Design by Contract is very simple.
There is, however, one subtlety. Preconditions are inherited, so that derived classes are satisfying the Liskov Substitution Principle out of the box. It is forbidden to add more preconditions in the derived class. Now, that sounds easy when you have base and derived classes. The real problem is that all this stands for interface inheritance as well, and that is what makes Design by Contract messy in .NET.
Code Contracts library came with a powerful (and complicated) solution to contracts inheritance problem. You may refer to their documentation for details.

Q: Where can I get more instructions on using contracts?

A: The authoritative source of information on Design by Contract is Bertrand Meyer’s book Object-Oriented Software Construction. That is where Mr. Meyer has introduced and explained DbC theory in such terms that there is virtually nothing more to be added.

Q: Your idea of recovery is not technically sound... How this "separate" concern can work in multi-threaded environment?

A: Multithreading is another concern. It is not advisable to enforce thread safety on any given piece of code. It is better to implement thread safety as a separate class which wraps a non-safe component. In that way, code is much simpler and usually less error prone.

When things are put that way, it turns that contracts will remain in the component which doesn’t deal with threads and therefore it is 100% sound.
Take TPL as an example – threading was incorporated in Tasks and Task Managers, leaving logic intact. If you take a look at the way Code Contracts library is already handling async/await constructs, you will see that there is nothing missing in their solution.

Q: How practical it is to remove precondition from production code? What should your method returns in production if someone violate your preconditions for method?

A: Sometimes people decide to remove contracts checking from production code just for performance reasons. Note that contract clauses can also be viewed as part of a larger testing picture.
You can consider rigorous proofs as one level (which you run once – on paper), automated tests as another level (which you run sometimes), and then contract checks as the third safety valve (which you run every time the code executes). Now, it’s obvious that you can drop contract checks if you are pretty sure that automated tests have done the good job, or even that you have a formal proof that contract check will never fail.
In about one month from now a new course of mine will be published at Pluralsight – titled Writing Highly Maintainable Unit Tests. In that course I am explaining in great depth relation between formal proofs, unit tests and code contracts. For those of you having Pluralsight subscription, it might be useful to watch that course once it goes live.

Q: If we have checks/contracts on debug or staging, how would we not want that on production? Ex: checking to see if the students exists would have to be checked on production as well right?

A: This relates to previous question, where I have argued that code contracts are only one level of safety, though the most rigid one. If your code is tested well, then you don’t even have to put explicit contract checks into your code. However, I don’t think that any production code is tested enough to truly remove contract checks, because contract violation would then pass unnoticed and it could possibly damage data.
Therefore the conclusion. It pays to consider removing some of the costly contract checks. For example, remove quantifier which has O(N) complexity from methods with sub-O(N) complexity – like methods that have O(logN) or O(1) complexity. Leaving contract checks there would make the code observably slower. You probably don’t have to remove an O(N) contract check from a method which already has O(N) complexity or higher – like O(NlogN) or O(N^2). Contract check would then reduce performance by at most a constant factor, which we can survive.

Q: Wouldn't Debug.Assert be removed from Release build even without the #if DEBUG compiler directive?A: That is true in my example. I have included the conditional build just to show you the concept. But don’t forget that there is the Release version of assertions in .NET. Some people opt to assert in Release build as well. That makes sense in very sensitive code, where erroneous execution might cause unacceptable consequences.

Q: Is there any situation in which non-private custom Exceptions should be used?A: Exceptions remain valid solution in all exceptional situations. If you are writing a library, you would probably opt to throw a custom exception back to the caller, than to put a burden to the caller to have to use certain code contracts library (including your own).
However, inside your library, you could still be fine if you decide to institute contracts as the design strategy. That will at least save you from your own bugs, not related to any other code that might call your functions.
Other legitimate cases are already covered by the .NET Framework and other libraries and frameworks. Consider network exceptions as an example. Network communication is a typical example of the process which is full of exceptional cases, and that is where exceptions can be of great value.
Note, however, that there is one funny side of exceptions, which can give us a hint about when exceptions are a valid option. Although they are representing exceptional events, we always expect exceptions. We expect network exceptions when working with remote services. We expect database exceptions, like concurrent edit, deadlocks or timeouts. We expect IO exceptions when working with files. If you have a situation in which you actually do not expect anything exceptional to happen, than you will probably be better off if you avoid introducing an exception.

Q: Is it correct to say that catching or throwing exceptions in between the execution of a function is an anti-pattern?

A: Using exceptions to procure information or to alter control flow is definitely the anti-pattern. Just consider this case. You have caught a custom exception and you are ready to do what it says. Now – how do you know that the exception was thrown by the entity you expected? Exceptions can jump over activation frames. You are certainly not going to inspect call stack from the exception object to discover which function has thrown it.
This analysis can go much deeper than this. Maybe it is enough to think of catching specific exceptions as breaking encapsulation of the called code and knowing too much about its implementation. That is rigid and fragile and it breaks principles of OO programming - which is not the problem in itself, but then you won’t have any benefits of OO like polymorphic execution.

Q: Since preconditions have been rephrased as declarative code, can we use attributes to define contracts? Any frameworks for doing this?

A: PostSharp tool is defining contracts through use of attributes. However, keep in mind that attributes must be compile-time constant, and therefore you cannot pass just any expression as an attribute-based contract.
However, it is possible to cope with that (in theory). You could use static attribute to somehow point to another method which contains more elaborate contract verification code.

Q: Briefly, what was the purpose of the private ContractException?

A: Primary fear from throwing exceptions when precondition/postcondition is violated is that somebody might catch that exception and forget to rethrow it. In that way, operation might continue to run, even though we have just found sufficient reason to terminate it.
That is the reason why in Debug mode many developers opt to assert, i.e. to kill entire process, rather than throw an exception. But then, in production, killing the process is not the most user friendly behavior we could come up with, and for that reason we may decide to throw exceptions instead.
But then, we don’t want to throw an exception which can be explicitly listed in the catch block. We don’t want anyone to catch that exception for any reason. In order to catch a private exception, you must catch its base System.Exception instead. And it makes sense to expect that nobody will ever catch System.Exception deep inside code. That is the justification behind private exception class.

Q: If you want to access a resource such as database and you split it to two function: One to check whether there is a value, and another to fetch the value. This will use expensive resource twice, so maybe here return nullable value is ok?

A: Database access is not subject to Design by Contract. Database is to our code just another outer system and we are dealing with it with all precautions we would make in any other case. You have to catch exceptions coming out from the database because you have no idea what might go wrong there (like deadlocks or timeouts, or even plain failures such as corrupt index).

Then the question how to deal with a nonexistent value where it was expected becomes easier. You just query the database as it is. Then turn concrete data into concrete objects. And then those objects would be subject to Design by Contract. If an object requires a value, and you got nothing from the database, then don’t call the method and take a recovery course instead. Otherwise, if you try to invoke the method with null reference, just because database returned no rows, the called object would preserve right to explode.

 

About the speaker, Zoran Horvat

Matt Warren

After fifteen years in development and after leading a dozen of development teams, Zoran has turned to training, publishing and recording video courses for Pluralsight, eager to share practical experience with fellow programmers. Zoran is currently working as CEO and principal consultant at Coding Helmet Consultancy, a startup he has established to streamline in-house training and production of video courses. Zoran's blog.

 

Modern business applications rely heavily on rich domain classes, which in turn rely heavily on polymorphic execution, code reuse and similar concepts.

How can we extend rich domain classes to support complex requirements?

In this webinar, Zoran Horvat will show why an object composition approach is favored over class inheritance when it comes to code reuse and polymorphism.

Watch the webinar to learn:

  • How class inheritance can lead to combinatorial explosion of classes
  • What the limitations of object composition are
  • What design patterns help consume composed objects
  • Techniques for creating rich features on composed objects

Applying Object Composition to Build Rich Domain Models on Vimeo.

Download slides.

Download code samples.

Video Content

  1. Limitations of Inheritance Approach (3:00)
  2. Composition Approach (12:45)
  3. Visitor Pattern and Double Dispatch Principle (20:47)
  4. Difficulty with Visitor Pattern (36:49)
  5. More Complex Examples (40:07)
  6. Accumulating State of Visitor (43:15)
  7. Q&A (55:08)

Webinar Transcript

Zoran:

Hello, this is Zoran Horvat speaking. I will be delivering a webinar and here with me is Alex from PostSharp.

Alex: Hello everyone.

Zoran:

Well, as you can see, this webinar is on the topic of creating rich domain models. I believe most or all of you have already suffered a lot building rich domain models, so in this webinar, I will show you one approach to that problem, so let's get started.

You may know me from Pluralsight, where I have delivered five courses this far and the sixth one is on route. The most interesting one regarding this webinar would probably be this course on design patterns called Managing Responsibilities. If you want to watch more of this, you can go to Pluralsight and you can search my name and get into the courses.

In this webinar, we will talk mostly about the Visitor design pattern. In this course on design patterns on Pluralsight, you would find visitor and also chain of responsibility, which may also be applied to the same problem that I will show today.

The problem. Let's start with an example. I have come up with an example which doesn't require any particular domain knowledge. It has to do with animals because I suppose everyone knows a lot about animals and you will not have to learn any specific business domain to follow this webinar.

Suppose that we have a company which deals with animals. It needs to categorize them to do stuff to them. Let's see where that will lead us. For example, we could have a number of objects, each representing one animal like a cow, horse or lizard, snail, catfish, parrot, eagle and now we have to do something with these objects. Suppose that the first requirement is to categorize them somehow and what we do, what we will try to do will be to create a hierarchy of classes which is categorizing the objects based on their features so that two objects would share the same base type either directly or indirectly and through that base type, we would be able to access common features of those two objects.

Limitations of Inheritance Approach

In the first part of the presentation, I will show you the approach with the class inheritance. You know the old saying in object oriented programming, favor composition over inheritance. Now I will show you why. Suppose we have started to categorize these animals into ground animals, water and airborne animals. Suppose that doesn't satisfy all the requirements right now. Suppose that we want to work with the mammals only, so here they are.

We can add another level of inheritance, another layer of derived classes, which only covers these two animals that we are interested in. At this level of complexity, we are quite happy. We can apply business logic that we have with mammals and that looks like this for example. We might have some object of type animal, but it's not really animal. It is some class derived from animal and now we check whether that is the mammal. If it is, we can cast this abstract animal object into a mammal object. For example, a tiger or anything else and then we can apply the domain logic, which is strongly tied to mammals.

For example, our business might pull the tail of an animal and then I don't know, run away. At this point, we are in a good situation when talking about the program. We can access a concrete feature of a concrete object, concrete class. No matter the fact that we have started from an abstract animal object. In the end, we could pick a concrete feature of a concrete class, so this looks good, but let's continue.

You know about special cases of mammals, namely whales and dolphins. Well, they are also mammals and they don't live on the ground. They live in water. If we wanted to also implement the same feature on these two kinds of animals, then we would have to introduce mammals in one additional place in the class hierarchy. If you think about this situation here, if you tried to, I don't know, map it to any real business domain that you may be working in, you would recognize a situation that you have certainly seen before.

There is this first front of classes that derive directly from the animal and at this level, everything is straight. We have ground, water, air. Everything is right, but then if you want to add one more, one additional kind of feature, not one feature, but additional aspect of an animal, then we start splitting ground animals into mammals, water animals into mammals, and we start duplicating logic.

Now, both of these would have some features that their parents do not have, like giving birth to live younglings as opposed to laying eggs. We have the feature of the mammal, another feature of a mammal which doesn't exist in any common ancestor. If you get back to this piece of domain code, then we see that it is not complete. To complete it, we also have to check whether it is another kind of mammal and then to repeat everything, only this time, with a different type.

Now, observe that this is not code duplication because for example, water mammals do not have tails. They have fins and we are not going to run, but to swim in the sea, so obviously we cannot reuse this code here because we are not working with the same type. There's nothing in common that we could use in these two examples.

Alex:

If I can ask a quick question here. What if we try to change the class hierarchy for example to fix this issue and for example, pull mammal implementation up in the hierarchy and then push what is up down and try to somehow simplify it?

Zoran:

Yes. Yes, that is a good question. The approach would be to pull the mammal up and in that way, make both of these have the same class or the same ancestor. However, if you do that, then ground and water animals would have to split under them.

Alex: Yeah.

Zoran:

We would virtually end up with the same problem, only different set of objects. Basically the problem is that we are solving a problem that we have with an inappropriate tool. The problem is that we have distinct features of an animal and we are trying to turn those distinct, unrelated features into a hierarchy which they do not exhibit and so we suffer. We suffer in a very painful way, as it says here.

We are duplicating domain logic, but we are not duplicating the code. The code is not the same because it is operating on different classes which are having no common ancestor. We can come up with more counter examples. You know about emus and ostriches as flightless birds, so if you wanted to add them, they would fit into the ground animal category, but they are still birds, so we have birds here and we have flightless birds there.

If we wanted to do something with birds again, we have code duplication. If you wanted to add even more specific mammals like flying squirrel or a bat, which are proper mammals, we would see that they are airborne animals. Or if we wanted to add a flying fish, it is still a fish, but it sometimes glides outside of the water. Here is the hierarchy of classes that is covering these features. Only this small, limited number of features of these dozen of animals, we have a dozen of classes because we have to split every single class into two or three of them to cover specific and duplicated features in every part of the hierarchy.

My point here is that attempting to solve a complicated domain problem with class hierarchy very quickly leads up into this situation in which this small hierarchy already is. Only the first layer is clean and everything under that is a mess and if you go to the second, the second layer, now you see duplication in three places, but if you get even lower, you see that even those are cut into smaller compartments because we want to know that the flying squirrel doesn't really fly. It glides. It needs wind or something to fly, but a bat needs nothing. It has full-blown flying ability.

If you want to explain an entire domain model with a class hierarchy, it is very soon going to become unbearable. You won't be able to complete it, now again, this is an example with animals. You can make it less funny by applying it to a financial market or to bank or to insurance company or I don't know, production plant and you would see numbers. I mean, dozens of special cases that you cannot fit anywhere. For example, just try to imagine amphibians. Where would I put amphibians here? I have no clue.

Now, imagine an amphibian which is a mammal who knows how to fly and you will see that there is absolutely no place where to put it in this hierarchy of classes. I have seen very large hierarchies in real business domains and they were really impossible to manage. There was so much duplicated or almost duplicated code that it was impossible to manage.

Composition Approach

We come to the primary topic of this presentation in which we are going to talk about a different approach to solving the same problem. It is composition. We are not going to inherit classes. Any of these animals is not going to be every of these ancestor classes. It is going to be just an animal and now animal is not going to derive from whatever. It is going to contain features, so we had a special kind of feature which is classification. It is biological classification and we had mammals, birds, bony fishes or reptiles or gastropods. That is a snail.

This diagram says that animal is going to contain an instance of a class deriving from classification. It is going to contain an object mammal. Let's see another. We had environment in which the animal is living, so it could live on the ground, in the water. It could be seen in the air. We would add an object of environment, even more, we could add multiple objects because some animals, like amphibians could be seen on the ground and in the water, no problem.

Some others like birds are living on the ground and sometimes flying in the air, no problem. Two objects again. Even more than that, we could refine some of these objects and say, there are two kinds of water. Not every fish is living in the fresh water or in the salt water. The third aspect of those animals that I had were their abilities. They could walk or run if they are on the ground. They could fly. Now, fly could come in two flavors, gliding or full flight. Swimming could be even more complicated.

Now, as you can seem the animal could also have multiple abilities. There's nothing to prevent that. As you can see, I have split that large and cumbersome hierarchy into three distinct, smaller hierarchies and none of these hierarchies alone is exhibiting any of the problems of the previous one because each of them is dealing with only one aspect of an animal, and you can always add multiple objects if you want to represent multiple abilities or environments or whatever.

Now, these hierarchies can also start growing and become not easy to manage, like abilities are very complicated right now. If that happens, then just apply composition to the abilities again. You could have, I don't know, ground abilities, air abilities, and water abilities like a product between these two hierarchies and you could again, compartmentalize the object and the composite from multiple smaller objects that would explain it closer.

This looks like a solution, right? And it is really not. The problem here is now the animal class is encapsulating its traits. We don't know what is inside and if you follow the encapsulation principle, which is a very important principle in programming. Okay, I will give a sentence on that just after. If you follow encapsulation, then we are in trouble we don't know whether this animal is a mammal. We cannot see that. If animal broke encapsulation and let us access its classification directly from the outside, then we come to the reason why encapsulation is good. Animal can never change its implementation of classification ever in the future because somebody's depending on a concrete implementation of this feature.

That is why encapsulation is good. We are keeping the right to change the way animal finds out who it is. For example, this animal could contain an object of classification, but also it would contain a reference, a dependency on some classifier. Something, I don't know and whenever we ask the animal what class you are, what biological class do you belong to, that animal object could contact its dependency and say, now, look, somebody's asking. You tell me, who am I? That object would tell it's a mammal or a bird or whatever and the animal would return that. That is what we get if we preserve encapsulation. 

We need some, I don't know how to say, a good, legal, legitimate way to poke into the animal object without violating its encapsulation. One of the ways to deal with that is the visitor pattern.

Now, before explaining the visitor pattern, I will show you what these animal objects will look like, so that will give you the idea, if you're already familiar with the visitor, that will give you the idea what we will try to visit. If you have a cow, its class is mammal, it's environment is ground. It has two abilities, to walk and run. The horse is the same.

Emu and ostrich are are almost the same, only they are birds. Now we can have about lizards, we can walk. We can have snail who is gastropod and I don't now how to call its movements, whether they are walk. Now, whale and dolphin are mammals. Now you can see an example of what was almost impossible in that hierarchy, the approach with the inheritance. Now, they are also mammals, as cow and horse, but they share nothing else with them. They dwell in water. They can dive, live under water, anything, so there's no resemblance with cow and horse.

That is exactly what I wanted to achieve with the composition. That is something that you cannot do with inheritance. We have, I don't know, two kinds of fish, birds and flying squirrel and a bat, who also share a little of common features. We have this table which is explaining how we can construct an animal. We can construct a lizard by telling that its biological class is reptile, that it lives on the ground, which means give it an object of type reptile, give it an object of type ground, give it an object of type walk and then the lizard will be able to walk. And so comes the visitor. 

The Visitor Pattern and Double Dispatch Principle

Here is a principal idea of the visitor pattern. There we can talk about a hierarchy, like these abilities, running, flying and swimming. Now, take these three concrete classes. They will have to do something. What will they do? For example, we might say that ability is defining an interface use. Like run, okay, start running. Or fly, start flying. Or start swimming. This is the common feature of all elements of the hierarchy.

However, after a while, we come up with a different idea that abilities might sometimes be suspended. Like I could break a leg and not be able to run. Ability might also have a suspend feature. Alright, we implement three more functions here and we are fine. Now we have six functions in total. However, later on, we come with another idea, advertise an ability. Like in the mating season, you can see someone running and jumping around with no obvious reason, so it would not be the same as use, but it will be, I don't know. It could still run, fly or swim, but in a completely different setting. Where does this lead us? If we have built this hierarchy of classes, we don't want to change them every now and then.

If you have this hierarchy, which might be overwhelmed with new features that are using them, then adding all those features to those classes might not be the best idea. Instead, we could turn this hierarchy or view it sideways, like let's see using feature, like something that develops here and then let the using feature use the run ability or use the fly ability or use the swim ability. The features are becoming classes and these are becoming their arguments. That is how we come to the visitor.

Some ability of visitor who'd visit the ability class and the class would tell, "Okay, now use me. Do whatever you do to myself, to this object." This class, AbilityVisitor would now have one method for each element of the hierarchy. Do you see how the things, they are skewed by 90 degrees? You can see that now we have one visitor method, but in three flavors, run flavor, fly flavor and swim flavor. Each of them would implement an entire ability for each of the classes in the hierarchy.

Now, when I said that the problem with the original design, where ability based class had all three methods is the problem when we add new features, then we have to add them to each of the classes. It's obvious that visitors are suffering the same kind of problems. If we add a new class, then we have to add a new visit method. It is not really better than the previous solution. It is just a different view on the methods.

We come to concrete visitors. Like useVisitor or suspendVisitor or advertiseVisitor. Those are the three visitors, which are implementing the three functions, the three features, but each of them is implementing an entire feature for an entire hierarchy, each type in the hierarchy. How does the visitor work? We take some ability. We don't now what concrete type this is, ability object and we call accept visitor. This is still something that is derived from ability visitor. In a concrete override, override of the accept, all these three members of the hierarchy would have to override the accept method.

In each of them, we just say accept this, visitor you are, accept me. Now guess what? Since this class is concrete, this will statically link to this method, visit here, visit of run because we are in the run class. Accepting method would look the same in the fly, but it will call visit of fly instead. This is the basic principle of the visitor.

It is called double dispatch. We have one dispatch to let two objects meet each other, like concrete object derived from this class, meeting an abstract visitor and then the concrete visitor, this one. It must be something concrete, is meeting concrete object run and that is how these two objects finally find each other and then you can implement the feature on the run object, on a strongly typed run object. That is the major benefit that comes with the visitor design pattern. That is the major benefit and now you know everything else is probably drawbacks.

As any design pattern, this looks deceptively simple. This looks like you implement entire universe by applying visitor pattern to everything. However you don't. I already said we have the same problem with added classes. If you have a dynamic hierarchy, if you have interfaces here and there, which are indeed, you don't know who's going to implement them, this is not going to work. However, when it comes to those animals, I will show you the situation in which the visitor can be applied and it will be applied pretty well.

The rest of the demonstration will be code and let's get started. This is the animal. The animal class is having a public name, but it doesn't do anything. More important parts are private. One classification, a list of environments where it can be seen and the list of abilities it has. I mean, the constructor animal requires name, classification, and one environment. Later on, we can add new environment. We can add as much environments as we like and we can add as many abilities as we like.

These are the animals. This interface is letting me compose an animal, compose it from its features. I have prepared this static utility class in which I have shown you how you can actually compose a cow. Its name is cow. It is a mammal. It has an object of type mammal inside of it. It has an object of type ground inside of it. It has the ability to walk, which is an object walk and another object of type run, so it can run. That is the cow. You have horse, emu, ostrich, everything. All the animals are here.

Now, of course you would never construct objects in this way. I have just tried to mimic a repository of animals. The only really important member in this utility class is this static property which is getting all the animals. You can imagine that this is the day and you have loaded all these objects and attributes from the database and constructed each of these animals. This will be the entry point for us. I have a sequence containing all the animals. Those are all these animals from the diagram plus a salmon, which I have added because it is going to provoke a bug and I will show you that in a minute.

Let's try to use these animals, all animals to perform a couple of tasks. One thing will be to find all the mammals and say hello to them. This is what I want to do. I want to iterate. The example is obviously in C#, but I think I haven't used any specific features of the language, whatever your native language is, you should be able to follow the example. Here I'm iterating through the collection of all animals and this and this.

Now, let's explain this. This is the visitor. It is a visitor which says hello to mammals. How does it do that?

This is the classification visitor. Now, a word of warning: if you have been using visitor before, or if you have learned it from the Gang of Four book or any other book on the subject, you didn‘t probably not see the implementation like this one. I don't like to tie too much to any concrete implementation of any concrete design pattern. The point here is to recognize what is the core of the pattern and the core of the visitor pattern is double dispatch mechanism. That is the pattern also called double dispatch pattern. If you remove everything else from any explanation of the visitor pattern, including that diagram that I've shown you, this implementation is not equal to that diagram.

Just as a demonstration, how much I do not want to tie to any concrete implementation of any design pattern. If you strip off everything else and only leave the double dispatch mechanism, then you are suddenly free to implement the design pattern in a way which is just right for your problem at hand. One of the problems that I had was that I couldn't find common ancestor for the two classes. You will see in this implementation that now we can visit a common ancestor of two features of an animal. It will be like magic.

Now, the difference between this implementation and what you could find in literature is that now I also have visit base class. Not only five concrete biological classes, but also visit the base. If you want to just see whether an animal has a classification or it is, I don't know, some alien, then you would listen to this, not to any of these.

Okay, now concrete visitors, this is the classification. This is the classification visitor there. You can see that the classification, abstract class has a virtual, which is overridable method accept. It is accepting any classification visitor and it is calling its visit this. If I choose to go to the definition. Where is go to definition here? Here it is.

It will go to concrete visit of classification. None other but this concrete one. Now, let's see more. Classification for example, as a descendant mammal. Mammal is derived from classification. It overrides accept and then says, base, do whatever you like, which means that we will end up in this method first, but then visit me and I am a mammal. This ends up. Look where it is if I go to definition. I'm visiting concrete visit method. I'm invoking concrete visit method, which accepts mammal, so it goes to concrete classification visitor.

Now, I want to say hello to all the mammals, so do is to accept an animal that I have to visit and then in this public say hello method, I say, "You animal, whoever you are, accept me, concrete visitor." Then the animal will do the magic. Let's see. This is the animal class. It says my classification, you accept this visitor. We get to a concrete classification and it might be a mammal. If it is a mammal, it will again call this visit of mammal, but not in the base visitor, but in the concrete visitor.

I don't know if you could get a grasp of what I did. This looks like magic, but that is the basic double dispatch principle, so we had concrete classification object and we have concrete classification visitor, so in the end, concrete visitor will meet concrete classification. If we ever get into this method, that means that the object animal had an object mammal inside of it, which means that we can say hello to it because it is a mammal. If we do not get into the visit of mammal method for this animal target, because we are visiting that object, it's not a mammal. That is how a visitor pattern works.

Difficulty with Visitor Pattern

Alex:

Okay, so if I can just maybe have a quick question. First there's a note that we actually have a number of questions from our listeners, but we will answer them at the end of presentation, so please keep asking questions. We will read them and answer then, but regarding the question I had, I think you already touched upon it already in the slides. I noticed in the base class, you basically have a method for every classification. We have classification visitor, base class. It means you need to maintain this class when you are adding a new classification subclasses, right? This is some negative part of that or that we need basically to add method every time you-

Zoran:

Yes, yes. I have only touched that question and here's the good point to show the consequences. If you add another type to the classification hierarchy, we must add one more method like this for that new type, which has a couple of drawbacks. One is that we have to change a class which is not really related to the new type, which is bad. The other important aspect is that now everybody else would have to be aware of the new class, but I have made that a little bit easier, by making these base implementations actually have no implementation. These are not abstract, so nobody has to override them. Only who is interested in.

Zoran:

If you take this mammals one, it overrides one of them and then the next result is that if you add a new class and this visitor knows nothing, it has no business with that new kind of animal or new kind of classification. We won't have to change this concrete visitor. It is somewhere in between.

Alex: Yeah, yeah. You don't need to change all the classes.

Zoran:

That is the major difficulty with the visitor.

Alex: Okay, thank you.

Zoran:

Thanks. I will run this and you will see that it will actually, if you see. I hope the font is not too small, you will see that I'm saying hello actually to all the ... Oh, this is terrible. Let's go back. Alright, so I have picked cow, horse, whale, dolphin, flying squirrel, and bat. All those objects from distinct parts of that class hierarchy from the beginning are now in the same bag and they have been recognized because each of them had a mammal object inside of it and none other animal object had the mammal object inside of it. That is how the visitor works.

More Complex Examples

Okay, I will show you more complicated things. Now, I want to say hello to all the animals under the sea, so that is a completely different thing that is environment and I have environment visitor, which is visiting all of these kinds of environments and now I have a lot of water creatures, which is catching, you see who? Water. Now, water is derived from environment, but it has salt descendant and it has freshwater descendant. Now, I don't want to have to override visit freshwater and then separately have to visit saltwater. I don't want to have to duplicate logic in two places just because there are two kinds of water.

I want to catch their base class and then visit that. That is the change that I made to the original visitor pattern in this example. That is why saltwater for example, when it accepts a visitor calls, this is another implementation, just a different way to call the base, no problem. It calls visit with this as water, as a base class and then visits this as the saltwater, so we will have two invocations on the visitor for the same object. Now, let's see how it works.

I'm using the same sequence of operations. I'm just creating a different visitor, like water creature visitor, giving it each of the animals and to each of them, trying to say hello. What will happen? Now we have whale, dolphin, flying fish, cat fish and two salmons, so we said hello to the salmon twice because if you take a look at the salmon animal, it has been ... In the animals, it is somewhere here. Salmon. Look, it has two environments, freshwater and that is one and the other is saltwater because it lives its entire life in the sea and then goes back to the river.

I said the salmon, hello twice. 

Accumulating State of Visitor

There is one very interesting feature of the visitor pattern. It is called accumulating visitor. I will show you that in the next example. Accumulating visitor does not do its operation as soon as it can, but it accumulates an object which it has visited. I will show you on the abilities visitor. You will see that. Let me just find any of the abilities. Oh, I cannot find.

Oh, here it is. This will be the next exercise. When we want to take a picture of anything that flies. Now, we have two kinds of flying. We have gliding and full flight. I'm visiting the base class again. It is fly, which derives from ability, but that class has two descendants. One is glide, as you can see and another is full flight, so if you take an animal which can glide and fly, which knows both things, then we would again visit it twice. Well, now instead of taking a picture of this animal here, I will accumulate it in this private property and just keep going. Then I will wait it again when I see the same, when I see another flying ability in the same object, in the same target. I'll override this again. I will override it as many times as it gets into the visit fly, but after the fact.

Here's the take picture method. After all the visitor methods have been unfolded, I will remain with this recognized target, which is either null, if I didn't recognize any flying ability or not a null. If it is not a null, then this animal has at least one flying ability and I can take a picture of it. I hope you followed this. It is another piece of convoluted logic. Not easy, not quite easy to follow. However, this is the example which is doing precisely that, and I will take a picture of each of the flying animals, including mammal bat and the squirrel which is also a mammal and a fish and two proper birds.

I have found all kinds of flying abilities in all of these animals that I have. Now, there's another exercise which is telling all the mammals to run. Now, let me show that very quickly. I think you have already got grasp of the basic idea. I have scare mammals visitor, however, when it visits mammal, it uses another visitor to scare it, which I must explain what it means to scare an animal. It means to make it run, so there's no point in scaring an animal which would stay in place, like scaring a snail.

Scare animal is different. It is ability visitor, which is trying to see if the animal can run and then you know what it does, it calls a feature of the run object because in this object, in this method, we have a strongly typed run object and we can use the feature which only the run class has in this entire system. It's just going to say that it is now running. You will have the opportunity to download this source code and that will help you take more time to see all this convoluted logic.

I will just run this example and show you that really, the task was to scare mammals, so I have found a cow, a horse and a flying squirrel. Those are all the mammals that can run. I didn't try to scare away, oh, because it doesn't run. This would be two level visitor, so to call it.

Alex:

Before you go to the next example, I can also ask a question because this looks interesting. This scare mammals visitor, it looks like it instead can be implemented as a chain instead, so if you have a concept of chain of visitors, you can say the first visitor will say, "Filter out mammals." Then you just have another visitor and that one will for example, tell them to run and so you can dynamically build the chain of visitors.

Zoran: Yeah, that question is funny because that is the next task.

Alex: Okay.

Zoran:

I'm doing precisely that. Yes, this scare mammals visitor which is tightly coupled to scare animal visitor, which is then working on the abilities, this code is very hard to follow. It's very hard to understand how it works and why it works. It is basically ugly if you have to construct all these objects and match their types. It is hard to manage. If you could, I don't know, close this, encapsulate this convoluted logic into something which is easier to use, that will be great. Here's what I have prepared for you. I have prepared a solution, which says, all animals of classification mammal. I believe this looks much better than using the visitor.

Now, in this example, I just want to join all the mammal names and to print them out. Let's just find mammals. This Ofclassification is an extension method on the sequence of animals. Internally, it is using some filter that I have made. Now, classification filter is the new kind of visitor that I made and it is visiting classification, some type, which is derived from classification, which I don't know in advance. It is a generic class.

Then this concrete visitor is expecting this classification and then checking if that has a runtime type, which is equal to what I expect. If it does, then add it or construct the result, which contains only that animal, which I am visiting. This is extremely ugly code and I don't ever want to see it in my custom code. However, this is the utility class and it has a great value because it can hide this class and this Enumerable extensions class which is using it. They're really ugly, but they are utilities that I will write now, test now and never touch again.

I won't have to rely on any visitor which is just looking for a classification of an animal. This is done forever. An interesting aspect of this solution is that it has ultimately given up the idea of the visitor because I am receiving the type T, which I don't know what it is and I'm not visiting any concrete classification, so you could even I don't know. This is not the visitor, although it derives from the visitor class, but now, you can see how good it is to have something like that because when I have all the animals, I can say of classification mammal and it will return me. You can see, it will return a sequence of animals, but now I know that only the mammals will be in that sequence and it really looks like magic.

I can pick names of all those animals that I have got back and those will be the names of mammals. Let's run it and it will be correct, you see, only cow, horse, whale, dolphin, flying squirrel and that in a chained sequence of calls. The last example in this presentation will be the same thing, only to scare the mammals. Now I have a filter of mammals. I have another extension method which is using an ability filter. I'm looking for the ability to run, which is ability of T. I don't know which ability it is. It works exactly the same as the mammal filter or the classification filter. It is the same idea behind and the same run time implementation. At this point after all classification, I still have a sequence of animals.

Now it comes with use ability. I want to use ability run and this method is asking me to give a lambda, which receives an animal, which would run and the run object. Guess what I am doing? I'm telling that now I'm scaring away this animal. I have access to concrete animal object which can run and ease mammal and I can use its own internal feature to make it run.

Now, this is magic. This is something that is not violating encapsulation. It has those problems if you imagine a new kind of animal. Alright, it has its limitations, but this is really magic. I can run this and it says, "Alright, cow, horse, and flying squirrel." Each of you run and they say, "We started running," so this is obviously working and it is certainly readable.

I would stop the presentation at this point. I hope you have enjoyed it. Now you can ask a few questions and all the questions that we cannot answer right now will be answered in textual form after the webinar.

 

Q&A

Q: Is it possible to implement the same interface called I mammal? This is regarding mammal and water mammal and when you tried to cast to each of the concrete types. This is about user interfaces in this hierarchy of animals.

A: Yes, that is a good question and that is what many people actually do and let me disappoint you. They do that to postpone the disaster in fact. Here's the problem. Basically the question is related to single inheritance languages and all this presentation was in one of those, C#, and the same stands for Java. Now, if you do implement interfaces on classes that are not related to a common ancestor, you can do that, but then you have to implement the same logic twice in them. For the first point is that you still have duplicate logic, only it has moved to classes. You're not going to reduce your code.

The second thing is very often, you do not have an interface which is the same for two distinct members of the large hierarchy because doing stuff to objects may involve a sequence of calls, calling protocol to complete an operation. Well, that calling protocol is not going to be the same in one and the other case and you will actually not have the common interface. It might get you to some extent and then you will get in a quick sand actually. You won't be able to implement more features.

Q: Why do you show such unreal examples? No serious developer would create object in property getters?

A: Yeah, because I didn't have the database under. I said I have just constructed objects in those getters to have them. You can ignore all those property getters and only view the last, the last one which returns the sequence of them all. That is the only one which I actually used. I know it's not realistic, but it resembles a repository for example.

Q: How do you persist encapsulated domain entities to database without breaking encapsulation? 

A: Yes, you have many problems if you try to persist a domain model when domain model becomes complicated. If you have started, if you have got to the point where you cannot extend your huge domain model anymore, then you have already been suffering a lot with persisting that model before. The answer is that probably at some prior stage, you will already separate domain model from persistence model and then persistence model would break encapsulation and let the data out and domain model would try to only expose methods.

Q: How to control if visitor always matches concrete object and there is no implicit conversion?

A: Visitor is designed to match concrete objects. If there is the need to support out-of-band objects, like objects from one hierarchy that can be converted to objects from a different hierarchy, then the compiler will have hard time trying to understand which method of the visitor to invoke.

Major selling point of the Visitor pattern is that it makes obvious to the compiler which method to invoke in every possible situation. The variation – which concrete implementation of a method to invoke – has been turned into an object, a visitor object. That makes it possible to reach enormous flexibility at run time, while remaining within premises of strong types all the time. Not so many designs can do that.

Q: What happened to simplicity? Seems like you are a making an alternative to LINQ-to-Objects to avoid having to deal with a solution, that you call ugly. It seems over-complicated.

A: There are no elegant solutions in complex domains, and certainly no simple ones. It’s only different places to hold complexity. The goal of the game is to make client code simple.

In Visitor pattern, concrete visitors will encapsulate complexity. Chaining calls approach has only saved the caller from having to know the visitor protocol.

One alternative was mentioned by a viewer: Make classes with no common base class implement common interfaces. This is a viable solution unless distinct classes come with different method dependencies and different calling protocols. At that moment, common interface idea is off.

Another alternative is Chain of Responsibility pattern, which I have applied to this same problem in the Tactical Design Patterns in .NET: Managing Responsibilities course on Pluralsight. It works nicely, but fails to differentiate subtle cases, like what happens if an object contains more than one component of requested type.

The worst options are bare hierarchy of classes and public composition (in which the object exposes its content). Class hierarchy promotes code duplication. Public composition prevents future maintenance of the composed class.

Q: Would it be possible for OfClassification<T> method to return an IEnumerable<T> instaled of IEnumerable<Animal>?

A: Classification objects do not reference their containing Animal – that was the design decision. It might be better to expect OfClassification<T> to return IEnumerable<Tuple<Animal, T>>, which then gives entire information to the consumer.

Another question here is how will this method behave if one animal object contains multiple classification objects of the same type – will it return multiple records then? That is one of the difficulties in approach I have shown in the webinar.

Q: Why not use dynamic to implement double dispatch instead of the visitor pattern?

A: Because you don’t know whether the dynamic object will even have the desired feature (like the Start() method which only exists on the Run class). Fundamental feature of double dispatch mechanism is that it lets two concrete objects meet each other.

Q: How to apply the Visitor pattern and composition altogether with dependency injection?

A: Visitor can be injected using common DI techniques. It boils down to selecting a concrete visitor, which is exactly what DI frameworks are good at. There are subtle issues here, though, but they are not hard to resolve. For example, I have requested target objects in all visitor constructors. That would have to be redesigned to fit DI, but as I said, it’s not too hard.

Composition does not fit the DI pattern since there is no object that implements certain type. It is rather an object which is constructed in certain way. We could argue that composite objects are closer to the Builder pattern which codifies the process of building a complex object. In that respect, IoC container (which implements DI) would rather resemble Abstract Factory, which is of lower complexity level compared to the Builder.

Q: How do you persist encapsulated domain entities to database without breaking encapsulation?

A: The persistence problem is not related to application of Visitor pattern, or object composition, as it exists in class hierarchies as well.
There are two levels at which we can attack it. One is to use ORM which can access private data members of a class. Entity Framework and Hibernate can both access private members. You can use that approach to some extent.

In really complex domains differences between OO model and relational model are significant. When you see that persistence is affecting domain model in adverse ways, it is probably time to separate domain model from persistence model. You can then apply a mapping framework to convert proper OO model into flat persistence model and vice versa.

Q: Can we call this pattern in borderlines of SOLID principles?

A: Depending on application, Visitor pattern may be seen as enforcement to SOLID principles, or as an adversary. Here are some guidelines on that:

S – Single responsibility – Each concrete visitor deals with only one feature. Visitor pattern enforces SRP more than the original hierarchy of classes which were doing more than one thing each.

O – Open for extension, closed for modification – If you have a fixed hierarchy, then you will never have to modify abstract visitor. That is the primary niche of the Visitor pattern. It is natively applicable to fixed hierarchies – like classes of animal species, or types of bank accounts, etc. It also supports easy extension, because extending the system boils down to implementing new concrete visitor.

L – Liskov substitution principle – Methods of original concrete classes are now moved to concrete visitors. LSP is violated if a derived class adds method preconditions. Therefore, if original classes had new preconditions added, then concrete visitors hill have to add them too. Conversely, Visitor pattern will violate LSP to exactly the same extent as original classes did.

I – Interface segregation – Original class can implement one interface per concrete operation. With Visitor pattern, each concrete visitor would represent one such interface. Segregation then boils down to selection of a visitor object, which is even more flexible than compile-time interface segregation. This means that concrete visitors will satisfy ISP at least to the same extent as original classes did.

D – Dependency inversion – Since each concrete visitor is a separate class, and all visitors share the same base type, it is possible to inject them as polymorphic dependencies. That is the premise of inversion of control principle. Also, this was the case with original classes which had the same feature. Therefore, DI is supported by both the class hierarchy and by the corresponding hierarchy of visitors.

Conclusion is that the only true difficulty with the Visitor pattern is when the hierarchy of classes is not stable. If we have to add new classes to the hierarchy, then we will have troubles implementing and maintaining the hierarchy of visitors.

Q: How will be the application performance with this approach when we need create several methods?

A: From source code perspective nothing will change. Solution will contain exactly the same number of methods, only they will be moved to different classes.

From run time performance point of view, there is only a negligent penalty. Where one virtual method used to be called, now we will have two virtual methods called. That does not give sufficient motivation to consider performance loss due to application of the Visitor pattern.

About the speaker, Zoran Horvat

Matt Warren

After fifteen years in development and after leading a dozen of development teams, Zoran has turned to training, publishing and recording video courses for Pluralsight, eager to share practical experience with fellow programmers. Zoran is currently working as CEO and principal consultant at Coding Helmet Consultancy, a startup he has established to streamline in-house training and production of video courses. Zoran's blog.