Starting with the premise that "Performance is a Feature", Matt Warren will show you how to measure, what to measure and how to get the best performance from your .NET code.

We will look at real-world examples from the Roslyn code-base and StackOverflow (the product), including how the .NET Garbage Collector needs to be tamed!

Watch the webinar to learn:

  • Why we should care about performance
  • Pitfalls to avoid when measuring performance
  • How the .NET Garbage Collector can hurt performance
  • Real-world performance lessons from Open-source code

Performance is a Feature on Vimeo.

You can find the slide deck here: http://www.slideshare.net/sharpcrafters/adsa-69947835

Video Content

  1. Why Does Performance Matter? (4:36)
  2. What to Measure? (11:48)
  3. When to Measure? (21:06)
  4. How to Identify Performance Issues? (25:31)
  5. Benchmark.net Alternatives (36:41)
  6. StringConcat versus StringBuilder (41:49)
  7. Garbage Collection (46:21)
  8. Stack Overflow Performance Lessons (50:04)
  9. Roslyn Performance Lessons (50:59)
  10. Q&A (56:52)

Webinar Transcript

Hi. Good afternoon. My name is Matt Warren and this is a webinar doing alongside PostSharp. We have Tony from PostSharp on the webinar as well who will handle answering your questions. Let's make a start. Just to start things off ... this is my details and I'm on Twitter, like most people. I have a blog where I blog about similar things to this talk, certainly around the idea of performance and a bunch of things around internals of .net and that type of thing. That's the kind of thing I talk about a lot.

I do have currently on my Twitter account, there's a little poll if people want to just take a look at that at some point in the next half an hour. There's a poll around ... I would like to get some idea of how much people's current projects they have performance requirements and those sorts of things. If you have the chance and want to go to my Twitter account and see the poll there and answer it, it would be good to get an idea of how those things work out for different people's projects. That's me. Let's get into the main part of the presentation.

I have to put this upfront to really say that, unfortunately, I'm not eloquent enough to come up with this really nice title of Performance as a Feature. Mostly because you can type this into Google and this is the first result returned. If you're not familiar with the Coding Horror blog, it's Jeff Atwood's and he's the founder or starter of StackOverflow and that's when I first heard of this term and probably one of the most popular uses of this term recently is this idea. It's where we're going with this talk. He talks about it in his blog post and this talk is covering the same ideas in that we treat ... security as a feature, we treat usability as a feature we treat, obviously, functionality as a feature, otherwise there's nothing much left. But do we treat performance as a feature? Should we treat performance as a feature? What does it look like if we do treat performance as a feature? That's where we're going with this talk today.

Just to give a little bit more context as well, there's obviously a whole range of areas within the general .net applications, web applications, client or other ones. Many of these different levels are involved, so the UI and whether that's on a phone or an app or your web UI. You obviously have a database and caching layer quite a lot of the time as well, .NET CLR.

This talk is looking about performance within the .NET CLR and the specifics of that and where that is. There's a lot of resources out there on some of the other things for front-end stuff and there's great books around getting better performance there. Database and caching, that's just standard stuff as well. We'll touch on those side of things. The bottom box, if people aren't familiar with this idea of mechanical sympathy, it actually comes from motor racing originally. I don't know if any people are into their cars or petro-heads. This guy here, Guy Martin, popularized this quote. He's saying, basically, you've got to have a level of mechanical sympathy, don't you, or otherwise you're just a bull in a china shop.

The basic idea is for motor racing, to be a good driver, you have to understand the mechanics of the car, you have to have sympathy for the mechanics of the car to get the best out of it. A guy called Martin Thompson co-opted this term. He's mostly from the Java space and his blog is called Mechanical Sympathy. If you want to find more about getting the performance, if you look at the level below the CLR, we'll be talking around things like CPU caches and stuff that's almost outside of the CLR, then mechanical sympathy is a good term. That blog is a good blog to start with to find out what's going on there.

Onto the bits, the agenda, that we'll be covering today. Initially starting with why does performance matter? Why should we take performance seriously? Why should we care about performance? What do we need to measure as part of that and how can we fix the issues and some real-world examples of how some of these issues are fixed, what can be done, where we need to worry about these types of performance issues, so why, what and how.

Why Does Performance Matter?

Why, why should we care about performance? Why do we need to take performance seriously? I think there's a few reasons. One is that actually in this day and age of everything being cloud hosted or ... not everything, sorry, a lot of things being cloud hosted ... Actually there's a monetary saving to be made. If you were able to improve the performance of your application by 20%, 30%, that might mean that you could go into your boss on Monday morning and say, "We can save on our ... your hosting bill or AWS or whatever it might be."

I don't know what sort of relationship you have to your boss and whether you saying that is going to get any money passed back to you, I don't know how that works, but potentially there's savings for the companies anyway. Even if you're not in a hosted situation like that, where you're looking to spec machines yourself, you can still spec lower-cost machines or things like that. There's an idea of saving money. 

I think another one is saving power into ... particularly around constrained devices, phones, tablets, these sorts of things. There's a level where actually saving power is very useful for our users. It makes us happy users, if you like. I don't know how many of you have installed an app on your phone and within a week got rid of it, basically, because you realized it is draining your battery 10 times faster than without the app, so there's that idea.

The other end of the scale is, I guess, people like Google and Amazon and those where they're hosting data centers and for them, every amount of power they can save is more vital. We're probably not the extremes of that, but somewhere in the middle. Again, the idea of good performance equals a saving of power.

I think one of the main ones actually for a lot of our users, bad performance, bad perf basically equals broken. We might, as a developer understand that in reality that page is just loading slow or that button click takes a long time to render the response, whatever it might be, because of bad performance, but users don't really think in those terms. They just think this site's running slowly, this site's not responsive, this app takes too long every time I click a button I get frustrated and I kick it 10 times. For them, bad performance equals broken. The worst end of that is that they're customers who don't come back or they're customers who never buy our products or maybe they'd just be unhappy customers. Either way, it's not a good experience for our customers. Bad performance, at some level, equals broken for our customers.

A real classic example of this is ... Google did a study and they introduced artificially a half a second delay and for them that caused a 20% drop off in traffic. Obviously, that's an extreme end, but for us maybe there's a level that that's a problem. Maybe the customer demo goes so badly wrong because of bad performance and the customer never buys your product or maybe a fairly influential person on Twitter uses your product as a bad experience and tweets about it and gives you some bad press, whatever it might be. We're not going to probably see the same level as drop off in traffic as Google, but there's some level, I think, where we're going to have lost customers or unhappy customers. Customers aren't going to buy our products, aren't going to buy again, that type of thing.

I think there's a few reasons there. Maybe all of them apply to the sorts of products you work on, maybe just some of them, but there's some reasons why we should be taking performance seriously. I think another one, as well, is almost like a pride as us as software developers. This quote from a guy called Henry Petroski, who is an engineer and written a lot of books about engineering and mechanical engineering and civil engineering, a professor at Duke University. It says that basically us, if a lot of you on this webinar are part of the software industry, we're doing our level best to cancel out the steady gains of the hardware industry. We're probably not being that deliberate about it, but the idea applies. Hardware has generally been getting faster. We're not in the same level of CPU increasing, but it's multi-core and the ability of what hardware can do is generally increasing at a fairly large rate.

But potentially, software is treading water or causing that to slow down. We sort of know that ourselves, don't we? If we get annoyed with Word, the latest version of Word and we say, "Oh, on my old 386 PC, Word 98 was lightning fast."

The new version of Word on my quad-core PC with SDS and stuff is running ridiculously slow. We kind of know. There's that idea as well. It wouldn't be a talk about performance without this famous Donald Knuth quote about premature optimization is the root of all evil. That may be true, but a lot of the time that quote is actually misquoted, as you can probably guess where this is going. The entire quote looks like this, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

There's this idea that actually yes, there is times where premature optimization is valid. We're doing it for the wrong reasons, we're doing it because we just want to optimize for the sake of it or whatever it might be. But it is also saying that there's times where there's opportunities, the critical 3%. It's interesting because it kind of implies that, in a way, that we need to measure this. What is the critical 3%? What is the 97% we can ignore? We need to know these things. We can't guess at these things. At least if people are going to quote that optimization is the root of all evil, it's nice to know the full quote that goes around it and where it fits in and what he was saying a bit more.

One bit to sum this all up for me, there's a developer named Rico Mariani, an architect at Microsoft, did a lot of work on, I believe, the version of Visual Studio, when it came out, when they added a lot of WCF capability into it and lots of nice UI, but it slowed things down a lot. I believe he had a lot of ... a hand in all of that as well, but a hand in making the performance of that better. He sums it up like this, "Never give up your performance accidentally."

 

This idea that there's always going to be a trade-off. We can't always have the time to make everything perform as possible and that's not always a worthwhile endeavor. There's a point where, actually, the performance is good enough for business reasons, for customer reasons, whatever it might be. But at least let's not give up our performance accidentally. Let's know where these places are. Have them measures and make sure we understand, yes, this bit of the code is not as fast as it could be, but we've measured it and we understand that it's as fast as we need it to be for our situation and any extra optimizations, we believe, are going to take too much time or not be worth it. We're not treating this blindly, we're saying we're going to understand where we might have performance issues. We're going to be deliberate about the places we do and don't fix them.

What Do We Need to Measure?

So onto the what side of things. I have a little section in this talk where I want to go on to about how averages are bad. I don't want to just flash up something in red in our webinar and leave it there. I'm going to explain it in a bit more detail, but generally when we're measuring the averages, it's not a bad aura. Don't give us the whole picture. I'm going to demonstrate that now and give a little bit more context to that now.

If you remember back to your math days at school, if that's the last time you've done math or maybe for some of you this is more familiar. This is the ... what's often known as a normal distribution or a nice bell curve and here the average is sitting right in the middle. We'd say the average is the peak of the curve. We know because it's a normal distribution we know that 95 our of 100 people, 95% of people are going to fall within the dark blue area right in the middle. We know that only four of 100 people are going to fall further out, with the extremes, the plus or minus. If you're wondering, that strange circle symbol, that's standard deviation. We know that only four out of 100 people are going to fall more than two standard deviations, plus or minus, and we know that only three out of 1,000 will fall into the very end pieces, the pink pieces is right at the end.

So given an average value in this sort of scenario, we can, in effect, the average just sets where the middle of the curve is. We know that if it's normally distributed, we know what the tail of. We know that we're not going to get responses, if we're talking about measuring response times on the webpage. We know that we're not going to get renderings if we're in an application, whatever it might be we're measuring. If it fits in a normal distribution, we know there's not going to be outliers, which is the problem. If not, people are going to have a really bad experience. Given the average value, we have a rough idea of where all the values might fall. But, unfortunately, this doesn't cover all scenarios. To put it a different way, I always like some good quotes and this guy, Hans Rosling ... I'll talk a bit more about him in a moment, but he came up with this fantastic quote, "Most people have more than the average number of legs."

I'll give you a little while to process that for your minds and see if you can figure it all out. I'm not going to do the math, I haven't got a whiteboard or anything, but the rough math is basically in a population of any country, whatever, there's a lot of people with two legs. Some amounts, bit smaller amounts, with people with less than two legs for a variety of reasons I won't go into, but you can imagine the reasons. If we then calculate the average number of legs divided by number of people, we're going to get a number less than two. 1.99, 1.998, whatever it might be, but it's going to be less than two. But we've said that by far the majority, most people, have two legs. Most people have more than the average number of legs. It's just a way of showing actually sometimes, in certain situations, averages can be misleading, not really give us the whole picture.

Hans Rosling, just as a very short aside if you're into stats or not even at all into stats, but want to learn a bit more about stats, he has some amazing Ted talks and links there at the bottom of the screen or you can search for Hans Rosling Ted talks. He has a fantastic way of bringing statistics alive in a way that few people do. If you've seen a talk with a guy jumping around, pointing at bubbles on the screen, you've seen some of his talks and I'm sure they'll be familiar to you.

But we're not generally measuring numbers of legs. That's a nice quote, it's a nice aside, it shows the point, but actually we're probably, for some of us anyway, we're measuring something like this. This is response times of a webpage, but it could apply to a variety of other scenarios. But we're just going to focus on this one. Again, don't remember your math because you haven't done it since school, histograms, in effect, buckets. The very left-hand bar is the bucket from zero to five milliseconds, in this case. We know that we have 21 responses fell in that bucket. We don't know where they fell within the zero to five milliseconds, we just know there's 21. The next bucket is five to 10, 10 to 15 and so on and so on across the scale across the bottom and high to the bars and underwriting. We know in this case most items fell between 20 and 25, if I'm reading it right, the bar at 31 high.

This response times, this is actually quite a classic scenario of response times. The reason we have the large amount of response times on the left that are happening in under 40 milliseconds, that blob of bars on the left-hand side is because they're not hitting the cache, in effect. They are hitting the cache. They are fetching it very quickly out of an in-memory, cache or wherever it might be, some level of caching. Most of our responses hit the cache. I think in this scenario it's around five out of every six or maybe six out of every seven, something like that. Anyway, a majority of them. We can see that from the graph.

We can actually see quite clearly in this case, our cache is working. The little group of ones on the right-hand side take around 100 to 140 milliseconds, that's ones that don't hit the cache because they're not in the cache, in effect. We have to then do a network call, for instance, to go and get the value from a backend service or a database call or something. That's why there's nothing really going on in the middle, because majority of ones hit the cache very quick, minus a network call or a quicker network lookup. The other ones take longer to do.

Now I've explained it or even from the point I showed the slide, for a lot of you, you can understand this. You can see whether this is acceptable for your users or not. You say, "Well actually, yeah, definitely the caching is working the majority. More than the majority of our hits are going to the cache, that's what we want."

We can see as well that actually no one's getting a response more than 130 milliseconds, the final bar. By the time you get down there, there's very few users. Only one, I think, got between 125. We know that's a kind of worst case and that's often what we care about, the worst case scenario. I've done it backwards deliberately. This is the histogram. If you were to try and imagine ... I'm not going to ask for a question and answer on this, but if you were to try and imagine what the average would be, it may be hard to fit it backwards so I'll help you out. The average value of this is 38.3. We've gone from the more detailed and more informative histogram to the average, that's fine, we can see that. If I was to work the other way around, if I was to have given you the average first and only the average, not the histogram, you might not have imagined that this was the way the response times panned out. You might not have imagined that there were some people getting response times of over 100 milliseconds when I told you the average is 38.3.

You might have imagined the nice bell curve we talked about in a few previous slides where they were trailing off nicely and by far the majority of them were clustered around 38. Actually, the story is a lot more complex than that and for those into your math, this is known as bimodal. There's two modes to this. It's not a normal distribution. Normal distributions only happen in things like height of a population, weight of a population. Response times of applications don't often fit the normal distribution. That's the main reason why if we're measuring performance and particularly when we're measuring things like websites or response times or applications or whatever it might be, when we're measuring these sorts of things, we need to look at things like histograms and stuff like that.

Histograms are all well and good, but they take a bit of space to plot out. They're not always ... they don't help for things over time. What are some other things? Well, this is from the Application Insights Analytics tool from Microsoft Azure and, actually, you can use it outside of Azure as well in this information map. They use what's called percentiles. Again, not to get too much into the math side of things, I know that's not everyone's cup of tea, but basically very simply, percentiles is if you took all the responses in a certain period of time and rank them from lowest to highest, the 95th percentile, if you only had 100 responses, would only be the 95th highest one or the 95th, so that's what it means.

Again, it tells you how people are experiencing at the worst end of stuff. Responses can never be lower than zero, that's kind of fixed, but they can tail off. 95th percentile, 99 percentile, that's linked to the idea of five nines and four nines and all this sort of stuff. This graph is quite nice to show this, because the huge peak that we see about a third away from the right-hand side, only shows up at a high percentile, is completely lost to the other percentiles and would be lost in the average. Whether that matters or not is another discussion, but we can certainly see here that some amount of customers had a very bad experience at that particular time. I'm only showing the average when I've seen … effect having the straight bar across the bottom, pretty much, which is the 50 percentile, which is very similar to the average. This is a way and a lot of tools display the idea behind histograms or the similar things they display in percentiles, because we can track it over time.

When We Should Measure Performance?

This leads us on to when. When should we be measuring this? When should we be looking at this sort of thing and, hopefully, if you've had the chance to answer my Twitter poll and I'll see the responses later on for how that matches in terms of performance requirements, but I would argue that actually for a lot of this we need to be doing this in production. I guess, if you've done web apps or apps on phones or these sorts of things, you can do all the testing you want with all the different handsets, all the different browsers, but there's always that one person that has that one that you couldn't possibly have imagined and all your fantastic test team and all the work you did before production wouldn't have shown up or you'd have to try 10 times harder and the costs would have been prohibitive.

As much as it's really useful to be measuring this stuff before production, there's a level that you want to use some type of tool and there's lots of tools out there that allow this to see this in production. The other reason ... the way to argue is that your users are seeing this. If there's a performance problem in production, a user or several users seeing it and you'd kind of like to know before they tell you, because they possibly won't. They might just never come back. Some level of monitoring production or whatever is possible is a good thing to have.

I would also say that you're unlikely to see any perf issues or very few of them in unit testing. This is not to knock unit testing in any way. I think it's a fantastic tool for what it does, which is allowing you to test a unit of your program and make sure the functionality works, but you can use unit testing frameworks to give you some idea about performance, but you'd want to be writing a different type of test. The basic reason is a lot of time in unit tests we put some mock data and that mock data is just enough to make the test do what we want it to do, to exercise the path. It's unlikely to be the same amount of data that we might put through our production systems or have running through our production systems. An algorithm that works fine for 10 items in a list might fall over and have horrible performance when there's 1000 items on a list. That's why generally you won't see any or very few performance issues during unit testing.

Also I'd argue to my first point, I don't think you'll see all performance issues in development or you'll have to try very hard to get to that level. There's always going to be ones that come up outside of development and production. You can have a great testing team, they can test out a lot of things and I've seen examples when it works, but there's always the times where things come out in production that you can't have imagined otherwise.

Tony: Excuse me , Matt, I have a question here.

Matt: Sure.

Tony:

Can you show us some real-world examples of when unit tests do not catch the performance issue and this performance issue will be only seen in production using some performance test or performance measurement?

Matt:

Yeah, I had one on a previous project we worked on. We were using an off the shelf IoC container, but we were customizing a bit for our needs and it worked absolutely fine for our unit testing, it worked fine when we were testing it, single person testing the application, but as soon as we put any load to the system, it fell off because we were using it in the wrong way, in effect. It looked absolutely fine in all stages of our development until we did our real longterm perf tests of some multiple days and it showed up over that time as a ... in effect a huge memory leak and actually caused pretty bad knock on effect for our response times and stuff like that. I've definitely seen that happen and that's a classic example. Everything looked fine under small load, because the app hadn't been running long enough, there wasn't the usage of the IOC framework and stuff wasn't getting exercised in unit tests because it was starting up with a clean one each time. But when it had been running for a while and we had a more realistic load through it, we definitely saw a big difference. Fortunately, in that case, we caught it in our pre-production.

Tony: Okay. Thank you.

How to Identify Performance Issues?

Okay, so how can we go about this? How can we identify performance issues? Measure, measure, measure, measure. Measure once, measure twice, however you want to think about it. You really need to be measuring this sort of stuff, but even more than that, I would say that you want to be identifying ... you want to measure to identify the bottlenecks in the first place. We'll talk about some tools that can help you with that in a moment. Also, equally important is you want to measure to verify the optimization works. A lot of time has built up knowledge over the years of things that are more performant or aren't more performant in .NET particularly or in other frameworks as well and some of those things may have been true five years ago in that version of the framework that aren't true nowadays.

You don't want to be just blindly applying what we think is an optimization we want to measure. Measure in the beginning, measure during and certainly measure afterwords to verify our optimizations work. Just get away from the idea of blindly applying stuff that we may have read elsewhere. Some of the tools we can use to do this ... one of the best ones I've come across is a tool called Mini Profiler from the development team at Stack Overflow developed this for themselves and then have made it, fortunately, for the rest of us has made it available. It's a great tool. Initially when you run the tool, you don't get this whole popup, you just get the little red section in the top right-hand corner. Integrates with ASP.NET, MVC web applications, integrates with a whole range, actually. You can see it runs as versions of Ruby, as versions for ... you can run it in console applications. It's quite a wide ranging tool, but the initial or the main use case is for ASP.NET.

It puts this little render into the top right-hand corner of your pages when you have it turned on or just for certain users, like if you want your developers to see it, but not for your customers, however you want to set that up. That gives you the page rendering time. This quote at the bottom really sums up to me why it's so useful is this idea of having that in the top right-hand corner when you're in development is pretty useful for developers. I know I'd much rather see straight away that I've made a certain page on the website slower by something I've just changed. I do like to see it before anyone else sees it, but certainly I'd like to see it before it goes to production. This idea of having these numbers up front and not in some log that developers need to go and look at every time they're there, every time the page is rendered it's there.

It gives you more than just the total time, it gives you a great drill down into ... we can see here, sequel calls, page render times. It gives you quite detailed information about parts of MVC, pipelines, rendering pages, natural controller action. You can insert your own timings if there's bits of your code that you particularly want to have a number for those ... it will tell you time for database calls. It integrates into things like NC framework and other ones as well and other, I believe, anyway. It does some wrappers to give you this. One other thing, which is not completely obvious, is down in the bottom right-hand corner, you've got these sequel in red and that's telling you when you have things like selecting plus one queries or duplicated queries. It's a pretty informative tool.

I know for a fact that at Stack Overflow, they run this in production. We don't get to see this when we visit the site, but their developers get to see this from any page when their developer visits the site. They also store the numbers that are used to create this rendering here in aggregate so they can then query that and come back to look at it later. I believe it's not for every request, it's for some sample of the requests, but certainly they're happy in having this running in production. I agree. I think it's quite a useful tool. There's a lot of information you can get from there. So check out Mini Profiler and there's a search for Mini Profiler on the site explains more of the features in detail.

Again, from StackOverflow, there's another tool they made available their Opserver Monitoring Tool. There's lots of monitoring tools available that give you this sort of idea of dash, but I just picked this one because it's an open source one and, again, it's one that StackOverflow, which I believe is a top 50 website, certainly a very high website in terms of page views and stuff like that they've had their own tool that they wrote because existing ones maybe didn't fit their needs. This tool is at least a bit more … for the type of scenarios they're in and this tool runs on a busy website. You can go and see more about Opserver there. It integrates quite nicely with Mini Profiler. What I quite like is this screenshot shows they actually used Mini Profiler, as you can see in the top left-hand corner, they run Mini Profiler on their own Opserver tool to make sure their Opserver pages are rendering reasonably quickly, which I guess makes some sense. If the page isn't rendering quick enough or is rendering so slowly because of performance issues, it's not going to help you as a dashboard, you want it updating reasonably quickly or reasonably frequently.

At some point, particularly, as I've said, this talk really focuses on stuff inside the CLR, that level of performance and we'll come to the real-world examples in a moment. You get to this topic of Micro-benchmarks. I guess I would always say that with these, you want to be profiling first to identify where there's places that are an issue and then do your micro-benchmarks. The problem when there's micro-benchmarks is you get into a situation ... you pick some bit of code that you think is running slowly, you run a micro-benchmark and say, "Oh yeah, that's running 20 milliseconds, or whatever it might be."

You then increase that bit of code to run 10 times faster or whatever it might be, but you lose the context of where that fits in the application. You lose any acknowledgment of actually is that a part of my application that runs repeatedly or is that a part of my application that runs just once a day? Does that speed improvement that I've made, the optimization made, does that have any effect on the real webs production system or is that just quicker when I'm testing in a micro-benchmark?

Whilst there are useful tools, micro-benchmarks, by their definition, they lose the context of where it runs in the whole system. I would always say you want to be starting the profiling first.

Tony:

Excuse me, can I have one question about profiling here? When we use the profiler, it will certainly show us a lot of issues and there's also this 80/20 rule in software development telling us that by modifying 20% of code, we usually solve 80% of troubles. Does it also apply in this case? What would you recommend for problems to solve?

Matt:

It's a good question. Generally, you should always be starting with the most expensive thing, the thing at the top of the profile. You should be fixing that first and the reason for that is quite simple. It's that actually when you ... if you fix that one first, if it's one you can fix, if you fix that one first, it might make some of the other ones go away, because they might have been dependent on the first one. If at all possible, you should always be starting with the most expensive thing, which fits in with if you like the 80/20 rule, the thing that's taking the time. Generally, a lot of times ... I've seen performance issues, there's often one thing that stands out as being a worst case and you should make your best effort to fix that first, if at all possible. Then when you've made that optimized, then run your tests again and see.

A similar sort of idea applies. You shouldn't just be ... if you're going to bother profiling, you shouldn't pick the thing that you want to fix the most. Try and start with the thing that's taking the most time at the top and get rid of that first and then see what the profile looks after that.

Tony: Okay, thanks.

Matt:

Okay, I just briefly flashed this up. It's a little bit tongue in cheek, but the idea behind this is when I show code samples ... a lot of presentations will show code samples and expect you to go away and use them. In this presentation, it's almost the opposite. I will show you some code samples of performance issues and the before and after and the changes that were made, but actually hopefully what I've got across previously is actually I really don't want people to go away and blindly change their code because it was in my presentation. I'm more showing you the tools and some of the areas of code to look at that can be performance bottlenecks, but you should certainly not be changing any of your code based on things you see on the slides coming up unless you've identified that as a bottleneck for your particular application.

The reason for that mostly is that a lot of the time the high performance code is harder to read, harder to understand and less intuitive. If it wasn't, you probably would have written it in the first place. It means that you're potentially making the code base, as a general thing, worse for the sake of performance and if that's needless, for the sake of performance, it's not a good thing to be doing. That's what I hope this half tongue in cheek but half seriously hope that people take away, particularly when you see some of the things later on, you shouldn't be changing it just because I told you to.

With micro--benchmark, I worked on a library called Benchmark.net and there's a nice writeup on Scott Hanselman's blog about it and working on it with a guy called Andrey and a guy called Adam and we have attempted to make a library that will make micro-benchmarking as easy as possible for you. If I show ... I'm going to click on the next slide to show an example. We're going to look at this one. This is a benchmark of reflection, so the standard stuff for most ... there's other tools available, but for most of them, really what Benchmark.net has done is you're writing ... in the functions you're writing what you want to benchmark. You put the benchmark attributes onto it and some other ones as well, baseline it, which is true in this case, and then the last bit of code is really asking benchmark.net to run a benchmark. That's as simple as you want it to be. Then it does the work of giving you accurate numbers, giving you numbers in a nice format and things we'll show later on ... trying to get an accurate representation of the functions that you're asking it to run.

Benchmark.net Alternatives

Tony:

Could you please mention some other tools similar to benchmark.net and tell us briefly how benchmark.net compares to those?

Matt:

Yeah, sure. A few that I've come across ... there's one called Nbench, which is from the guys, the people who made hacker.net and it's interesting actually because they made it because, as far as I understand, the story they had a particular release of their software that had a performance regression and they wanted to make sure that didn't happen again, so they devised NBench, which is more focused around writing performance ... they look like unit tests and they run as part of a unit test runner, but they are tailored, or they are designed to be performance tests. What I said before you're not going to catch by accident, probably, performance issues with a unit test if you craft a specific unit/performance test and that's what they've done.

That runs on every build and they have assertions in there that say, "Did this particular code take longer than x amount of milliseconds? Does this bit of code allocate more than this much memory?"

I believe you can do, "Does this bit of code run faster than this other bit of code?"

Anyway, they have those sort of ideas and those tests will then fail if they run too slow. The idea is to pick up performance regression. Benchmark.net is not focused around that, it's more of a console running tool. It could be extended, but we don't have that at the moment. That's the main difference with NBench.

There's also another tool called xunit.perfomance, that's used by some of the Microsoft tools like Roslyn and the Core FX and they have a seamless idea that it runs on every build and they've crafted performance tests for specific parts of that code and they just want to spot regression so it will give xunit.performance more traction over time so they can see these bits of code are running this fast and this build is fast and this build ... over time will be regressing were they getting slower or faster. The main issues with these sorts of tools is you need to make sure you're running on the same hardware if you're going to compare runs like that.

Another tool I should mention etimo.benchmarks. I just learned about this the other day is a very similar, much more aligned to benchmark.net and similar tools. There are other tools out there. Most of them, as far as all the ones I've come across, they all do the accuracy bit ... it's not straightforward, but they all do that, otherwise they won't be using that and you want accurate results, but they just vary in terms of their focus, whether it's focus for performance test or failure build or focus for running into console, wanted to try things out and give you results like that, benchmark.net has a different focus.

Tony: Okay, thank you.

Matt:

Yeah, no problem. I picked this example because reflection is often talked about and we say, "Reflection is slow."

We're just looking here on the uri system, .uri object and we're doing a regular property call, so object dot host in the first benchmark and in the second one we're going to get the same thing via reflection and get the same value and see the difference. Just a really quick recap, certainly in benchmark.net we cover all the way down to reporting results in nanoseconds, which is a billionth of a second. There's microseconds and milliseconds, just as a quick refresher for people, if you're not familiar with those and the terminology. A lot of the time when you're talking about stuff just in the CLR and stuff, you can get down very easily into nanoseconds, it's not inconceivable to have stuff running in the nanosecond range and hopefully the tools will report that.

What does this look like for the benchmark we talked about before? This just shows quickly the output of benchmark.net. As I said, we mostly focus on console output, although we provide tables and stuff you can post into to get help in other places. This is the numbers. The regular property call comes in at 13 nanoseconds and reflection comes in at 230, so reflection is clearly slow, there's no argument in that. It's roughly 18 times slower. In this case, the regular property URI is not the most ... simpler properties there's a bit more going on so the regular property call is a bit slower than what you'd expect if it was just directly getting a backing field, so it skews things a little bit. Anyway, you get a rough idea of the timings here. That's why in benchmark.net we like to report the scaled number and the absolute timings to give you an idea. So yes, reflection is slower, definitely. It's slow, it depends on how often you're doing it. What ways you're doing it. This is just a simple property access. If you're doing more of it, you're doing more complex reflection, that's going to add up. If you're doing it lots of times a second, that's going to add up. But the idea is not to say that that's wrong, that is definitely right, but the reflection is slower. How much slower is worth figuring out for your example, for your scenario.

StringConcat vs. StringBuilder 

Onto another one. StringConcat versus StringBuilder. It's always sort of said that you should be using StringBuilder and that's a great general rule, we're not going to argue against that, but in terms of performance, what's the difference? Interesting enough, there's a link to the Roslyn issue at the bottom to try and introduce StringBuilder in more places and you can read the issue if you're interested to see where that went, because generally StringBuilder is better. You're doing less allocations. The issue ... if you see what we're doing with String Concat in this case is we're taking the string and cutting a new string, but each time we're throwing away the previous string because next time around the loop, we're adding a new string to it. Strings in .NET are immutable. If you can concat them all in one go, great, but if you concat like this in the loop, you're basically taking a string that contains a number zero, we're adding to that a string that contains number one, making a new string which is zero one. Next time around it's taking zero one and adding two and so on. Basically there's a lot of waste in this particular example of StringConcat.

StringBuilder doesn't have that issue, we only build the string fully up at the end. We're doing two string. Anyway, what does the actual difference look like? This is an output from benchmark.net. It shows you a lot of detailed information around the allocations and I realize this is hard to look at, I'm just going to show that briefly, this is the raw stuff we give you but much more useful is some graphs that show this in a better way. Basically, the long and short of it is that for a lot of cases, the performance isn't hugely different but there comes a point, depending on how many times you're concatenating strings where the performance of StringBuilder is way better. That's mostly related to the fact that it's doing less temporary allocations, so less work for the garbage collector to do. The actual difference is not huge, but it's all about those temporary strings.

This is one example, it's potentially a controlled example because we're concatenating a lot of very small strings. If you were to conduct a small amount of larger strings, your results would be different. The point is not to take this as a general thing, but measure this sort of stuff when it matters and try and get some truth around these ideas if a StringBuilder is better than a StringConcat.

Tony:

Since this kind of performance issue is visible in source code directly, should we consider them when doing core reviews?

Matt:

Yes. I think the general rule of using StringBuilder with StringConcat is great and I think it should always be applied, basically because if we flick back to the code, you're not making the code more complex. You're not using some handwritten class to get absolute performance reviews built into the .NET run time. It's designed for this sort of thing. It's tailor made for it. This sort of thing is a great example of what actually ... this whole idea of premature optimization or not. Actually, we want to be writing the best code we can from the start, really, based on our knowledge and our best practice and all that sort of stuff. I think this is a good one, but the time, I would say, be careful in code reviews going beyond this. I think StringConcat, String Builder are on quite good ground there, but particularly back to the reflection example, to say blindly we shouldn't ever use reflection ... actually sometimes you have no choice, so that kind of balances that one, but in other situations it's like how much slower is reflection? Is it a worthwhile trade off in our case? To say don't use reflection during code review stage… I think like in a lot of these things, it is a balance, but I think the String Builder, StringConcat ... going to your actual question is actually there's no real downside to changing the code to String Builder if it's not ... it's no more complex. You're not writing code that won't be understandable by someone else, things like that. I think it's a worthwhile thing there. I guess the thing to do is make sure that you have someone else on who's taken the time to just check these things and understand what's going on in different scenarios and ... so there's a bit more knowledge and data behind it would be my recommendation.

Tony: Okay. Thank you.

Garbage Collection 

Onto some other things. Basically, we touched upon it on the benchmark just there but with the .NET Garbage Collector ... it’s fantastic aid to programming on the .NET runtime. It takes away so many issues that you have to worry about in languages that don't have or run times that don't have it and it's true that allocating is very cheap, but the main issue is that cleaning up afterwards, to make allocations cheap, the .NET GC has to do work in the background. It has to compact. It has to search for objects that are available, this sort of stuff. It's actually sometimes difficult to measure the impact because it happens that some of the tasks are, in effect, are synchronous, so when you write a bit of code, it's not that point the Garbage Collector kicks in. The Garbage Collector kicks in when it feels like it needs to and at that point you might get GC pauses. That's the main issue.

There's a few tools that can help understand when there's excessive amounts of GC. Very simply, with perf …tool, system internals, time in GC, it's hard to put an exact number, but I would say once you get above 50% of the time in GC, there's a problem because it's spending more time on doing garbage collection than it is running your program. I've heard numbers that say 10% or above GC is another cause of concern. But certainly seeing a sustained high amount of time in GC is a red flag and you want to investigate that more.

Another tool that allows you to investigate that more is PerfView. I always say that PerfView wins the prize for being the most useful but possibly ugliest looking tool. It's on that end of the scale. I'm sure we've all used tools that look amazing, but give you no functionality or no use for functionality. PerfView is the complete opposite. Don't be put off by what it looks. It's a functional UI. It does exactly what it needs to, but it can give you some very useful, low-level information. It works on top of ETW Events, event tracing for Windows events. It's designed to be very fast. They do say that it can be used in production apps for short periods of time with minimal impact. It's not saying you'd want to turn it on all the time, but you can turn it on for a while for investigation. Please test it out before turning on your production app.

In terms of GC, what it gives us in the chart at the bottom is this max pause time and GC pause time equals time when your application wasn't running. If you're here with a quite small pause time of eight milliseconds, but it does vary a bit with which GC mode, whether it's work station or server and background and foreground. At certain points of time, the GC kicks in and does stop the world or kind of stops the world and when it's doing that, none of your code can run. If that pause takes 100 milliseconds and you're SLA is 100 milliseconds, you've lost it because GC pools, because any responses that were happening at that time were only button clicks away will happen at that time will be paused until the GC is finished.

Fortunately, over the time the last releases of .NET, the GC has had more and more features, so it does this more and more in the background and a GC server mode has a background mode now in. NET related versions. So the times when your entire application is paused is becoming less and less, but it's still definitely a possibility.

StackOverflow saw these huge spikes in GC pauses ... at least over one second up to four seconds and they would render their pages generally in under 100 milliseconds. So for them, this is a bad experience for users and you can see a link there at the bottom for the full details of what happened there and how they fixed it. 

Stack Overflow Performance Lessons 

There's also some nice performance lessons from StackOverflow. They controversially say use static classes for them, the performance benefit of having static classes versus […]classes all the time was found to be a measurable impact on their application.

I'm not, again, saying that was a general thing, but for them they found that it worked well. They're also not afraid to write their own tools when off the shelf tools don't give them what they need or don't give them performance they need. Generally, Dapper is their micro ORM or macro ORM that has very high performance. Jil, JSON Serializer again, that is tuned for high performance and Miniprofilers that we talked about before. For a lot of this, you need to understand the platform, the CLR is not a black box. There's stuff going on in there, particularly around the garbage collector and things like that, you need to ... if you want to get the most performance out of .NET, you need to try to understand what's going on there.

Roslyn Performance Lessons 

Again, onto the code samples and more just finish with this last section and talk with some examples from the Roslyn codebase, actually. There's an entire talk on this. This is just a small sample. You can see the link at the bottom. There are some places ... the thing I find most interesting about this is this is the people who write C#, the C# compiling team. Some of the stuff they come up against is interesting because some of this then fed back into the language, because they were seeing this as performance issues in the Roslyn product they were writing, the Roslyn C# compiler and so they then, where applicable, fed it back into the language.

The performance ... with all these examples, you can assume it's a bit of code that's running a lot. This is a logger class and the fix in this case or what they changed their majors, they have all these boxing and added the two string cores there. There's some details in the pull request at the bottom to explain what's going on there and a bit more context to it and how it can, in some ways, has been added up back into the compiler but can't in all different ways because ... the link explains a bit better, but what's interesting with this is actually if you use Resharper or other tools, they tell you to remove the two strings, because they're redundant because they are technically redundant, but not if you care about the boxing.

Really, this isn't one you should be applying unless you, as I said, profiled it makes the codes… I guess uglier. It's not intuitive while you're doing that. Unless that code's being called a lot, the overhead of boxing won't be noticeable. But it is one they came across in Roslyn. 

Another one they came across as a performance issue. This is fine matching symbol in the compiling. You can assume that's being called a lot of times. A lot of calls to this server over a period of time. Roslyn compiler is not just running when we build our projects in Visual Studio, but it's constantly running in the background of Visual Studio to power intellisense, to power syntax highlighting. So there is bits of Roslyn, in effect, running continuously in the background whilst we develop in Visual Studio as well.

Interestingly, their fix in this case was to not use LINQ. This is really the one I really would hate for anyone to go away and take out LINQ. I think LINQ is a fantastic feature. It makes code that is much more understandable, more concise. Anyone, almost, could read LINQ. It takes a bit more understanding, potentially. But there's an overhead to LINQ. It doesn't come for free. There's stuff going on with the compiler to make that possible. Stuff going on in the background. It's basically, again, an extra allocation with LINQ. They found that the old iterative way of a simple foreach loop in doing the same thing worked for them in this case.

And to give you an idea, the actual difference in timing is between Iterative and Linq isn't huge, it isn't even double in terms of raw speed. But the main issue is the number of gen zero collections basically caused by the number of allocations. The Iterative one will almost never allocate. The LINQ one almost always will allocate, basically and it's those allocations that cause the garbage collector to do more work and if you call in this sort of code enough or cause the difference in times.

The final one from Roslyn, again, this is a bit of code that's running a lot. It's working with generic types in Roslyn and it's already using StringBuilder, which we discussed before is actually good best practice. It's not allocating lots of temporary strings. It's building it up bit by bit and then recoding the strings at the end. So in terms of string processing, this is about as efficient as you can currently get in .NET. Actually, they found that in this case, they used object pooling. They found that the allocation of a new StringBuilder object was costing them every time if you do it a lot. StringBuilder, by default, I think allocates pre-sizes to around 16, so there's a bunch of allocations, it's a certain amount of bytes allocated every time you make a StringBuilder even before you've added something to it. They made a simple object pooling cache. I’ll show you the code they used to do that. The system pool thread static when they acquire they look for the one on the current thread. If it's not there because it's not been created yet, they create a new one, clear it out and then return it. When they finish with it, they call it GetStringAndRelease and put it, in effect, back in the cache, it‘s not a pool in the classic sense, because it's per thread and there‘s a single one.

There's a couple of things just to call out before people even consider object pooling. One is that if you're going to pool objects, you need to make sure you return them to a clear state when you're finished with them because when we allocate new objects, normally that's done for us behind the scenes and we get an object in a known state of things errored out, mulled out if you like. With the StringBuilder, that's nice and easy because you can call .clear and in other cases it's not. This is a simple cache because it's thread static, so it's per thread. The downside is it stays on that thread for the lifetime of that thread. StringBuilder is still there. If you have lots of threads, you have lots of StringBuilders. Often the other solution used is a global logic pool, but that requires locking and that's much more complex. There's a trade off in both these scenarios but you need to worry about ... you don't want your object pool growing too large and storing more because there's a cost to storing. It has to be tuned and thought about before implemented.

That's brought me to the end of the talk. Hopefully that's been useful for people. I don't know if there's any questions that have come up as I've been going along.

Q&A

Q: Can benchmark.net can be used in continuous integration and fail?

A: It's not something we currently have support, if you like, out of the box with the main product. It's something we keep thinking about adding, but with time ... there's been a few community contributions getting us there, but we're not there yet, but it's certainly something we plan to have. You can take the raw stuff benchmark.net gives you and certainly you could build that yourself, but it's not something we provide as yet. The main issues around providing that is if you're going to do the whole thing, you need to worry about storing your results, running it on consistent hardware and a lot of other stuff that's on the outside of our ideas at benchmark.net. It may be something we get in the future, but it's not something you can do straightaway with benchmark.net, but you can certainly use benchmark.net to give you the raw numbers and then you could build that tool on top of it or implement on top of it, yes.

Q: How is benchmarking different than finding the difference of time span between start and end of a function?

A: How does benchmark.net do it differently? One of the main things is we use the stopwatch, which is a bit more accurate than just relying on time span. We run the code multiple times, that's the other one. There's a couple of things ... there's not lots of things you have to do, but you certainly want to call a function once first to let it be jitted, because you pay that cost of the jitting of the function the first time it's called. You don't want to measure that in terms of the real performance, because that only happens once. You want to get that out of the way, then you generally want to run the function multiple times in batches and get the timings on the batches because there's a limit to ... if we're talking about something that takes nanoseconds, you can't just measure that with a before and after.

You need to run it multiple times in a batch until you can actually record the length of the batch and then work out the per iteration time. It's doing a bit more but it basically boils down to running a function multiple times. Jitting it, first of all, and the one other thing we do is the jit compiler in .net ... if it sees that you're calling a function but not doing anything with the result, it might remove that, say there's no need to do that because it's … . It doesn't go into anything. We make sure that doesn't happen. We prevent that from happening so that we ensure the code that you think you‘re benchmarking is actually benchmarked. So they're the main sort of things we do.

Q: Does LINQ perfomance problem raise only in Roslyn or does it have a place with older .Net platforms?

A: With regards to LINQ, there are no major differences in the Roslyn compiler compared to the older one. So performance issues can exist in LINQ across all .NET compiler versions. The example was included because it was a change made to the Roslyn code base itself, not to the code it produces when compiling LINQ.

Q: What is the best setting for GC on Win Server 2012 R2 which hosts 250 ASP.NET MVC apps on IIS. I am talking about gcConcurrent and gcServer in aspnet.config file.

A: Generally, server mode is the best when you are running on a server, but you actually don’t need to do anything, because it’s the default mode in ASP.NET Apps, from https://msdn.microsoft.com/en-us/library/ee787088(v=vs.110).aspx
You can also specify server garbage collection with unmanaged hosting interfaces. Note that ASP.NET and SQL Server enable server garbage collection automatically if your application is hosted inside one of these environments.

Q: Do we get hints, how to measure or which tools are recommended to measure performance issues?

I really like PerfView, it takes a while to find your way around it, but it’s worth it. This post by Ben Watson will help you get started http://www.philosophicalgeek.com/2012/07/16/how-to-debug-gc-issues-using-perfview/, plus these tutorials https://channel9.msdn.com/Series/PerfView-Tutorial I’ve also used the JetBrains and RedGate profiling tools and they are all very good

Q: What other techniques can be applied for embedded dotnet, single use application, to avoid unpredicttable GC hesitation. Considering the embedded device is not memory constrained.

A: Cut down your unnecessary allocations, take a look with a tool like PerfView (or any other .NET profiler) to see what’s being allocated and if you can remove it. This post by Ben Watson will help you get started with PerfView http://www.philosophicalgeek.com/2012/07/16/how-to-debug-gc-issues-using-perfview/. PerfView will also tell you how long the GC pauses are, so you can confirm if they are really impacting your application or not.

Q: How is benchmarking different from finding the difference of timespan between start and end of a function?

A: BenchmarkDotNet does a few things to make its timings are accurate as they can be:

  1. Using Stopwatch rather than TimeSpan, as it’s more accurate and has less overhead.
  2. Call the [Benchmark] method once, outside the timer, so that the one-time effects of JITting the method are included in the timings.
  3. Call the [Benchmark] method several times, in a loop. Even Stopwatch has a limited granularity, so that has to be accounted for, when the methods only takes several nanoseconds to execute.

Q: This is not a question, but an answer for the question about examples for Unit tests not showing performance issues. We need to load data from Azure / SQL server in production, but in Unit tests we have a mock service that responses immediately

A: Thanks, that’s another great example, a mock service is going to perform much quicker than a real service!

Q: What measure could point me in the right direction to know that I could benefit from object pooling?

A: See Tip 3 and Tip 4 in this blog post by Ben Watson http://www.philosophicalgeek.com/2012/06/04/4-essential-tips-for-high-performance-garbage-collection-on-servers/ for a really great discussion on ‘object pooling'.

Q: How does benchmarking work on asynchronous code?

A: Currently we don’t do anything special in BenchmarkDotNet to help you benchmark asynchronous code. It’s actually a really hard thing to do accurately and so we are waiting till after the 1.0 release before we tackle it, sorry!

 

About the speaker, Matt Warren

Matt Warren

Matt is a C# dev who loves nothing more than finding and fixing performance issues. He's worked with Azure, ASP.NET MVC and WinForms on projects such as a web-site for storing government weather data, medical monitoring devices and an inspection system that ensured kegs of beer didn't leak! He’s an Open Source contributor to BenchmarkDotNet and RavenDB. Matt currently works on the C# production profiler at ca and blogs at www.mattwarren.org.

 

How do you avoid the dreaded "this is not what we asked for" and ensure customer satisfaction when building a new system?

In this webinar, Dino Esposito demonstrates a top-down methodology, sometimes mistaken for plain common sense and often boldly ignored, called UX-Driven Design (UXDD).

UXDD means coming to a visual agreement with customers by using wireframing tools to iterate on sketches of the new system before building it. Then, rather than building the system from the data model, you proceed in a top-down fashion instead. The resulting system may be slow or even inefficient but it will never be the “wrong” system! In addition, UXDD leads to clarity on a few of today’s most popular patterns that are sometimes difficult to understand like CQRS and Event Sourcing.

Watch the webinar and learn:

  • An introduction to Wireframing tools
  • Proven ways to save on post-first deployment costs
  • Insights into better customer relationships
  • Better focus on development without 'analysis paralysis'

Building Better Architecture with UX-Driven Design on Vimeo.

You can find the slide deck here: https://www.slideshare.net/sharpcrafters/building-better-architecture-with-uxdriven-design

Video Content

  1. Need for Two Architect Roles (10:12)
  2. UX-Driven Design in Three Steps (19:02)
  3. UXDD Summary (24:35)
  4. Related Terminology (32:33)
  5. Three Levels of Prototyping (37:05)
  6. Bottom-Up vs. Top-Down Approach (44:27)
  7. Q&A (53:46)

Webinar Transcript

Dino:

Hi, everybody. I'm here today to share essentially, a few ideas about how to take advantage of techniques that I collectively call UX-Driven Design, to help us building better architectures and software architectures. The first issue that everyone, every developer, every architect runs into in any software project is making sense of estimates and making sense of requirements. We all know ... We are familiar, very, very familiar with a statement like the one you can see taken from an interesting book called 'The No Estimates Book', that attempts to promote a slightly different and alternative approach to making estimations around software projects. Anyway, the quote says, "A good software project must, like a house, start on the strong foundation of good architecture and good requirements" and I guess, everyone would agree with that. But, the problem is that in software, nobody asks you to simply build a house. You know how it works.

The customer or maybe, even worse. The marketing people will approach you and try to convince you that all that the customer needs is a small and it's a very simple thing, very basic. Nothing really serious to be worried about, but soon it grows slightly bigger and more ... Requires, more importantly, a more solid foundation and this is what we call a hut. But, the hut then is not enough, because it has to be these days mobile. It has to move around, across platforms. Across devices, across whatever, so from the hut we move up to the caravan. You know how it works, in the end they ask us to build a castle. But, the problem is, we never know since the beginning where we are expected to go. All that we get from customers is words and as architects, we must make sense of those words. But, words are of different types, they belong.

They express different concepts, as far as requirements are concerned. There are conscious requirements when users express everything that is relevant, because they know all know the details, but then there are unconscious requirements as well. In which, just a few details are often reckoned so obvious that users don't mention that. They just omit it, but they are fundamental for the architect and finally, there are dreams. Everyone that has been a developer at some point or an architect at some point knows about users' dreams. Those things that they wish to have, but they could even have, but they never mentioned them. In the end, if we want to try to make more financial sense around software projects, we must in my humble opinion find out a way to improve the software development process. In other words, we need a better way to learn. Now, which way?

Here is a couple of funny laws, funny because they express a concept that as architects and developers, we are pretty much familiar with. But, the Mr Humphrey mentioned here is a scientist, so he is not a funny person. Funny, because of the effect it may have on us, but why it's serious. The Humphrey's Law mentions that the user of the software won't know what she wants, until she sees the software. I'm sure that everyone here is smiling, oh I know that. The other lemma to this Humphrey's Law is the Wegner's Lemma. An interactive system can never be fully specified, nor can it ever be fully tested. We may smile and laugh at these things, but at the end of the day they are absolutely pieces of truth, so we must cope with these two pillars and build our new way of learning about software development. Taking into account, these two facts.

There is another fact we have to think about, that drives us straight towards the foundation of this webinar. If you wait until the last minute to complete the user interface, at that point it only takes a minute and this is the foundation of the message. The foundation of the sense, of this UX-Driven Design approach. We never, ever spend enough time on the building and more importantly, on the thinking of the user interface. The main problem here is not much, the user interface intended to be a collection of graphical things. Artifacts, colors, styles, CSS. This or that, but it's just the definition of the interactions expected between the user, any user in the system and the system itself. In a way, the concept that I like to associate with the popular idea of the user interface here is close to the idea of the use case diagrams you may know from your exposure. Your past exposure to the UML modeling language.

It's fundamental to have clear, which interactions should take place between users and the system. You know that every software in this world, many great ideas have been first sketched out on paper napkins. Paper napkins or just paper and a pen. Even, in the days in which everything is done through a computer, still play a role. Because, they are extremely intuitive ways to save, jot down ideas. But, in software, jotting down ideas is great. It's good, it's still effective, but to paraphrase a popular sentence of Linus Torvalds. "Talk is cheap, just show me the product" and this is what customers want. How can we show customers the product, because talk is cheap? How can we do that, in a way that is cheap to us as well? If we could find a way to show customers in a cheap or any financially sustainable way, the kind of software we are going to write.

The kind of interface we are going to produce, we gain significantly in terms of the immediacy of the message. If there are missed points in our understanding of the system, there is a significantly higher chance that those missing points will be caught at a good time. Not to compromise significantly and to make more expensive the development. In doing so, if we focus on providing screens and wireframes, and sketches of the user interface. Whether, on paper or not, we learn about the system in a way that is strictly focused on tasks and actions, that the user has to take with our system. We learn a lot more about the processes that we are then called to implement, but we learn about those processes. Not much, in terms of the physical workflow we have to implement in the backend, but essentially and primarily from the perspective of the impact that those processes have on the way that users work with the application.

We write software, essentially to replace. To help users doing things with computers, we are not necessarily expected to change the processes the way in which users do their own things. We have to lean those processes and then, simply map those things to software artifacts we can work with. In other words, the front end of a software system and the backend must match. If I have to find an abstract reason for many software projects, to be financially not much sustainable these days. It's just because of the front end and backend that don't match the way they should.

Need for Two Architect Roles

Let me confess that I really love actually, quotes. Here is another one, this time it's taken from the Matrix, the popular movie. Who is speaking is Morpheus and this character at some point says, "Remember, all I'm offering is the truth. Nothing more." That's what I'm trying to do now. The piece of truth I want to share with you today is that, we need to have ideally two architect roles around our team. I mentioned roles and I mentioned the word 'Architect.' The first type of architect role is essentially, the software architect. The classical, canonical software architect which is facing the painful truth of user requirements. The other one is the UX architect, the new one. The new entry, who is expected to face the blissful simplicity of the user interface. The software architect collects business requirements with a purpose of building the best possible domain layer, whereas the UX architect collects usability requirements to build the best possible user experience for the presentation layer. We are talking about two different architect roles, focusing on different layers of a software system. The domain layer, the backend where we have business logic, where we have business rules implemented and the presentation layer where we have instead the face of the system shown to the face of users. We want to be as smooth, as pleasant, as comfortable as possible and these two architects will work together.

Tony: Excuse me, I have a question here.

Dino: Yeah, sure.

Tony: You are talking about two roles here, is it possible that these two roles are actually represented by one person?

Dino:

It is possible, definitely. Not coincidentally, I titled the slide two architect roles. The role is a type of work that must be done around a software project. Definitely, it can be the same person with enough skills or it could be two different people. I'm not saying that every team should hire a new person, but every team should have a person that is perfectly sensitive to the needs of the usability, as well as the needs of the business roles. Yes, to answer your question.

Tony: Okay, thank you.

Dino:

You’re welcome. The UX architect, a new role. Let's find out more about the responsibilities, so what is the job that is associated with the additional architect role in a software project? Primarily, the UX architect is responsible for something we can call the architecture of the information. The way in which information is distributed across the screens and dispatched to the values, and personas. Persona in the jargon of UX is essentially, a type of a user and user machine interaction. How we distribute information, how we organize information and how we dispatch information to users. Using, which kind of interaction that users love, so it's about how telling to users how you want to do this task. What is your natural way of performing this task and what kind of information you need, at every step of the procedure that represents the process.

This is apparently, something totally completely obvious, but in practice this is not ... This has very little to do with how we do things together, today. Today, we essentially focus too much at the beginning of our tasks in the understanding of what could be the ideal way or the cooler technology we want to use. We would like to use in a project, so we focus on how to parse this data. How to lay out our data, the data that represents and configures the state of the system. We typically, as software architects, ignore until the last minute what is the really required ideal architecture of the information for users and that information that will make the interaction between users, and machines pleasant as it should be. How can we verify the goodness of any idea, any type of user interface we may work with, we may offer?

There are usability reviews and usability reviews is a new entry. There are unit testing, which is good for validating the performance, but usability reviews are essentially the unit testing of usability and presentation layer. I'm not talking about or not necessarily and not only about automated UI tasks. That could help, but this is something that happens at the higher level of abstraction. It's about figuring out learning from users at work, with our prototypes. If they love it, if they like it. If they feel comfortable, if they find it simple. When you send an email and ask users for feedback and the answer you receive is, it's so simple. Then, you have done a great job, regardless of the fact that it was really easy or not to make it that simple. But, making it simple for users is the goal, is the absolute measurement of performance.

It's the measure of performance for good software architects these days, so what does it mean? What it could mean, evaluating the usability of a software system. Essentially, you do that looking at users while they work with the system. Even, recording them. Even, filming them and learning from the body language. There is a little bit of a cognitive sciences here, some of those principles. It may be, you or a team of experts to extract the real feedback and the way in which you interpret the body language. Could be just reading through the emails or looking into their faces. If you see them or just hiring a separate team of people, who can do the analysis for you. Another important point, another practical approach you can take is monitoring the timing of operations or at least, the operations you reckon important in the software.

This is something you can even do via software, by placing tools around your sensitive calls, that log the start and the end of operations. Then, report those times to some remote database, for further analysis. In some cases, it's just a visual kind of analyzing, in some other cases it could be delegating the task to a separate team of experts. With, psychologists and experts in the cognitive sciences. In some other cases, it can just be extra software. Logging and providing software, not measuring the performance of the software itself, but measuring the time it takes to perform certain visual operations. 

UX-Driven Design in Three Steps

These concepts altogether form something I like to call UX DD or for short, UX-Driven Design. In three steps, UX DD can be summarized as follows. Step number one, create screens, as users love them. Iterate on this process, until you reach the stage in which users really tell you they love the screens you have.

Second step, once you have finalized those screens. It means that you know at that point, with an acceptable degree of certainty what goes in and out of each screen you are expected to have. For every piece of user interface you present to users, you know exactly because you have a sketch of that screen. Exactly, the data flow, in and out what has the user has to type. Has to enter in that screen and what the user is expected to see after the operation starts from there has completed. You know exactly the input and the output of the workflow, you have to trigger on the backend to serve the input from the UI. At that point, you have the screen. You trigger a workflow that takes just the input data as users have entered that into the screen and the workflow at the end has to generate the next screen that presents exactly the information that you can learn from the screen you have. Once you have this point of contact UI and the topmost part of the backend, whatever leads underneath that level is the pure backend.

 

It just consists of essentially, attaching some code and some business logic to those workflows. It's just about filling out with logic and code, and action the workflows. These three steps ideally don't take place sequentially and not necessarily in a single cycle. Ideally, I envision UX DD to be split in two parts, so there is a sequence of strings done initially where you iterate using the classic agile feedback loop to work, and rework. Learn and rework the screens, at that stage in the first step of UX DD, you hardly write any code. You typically work with sketches and with tools that can help you, dealing with digital sketches. There is a good number and still, growing number of new tools emerging in this area. Tools for wireframing, I will be mentioning a few of them later on. The breakpoint that I put on screen here, after the create screens as users love them, that is the point in which you can sign off whatever that means with your users. We are now start coding and we code exactly to that set of screens.

Sign off means all and nothing, the term sign off in this context can be intended as the signature put on some binding contracts. If, you are for example, a software consulting company. But, it could even be just an agreement, verbal agreement with customers if customers are for example internal customers. You are essentially, a corporate developer, so I see a logical breakpoint in between the first step and the other two, versus the actual concrete implementation of this breakpoint. May be different from project to project and from team to team. After that, the second half of the UX DD methodology is about writing code. Essentially, coding the backend and also, coding the mechanics you need to have on the user interface. The second part is just coding in the regular agile feedback loop. The sign off is the guarantee that you have more than ever chances to deliver at the end of the day the software, that is really close to the real expectations.

UXDD Summary

Summary, UXDD can be described as a two phase waterfall methodology. Two phase, because there is a logical breakpoint in between the first wire frame analysis and the actual coding phase. The benefit is that, the extra steps or the step wire framing that is receiving a lot more attention today. Apparently, we spend more but the more we spend is essentially, a low cost activity. Because, we use wire frames and no code, the sign of the front end. The analysis on the front end is essentially, a low cost thing. In return for this small extra step, we have all set for a straight implementation on the backend. We have great chances to make it right, the first time. In summary, slightly longer than classic, bottom-up architecture approach. But, on the upside, nearly no post-deployment costs. Yes, no post-deployment costs and this is the real point that could make the difference.

Tony: I have one more question here about this.

Dino: Absolutely, sure. Yeah, sure.

Tony: You say there are no post-deployment costs, do you mean just costs by fixing issues which were not caused before during the development or is it also about support, which usually doesn't have to do anything with bad design or something like that?

Dino: I think, it's both things. Primarily, the cost that in my opinion and for what is my experience, because by the way I'm practicing this every day. It's just the cost of fixing things, because they don't provide exactly ... They don't let users work exactly in the way they need. It's about work taken out of support and also, work done fixing small, little things in the UI. Just add this piece of information here, can I have the list prefilled or this information sliding in at a certain point? It's both about the support and also quick maintenance. The type of quick maintenance you typically do, right after deploying it.

Tony: Okay, right.

Dino:

You're welcome. I think, it was a couple of years ago. I've seen the picture that is going to appear on screen just in a few seconds, which was Tweeted by a European MSDN account. I don't know if it was MSDN, Italy, France, Germany. I don't even remember, but it was one MSDN account who presented this picture, to say this is how you see your system has a tidy, ordered sequence of layers. The data, then on top of the data, the business logic. Finally, on top of the business logic, the presentation. You see, there is a classic bottom-up way of development, data business presentation. This is how we, as smart software architects, think that software should be done and this is instead how users see our system. A layer of interface and underneath that, under the covers. Under the surface, just like magic. The funny thing is that when this tweet was made, the sense of the tweet itself was just that of a joke.

We are so smart, that we know how to create, design systems and let's look at how dumb users can be. Because, all they are interested in is, user interface and whatever else. All the effort we put into the system to their eyes is just black magic. That's the key point, that's exactly how users see our system, but that's not how we build actually the system. Let's take another perspective on the same point. Imagine, we talk about carrots. If you are a developer, a carrot is for the most part, under the soil. Every carrot has a huge taproot and just small leaves over the soil. But, if you are a user or if you are a designer, it's quite the other way. The leaves is not simply leaves, but they are an actual tree and the taproot is whatever else that you don't see, because it's under the soil. But, the real world is neither, the world has developers see it and nor, the world has users, and designers see it. The real world shows that a carrot has a significantly long taproot and significant leaves. More importantly, the two things grow together.

At the end of the day, UX Driven Design is about mimicking, mirroring how carrot's leaves and taproot grow in the real world. This leads us straight to formulating the definition of user experience. User experience is nothing fancy, user experience is simply the experience that users go through when they interact with the application and we do want that the experience they go through is as pleasant as possible. Otherwise, we failed and look, we failed here means essentially we have spent. Our company has to spend more time, more money, money effort. More resources on fixing those things and experience shows that companies lose good money on that extra effort. There is frustration in the developers involved with that, there is frustration on the users end also because they paid for something they can hardly use. They have to adapt themselves to the thing we have delivered and nobody is happy, so it's a lose-lose kind of thing. Paying more attention, a lot more attention on user experience becomes or has more chances to become, or to really be a win-win kind of thing. We learn more, we do best and everyone is happy. 

Related Terminology

When it comes to user experience and the way we can learn our way through that, there are a few terms. A few words, a few expressions, terminology that look related and is often used interchangeably, but those terms however have some significant and specific meaning. The first term I want to call your attention on is sketch. Sketch is defined as a freehanded drawing which is primarily done with a purpose of jotting down ideas. It's the classic thing we do on the paper napkin of some cafeteria. The wireframe is another term, which essentially identifies a more precise sketch that contains more information. Precisely, it extends the concept of wireframe with information about the layout, about the navigation about the content we are going to present in each, and every screen. Third term is mockup, which is nothing than a wireframe where a lot of UI graphical details have been specified.

A mockup is a wireframe with a sample UI attached. This set, this is the real meaning and these are the three levels of wireframing activities you can find. But, be aware that depending on the context, these three terms may be used interchangeably to mean something that in the end is very close to the idea of a wireframe. Of these three terms, the central idea that is strictly related to UX DD and in general, learning about processes through the UI. The key term is wireframe. Other three terms related are proof of concept, which typically identifies a small exercise with code done to verify the truthfulness or just the viability of an assumption. Or, another scenario pretty common for proof of concept is, getting yourself started. Familiar, with a new technology, just to see if it could work and if it could be used to implement certain features of your system. Prototype is yet another term, but the prototype is not a small exercise actually.

It's a fake system that simulates the behavior of a real system, the real system to be built. When you build a prototype, you are miles away from the real system, even though the goodness of a prototype is making the fake thing look like the real one. If you have one a good prototype, the sign that proves this is when users say it's half done or it's already done. No, the hardest part when you have a good prototype in your hands is telling users it's just a fake thing. The real one is miles away. The pilot is instead a production ready system, the difference between the pilot and the real system is essentially not in the functionality, but the fact that it's tasked and put in action against a subset of the intended audience or the intended data. Anyway, proof of concept, prototypes, pilots are again, terms that depending on the jargon spoken in each company can be used interchangeably. But, again, the central term, the most important thing is prototype and prototype ties up nicely with the concept of wireframing. 

Three Levels of Prototyping

At the end, UX DD identifies three levels of prototyping. Basic understanding of functions, which is typically done via sketches. Very simple, very basic wireframes. Basic prototyping, which is when you essentially go deeper and rewrite sketches into something that is much closer to the real sequences of steps in the process. You get a lot more granular when you move from sketches to wireframes and then, sometimes this is not enough. Not always, not necessarily, you can go and talk to users, just about wirefreames. Wireframes are nice, are PDF files, are PowerPoint kind of things. But, there is no action, even the most detailed wireframe is essentially a storyboard that the users can view as a sequence of hyperlinked screens. But, there is no action, no fake data. There is nothing there that gives the sense of the real application. The term that is advanced prototyping and advanced prototyping is done with code. We are back to square one, so back to the issue that it is expensive to do that.

Tony: I actually have a question here.

Dino: Yeah?

Tony:

You say that we need to create prototypes during the UX DD process. Where are actually the savings here?

Dino:

The saving, if you ... Essentially, if you write the prototype because it's required, because users ask you to show something that looks like the real thing and this happens very often. Not always, but very often. Then, the challenge is not using deep code to build the prototype. Ideally, building the prototype without even creating a visual studio project. This opens up an entire new world of tools, that allow you to achieve the level of prototypes from wireframes. The trick, the lever that makes UX DD really affordable is using a different type of tool side-by-side, with a classic Visual Studio or IntelliJ. Whatever kind of framework you use to write your software, it's having a dual set of tools that could work together and the wireframe is created at some point with one tool. If that tool is particularly smart, you can just derive from there the prototype or if you're unlucky, or if the project is particularly large and big. Okay, prototypes is writing code, but anyway the idea. The experience proves that any effort you put up front tends to be essentially paid back, paid off by the saving at the end of the project. The work you do in advance is much easier to code to customers, that's not a secondary point.

Tony: Okay, thank you.

Dino: You’re welcome. A few products that can help you out with having some preliminary work done around wireframing and around quick prototypes. The entry level tool in this area is BALSAMIQ and BALSAMIQ is a relatively cheap piece of code, that allows you to work in a PowerPoint style. You drag and drop artifacts, graphical shapes to a drawing surface. You compose those things together, you group those things together and the most you can do is linking some of those elements to another view so that you can simulate a storyboard. The effectiveness of BALSAMIQ is that you can save those things as a PDF file, taking advantage of the hyperlinking internal feature of PDF files. All you send to users is a PDF file and you can have users to navigate through the screens, just having an idea of the way in which they will be doing the real thing. Nothing is real here, there is no data. Nothing is live, it's entirely static but it gives an idea and if you need more, there are three other tools that I'm aware of. There might be probably other tools as well in the industry out there.

AXURE, UXPIN, JUSTINMIND. These three tools are nearly doing the same and they do more than BALSAMIQ. They are also more expensive than BALSAMIQ, in terms of licenses and prices. I'm familiar particularly with AXURE and with AXURE, you can create wireframes in much the same way you do in BALSAMIQ but you can easily turn those things into code, into working prototypes that are delivered as HTML, JavaScript and CSS websites. From the tool itself, you can upload and share via a dedicated cloud the prototype for users to play with. The functions that the tool makes available allows you to have a can of data. Pre-field screens random data automatically and you can even add some quick action, essentially using JavaScript code. You don't have to be a developer to use AXURE and usually, AXURE is not used by developers. Yet, with AXURE, you can produce something that really looks like in terms of colors, in terms of layout, in terms of navigation.

In terms of real user interface, look like the real application and can even be the starting point once approved for building the real facade. The real presentation, the real front end for the real system. 

Bottom-Up vs. Top-Down Approach

To finish off, I just want to show you now the difference between the bottom-up and the top-down approach, so that you see once more from a different perspective what is going to be ideally and hopefully beneficial out of UX driven design. The bottom-up approach that we are, everyone is familiar with I guess, starts from requirements and then based on the requirements, we create a data model. On top of the data model, we build up the business logic and then, we just stick some user interface on top of the business logic. It works, where is the problem? The problem, for many years, didn't exist. The problem started existing in recent years, when probably following the introduction of the iPhone, the entire mass of users out there started changing their mindset about the way in which they interact with software. Users started hitting, rightly, the need to have data at hand.

Information at your fingertips Bill Gates used to say many years ago, but never in my opinion that concept turned into reality. What is the problem when the user interface is stick on top of that business logic, in a bottom-up manner? Not necessarily the interface of the user interface matches the interface, the bits out of the business logic, so it's a matter of facing a possible likely model mismatch. When this happens, so when the format, the shape of the data you work with in the backend is significantly different from the data expected to be used and consumed in the user interface, you need to have at some point, in some way, adapters. Adapters must be written and adapters must be tested. Adapters must be debugged and it's a cost, and not necessarily those adapters can easily retrieve the data in the aggregation level that is required by the user interface.

When this happens, you know what's the next point? The next point is that you have to spoil the purity, the beauty of your model and leaving holes here and there, for some code to connect through the layers to retrieve the data. The queries, the inner joints, the aggregations that weren't planned right from the bottom, because building from the bottom is never as optimal as doing it the other way around. Top-down, we have requirements and let's say, we start right from the user interface according to the UX DD idea. Then, once we know exactly what comes in and out of each screen we are going to have, we know all possible ways for the backend to interact with the system. We can create a business logic, the topmost part of the backend and give that a model that is cut to fit perfectly with the bits coming out of the user interface. At that point, the data model is just a detail because we create the data model.

We persist the data in the way that suits our needs. We can do that in total freedom, regardless of how we are retrieving data. We can choose, even the framework, the technology, the persistence technology that best suits. The business logic and the data model together are a single element, they are a piece of black magic. We now, in this way, are back to the ideal vision of the software the users have. Just an interface and whatever sits under the surface, under the covers of user interface is black magic. You can have this black magic kind of thing, only if you go top-down and UX driven design offers you a set of practices, techniques, suggestions on how to achieve that. To summarize, there are some other things that nicely fit in this big design, if only you take it a little bit further.

User interface, collection of screens, pins to connect the backend, the application layer and business logic built to fit with user interface. Screens and workflows, and it's clear in the interaction, the input model that flows out of the user interface and becomes the input for the application layer, and the view model that becomes the output of application, and the input of next screens we present to users. We have here, four different types of data models around. There is the persistence model, how we save data. There is the domain model, how we deal with data in terms of business processes. There is the view model, how we render out and there is the input model, how we receive data from the user interface. Domain layer, infrastructure layer leaves underneath the application layer.

The user experience we deliver is just what users want and we can happily, and comfortably build the backend that is cut to fit, to support just the user experience that users want. There is one more benefit here, CQRS. When you learn about a system, in terms of tasks. Every task tends to be either a command that alters the state of the system or a query that reports the current state of the system. In this context, CQRS makes total sense and it's a nice fit. CQRS recommends that you split your backend into two distinct layers, two distinct stacks. Vertical stacks, the common and the queried stack. The nice thing, because they are separated and because, whatever way you choose to persist and read data doesn't change is transparent to the front-end of the system. There is no reason not to use the technology that fits best. It could be sequel, no sequel, events. Whatever, in memory caches. It can be whatever, for example: It can be an even store for the common stack and a remodel may be sequel, relational real model for the query stack.

In the end, UX-Driven Design just helps us saving cost, especially post-deployment costs in maintenance. But, it helps us to learn, through processes about the domain. Which, also makes us able to suggest new features and grow our business. But, it also connects very well and very clearly with a couple of buzzwords, which are very popular these days. CQRS and event sourcing. All this said, the best way to save money on software projects is learning as much as possible about the domain and about the user expectations. I really don't know if there is someone who said this sentence, probably yes. Probably, no I don't know, I don't even care but this represents however an absolute piece of truth. Okay, so that's all for my end. I will be happy to take a few questions from you, if any questions are.

I just want to thank PostSharp for organizing this webinar and for giving me the chance to talk about UX-Driven Design, so thank you to the people at PostSharp. Which, by the way, is a cool kind of a software that can help in some way to write software in a cost-effective manner. For those of you interested in sort of a followup or in some other resource that goes beyond the boundaries of the webinar. I have a couple of Pluralsight course. Actually one available today, which is in top hundred of Pluralsight courses. Modern Software Architecture, where I talk about domain model, CQRS and event sourcing. There is another one, more specific on UX-Driven Design, which is expected to ship pretty soon. I was hoping to have it available within the end of the year, but it will probably slip to January. It's a matter of weeks anyway, to have UX-Driven software design available in the library of Pluralsight. What else? If you don't mind, follow me on Twitter and thank you very much for your time.

Q&A

Q: Maybe the issue is to stop trying to give the customer what she wants and together, as architects, we should give the customer what she needs. Then, work to persuade her why that's better.

A: Sometimes, this works and sometimes, not. Yeah, understanding what the customer needs is the ultimate purpose of the architect but however, not necessarily our understanding. What we believe is what the customer needs is exactly what the customer needs. It's true, it's correct. However, it relates on how deep is our knowledge that's it. How we manage to understand what the customer really needs. Persuade the customer that something is best, is better? You can do that, but I have a recent experience this week which I'm trying to convince the customer, that the different architecture of information will help her to work more effectively. But, she's coming from a mindset that has taken her and all of the team in expecting data to be laid out in a certain way. No matter, the confusion that can result on screen, that's the way they love it. There's no way to persuade these people, so if you, definitely. Otherwise, I don't know. It depends, it usually depends.

Q: I'm not sure the front-end and backend should match, surely the reason why for each is very different. The UX is contextual and should be tuned to the needs of the now, whereas the backend should be focused on stability. Extensibility and future proofing the data. The front-end should be specific and the backend, generic.

A: In a way, yes. This fits nicely with some of the concepts borrowed from domain-driven design and in particular, the layer. The architecture and the fact that the business logic, the domain logic, so the part of the business logic that never changes that is not dependent on processes. That should be isolated from persistence and from processes. Yeah, exactly. In this context, the business rules go into a layer isolated, which in domain-driven design is called the domain logic. That is that persisted to typical databases, but it's also injected into something that is called application layer which is the point of contact. The facade, the proxy in between the front-end, the presentation and the domain logic. Out of the domain logic which is static, in the sense that it's shared and common. Then, can be optimized for whatever, for business. For performance, for relatability and whatever else. It's about the processes and the processes should be taken out of the logic and made to measure, to coordinate with the needs of processes on the backend of the expectations of users. The application layer plays a key role here, because it's the facade between the front-end and the backend.

Q: What happens in the middle of the backend sprint. If you realize you are missing a screen, should you have it designed within the sprint or wait for the next one?

A: It depends on how relevant is the screen that you realize you're missing. Probably, yeah. I would try to get that fixed in the sprint you're in, so in the second sprint you're in without waiting for completion. I would be ... The purpose is delivering what they need as soon as possible, so if a screen is missed I would try to violate, in a way, the sequentiality of the two waterfall steps. Have the screen added, with all the details, in the context of the development sprint. Unless, there is some contractual issues, so if you signed off and there's a contract that financially binds you to deliver something. At that point, if you miss a screen, it's a matter of legals to lawyers or to experts, to kick in and see how the contract has to be rewritten to let you do that kind of work extra.

Q: If it's agile then how common this that we can see there is no post-deployment cost or there is no post-deployment at all in this process?

A: It was more or less the promise of agile to ... If you talk to customers during the entire cycle, then you deliver what they need at the end of the final sprint. But, actually this is not what most people are experiencing, so there are post-deployment costs. If someone doesn't experience that, okay. Good, it's great. Fantastic, but that's not ... In my experience, it's not exactly the point and that is that primary ... The major source of extra costs in the midst of the project is my experience.

Q: Is UXDD applicable for data driven apps such as most enterprise apps, like in banks?

A: It’s mostly about your interaction with users, whether consumers or corporate or whatever. I’ve seen UXDD similar things applied in a large (very large) pharma company. Missing link was that they simply split design of UX and design of code in two steps but design of code was done bottom-up. In general, I see benefits everywhere mostly because that helps taking a task-focused approach to the way you learn about the domain and processes. Particularly relevant I guess in enterprise business.

Q: What is the best system development process to use with UXDD - e.g. DSDM?

A: I see UXDD as two agile steps. And agile means what it usually means to you. But in some organizations distinct steps are problematic. In this case, I tend to see the first UXDD step (wireframes) as a sort of sprint zero. Possibly a repeated sprint zero not repeated for each sprint but more than once.

Q: How best to handle a situation where the cost to implement a screen is a very high? What if there is a better way to implement at a cheaper cost but at a sacrifice of usability.

A: As long as it keeps customers happy, any UX is fine. But users can appreciate MORE and MORE a UX that is super-simple, easy to understand, tailor-made to processes and friendly.

Q: I've heard a lot of from managers, that UI can be generated by data. So the data is essential and for hardcore enterprise it is tabular data. There are numbers of solutions for that approach. Devs modeling data, domain and then get UI for free - from the managers perspective. Any advice on how to convince those managers?

A: Unfortunately data is less and less simply tabular. And users are less and less willing to accept whatever UI a system can autogenerate. That’s precisely the mindset UXDD attempts to address. The simple answer (and most concrete) is that you save money after deployment. Money in terms of support work, fixing bugs, fixing logical bugs. If you don’t run into these problems, you’re probably OK as is.

Q: How is multiple UX Designs handled for a single data model? I mean if i have different UX for different devices rendering the same data how is that handled in UXDD?

A: Multiple frontends (like Web and mobile) talk to possibly different application layers and application layer is the topmost part of the backend. Under the application layer you find the reusable domain logic. Different UX may refer to different processes that orchestrate common pieces of business in different ways. Different frontend match possibly different application layers. (Application layer is one of the layers of DDD layered architecture. The place that orchestrate use-cases.)

Q: Can you please give an example of working with the top down UXDD approach and adjusting the business logic requirements. What is an example of adjusting the Business Logic to fit the User Interface? For example, if we learn that what users need is out of scope with project scope/timeline/resources, how can we affect the Business Logic?

A: The key fact I see behind UXDD is simply that everything starts from the top and only once you’re satisfied with the UX you start arranging the backend. How can you effectively design the BLL if you’re not sure about what users want? BLL is just the next step after UX or the part of UX that fits your scope/timeline/resources.

Q: When performance tuning comes into the place? During development or at late stages of development when you have UI defined completely?

A: UI and the backend share input and output and defining those things is the primary purpose of UXDD. At that point UI and backend can be optimized separately.

Q: Was UML to purpose this idea? And now MS removing it?

A: UXDD screens are essentially implementation of use-cases. And use-case diagrams are a foundation of UML. But in my opinion UML is too much and too cumbersome. Too many diagrams and formalism. But yes there’s some reasonable conceptual overlap.

Q: What's an example of an Event Store?

A: An event store is a database where you store the state of the system in the form of events (this is event sourcing, or having events as the data source of the application). Technically, an event store can be a NoSQL store (Mongo, DocumentDb, RavenDB) but it can be a JSON-friendly store (SQL Server 2016) or specific products like EventStore. Learning about Event Sourcing is helpful.

Q: What you think about "Advanced Prototyping" level with following workflow (e.g. we are working with ASP MVC):

  • 1) First we implement prototype in ASP MVC Views with faked ViewModels.
  • Use fake ViewModel Repositories, which in future we will replace to real.
  • Not repositories of domain objects, but repositories of view models.
  • At that time there is no domain objects and logic.
  • 2) After some iterations we have working prototype and then we implement domain objects and logic. And connect domain model with view model.

A: What you describe is what I did no more than a couple of times. Today ad hoc tools exist (ie, Axure) that can do most of the work for you and generate HTML frontends with fake data in a PowerPoint style way and a bit of JavaScript knowledge. A work non-developers can do. At the same time, these tools can create only storyboards—sequences of hyperlinked pages. If you need a bit of logic and realistic data in the prototype, then it’s a prototype and what you suggest is probably the least expensive way.

 

About the speaker, Dino Esposito

Dino Esposito

Since 2003, Dino has been the voice of Microsoft Press to Web developers and the author of many popular books on ASP.NET and software architecture, including “Architecting Applications for the Enterprise” or "Modern Web Development" or upcoming "Programming ASP.NET Core". When not training on-site or online for Pluralsight, Dino serves as the CTO of Crionet—a fast growing firm that specializes in Web and mobile solutions for the world of professional sports.

 

Localization is crucial for reaching out to a global audience, however, it’s often an afterthought for most developers and non-trivial to implement. Traditionally, game developers have outsourced this task due to its time consuming nature.

But it doesn’t have to be this way.

Yan Cui will show you a simple technique his team used at GameSys which allowed them to localize an entire story-driven, episodic MMORPG (with over 5000 items and 1500 quests) in under an hour of work and 50 lines of code, with the help of PostSharp.

Watch the webinar and learn:

  • The common practices of localization
  • The challenges and problems with these common practices
  • How to rethink the localization problem as an automatable implementation pattern
  • Pattern automation using PostSharp

Solving localization challenges with design pattern automation on Vimeo.

You can find the slide deck here: http://www.slideshare.net/sharpcrafters/solving-localization-challenges-with-design-pattern-automation

Video Content

  1. Six Sins of Traditional Approach to Localization (6:20)
  2. Automating Patterns with PostSharp (16:08)
  3. Q&A (20:50)

Webinar Transcript

Hi, everyone. Good evening to those of you who are in the UK. My name is Yan Cui, and I often go by the online alias of The Burning Monk because I'm a massive fan of this '90s rock band called Rage Against the Machine. 

I'm actually joined here by Alex from PostSharp as well. 

Alex: Hi, everyone. 

Yan:

And before we start, a quick bit of housekeeping. If you have any questions, feel free to enter them in the questions box in the GoToWebinar control panel you've got. We'll try to answer as many of them as we can at the end of the session, and anything we can't cover, we will try to get back to you guys via email later on. 

We're going to be talking about some of the work that I did while I was working for a gaming company called Gamesys in London. This was up until October last year, and one of the games I worked on was this MMORPG, or massive multi-player online RPG game, called Here Be Monsters. One interesting thing about Here Be Monsters was that it has lots of content, and when it comes to time to localize the whole game, we have some interesting challenges that we want to find a novel way of solving. Just from this simple screen, you can see there's a couple pieces of text. The name of the character, the dialogue that you say, as well as some UI control here that says, “out of bait.” One of these need to be localized for the entire game. And as I mentioned earlier, this game is full of content. In fact, in terms of text, we have more text than the first three Harry Potter books combined. And there are many different screens in the game, one of which is what we call the almanac. Or, you can think of this as an in-game Wikipedia of some sort, where you can find information about different items or monsters in the game. Here is an example of the almanac page for Santa's Gnome, which is only available during Christmas. 

So anyway, there's a couple information about the monster itself. A name, description, some type, et cetera, et cetera. Those all need to be localized, as well as all the UI elements of labels or for bait. The text on the bottom et cetera, et cetera. So even for a very simple screen like this, there's actually a lot of different places where you need to apply localization. 

A few years back, Atlus, who makes very popular RPG games, very niche games as well, they did a post explaining why localization is such a painful process that can sometimes take four to six months, and he touched on many different aspects that's involved in the localization process. And you can see from his list that I hit, by the estimation, programming alone takes between one and one and a half months with the traditional approach of localization. Therefore, each of your platforms, the client, they will ingest a gettext file, which contains a bunch of localization in a very plain text format like this. You've got, basically, a key value of pairs of what the original text is, as well as what the localized text should be. 

Alex:

Yan, so what is this pure file format? Is this the standard for localization, or do you have some tools for that? 

Yan:

Yep. So the gettext file is industry standard for localization. Otherwise, I don't know of any standard tooling for translators. The translator that we were working with, they had internal tools to help their translators, or work more effectively with the gettext file format. And for different languages, there are also libraries available for you to be able to consume those gettext files. We'll look at one for .NET later on. 

Alex: Okay. Yeah, thanks. 

Yan:

Once you know you've consumed those translation files, you'll need to then substitute all the text that you have with the localized versions of those texts. You've got buttons that display some text. This is just do the code, it's not taking from our real code base, but to just give you an idea of where you need to apply localization to labels and buttons and so on, as well as your domain objects. So where you've got the domain object that represents a monster, or the names and descriptions, et cetera, et cetera, will need to be localized.

 

Once you've done that, you do your data binding. Then, assuming you haven't lost anything while you've localized, this screen shows you all information about the centers known. I think this is in Brazilian. Portuguese? I can't read any of it, so I don't know how accurate translations are, but at least you can see the localization has been applied to all the different places. So you pat yourself on the back for a job well done, probably go get a beer with your colleagues, and then you realize, oh wait; what happens if we make some changes or we add new, additional types to our domain? You're gonna have to keep doing this work over and over, each with a platform that we support.

And then to add salt to that injury, look at how much time Atlus reckons we tend to spend on QA. The reason why it takes so much time is because there's a massive scope they need to test. And not only to spot check and make sure the translations is not drastically wrong, but there's also loads of bugs that can creep in and do in client integration work, have people miss out on a particular screen, whether some buttons was left in English instead. And you've gotta do these for every single platform, and because you're doing releases so frequently, or at least I hope you should be, that means you have to put on repeated effort to test localization whenever you make any kind of change on the client, really; which broadens your scope for your regular testing. And it's putting a lot more pressure on your QA teams. 

 

6 Sins of Traditional Approach to Localization 

Here's pretty much a laundry list of the problem that I tend to find with the traditional approach to localization. A lot of up-front effort, which is development, and we have the team doing more work as you introduce more domain times and extend your game. It's also hard to test, and it's prone to repressions. You see, it's normal to feel doom and gloom whenever localization's been mentioned in the company, and it's because it's a pain. Once it's there, it just doesn't go away. Which is why, when it comes to time for us to implement localization in our game, we decided to think outside of the box and see whether or not we can do it better in a way that would be easier and more attainable for our team. 

To give you a bit of background on what sort of hours of pipeline at a time, we built a custom CMS; content management system. We internally call it TNT. It's really just a very thin layer on top of Git where all the information about game design data about the monsters, about different quests, locations; they're all stored as JSON files, and we built in some integration with the Git flow, finding strategies so that we can apply the same Git flow. Finding strategy that we use for developers already, and get our testers to do the same thing. 

Alex:

By the way, why did you choose to build a custom CMS instead of using something else, or benefit to Git and Git 12 in this case?

Yan:

Right. We decided to build a custom CMS because we also wanted to bake into the CMS some basic validations that applies to our particular domain. And the reason we have a pro-lay on top of Git, is because we want to have a source control for all our game data. Well, we do that for all our source code, and the game data is really part of your source code for your game, which can't exist without the validator, the things that makes up the content of the game. 

And Git flow is just a way so that we allow our game designers to work in tandem with each other, and have a well understood process of how to merge things, and how to release things when those things get back to masters. So that when you look at a master branch, you know you're looking at exactly what has to be deployed for production and so on. 

Well, we had a team of game designers who work on different branches of stories. One person may be working on a storyline, their storyline next week, whilst another person may be working on a storyline they're screwing up in a month's time, and you want them to be able to work in parallel without stepping on each other's toes. Git flow comes into the play as part of the mechanism for allowing them to do that. 

Does that answer your question?

Alex:

Yeah. That seems to work well, yes. Good idea. Thanks. 

Yan:

Cool. 

So inside TNT, you have some very simple UI controls that the game designers can do cherry-pick, as well as they do merges of different branches. Once they're happy with the game design for the world they've done, they can then publish to a particular environment so that they can test it out in that environment and see whether the quest is more interesting. Or, put in all the mechanics so that their hopes are in place.

Right, so at this point the custom CMS would package all the JSON files and send it to a publisher survey's, which you would perform deeper validation against the game moves. For example, if you've got an item, there's a high level. Then the pair should be at a particular quest, and it shouldn't be able to give out the item as a reward. And also, we do quite a few pre-computation and auto transform the format from the original JSON into a more suitable format to be consumed by different client platforms. 

Now, all of that then gets pushed to S3 and from the TNT as a publisher, I press a button. Now I can see everything's happened. And as you can see from the logs, we do this an awful lot, which is also one of the reasons why we decided to invest the effort into building two links, such as the CMS, so that we can constantly integrate upon our game design, not just as much as our code. 

Once the publisher has done his job, we will solidify the game specs. We call them into S3, into version folders, and as a game designer, we'll just press a button to publish my work. I'll get an email with a link at the top so that I can click that and load up the web version of the game with just the changes from my branch so that if someone else is using the same test environment to test their changes, I won't be stepping on their toes. 

You may notice that there's a link here to run economic report. This refers to some other piece of work that we've done to use a graph database to help us understand how different aspects of the game connects with each other. So an item could be used in a recipe to make another item, which can then be used to catch a monster, who then drops a loop, which can then be used in another quest, or so on and so forth. So our domain is very highly connected, and to understand all the knock on effects or making even small changes like upping the price of water, you'll have a huge amount of knock on effect that goes through the entire economy of the game. So, we use a database to automate a lot of those validation and auto-balancing. 

You can also see, down the email there, you can see results of the game rule validations, and we report back to the game designer. And then at this point, all the specs are ready. The server, the flash client, as well as the iPad client will be able to consume those data in different formats, and you'll be able to load up a game and test out the changes. 

Alex:

Another question here; so why do you need to produce different formats for all those different platforms? Is that a requirement? 

Yan:

Right. So for example, on the server application, we don't really care about the name of a monster, or the description of a monster. It helps to reduce the size of the file and how long it takes to load it, as well as the memory footprint of several application. So we ship those client only information from the server spec, and we also precalculate a bunch of secret values - coefficients and things like that - into the client and into the server spec, but not make them available in the client spec. 

Of course, the client specs are public, so anyone who's a bit more tech savvy will be able to download the spec, and then work out its format and understand some secret values that we have embedded into our domain. They'll be able to cheat in the game, essentially. 

Alex: Uh-huh, okay. 

Yan:

And also the flash, because it's all web based. They prefer to load the whole file as one big zip, whereas for a iPad, they prefer to have smaller file sizes. But many of them -

Alex: Okay, that makes sense. 

Yan:

So at this point, we thought, “well, if we do localization, what about if we bundle it into our publishing process so that by the time all the files has been generated, they'll only be localized on the client?” We wouldn’t have to do some of the things we saw earlier, where you have to apply localization to your domain objects all the time. You can then publish your localized versions of game specs to language specific folders. Notice that, as you mentioned earlier, the server doesn't care about most of our text being localized, so we're actually gonna need to apply that same sort of path to the server spec. 

So with that, you remove the duplicated effort you need to do on each of the client platforms. At the same time, you reduce the number of the things that can change. They're automatically changed with each release, because they have an automated process of doing this, so there's fewer things that you test. We don't need to touch all these things, but we still have this problem of having to spend a large amount of effort up front. All the things that you were doing on the client before, now has to be done by something else. In this case, the server team, which have to, like I said before, ingest a gettext file to know all the translations and then need to check the domain objects for string field improperties that need to be localized, apply localizations when you transform those domain objects into DTOs. And then again, do the same thing for multiple languages if you're localizing for different targets.

Automating Patterns with PostSharp 

 But notice that the step two and three is actually just a Patient pattern that can actually be automated to help fire proof yourself against future changes or as you add more domain objects, we should get localization for free. 

And in .NET, you can consume a gettext file and gain admission from that using the second language package. And obviously, because we're here, I'm going to be talking about the implementation patterns and how to automate them with Posharp. 

So for those of you who are not familiar with Posharp, you can buy different aspects, which will then apply the post compilation modification to your codes so that you can bake additional logic and behavior into your code. So here, what I've got is a very simple aspect which is applied only to fields or properties of type strength, so that when you code a setter on those properties of fields, this bit of code will run. And as part of that, we check against localization; so, local stress … context object to see whether or not we are in localization. If not, we move on. 

Alex:

So I can see here that actually localization context is doing all the translation work apparently. How do you set it up, or how do you initialize it? Because here, you see just that you call translate. 

Yan:

Yep. So as I mentioned, we use just the gettext translation file. Imagine when the custom CMS, the TNT, calls the service with a big zip file, the publisher will then unpackage that. As part of that package, you will find those PO files. For each of those files, the publisher would load it with a second language, and then create a local context. And within that context, it then transforms the domain objects created from those JSONS, and into DTOs. 

So when the DTO transformation is happening, and you're creating new DTO objects and setting the string values for its fields and properties, this code will kick in. And because it's called inside a localization context, you will contain the information that we have loaded from the gettext file. So that next line, this guy, all he's doing is checking for the gettext file. “Do we have a match for the string that you're trying to localize?” If there is, then we will use that localized string instead. So what we're doing is here, is that we're proceeding with calling the setter as if you've called the setter with the localized string instead of the original string. 

Does that make sense?

Alex: Yeah, that's perfect. 

Yan:

So with this, we can then just multicast all of our DTO objects, which have the convention of having the suffix of VO for legacy reasons. And through this one line of code, plus the 30 we just saw, it pretty much covers over 90% of the localization work we had to do. And as we create new domain objects and new types, those types will be localized automatically without us having to do additional work. 

So with that, we can eliminate the whole up front environment cost, because the whole thing took me less than an hour to implement. And because we're multicasting attributes to all DTO types, it means any time we add a new DTO type in the future, you will be localized automatically by default. 

Again, you have more automations, so there's fewer chance for regression to kick in, because people are not changing things and having to constantly implement new things by hand. Still, you can have things that are regressed, but it's just, in my experience anyway, is far less likely. And since we implemented the localization this way, we actually didn't have any localization related regressions and bugs at all, which is pretty cool for us for not a lot of work. The combined effect of all of these changes is just far less pressure on your QA team to test the changes that you are making to the game; new quest line, new storylines, as well as UI changes and the server changes, and localization as well. So, they can better focus their time and effort on testing things that have actually changed and are likely to cause problems. 

Q&A

“Okay, well,” you may ask, “well, how do I exclude the DTO types from the localization process?” Fortunately, we have a built in mechanism for doing that, where you can just use the attribute property on particular types. In this case, I know the leaderboard player DTO only have the IDs which are called division profile ID and the name of the user, none of which should be localized. And therefore, we can simply exclude this guy from the whole localization process.

Then, you may ask, “well, but then where do you get these gettext files from,” which is a great question. Well, as I mentioned earlier, we actually store those gettext files as part of TNT so that when we call a publisher from there, you include the localization files as well. And to get those files into TNT, there's actually a page in the tool so that the game designer can go in there and make the changes once they're happy with all the contents. At that point, we say, “okay, now let's localize all the new quest lines that we've just created.” 

And there's a button that we click, which will then take the existing localization file, because we don't want to localize the same text if you haven't changed. We actually use comments to basically put a unique identifier for each of your text, so that we can identify that when a particular by-log or name, or description or whatever, has changed, so that we will reset the entry in the gettext file. And then for the new gettext file, we then send you over to the translators, who, with their tools, will be able to pick out the new strings that they need to translate. They only charge us for the new strings that they have to translate and not everything else that we send them. 

Once they send it back, the translated PO file, we upload it into TNT. When we do the next publish, you will have all the organization for the new content. When we release the new content, there is a bit of time where the English version is the head of the Brazilian Portuguese version. So if the player is up to the latest quest, then chances are they will end up playing a single game in English instead of translated the version of the game. 

 

With that, that's everything I've got, and thank you very much for listening. 

 

About the speaker, Yan Cui

Yan Cui

Yan Cui is a Server Architect Developer at Yubl and a regular speaker at code camps and conferences around the world, including NDC, QCon, Code Mesh or Build Stuff. Day-to-day, Yan has worked primarily in a mixture of C# and F#, but has also built some components in Erlang. He's a passionate coder and takes great pride in writing clean, well structured code.
Yan's blog.