Archive

Do you know how to write very fast C# code? Here's a sobering fact: many schools and universities only teach how to write valid C# code, and not how to write fast and efficient code.

Did you know that adding strings together inefficiently can slow down your code by a factor of more than two hundred? And ’swallowing’ exceptions will make your code run a thousand times slower than normal.

Slow C# code is a big problem. Slow code on the web will not scale to thousands of users. Slow code will make your Unity game unplayable. Slow code will have your mobile apps catching dust in the app store.

In this session, our guest speaker Mark Farragher will show you many common performance bottlenecks and how to fix them. We’ll introduce each problem, write a small test program to measure the baseline performance, and then learn how you can radically speed up the code.

Watch the webinar and learn:

  • The low-hanging fruit: basic optimizations
  • How to read compiled MSIL code
  • The struct versus class debate
  • Optimize for the garbage collector
  • Writing directly into memory with unsafe pointers
  • Use dynamic delegates to dramatically speed up reflection

 

How to Write Very Fast C# Code on Vimeo.

For source code of the examples, please email Mark at mark@mdfarragher.com

Q & A

Q: Why 9% are exceptions?

A: Several viewers have pointed out that the 9% number I mention in the webinar is incorrect. Here is the correct calculation:

I’m building numbers from individual digits. There are 11 digits, 0-9 and the letter ‘X’. So, the chance of a single digit being invalid is 1/11. A number consists of 5 digits, so the chance of a single number being invalid is (1/11) * 5 = 45%. The loop in my code will fail 45% of the time and throw an Exception.

Q: How to get mastery in reflection and dynamic code?

A: By practicing a lot. Write lots of code that uses reflection and dynamic emitting. Experiment, measure performance, see how far you can go optimizing your code. Play around and discover what works and what doesn’t. Plus: read lots of blog posts and articles.

Q: Why would it not be beneficial to use structs for all simple business objects? Is there a point of degradation or some limitation over a class? Is a struct usable with Entity Framework to represent database objects?

A: The .NET Runtime makes certain assumptions about structs and classes, specifically that structs will be very small (in terms of memory space) and have a short lifetime, and classes will either be small or large and have a long lifetime. Simply replacing all classes with structs in your code is dangerous because you will go against these assumptions. For example - if you change a long-living object to a struct, it will get boxed on the heap and your code will be even slower than when using classes. A struct also get copied during each method call, so passing a very large struct to many different methods will slow down your code a lot.

The rule of thumb here is to always start with classes, and only use structs when it makes sense to do so.

The Entity Framework does not support structs.

Q: Can we use DynamicMethod trick on AOT platforms (via Mono)?

A: Nope. The ILGenerator class is missing, so you can’t emit your own CIL code into the dynamic method. Makes sense, right? It couldn’t possibly work with AOT.

Q: CIL stuff is really interesting.  Perhaps worth mentioning that string interpolation and string.Format uses StringBuilder so you don't always need to explicitly use StringBuilder.  Also, StringBuilder has a little overhead so for <4 strings something like str1 + str2 + str3 is faster - I think

A: Correct! String interpolation ($”yadday {yadda}”) compiles to a String.Format call, so it’s exactly the same thing. I always use interpolation because it’s so much easier to type.

You’re also spot-on with the string versus StringBuilder comment. A StringBuilder has some overhead initializing, so it is actually slower for a small number of additions. The cutoff point is at 3 additions. For zero to three the string is faster, for four and more the StringBuilder is faster. For larger number of additions, they start to diverge very quickly.

In my logging and diagnostic code, I always use strings (string interpolation) because I usually stay below the 3-addition limit, and it makes my code so much easier to read.

Q: Hi, For Exceptions, what if TryParse is not there. For user-defined types instead of Primitive what needs to be done.
A: You need to do the same that TryParse is doing internally – scan the input data first, and only start parsing if the scan says it’s okay. Also make sure you return a parsing failure as a return value (i.e. a bool) instead of throwing a FormatException.

An easy way to scan is by using a precompiled regular expression to make sure the input data doesn’t contain any invalid characters. Regular expressions are super-fast.

Q: Any comment about differences between copping arrays, lists, c # hash table, etc at the heap?

A: In terms of memory layout, there’s not that much difference between an array, a list, or a hashtable. All three use arrays internally to hold the data. A hashtable is optimized for key/value lookup, whereas list and array are intended for indexed access.

They all have a CopyTo method that attempts to block-copy all data in one go. If you’re storing value types, you will see great performance for all three.

Q: Are you going to review LINQ / Parallel performance someday?

A: That’s a great idea! Thanks for the suggestion. I have an existing course already that scratches the surface of LINQ versus PLINQ performance, but I’d love to go deeper.

Q: Nice talk. BTW Stringbuilder may not be the fastest. it depends on the size etc. you have to calculate the GC allocations also. best tool for that is BenchmarkDotNet with memory diagnoser on windows! it is a fantastic tool. General rule of thumb: whatever you do you have to measure in order to see perfomance benefits.

A: Thanks for the suggestion. I’ll check out BenchmarkDotNet. And you’re right about the rule of thumb – you always have to do actual measurements, you can’t rely on just theoretical knowledge to optimize your code.

Q: Just a question on array.CopyTo(...) where Mark said that the memory copy was done out of process by the OS (in C libs guessing "memcpy"). In the profiling application during the webcast, array.CopyTo(..) executed in 32ms, whereas the copy via index and loops was >300ms, in other words, using array.copyTo is an order of magnitude faster with OSX as the OS. It the 10-fold difference "about" the same with .NET on Windows? Different OS different ratio?

A: Yes, the ratio is roughly the same. The speed of a memory copy is more or less the same for all operating systems, whereas you might see small differences in 1-dimensional array performance. I’ve noticed that .NET Core tends to be slightly faster than Mono in handling arrays, because it’s much better optimized.

Q: I measured. GetType() is 171 ms vs. typeof() at 6 ms in a test of a million iterations.

A: That’s because typeof() is processed at compile-time, whereas GetType() is processed at runtime.

Q: How do you keep yourself upto date on the latest and greatest technology?

A: I read lots of technical blogs, and when I’m preparing for a new course or webinar, I do a lot of research and write small test programs to experiment. And I probably have a talent for learning new stuff very quickly.

Q: Would you use some form of multi-dimensional converter to convert a single dimensional array back to a multi-dimension array or would you take another approach?

A: It depends on the use case. I usually just wrap a 1-dimensional array so from the outside it looks like the original multi-dimensional array. The disadvantage of converting the other way is that you’re slowing the code down again, so I am a bit hesitant to use any kind of converter.

Q: Do you have any advice for Parallel.ForEach?

A: Yeah, use it! Parallel.ForEach is great for parallelizing regular for or foreach loops. It is my first step in parallelizing code, and quite often it’s all I need to do.

Two years ago, I wrote an app that processes Sharepoint documents. I had a for-loop in my code that would process each document individually. I parallelized the code simply by replacing my for-loop with a Parallel.ForEach. This drop-in replacement to make code multi-threaded is really nice.

Q: Have you tried these performance tests on .NET Core?

A: Yeah. Everything I show you in the Webinar is running on .NET Core 1.1

Q: Foreach loops do have a performance optimization over for loops in cases where the collection is already an enumeration or a function that yield returns?

A: No. Enumerations or methods with yield return cannot be indexed and they don’t have a well-defined upper limit, so there’s no benefit using a for-loop with them. If you do try to use a for loop, you’d have to manually access MoveNext() and Current, and this would be the exact same code the compiler produces when you use foreach.

Q: Is there any significant difference between pre- and post-increment operations. In C++ I am accustomed to always doing ++i in preference to i++ but I rarely see this being done by C# developers.

A: It works exactly the same as in C++, the difference between the two is the return value: i before increment or i after increment.

Q: Does the performance benefits you described for structs vs classes get lost when comparing the performance of passing classes vs structures to other functions (excluding cases where structs are being passed by reference)?

A: Passing structs to functions will slow down your code, because structs are copied by value. For every method call the entire struct will be cloned in memory. When you’re using classes, only the reference to the object instance is copied into the method.

So yes, for large structs with lots of fields you’ll see a measurable slowdown when doing lots of method calls with struct parameters.

Q: What is the difference between the heap and the stack?

A: The stack is a highly-optimized block of memory intended for data with a very short lifetime, just for the duration of a single method call. Stack memory is created when you enter a method and gets cleaned up when you exit out of a method. The stack is also fairly small, usually around 100MB. It’s optimized for a manageable number of small objects (thousands, not millions) with a very short lifetime.

The heap is a very large block of memory (multiple GBs) optimized for long-term storage. You can easily put millions of objects on the heap, and they can be either small or large. The heap has a special internal process for archiving long-lived data, and there’s a separate process called the Garbage Collector that cleans up objects that are no longer in use.

As a rule of thumb, the stack is slightly faster than the heap. It can also very quickly initialize new data by writing zeroes directly to memory (the heap calls the constructor of each object individually). The disadvantage of the stack is that it’s relatively small, and it assumes your data will be short-lived. The stack can also slow down if you have a very deep chain of nested method calls.

Q: Why and when we use reflection?

A: We use reflection when we want to dynamically access object fields or call object methods. With ‘dynamically’ I mean based on data that is not known during compile-time. For example, when we store database configuration data in a configuration file. The configuration file might say we need an OracleConnection or a SQLLiteConnection. With reflection, we can read this configuration field and then dynamically instantiate the correct object.

Basically, any time an object type, property, field or method appears somewhere in text format, we’re going to need reflection to perform instantiation, access fields and properties, or execute a method call.

Q: What does emit mean?

A: Emit means injecting a single CIL instruction into a dynamic method.

Q: What do you mean by baseline test?

A: A baseline test is a performance test of un-optimized code, to get a baseline performance value.

 

About the speaker, Mark Farragher

Mark Farragher

Mark Farragher is a blogger, investor, serial entrepreneur, and the author of 10 successful IT courses in the Udemy marketplace. His IT career spans 2 decades and he has worn many different hats over the years.

Mark started using C# and the .NET framework 15 years ago, and creates online courses that make complex C# programming topics easy to understand and accessible to anyone.