SlimTune Profiler for .NET

I basically took last week off from blogging. Time to try and get some new entries out! Things have been very SlimDX focused, but what did you expect? It’s what I do. Maybe today’s will inspire a bit more general interest.

As a creature of GameDev.Net, I get to see lots and lots of discussions questioning whether or not C# and .NET are “fast enough” for games. What I don’t much is people working on actually analyzing and tuning the performance of their .NET code to see what’s going on. I’m not sure why this is, but I have a theory that it’s partly because of the sorry state of available performance tools. The only version of VS that has a profiler is Team Edition, which damned near nobody has. Other commercial offerings are also seriously expensive. There are only two free profiling tools that are really available for use: CLR Profiler and NProf. (I’ve seen a few other tools, but it’s clear that they’re fringe tools that aren’t well supported.)

CLR Profiler is written by Microsoft, and it’s a pretty good tool. They’ve even released the source code, although the licensing is vague. It has a few drawbacks though. First of all, it only does memory profile analysis. It does a very good job of tracking allocations and garbage collections, and the visualizations are very well done too. But that’s all you get — no timings of any kind, let alone a breakdown of where time is being spent. Also, it hasn’t been updated since late 2005.

Then there’s NProf. Oh dear. The good news is it works, barely. That bad news is that’s all the favorable comments I have about it. It does simple sampling based profiling only, and will show you a simple tree based breakdown of time spent. It’s not that NProf is useless; I’ve done lots of good performance tuning with it. But this is literally all it can do, and there’s a lot more you want from a profiler. The last release was December 2006, and there’s some scattered SVN traffic since then but it’s basically dead. Support for x64 is apparently doable if you compile from source. I looked at the source, which is also poorly written. I decided immediately that I could do better than this toy, and now I’m putting my money where my mouth is.

I’m working on a new open source profiler tool right now called the SlimTune Profiler. It will probably release in early September, and the initial feature set is taking direct aim at NProf. The initial version will support sampling and instrumentation profiling for .NET 2.0 and above on local and remote machines. A little later on, you’ll be able to profile-enable a long running process at zero performance cost, and then profile it in real time for short periods. Imagine running a production server, and actually connecting with the profiler while it’s serving real requests to see what’s happening.

On the front-end, data will be collected from the profiling backend and dropped into an embedded relational database. There will be some preset views of the data, but the idea here is that you should be able to apply your own queries to the data and get results that are useful to you. Reporting is not expected for the initial version, but it will be supported eventually as well. I imagine you’ll be able to create various tables, graphs, etc and export them, although I’m not sure exactly what format that’ll be in. PNG and Excel seem reasonable. I’m hoping that you’ll be able to combine results from multiple runs, which would allow you to make all kinds of snappy graphs to show off to your boss.

It’s been my plan for some time now to expand beyond SlimDX, and create a suite of Slim software. We’ve got a good reputation and lots of respect for our work, and I’m looking to build on that. SlimTune is the first step. It probably won’t be able to compete with the commercial offerings — but RedGate ANTS runs $400 or more per license. SlimTune will blow NProf out of the water in a scant two months, and it won’t cost you a dime. The feature set is pretty well specified, and the profiler already works in prototype form. The work over the coming weeks is in building a product instead of a project.

And yes, I know I’m a tease. It’ll be worth the wait.


9 thoughts on “SlimTune Profiler for .NET

  1. Really loved the post. I’ve often noticed the lack of good profiling. =/ I was spoiled, while working at the local courthouse which had a license to pretty much every Microsoft Product, including VSTeam. Profiling, the Class Diagramming stuff, metrics like Cyclomatic dependancy… It was a while before I could do any programming independently again. Good to see that you’re taking a step to break their monopoly. 🙂

  2. This looks and sounds really awesome. I have been using Nprof recently, and while it is functional, it is quite lacking. Keep the good work up, I can’t wait to give it a try!

  3. I wasn’t previously aware of it. Having tried it, they have some nice UI touches — and holy mother of god it’s slow. The SlimTune test app is a modified version of the Agg.Xna sample that is very CPU heavy. Each frame takes 200ms to render under normal conditions, and SlimTune in sampling mode does not perturb that.

    Profiling it under EQATEC, each frame takes 30 seconds to render. Sorry, but I’m not impressed when I have to profile at 2 frames per minute. And this is a really simple app, too. (SlimTune’s instrumentation engine isn’t done, but the estimated overhead is 10x or less, versus EQATEC’s 150x.)

  4. I ended up dropping $150 for a copy of DotTrace so I could reliably profile my XNA games. If you can offer something that’s better than NProf and make the source available, I’ll gladly start using it and contribute bug fixes/improvements.

    If you can keep the instrumentation cost low that will be especially useful – most instrumenting profilers bring my game’s framerate down so significantly that they’re not useful, and the overhead of the instrumentation also throws off the results a lot. Either way as long as you have a good sampling mode I will be able to get use out of it.

    Also, if you can find a way to make it possible to run your instrumentation engine in an offline mode, that would be amazing – even if I have to do some legwork to make it work for my game, that would let me run an instrumented build on the 360 to get timing data, which would be a godsend since there are no profiling tools for the 360 yet.

  5. A few comments on the EQATEC Profiler. (disclaimer: I’m the author of it)

    Yes, CPU-intensive apps can cause a big overhead. For “ordinary” .NET apps the overhead is far, far lower than this reported 150x (sometimes even barely noticeable), but maybe games/xna-programming presents a hard challenge for this instrumenting profiler. Anyway, I have a few hints that might make it useful after all:

    1) You don’t need to profile everything, but can refrain from instrumenting individual assemblies, classes, or methods.
    2) Profiling of extremely small methods (typically set/get of a single variable) is by default turned off. If you’re dealing with cpu-intensive apps then make sure you’ve not accidentally turned this option on, as it could have just the effect you’re seeing, i.e. the profiled code grinding to a halt.
    3) The profiler does by default work in “offline mode” and will save its profiling report, in plaintext xml, to a file and folder of your choice. This might be useful for you, Kevin.

    1. “Yes, CPU-intensive apps can cause a big overhead. For “ordinary” .NET apps the overhead is far, far lower than this reported 150x”

      It seems to me that CPU-intensive apps are the types of apps that people are generally going to be profiling – that is, if I’m working on something, and it’s *NOT* CPU-intensive, I probably have no need whatsoever to run it through a profiler. People generally don’t profile something unless it’s using up all of the CPU 🙂

      It seems to me that, what you’re saying is, the profiler is going to add a ton of overhead, but only in the cases in which the profiler is actually needed.

      Don’t get me wrong – I’ve never used EQATEC, so I can’t speak as to the quality of it…it’s just that this particular statement struck me as somewhat funny 🙂

      1. In a sense instrumentation does add a ton of overhead, yes, about 70-80 CIL instructions of timing-code for every method. That’s why it makes little sense to instrument single-variable get/set-methods, as they are typically around 3-5 CIL instructions long and are never in themselves the bottleneck.

        And frankly, given this overhead I was originally quite surprised that profiled apps ran okay at all, and most often even ran quite fast. For instance, I’ve profiled the profiler’s instrumentation-phase itself, where it certainly does a lot of work, and the profiled version was only about 50% slower than the non-profiled version. (It also revealed a bottleneck that allowed us to improve performance of that phase significantly; see

        So while adding all this extra timing-code theoretically seems like a big overhead it apparently (and again, frankly to my own surprise) isn’t a problem in practice for many, many users.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s