PhysX and Hardware Acceleration

A more accurate title would be PhysX and its lack of hardware acceleration. In retrospect this is largely my fault for believing marketing hype without looking into details, but I figured that I’d discuss it. Finding resources on the subject was difficult, so I think a lot of people may be suffering from the same delusion as I was.

Super short version: PhysX computes its rigid body simulation strictly in software. The GPU never gets involved.

Let’s step back for a second and look at the history. It starts with a library called NovodeX which was a popular physics package around the same time Half-Life 2 launched with Havok. A semiconductor company called Ageia acquired the company to build Physics Processing Unit (PPU) based cards, an idea that got a lot of attention at the time. The hardware was over priced, mismanaged, and probably doomed in the end regardless. The effort started sinking pretty quickly until NVIDIA swept in and bought the whole thing. NV then announced that the whole PPU project would be scrapped and that hardware acceleration of physics would be done on the GPU.

The current state of PhysX is somewhat confusing. There are plenty of PhysX accelerated games. Benchmarks of these games show dramatic performance gains when running with hardware supported physics. There have been recent allegations that PhysX’s CPU performance is unreasonably poor, possibly to strengthen the case for hardware acceleration. I don’t have any comment on that mess, but it’s part of the evidence suggesting that hardware is the real focus of PhysX and NVIDIA. That’s not a surprise.

What is a surprise is that the rigid body simulation — the important part of the physics — is not hardware accelerated. Apparently it was when the PPU rolled out, but the GPU based acceleration does not support it. Look at any PhysX based game that advertises and you’ll notice gobs of destruction, cloth, fluids, etc. That’s because those are the only GPU-accelerated effects PhysX supports (plus a few misc things like soft body). Probably the big tip off is that none of these effects require forces to be imparted back to the main physics scene in any way. This is strictly eye candy.

So the interactive rigid body simulation, the part that actually affects gameplay, is completely in software. And if you believe the claims, it’s not even done well in software. All these problems will apparently be fixed in a magical 3.0 release, coming at some vague point in the future. Why? My best guess is that no one has paid any attention to the core PC code in six years. I’d wager that everyone’s been so obsessed with hardware acceleration, and that the basic problem of writing a rigid body solver is so stupidly easy, that we’re simply coasting on the same 2004 NovodeX era code that made the library popular in the first place. Version 3.0 is probably a ground up rewrite.

Don’t get me wrong. PhysX is not bad. It is simply stagnant. Take a recent game and strip away the GPU driven effects candy. What exactly is left in the interactive part of the simulation that wasn’t already in Half Life 2? That was also a 2004 release. NVIDIA did what they do best, visual effects. Marketing also did what they do best, letting everybody assume something untrue without actually ever saying it.

Anyway, now you know where hype ends and fact begins. Rigid body game physics is not hardware accelerated; only special effects that fall pretty loosely into the category of “physics” are. Maybe that’s common knowledge, but it was news to me.

Update: This presentation about Bullet’s GPU acceleration is a good read.
Update 2: I’m wrong on one technical point — PhysX’s hardware accelerated systems can impart forces back to the main scene, and their cloth shows off this capability in the samples.

NHibernate Is Pretty Cool

My last tech post was heavily negative, so today I’m going to try and be more positive. I’ve been working with a library called NHibernate, which is itself a port of a Java library called Hibernate. These are very mature, long-standing object relational mapping systems that I’ve started exploring lately.

Let’s recap. Most high end storage requirements, and nearly all web site storage, are handled using relational database management systems, RDBMS for short. These things were developed starting in 1970, along with the now ubiquitous SQL language for working with them. The main SQL standard was laid down in 1992, though most vendors provide various extensions for their specific systems. Ignoring some recent developments, SQL is the gold standard for handling relational database systems.

When I set out to build SlimTune, one of the ideas I had was to eschew the fairly crude approach that most performance tools take with storage and build it around a fully relational database. I bet that I could make it work fast enough to be usable for profiling, and simultaneously more expressive and flexible. The ability to view the profile live as it evolves is derived directly from this design choice. Generally speaking I’m really happy with how it turned out, but there was one mistake I didn’t understand at the time.

SQL is garbage. (Damnit, I’m being negative again.)

I am not bad at SQL, I don’t think. I know for certain that I am not good at SQL, but I can write reasonably complex queries and I’m fairly well versed in the theory behind relational databases. The disturbing part is that SQL is very inconsistent across database systems. The standard is missing a lot of useful functionality — string concatenation, result pagination, etc — and when you’re using embedded databases like SQLite or SQL Server Compact, various pieces of the language are just plain missing. Databases also have more subtle expectations about what operations may or may not be allowed, how joins are set up, and even syntactical details about how to refer to tables and so on.

SQL is immensely powerful if you can choose to only support a limited subset of database engines, or if your query needs are relatively simple. Tune started running into problems almost immediately. The visualizers in the released version are using a very careful balance of the SQL subset that works just so on the two embedded engines that are in there. It’s not really a livable development model, especially as the number of visualizers and database engines increases. I needed something that would let me handle databases in a more implementation-agnostic way.

After some research it became clear that what I needed was an object/relational mapper, or ORM. Now an ORM does not exist to make databases consistently; that’s mostly a side effect of what they actually do, which is to hide the database system entirely. ORMs are actually the most popular form of persistence layers. A persistence layer exists to allow you to convert “transient” data living in your code to “persistent” data living in a data store, and back again. Most code is object oriented and most data stores are relational, hence the popularity of object/relational mapping.

After some reading, I picked NHibernate as my ORM of choice, augmented by Fluent mapping to get away from the XML mess that NH normally uses. It’s gone really well so far, but over the course of all this I’ve learned it’s very important to understand one thing about persistence frameworks. They are not particularly generalized tools, by design. Every framework, NH included, has very specific ideas about how the world ought to work. They tend to offer various degrees of customization, but you’re expected to adhere to a specific model and straying too far from that model will result in pain.

Persistence frameworks are very simple and effective tools, but they sacrifice both performance and flexibility to do so. (Contrast to SQL, which is fast and flexible but a PITA to use.) Composite keys? Evil! Dynamic table names? No way! I found that NHibernate was amongst the best when it came to allowing me to bend the rules — or flat out break them. Even so, Tune is a blend of NH and native database code, falling back to RDBMS-specific techniques in areas that are performance sensitive or outside of the ORM’s world-view. For example, I use database specific SQL queries to clone tables for snapshots. That’s not something you can do in NH because the table itself is an implementation detail. I also use database specific techniques to perform high-volume database work, as NH is explicitly meant for OLTP and not major bulk operations.

Despite all the quirks, I’ve been really pleased with NHibernate. It’s solved some major design problems in a relatively straightforward fashion, despite the somewhat awkward learning curve and lots of bizarre problem solving due to my habit of using a relational database as a relational database. It provides a query language that is largely consistent across databases, and very effective tools for building queries dynamically without error-prone string processing. Most importantly, it makes writing visualizers for Tune and all around much smoother, and that means more features more quickly.

So yeah, I like NHibernate. That said, I also like this rant. Positive thinking!

Next Step for SlimTune

Okay, it’s been a long time since I touched Tune. In fact I think I’m averaging one major blast of work just about every six months. That’s terrible, but I don’t really know what to do about it. I don’t have the bandwidth to run a company and two open source projects at once. Even my involvement with SlimDX is much weaker than it used to be. The difference is that DX has more very competent developers to take care of it; Tune is all on me.

All the same I am back and I’m working on improving the thing once again. It’s been long enough that I can’t remember exactly what the roadmap was or what I was trying to accomplish. (Protip: Write down your roadmaps, people!) That gave me an opportunity to step back and really examine what the state of Tune is. A lot of people were pulling for memory profiling support, which I had slated for 0.3.x and planned to start fairly soon. I no longer think that’s a good idea. It’s more in my (and hopefully your) interests to make the 0.2.x series as strong as possible.

The basic problem is that SlimTune is a really, really cool program with a very mediocre implementation. I still think the ideas in there blow away a lot of the commercial profiling tools out there. All the same, I’m one person and the code’s accumulated about eight weeks of full time work since its beginning (8*40 = 320 total man hours). It’s still a prototype. Before I can do any significant expansion of features, I first need to rebuild the foundation in a stable fashion. The database code is basically a complete loss. The UI is a rough draft. The backend is actually pretty solid, as far as I know, but it isn’t good with error reporting. Things just stop working sometimes.

This is stuff that needs to be tackled long before adding more features. That means I’m not going to see very many users soon, and that corporate adoption will be slow at best. I have several good friends on commercial products who need memory tracking, end of story. But honestly, I have a reputation of building quality work and it’s not casually earned.

In short, the next phase of development is to build the best damn sampling profiler out there.