Promit's Ventspace

September 26, 2011

Neuroscience Meets Games

Filed under: Non-technical,Software Engineering — Promit @ 3:43 pm

It’s been a long time since I’ve written anything, so I thought I’d drop off a quick update. I was in San Francisco last week for a very interesting and unusual conference: ESCoNS. It’s the first meeting of the Entertainment Software and Cognitive Neurotherapy Society. Talk about a mouthful! The attendance was mostly doctors and research lab staff, though there were people in from Activision, Valve, and a couple more industry representatives. The basic idea is that games can have a big impact on cognitive science and neuroscience, particularly as applies to therapy. This conference was meant to get together people who were interested in this work, and at over 200 people it was fairly substantial attendance for what seems like a rather niche pursuit. For comparison’s sake, GDC attendance is generally in the vicinity of 20,000 people.

The seminal work driving this effort is really the findings by Daphne Bevalier at the University of Rochester. All of the papers are available for download as PDF, if you are so inclined. I imagine some background in psychology, cognitive science, neurology is helpful to follow everything that’s going on. The basic take-away, though, is that video games can have dramatic and long-lasting positive effects on our cognitive and perceptual abilities. Here’s an NPR article that is probably more helpful to follow as a lay-person with no background. One highlight:

Bavelier recruited non-gamers and trained them for a few weeks to play action video games. [...] Bavelier found that their vision remained improved, even without further practice on action video games. “We looked at the effect of playing action games on this visual skill of contrast sensitivity, and we’ve seen effects that last up to two years.”

Another rather interesting bit:

Brain researcher Jay Pratt, professor of psychology at the University of Toronto, has studied the differences between men and women in their ability to mentally manipulate 3-D figures. This skill is called spatial cognition, and it’s an essential mental skill for math and engineering. Typically, Pratt says, women test significantly worse than men on tests of spatial cognition.

But Pratt found in his studies that when women who’d had little gaming experience were trained on action video games, the gender difference nearly disappeared.

As it happens, I’ve wound up involved in this field as well. I had the good fortune to meet a doctor at the Johns Hopkins medical center/hospital who is interested in doing similar research. The existing work in the field is largely focused on cognition and perception; we’ll be studying motor skills. Probably lots of work with iPads, Kinect, Wii, PS Move, and maybe more exotic control devices as well. There’s a lot of potential applications, but one early angle will be helping stroke patients to recover basic motor ability more quickly and more permanently.

There’s an added component as to why we’re doing this research. My team believes that by studying the underlying neurology and psychology that drives (and is driven by) video games, we can actually turn the research around and use it to produce games that are more engaging, more interactive, more addictive, and just more fun. That’s our big gambit, and if it pans out we’ll be able to apply a more scientific and precise eye to the largely intuitive problem of making a good game. Of course the research is important for it’s own sake and will hopefully lead to a lot of good results, but I went into games and not neurology for a reason ;)

March 9, 2011

Understanding Subversion’s Problems

Filed under: Software Engineering — Promit @ 6:13 pm
Tags: ,

I’ve used Subversion for a long time, even CVS before that. Recently there’s a lot of momentum behind moving away from Subversion to a distributed system, like git or Mercurial. I myself wrote a series of posts on the subject, but I skipped over the reasons WHY you might want to switch away from Subversion. This post is motivated in part by Richard Fine’s post, but it’s a response to a general trend and not his entry specifically.

SVN is a long time stalwart as version control systems go, created to patch up the idiocies of CVS. It’s a mature, well understood system that has been and continues to be used in a vast variety of production projects, open and closed source, across widely divergent team sizes and workflows. Nevermind the hyperbole, SVN is good by practically any real world measure. And like any real world production system, it has a lot of flaws in nearly every respect. A perfect product is a product no one uses, after all. It’s important to understand what the flaws are, and in particular I want to discuss them without advocating for any alternative. I don’t want to compare to git or explain why it fixes the problems, because that has the effect of lensing the actual problems and additionally the problem of implying that distributed version control is the solution. It can be a solution, but the problems reach a bit beyond that.

Committing vs publishing
Fundamentally, a commit creates a revision, and a revision is something we want as part of the permanent record of a file. However, a lot of those revisions are not really meant for public consumption. When I’m working on something complex, there are a lot of points where I want to freeze frame without actually telling the world about my work. Subversion understands this perfectly well, and the mechanism for doing so is branches. The caveat is that this always requires server round-trips, which is okay as long as you’re in a high availability environment with a fast server. This is fine as long as you’re in the office, but it fails the moment you’re traveling or your connection to the server fails for whatever reason. Subversion cannot queue up revisions locally. It has exactly two reference points: the revision you started with and the working copy.

In general though, we are working on high availability environments and making a round trip to the server is not a big deal. Private branches are supposed to be the solution to this problem of work-in-progress revisions. Do everything you need, with as many revisions as you want, and then merge to trunk. Simple as that! If only merges actually worked.

SVN merges are broken
Yes, they’re broken. Everybody knows merges are broken in Subversion and that they work great in distributed systems. What tends to happen is people gloss over why they’re broken. There are essentially two problems in merges: the actual merge process, and the metadata about the merge. Neither works in SVN. The fatal mistake in the merge process is one I didn’t fully understand until reading HgInit (several times). Subversion’s world revolves around revisions, which are snapshots of the whole project. Merges basically take diffs from the common root and smash the results together. But the merged files didn’t magically drop from the sky — we made a whole series of changes to get them where they are. There’s a lot of contextual information in those changes which SVN has completely and utterly forgotten. Not only that, but the new revision it spits out necessarily has to jam a potentially complicated history into a property field, and naturally it doesn’t work.

For added impact, this context problem shows up without branches if two people happen to make more than trivial unrelated changes to the same trunk file. So not only does the branch approach not work, you get hit by the same bug even if you eschew it entirely! And invariably the reason this shows up is because you don’t want to make small changes to trunk. Damned if you do, damned if you don’t.

Newer version control systems are typically designed around changes rather than revisions. (Critically, this has nothing at all to do with decentralization.) By defining a particular ‘version’ of a file as a directed graph of changes resulting in a particular result, there’s a ton of context about where things came from and how they got there. Unfortunately the complex history tends to make assignment of revision numbers complicated (and in fact impossible in distributed systems), so you are no longer able to point people to r3359 for their bug fix. Instead it’s a graph node, probably assigned some arcane unique identifier like a GUID or hash.

File system headaches
.svn. This stupid little folder is the cause of so many headaches. Essentially it contains all of the metadata from the repository about whatever you synced, including the undamaged versions of files. But if you forget to copy it (because it’s hidden), Subversion suddenly forgets all about what you were doing. You just lost its tracking information, after all. Now you get to do a clean update and a hand merge. Overwrite it by accident, and now Subversion will get confused. And here’s the one that gets me every time with externals like boost — copy folders from a different repository, and all of a sudden Subversion sees folders from something else entirely and will refuse to touch them at all until you go through and nuke the folders by hand. Nope, you were supposed to SVN export it, nevermind that the offending files are marked hidden.

And of course because there’s no understanding of the native file system, move/copy/delete operations are all deeply confusing to Subversion unless it’s the one who handles those changes. If you’re working with an IDE that isn’t integrated into source control, you have another headache coming because IDEs are usually built for rearranging files. (In fact I think this is probably the only good reason to install IDE integration.)

It’s not clear to me if there’s any productive way to handle this particular issue, especially cross platform. I can imagine a particular set of rules — copying or moving files within a working copy does the same to the version control, moving them out is equivalent to delete. (What happens if they come back?) This tends to suggest integration at the filesystem layer, and our best bet for that is probably a FUSE implementation for the client. FUSE isn’t available on Windows, though apparently a similar tool called Dokan is. Its maturity level is unclear.

Changelists are missing
Okay, this one is straight out of Perforce. There’s a client side and a server side to this, and I actually have the client side via my favorite client SmartSVN. The idea on the client is that you group changed files together into changelists, and send them off all at once. It’s basically a queued commit you can use to stage. Perforce adds a server side, where pending changelists actually exist on the server, you can see what people are working on (and a description of what they’re doing!), and so forth. Subversion has no idea of anything except when files are different from their copies living in the .svn shadow directory, and that’s only on the client. If you have a couple different live streams of work, separating them out is a bit of a hassle. Branches are no solution at all, since it isn’t always clear upfront what goes in which branch. Changelists are much more flexible.

Locking is useless
The point of a lock in version control systems is to signal that it’s not safe to change a file. The most common use is for binary files that can’t be merged, but there are other useful situations too. Here’s the catch: Subversion checks locks when you attempt to commit. That’s how it has to work. In other words, by the time you find out there’s a lock on a file, you’ve already gone and started working on it, unless you obsessively check repository status for files. There’s also no way to know if you’re putting a lock on a file somebody has pending edits to.

The long and short of it is if you’re going to use a server, really use it. Perforce does. There’s no need to have the drawbacks of both centralized and distributed systems at once.

I think that’s everything that bothers me about Subversion. What about you?

November 10, 2010

Windows Installer: Worse Than I Thought

Filed under: Software Engineering — Promit @ 4:59 pm

I hate Windows Installer. I really hate it. So you have to understand, I was a little surprised to discover that it is actually worse than I thought.

I do not, generally speaking, have much of a problem with Microsoft or Windows. They’ve done a lot for me, I’ve done a few things for them, and it’s been good. However, my feelings about Windows are continuing to degrade from “generally decent” to “least unpleasant”. But that is a digression from the point at hand, which is a specific infuriating problem. Let me explain my machine’s hard drive setup:
* 60 GB SSD. This is for OS and ‘core’ software.
* 400 GB magnetic drive for ‘non-core’ software.
* 1 TB drive for personal data.
I figured that 60 GB would surely be plenty for the OS and small programs. The Windows folder now accounts for a mind boggling 25.7 GB of usage on that drive. I was wondering why.

Our culprits are apparently Installer and WinSxS. I’ll leave the latter for another time and dig into why Installer is so big.

As I’ve discussed in the past, Installer files (msi, msp) are essentially embedded database files that store everything that is required in order to install or uninstall a program. Notice the flip side of the coin here — the only way to uninstall (or repair for that matter) is to save the installer package. And these packages always get saved in the same spot, regardless of where you actually installed the program.

The short version is that if you have a small C drive but run lots of software, you will eventually run out of space even if all the software lives elsewhere. At this point I’m left to throw up my hands and ask the same thing I’ve asked many times before. What idiot was in charge of designing Windows Installer? The engineering here is slipshod and frankly embarrassing.

On the bright side, the solution to this problem is actually relatively straightforward. There’s nothing stopping you from moving the contents of Installer to another drive and creating a junction to fill in for it. Just like that, I can cross off 8 GB that definitely don’t need to be on my two-dollar-per-gigabyte SSD.

October 10, 2010

Evaluation: Git

Filed under: Software Engineering — Promit @ 6:59 pm
Tags: , , , ,

Last time I talked about Mercurial, and was generally disappointed with it. I also evaluated Git, another major distributed version control system (DVCS).

Short Review: Quirky, but a promising winner.

Git, like Mercurial, was spawned as a result of the Linux-BitKeeper feud. It was written initially by Linus Torvalds, apparently during quite a lull in Linux development. It is for obvious reasons a very Linux focused tool, and I’d heard that performance is poor on Windows. I was not optimistic about it being usable on Windows.

Installation actually went very smoothly. Git for Windows is basically powered by MSYS, the same Unix tools port that accompanies the Windows GCC port called MinGW. The installer is neat and sets up everything for you. It even offers a set of shell extensions that provide a graphical interface. Note that I opted not to install this interface, and I have no idea what it’s like. A friend tells me it’s awful.

Once the installer is done, git is ready to go. It’s added to PATH and you can start cloning things right off the bat. Command line usage is simple and straightforward, and there’s even a ‘config’ option that lets you set things up nicely without having to figure out what config file you want and where it lives. It’s still a bit annoying, but I like it a lot better than Mercurial. I’ve heard some people complain about git being composed of dozens of binaries, but I haven’t seen this on either my Windows or Linux boxes. I suspect this is a complaint about old versions, where each git command was its own binary (git-commit, git-clone, git-svn, etc), but that’s long since been retired. Most of the installed binaries are just the MSYS ports of core Unix programs like ls.

I was also thrilled with the git-svn integration. Unlike Mercurial, the support is built in and flat out works with no drama whatsoever. I didn’t try committing back into the Subversion repository from git, but apparently there is fabulous two way support. It was simple enough to create a git repository but it can be time consuming, since git replays every single check-in from Subversion to itself. I tested on a small repository with only about 120 revisions, which took maybe two minutes.

This is where I have to admit I have another motive for choosing Git. My favorite VCS frontend comes in a version called SmartGit. It’s a stand-alone (not shell integrated) client that is free for non commercial use and works really well. It even handled SSH beautifully, which I’m thankful about. It’s still beta technically, but I haven’t noticed any problems.

Now the rough stuff. I already mentioned that Git for Windows comes with a GUI that is apparently not good. What I discovered is that getting git to authenticate from Windows is fairly awful. In Subversion, you actually configure users and passwords explicitly in a plain-text file. Git doesn’t support anything of the sort; their ‘git-daemon’ server allows fully anonymous pulls and can be configured for anonymous-push only. Authentication is entirely dependent on the filesystem permissions and the users configured on the server (barring workarounds), which means that most of the time, authenticated Git transactions happen inside SSH sessions. If you want to do something else, it’s complicated at best. Oh, and good luck with HTTP integration if you chose a web server other than Apache. I have to imagine running a Windows based git server is difficult.

Let me tell you about SSH on Windows. It can be unpleasant. Most people use PuTTY (which is very nice), and if you use a server with public key authentication, you’ll end up using a program called Pageant that provides that service to various applications. Pageant doesn’t use OpenSSH compatible keys, so you have to convert the keys over, and watch out because the current stable version of Pageant won’t do RSA keys. Git in turn depends on a third program called Plink, which exists to help programs talk to Pageant, and it finds that program via the GIT_SSH environment variable. The long and short of it is that getting Git to log into a public key auth SSH server is quite painful. I discovered that SmartGit simply reads OpenSSH keys and connects without any complications, so I rely on it for transactions with our main server.

I am planning to transition over to git soon, because I think that the workflow of a DVCS is better overall. It’s really clear, though, that these are raw tools compared to the much more established and stable Subversion. It’s also a little more complicated to understand; whether you’re using git, Mercurial, or something else it’s valuable to read the free ebooks that explain how to work with them. There are all kinds of quirks in these tools. Git, for example, uses a ‘staging area’ that clones your files for commit, and if you’re not careful you can wind up committing older versions of your files than what’s on disk. I don’t know why — seems like the opposite extreme from Mercurial.

It’s because of these types of issues that I favor choosing the version control system with the most momentum behind it. Git and Mercurial aren’t the only two DVCS out there; Bazaar, monotone, and many more are available. But these tools already have rough (and sharp!) edges, and by sticking to the popular ones you are likely to get the most community support. Both Git and Mercurial have full blown books dedicated to them that are available electronically for free. My advice is that you read them.

October 6, 2010

Evaluation: Mercurial

Filed under: Software Engineering — Promit @ 11:13 am
Tags: , , , , ,

I’ve been a long time Subversion user, and I’m very comfortable with its quirks and limitations. It’s an example of a centralized version control system (CVCS), which is very easy to understand. However, there’s been a lot of talk lately about distributed version control systems (DVCS), of which there are two well known examples: git and Mercurial. I’ve spent a moderate amount of time evaluating both, and I decided to post my thoughts. This entry is about Mercurial.

Short review: A half baked, annoying system.

I started with Mercurial, because I’d heard anecdotally that it’s more Windows friendly and generally nicer to work with than git. I was additionally spurred by reading the first chapter of HgInit, an e-book by Joel Spolsky of ‘Joel on Software’ fame. Say what you will about Joel — it’s a concise and coherent explanation of why distributed version control is, in a general sense, preferable to centralized. Armed with that knowledge, I began looking at what’s involved in transitioning from Subversion to Mercurial.

Installation was smooth. Mercurial’s site has a Windows installer ready to go that sets everything up beautifully. Configuration, however, was unpleasant. The Mercurial guide starts with this as your very first step:

As first step, you should teach Mercurial your name. For that you open the file ~/.hgrc with a text-editor and add the ui section (user interaction) with your username:

Yes, because what I’ve always wanted from my VCS is for it to be a hassle every time I move to a new machine. Setting up extensions is similarly a pain in the neck. More on that in a moment. Basically Mercurial’s configurations are a headache.

Then there’s the actual VCS. You see, I have one gigantic problem with Mercurial, and it’s summed up by Joel:

Whereas, in Mercurial, all commands always apply to the entire tree. If your code is in c:\code, when you issue the hg commit command, you can be in c:\code or in any subdirectory and it has the same effect.

This is an incredibly awkward design decision. The basic idea, I guess, is that somebody got really frustrated about forgetting to check in changes and decided this was the solution. My take is that this is a stupid restriction that makes development unpleasant.

When I’m working on something, I usually have several related projects in a repository. (Mercurial fans freely admit this is a bad way to work with it.) Within each project, I usually wind up making a few sets of parallel changes. These changes are independent and shouldn’t be part of the same check-in. The idea with Mercurial is, I think, that you simply produce new branches every time you do something like this, and then merge back together. Should be no problem, since branching is such a trivial operation in Mercurial.

So now I have to stop and think about whether I should be branching every time I make a tweak somewhere?

Oh but wait, how about the extension mechanism? I should be able to patch in whatever behavior I need, and surely this is something that bothers other people! As it turns out that definitely the case. Apart from the branching suggestions, there’s not one but half a dozen extensions to handle this problem, all of which have their own quirks and pretty much all of which involve jumping back into the VCS frequently. This is apparently a problem the Mercurial developers are still puzzling over.

Actually there is one tool that’s solved this the way you would expect: TortoiseHg. Which is great, save two problems. Number one, I want my VCS features to be available from the command line and front-end both. Two, I really dislike Tortoise. Alternative Mercurial frontends are both trash, and an unbelievable pain to set up. If you’re working with Mercurial, TortoiseHg and command line are really your only sane options.

It comes down to one thing: workflow. With Mercurial, I have to be constantly conscious about whether I’m in the right branch, doing the right thing. Should I be shelving these changes? Do they go together or not? How many branches should I maintain privately? Ugh.

Apart from all that, I ran into one serious show stopper. Part of this test includes migrating my existing Subversion repository, and Mercurial includes a convenient extension for it. Wait, did I say convenient? I meant borderline functional:

Subversion’s Python bindings are a prerequisite. The bindings (generated with SWIG) are installed separately on Windows, and can be found on http://subversion.tigris.org/ . Note that you can’t do this with the Win32 Mercurial binaries — there’s no way to install the Subversion bindings into its built-in Python library. So you’ll need to use a Mercurial installed on top of a stand-alone Python, and you may also need to do something like “set HG=python c:\Python25\Scripts\hg” to override the default Win32 binaries if you have those installed also. For Mac OS X, the easiest way is to install the CollabNet Subversion build, and then copy the content of /opt/subversion/lib/svn-python to the site-package directory of the python installation.

The silver lining is there are apparently third party tools to handle this that are far better, but at this point Mercurial has tallied up a lot of irritations and I’m ready to move on.

Spoiler: I’m transitioning to git. I’ll go into all the gory details in my next post, but I found git to be vastly better to work with.

September 19, 2010

Lua and LuaBind Binaries for VS 2010

Filed under: Software Engineering — Promit @ 10:40 am
Tags: ,

I’ve been working with Lua recently, and apparently neither Lua nor LuaBind come in binary distributions. They’re also kind of a pain to build, so I thought I’d go ahead and post fully prepped binaries, built by and for VC 2010.
Lua 5.1
LuaBind 0.9

A few things to be aware of:

  • These are just built 32 bit binaries and nothing else. You will still need project sources for the headers.
  • Lua is built as a release mode only DLL.
  • LuaBind comes in debug and release variants as a static library.
  • These are built using Multithreaded DLL and Multithreaded Debug DLL. You’ll get link errors if your project is statically linking to the CRT.

July 22, 2010

PhysX and Hardware Acceleration

Filed under: Software Engineering — Promit @ 2:49 pm
Tags: , , , , , ,

A more accurate title would be PhysX and its lack of hardware acceleration. In retrospect this is largely my fault for believing marketing hype without looking into details, but I figured that I’d discuss it. Finding resources on the subject was difficult, so I think a lot of people may be suffering from the same delusion as I was.

Super short version: PhysX computes its rigid body simulation strictly in software. The GPU never gets involved.

Let’s step back for a second and look at the history. It starts with a library called NovodeX which was a popular physics package around the same time Half-Life 2 launched with Havok. A semiconductor company called Ageia acquired the company to build Physics Processing Unit (PPU) based cards, an idea that got a lot of attention at the time. The hardware was over priced, mismanaged, and probably doomed in the end regardless. The effort started sinking pretty quickly until NVIDIA swept in and bought the whole thing. NV then announced that the whole PPU project would be scrapped and that hardware acceleration of physics would be done on the GPU.

The current state of PhysX is somewhat confusing. There are plenty of PhysX accelerated games. Benchmarks of these games show dramatic performance gains when running with hardware supported physics. There have been recent allegations that PhysX’s CPU performance is unreasonably poor, possibly to strengthen the case for hardware acceleration. I don’t have any comment on that mess, but it’s part of the evidence suggesting that hardware is the real focus of PhysX and NVIDIA. That’s not a surprise.

What is a surprise is that the rigid body simulation — the important part of the physics — is not hardware accelerated. Apparently it was when the PPU rolled out, but the GPU based acceleration does not support it. Look at any PhysX based game that advertises and you’ll notice gobs of destruction, cloth, fluids, etc. That’s because those are the only GPU-accelerated effects PhysX supports (plus a few misc things like soft body). Probably the big tip off is that none of these effects require forces to be imparted back to the main physics scene in any way. This is strictly eye candy.

So the interactive rigid body simulation, the part that actually affects gameplay, is completely in software. And if you believe the claims, it’s not even done well in software. All these problems will apparently be fixed in a magical 3.0 release, coming at some vague point in the future. Why? My best guess is that no one has paid any attention to the core PC code in six years. I’d wager that everyone’s been so obsessed with hardware acceleration, and that the basic problem of writing a rigid body solver is so stupidly easy, that we’re simply coasting on the same 2004 NovodeX era code that made the library popular in the first place. Version 3.0 is probably a ground up rewrite.

Don’t get me wrong. PhysX is not bad. It is simply stagnant. Take a recent game and strip away the GPU driven effects candy. What exactly is left in the interactive part of the simulation that wasn’t already in Half Life 2? That was also a 2004 release. NVIDIA did what they do best, visual effects. Marketing also did what they do best, letting everybody assume something untrue without actually ever saying it.

Anyway, now you know where hype ends and fact begins. Rigid body game physics is not hardware accelerated; only special effects that fall pretty loosely into the category of “physics” are. Maybe that’s common knowledge, but it was news to me.

Update: This presentation about Bullet’s GPU acceleration is a good read.
Update 2: I’m wrong on one technical point — PhysX’s hardware accelerated systems can impart forces back to the main scene, and their cloth shows off this capability in the samples.

July 15, 2010

NHibernate Is Pretty Cool

Filed under: SlimTune,Software Engineering — Promit @ 3:41 pm
Tags: , , ,

My last tech post was heavily negative, so today I’m going to try and be more positive. I’ve been working with a library called NHibernate, which is itself a port of a Java library called Hibernate. These are very mature, long-standing object relational mapping systems that I’ve started exploring lately.

Let’s recap. Most high end storage requirements, and nearly all web site storage, are handled using relational database management systems, RDBMS for short. These things were developed starting in 1970, along with the now ubiquitous SQL language for working with them. The main SQL standard was laid down in 1992, though most vendors provide various extensions for their specific systems. Ignoring some recent developments, SQL is the gold standard for handling relational database systems.

When I set out to build SlimTune, one of the ideas I had was to eschew the fairly crude approach that most performance tools take with storage and build it around a fully relational database. I bet that I could make it work fast enough to be usable for profiling, and simultaneously more expressive and flexible. The ability to view the profile live as it evolves is derived directly from this design choice. Generally speaking I’m really happy with how it turned out, but there was one mistake I didn’t understand at the time.

SQL is garbage. (Damnit, I’m being negative again.)

I am not bad at SQL, I don’t think. I know for certain that I am not good at SQL, but I can write reasonably complex queries and I’m fairly well versed in the theory behind relational databases. The disturbing part is that SQL is very inconsistent across database systems. The standard is missing a lot of useful functionality — string concatenation, result pagination, etc — and when you’re using embedded databases like SQLite or SQL Server Compact, various pieces of the language are just plain missing. Databases also have more subtle expectations about what operations may or may not be allowed, how joins are set up, and even syntactical details about how to refer to tables and so on.

SQL is immensely powerful if you can choose to only support a limited subset of database engines, or if your query needs are relatively simple. Tune started running into problems almost immediately. The visualizers in the released version are using a very careful balance of the SQL subset that works just so on the two embedded engines that are in there. It’s not really a livable development model, especially as the number of visualizers and database engines increases. I needed something that would let me handle databases in a more implementation-agnostic way.

After some research it became clear that what I needed was an object/relational mapper, or ORM. Now an ORM does not exist to make databases consistently; that’s mostly a side effect of what they actually do, which is to hide the database system entirely. ORMs are actually the most popular form of persistence layers. A persistence layer exists to allow you to convert “transient” data living in your code to “persistent” data living in a data store, and back again. Most code is object oriented and most data stores are relational, hence the popularity of object/relational mapping.

After some reading, I picked NHibernate as my ORM of choice, augmented by Fluent mapping to get away from the XML mess that NH normally uses. It’s gone really well so far, but over the course of all this I’ve learned it’s very important to understand one thing about persistence frameworks. They are not particularly generalized tools, by design. Every framework, NH included, has very specific ideas about how the world ought to work. They tend to offer various degrees of customization, but you’re expected to adhere to a specific model and straying too far from that model will result in pain.

Persistence frameworks are very simple and effective tools, but they sacrifice both performance and flexibility to do so. (Contrast to SQL, which is fast and flexible but a PITA to use.) Composite keys? Evil! Dynamic table names? No way! I found that NHibernate was amongst the best when it came to allowing me to bend the rules — or flat out break them. Even so, Tune is a blend of NH and native database code, falling back to RDBMS-specific techniques in areas that are performance sensitive or outside of the ORM’s world-view. For example, I use database specific SQL queries to clone tables for snapshots. That’s not something you can do in NH because the table itself is an implementation detail. I also use database specific techniques to perform high-volume database work, as NH is explicitly meant for OLTP and not major bulk operations.

Despite all the quirks, I’ve been really pleased with NHibernate. It’s solved some major design problems in a relatively straightforward fashion, despite the somewhat awkward learning curve and lots of bizarre problem solving due to my habit of using a relational database as a relational database. It provides a query language that is largely consistent across databases, and very effective tools for building queries dynamically without error-prone string processing. Most importantly, it makes writing visualizers for Tune and all around much smoother, and that means more features more quickly.

So yeah, I like NHibernate. That said, I also like this rant. Positive thinking!

June 30, 2010

Windows Installer is Terrible

Filed under: Software Engineering — Promit @ 10:57 pm
Tags: ,

I find Windows Installer to be truly baffling. It’s as close to the heart of Windows as any developer tool gets. It is technology which literally every single Windows user interacts with, frequently. I believe practically every single team at Microsoft works with it, and that even major applications like Office, Visual Studio, and Windows Update are using it.

So I don’t understand. Why is Installer such a poorly designed, difficult to use, and generally infuriating piece of software?

Let’s recap on the subject of installers. An installer technology should facilitate two basic tasks. One, it should allow a developer to smoothly install their application onto any compatible system, exposing a UI that is consistent across every installation. Two, it should allow the user to completely reverse (almost) any installation at will, in a straightforward and again consistent fashion. Windows, Mac OSX, and Linux take three very different approaches to this problem, with OSX being almost indisputably the most sane. Linux is fairly psychotic under the hood, but the idea of a centralized package repository (almost like an “app store” of some kind) is fairly compelling and the dominant implementations are excellent.

And then we have Windows. The modern, recommended approach is to use MSI based setup files, which are basically embedded databases and show a mostly similar UI. And then there’s InstallShield, NSIS, InnoSetup, and half a dozen other installer technologies that are all in common use. Do you know why that is? It’s because Windows Installer is junk.

Let us start with the problem of consistency. This is our very nice, standard looking SlimDX SDK installation package:

And this is what it looks like if you use Visual Studio to create your installer:

Random mix of fonts? Check. Altered dialog proportions for no reason? Check. Inane question that makes no sense to most users? Epic check. Hilariously amateur looking default clip art? Of course.

Okay, so maybe you don’t think the difference is that big. Microsoft was never Apple, after all. But how many of those childish looking VS based installers do you see on a regular basis? It’s not very many. That’s because the installer creation built into Visual Studio, Microsoft’s premiere idol of the industry development tool, is utter garbage. Not only is the UI for it awful, it fails to expose most of the useful things MSI can actually do, or most developers want to do. Even the traditionally expected “visual” half-baked dialog editor never made it into the oven. You just get a series of bad templates with static properties. Microsoft also provides an MSI editor, which looks like this:

Wow! I’ve always wanted to build databases by hand from scratch. Why not just integrate the functionality into Access?

In fact, Microsoft is now using external tools to build installers. Office 2007′s installer is written using the open source WiX toolset. Our installer is built using WiX too, and it’s an unpleasant but workable experience. WiX essentially translates the database schema verbatim into an XML schema, and automates some of the details of generating unique IDs etc. It’s pretty much the only decent tool for creating MSI files of any significant complexity, especially if buying InstallShield is just too embarrassing (or expensive, $600 up to $9500). By the way, Visual Studio 2010 now includes a license for InstallShield Limited Edition. I think that counts as giving up.

Even then, the thing is downright infuriating. You cannot tell it to copy the contents of a folder into an installer. There is literally no facility for doing so. You have to manually replicate the entire folder hierarchy, and every single file, interspersed with explicit uniquely identified Component blocks, all in XML. And all of those components have to be explicitly be referenced into Feature blocks. SlimDX now ships a self extracting 7-zip archive for the samples mainly because the complexity of the install script was unmanageable, and had to be rebuilt with the help of a half-baked C# tool each release.

Anyone with half a brain might observe at this point that copying a folder on your machine to a user’s machine is mostly what an installer does. In terms of software design, it’s the first god damned use case.

Even all of that might be okay if it weren’t for one critical problem. Lots of decent software systems have no competent toolset. Unfortunately it turns out that the underlying Windows Installer engine is also a piece of junk. The most obvious problem is its poor performance (I have an SSD, four cores, and eight gigabytes of RAM — what is it doing for so long before installation starts?), but even that can be overlooked. I am talking about one absolutely catastrophic, completely unacceptable design flaw.

Windows Installer cannot handle dependencies.

Let that sink in. Copying a local folder to the user’s system is use case number one. Setting up dependencies is, I’m pretty sure, the very next thing on the list. And Windows Installer cannot even begin to contemplate it. You expect your dependencies to be installed via MSI, because it’s the standard installer system, and they usually are. Except…Windows Installer can’t chain MSIs. It can’t run one MSI as a child of another. It can’t run one MSI after another. It sure can’t conditionally install subcomponents in separate MSIs. Trying to run two MSI installs at once on a single system will fail. (Oh, and MS licensing doesn’t even allow you to integrate any of their components directly in DLL form, the way OSX does. Dependencies are MSI or bust.)

The way to set up dependencies is to write your own custom bootstrap installer. Yes, Visual Studio can create the bootstrapper, assuming your dependencies are one of the scant few that are supported. However, we’ve already established that Visual Studio is an awful choice for any installer-related tasks. In this case, the bootstrapper will vomit out five mandatory files, instead of embedding them in setup.exe. That was fine when software was still on media, but it’s ridiculous for web distribution.

Anyway, nearly any interesting software requires a bootstrapper, which has to be pretty much put together from scratch, and there’s no guidelines or recommended approaches or anything of the sort. You’re on your own. I’ve tried some of the bootstrap systems out there, and the best choice is actually any competing installer technology — I use Inno. Yes, the best way to make Windows Installer workable is to actually wrap it in a third party installer. And I wonder how many bootstrappers correctly handle silent/unattended installations, network administrative installs, logging, UAC elevation, patches, repair installs, and all the other crazy stuff that can happen in installer world.

One more thing. The transition to 64 bit actually made everything worse. See, MSIs can be built as 32 bit or 64 bit, and of course 64 bit installers don’t work on 32 bit systems. 32 bit installers are capable of installing 64 bit components though, and can be safely cordoned off to exclude those pieces when running on a 32 bit system. Except when they can’t. I’m not sure exactly how many cases of this there are, but there’s one glaring example — the Visual C++ 2010 64 bit merge module. (A merge module is like a static library, but for installers.) It can’t be included in a 32 bit installer, even though the VC++ 2008 module had no problem. The recommended approach is to build completely separate 32 and 64 bit installers.

Let me clarify the implications of that statement. Building two separate installers leaves two choices. One choice is to let the user pick the correct installation package. What percentage of Windows users do you think can even understand the selection they’re supposed to make? It’s not Linux, the people using the system don’t know arcane details like what bit-size their OS installation is. (Which hasn’t stopped developers from asking people to choose anyway.) That leaves you one other choice, which is to — wait for it — write a bootstrapper.

Alright, now I’m done. Despite all these problems, apparently developers everywhere just accept the status quo as perfectly normal and acceptable. Or maybe there’s a “silent majority” not explaining to Microsoft that their entire installer technology, from top to bottom, is completely mind-fucked.

March 4, 2010

Selling Middleware

So a few days ago, we published a video demo of our BioReplicant technology. In particular, we published it without saying much. No explanation of how it works, what problems it solves, or how it could be used. That was a very important and carefully calculated decision. I felt it was critical that people be allowed to see our technology without any tinting or leading on our part. Some of the feedback was very positive, some very negative, and a whole lot in between. I’m sure we’ll get an immense amount more from GDC, but this initial experience has been critical in understanding what people want and what they think we’re offering.

To a large extent, people’s expectations do not align with what BioReplicants actually does. Our eventual goal is to meet those expectations, but in the meantime there is a very tricky problem of explaining what our system actually does for them. I think that will continue to be a problem, exacerbated by the fact that on the surface, we seem to be competing with NaturalMotion’s Euphoria product, and in fact we’ve encouraged that misconception.

In truth, it’s not the case. We aren’t doing anything like what NM does internally, and all we’re really doing is trying to solve the same problem every game has to solve. Everybody wants realistic, varied, complex, and reactive animations for their game. Everybody! And frankly, they don’t need Euphoria or BioReplicants to do it. There’s at least three GDC talks this year on the subject. That’s why it’s important to step back and look at why middleware even exists.

The very first thing to realize is that games are hard to make. There is a wide array of complex intersecting problems that go into the production of every last title you see on the shelf, in Steam, or anywhere else.

  1. Engineering a game’s underlying systems is complicated in the best of situations, and every title evolves dramatically, throwing half your previous work out the door every six months. Developers do not want anything that makes their lives harder. Developers love almost anything that makes their lives easier!
  2. Art production for a game is incredibly time consuming. Time is money. Streamlined production is worth a lot of money.
  3. Games are expected to run on very limited hardware. How much RAM do you guys have in your laptops? A PlayStation 3 has 200 MB. Two hundred.
  4. Designing a fun game that isn’t a rehash of everything before it is very, very tricky. Companies go a long way to make a game not look like a rehash. Here’s a hint: developer diaries are rarely produced for any other reason. Nobody would ever have noticed NM Euphoria in Force Unleashed otherwise.
  5. If there’s one thing harder than making a game, it’s selling it. Each console gets a couple hundred new game releases every year and most of them represent a substantial loss. Publishers are willing to do an awful lot to avoid that loss. A new gimmick might flop, but when you drop $30M to ship a game, $32M instead is probably worth the odds.

The point of middleware is simply to alleviate one or more of these problems. That’s all. Selling middleware, then, is mainly a matter of convincing people in games that you can tackle these problems in a net positive way. Most game developers are also gamers, and our instinct is to focus on 5 and to a lesser extent 4 (the two are somewhat intertwined). Middleware developers usually try to convince game developers that their games will be more fun with the middleware product. We’ve been doing the same. It’s very possible that it’s true, but it’s just one piece of the puzzle.

So in moving from a concept to a product, it’s critical to start with a very solid understanding of these points, and to pick exactly which ones the product attacks. Nobody hits all five. With Force Unleashed, LucasArts designed a game based almost entirely around Euphoria, and you know what? It’s a terrible game. Once you get past the vaguely clever physics, there is no substance there. Asking people to design games around BioReplicants is a tall order. NM has been forced into the position of doing it for themselves.

“Every tackle is different.” As if EA etc haven’t thought of this already. A big failing of NaturalMotion Euphoria is that it is an epic amount of work to integrate. It’s very easy to spot a single animation being overused in a game. But throw in five animations and behave vaguely smart about which one you pick and when, and suddenly no one can tell the difference. Put in some parametric control over details of the animation, and you can probably turn those five animations into twenty. All it cost you was extra staff to do the animations and a programmer to figure out how to squeeze that into memory. Kind of expensive, but a hell of a lot easier to do than Euphoria integration. We’re not competing against NaturalMotion. The two of us are competing against the status quo.

A shooter now will have half a dozen knock-back animations for hits to different body parts, another half dozen for falls, and so on. Somebody sits down and animates them. Can we make that animator’s job more efficient? I bet we can. How much memory do you suppose those animations eat? If we can do them on the fly and open up that memory for other things, engineering staff will trip over themselves to thank us. I bet a BioReplicant walk takes up a lot less memory than a keyframed walk, and I bet we can replace a dozen hand-drawn animations with two physically driven animations.

Revolutionizing how people play games is glamorous and tempting. It’s an important goal. Full blown rigid-body physics made the jump between five and ten years ago. But that’s just not how you sell a middleware product. On the other hand, if you can inflict a moderately sized change in game development, developers will happily pay you. After that, you can start changing the actual games.

« Previous PageNext Page »

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 479 other followers