Hooray for Gluttony

I was going to write up a little thing about some of the goals I have for the upcoming November release of SlimDX, but then I remembered I’m going out to dinner with my girlfriend tonight. Strictly speaking it’s been ten months since our first date, but it was around mid October that I first started getting to know her. More to the point, I picked up some Restaurant.com gift certificates super cheap ($25 certificates for $2 each), which she doesn’t know about yet. Basically we’re going to the local Indian place and eating all super nice like. These certificates have a $35 minimum and it’s your typical $12 entree type restaurant, so drinks and dessert are a definite yes.

And she just called me. I’m out.

Promit’s Tips for Life (random selection part 1)

I truly regret watching this video.

#5: If someone is looking at you funny, consider the possibility that they may suffer from chronic creepiness, rather than intending actual funny looks.
#23: Generally speaking, it’s best to avoid bringing lions, tigers, or bears to the mall, even if they’re just babies. Oh my.
#49: Getting up close to someone’s face and singing Nine Inch Nails lyrics is not a good way to make friends.
#70: Yes, you can still rock out like it’s 1998, even if it was ten years ago. Intergalactic planetary, planetary intergalactic…
#91: Don’t bother trying to come up with the stupidest comment ever made to someone on the internet. You’re way out of your league.

C Versus Reality

So it looks like polls across the country, from all races (presidential, Senate, House), have swung violently towards the Democrats. And I do mean violently. The Georgia Senate race, for example, has swung 15 points over two weeks, leaving a normally solid Republican state tied between the two candidates. I guess the economic meltdown of the country isn’t a total loss.

I originally intended for my next technical post to be a continuation of the test driven development discussion, but motivated by this thread, I’d like to take an moment and examine just how native code (like C or C++ generates) relates to the reality of how things work. People (who I think are idiots and should feel free to get the hell out of that thread) like to say that C++ is closer to the hardware or more low level or other similar kinds of bullshit. I want to dissect that in a more methodical, fair manor.

The basic misconception seems to be centered around the mysterious gods that C++ noobs worship, pointers. I believe that pointers are basically taught the same way to everybody — they store memory addresses of things, so you use the dereference operator to access what is actually at that memory address. This isn’t wrong, exactly. The problem is that it ignores a number of important details. C (and by extension C++) are very careful to avoid specifying any kind of detail about how their underlying memory architecture works, or saying much of anything about pointers beyond what is necessary to actually define the language behavior.

So what are pointers in C++ land? I think ToohrVyk described it quite well:

A pointer-to-X rvalue, where X is an actual first-class type, can be one of three distinct things: 1 the ‘null pointer’ for the type X, which represents the absence of any value. The null pointer evaluates to false in a boolean context (while all other pointers evaluate to true), and the integer constant zero evaluates to the null pointer in a pointer context. 2 an lvalue of the type X. This is the usual ‘points at an object of type X’. The definition of lvalue says everything there is to know here. The lvalue and its corresponding rvalue can be accessed through dereferencing (*ptr). 3 a past-the-end pointer. Unlike the null pointer, past-the-end pointers are many, and they differ from each other through ‘==’ comparison. They cannot be dereferenced.

Then, there’s the grouping of lvalues: they are grouped in buffers containing zero or more lvalues. A pointer to an lvalue can be incremented or decremented, changing its rvalue the previous or next lvalue in the buffer if it exists, otherwise resulting in either that buffer’s past-the-end pointer (if incrementing) or in undefined behaviour (if decrementing). Decrementing a past-the-end pointer yields the last lvalue in the associated buffer, or undefined behaviour if the buffer is empty. Such buffers are created every time you allocate data on the stack or heap, with pointers to the first lvalue being returned in the latter case, or obtained with &var in the former.

The matters are further complexified by the notion of memory layout compatibility, which allows one to see a buffer of X lvalues as a buffer of Y lvalues, under certain conditions of alignment, padding and size. These, I will not go into here, but they are the fundamental element behind casting structures to a buffer of bytes, or behind unions.

The usual ‘pointers are addresses’ works fine, as long as you consider an address to be a synonym for an lvalue or past-the-end, though it does miss on a lot of subtleties described above. And as soon as you get the strange notion that addresses are numbers, which is almost universally inflicted upon beginners by tutorials and books, you’re off course. Unlike numbers, pointers can only be compared for order in very specific cases: when they’re within the same buffer. Unlike numbers, pointers cannot reliably be converted to and from numbers (though C99 has done some efforts to solve this) and can certainly not respond correctly to arithmetics on numbers. The list of discrepancies goes on. Ultimately, code such as z[1337]++; actually consists in incrementing an lvalue, not accessing a memory address and incrementing the value found there.

And that doesn’t even touch on the complexities of pointer-to-member constructs, function pointers, and other wonky details of the language. (And don’t forget that the 0 value for a pointer is symbolic and the real address assigned to it may be something else!) Even once you get past all that, you’ve got yet another hurdle: this flat memory layout is a lie. Your “memory address” could be a reference to a value in a register, or it could be replaced outright by a constant expression. And even if it is a memory address, it will be a virtual address, which could refer to main memory, a page file, memory on some other device (video card, memory mapped file), or really anything else a driver or the kernel chooses to expose. It might not be memory; it might not even exist. (Consider the case of memory mapping /dev/zero on a Linux machine.)

That doesn’t leave us with much. Nearly all of the things you can abuse pointers for are either implementation defined or outright illegal, and could do damn near anything outside carefully controlled circumstances. That’s probably why modern C++ code doesn’t really use pointers much anymore, preferring auto_ptr, shared_ptr, intrusive_ptr, weak_ptr, vector, or whatever standard class is most appropriate for representing a certain object. The simple fact is that if you are using pointers to any great effect in your code, you are probably Doing It Wrong, and creating opportunities for subtle and dangerous bugs to wreak havoc throughout your code. Here at Day 1, any use of raw pointers is immediately suspect and examined carefully in code reviews. (We do occasionally use and store raw pointers, but it’s really not a preferred approach.)

Tina Fey Is Incredible

In this clip, Tina Fey does Palin in SNL’s parody of the vice presidential debate. It’s amazing.

I was going to write a post and then I didn’t

So have this instead.

Unit Testing

This is still a technical blog first and a personal blog second, so despite a VP debate yesterday, the first Fracture reviews coming out yesterday, and other junk I’d like to talk about, this blog needs a technical post and I’m not going to let reality delay that.

Commonly abbreviated TDD, test driven development is one of the more recent techniques to show up in software engineering. It’s an interesting approach because it takes the normal process of software development and turns it on its head. In order to be able to discuss TDD effectively, though, it’s necessary to make sure everybody is on the same page regarding testing itself, both in the basic process and how it alters the dynamics of software development.

Traditionally, you develop software by iteratively writing pieces, then compiling and running it to confirm that the code you wrote is working as expected. If you’re at a company, there’s probably a QA department who is also poking at the software to try and find holes. If not, well then it’s up to your developer instinct to test likely spots for bugs and ensure that everything is fine. No matter how well organized the QA team is, this testing is still going to be relatively haphazard and time consuming. The reasons that’s not desirable are obvious, but it’s still important to have a test department. They are good at poking at software because it’s their job, and they have a lot more time to dedicate to it than the engineers. They can also identify bugs that are not code bugs. For example, a feature may work exactly as intended, but the actual intention can be flawed or misguided. A good QA department will identify thing are “weird”, not just things that are broken.

It’s also interesting to keep in mind that this does not test the code. It tests the software, which is the end result of a whole lot of code interacting. Most of these interactions are intentional, but some may not be. For example, there might be a block of code to load a configuration file and check that it is sane. This code will sit on top of a function to actually open the file, which should raise an exception if it fails. Now suppose the configuration loader has a bug where it doesn’t correctly check for that error. It would let that exception spill out into its parent code, which would probably result in an application crash or similarly undesirable and obvious behavior. But if the file open function is itself broken so that it doesn’t create that exception the way it’s supposed to, the configuration loader might just keep going as if everything is fine, which could lead to subtle bugs that are not noticed for a long time. If that scenario sounds overly contrived, then you probably don’t have much software development experience. This stuff happens all the time.

That’s where automated code testing comes in, also known as unit testing. This type of testing works by writing extra code that actually uses the code being tested in a synthetic setup, where it’s isolated and run in a specific way to yield specific results. The test passes if the expected result matches the actual result, and fails otherwise. It’s important for tests to be repeatable, independent, and compact. When tests are repeatable, they will always have the same exact results. That’s important for being able to nail down bugs quickly and efficiently. Independent tests will not interact with each other, which avoids nasty interactions between tests that can create all sorts of bugs (or hide bugs, which is equally bad). And tests need to be compact, because that ensures that they test very specific blocks of code independently, thus minimizing the potential “surface area” for a problem to occur in that test. These tests are typically part of your build system. The compiler checks that your code fits the rules of your language; the tests check that your code fits the rules of your specifications. The specifications, in turn, have now been converted from the original loosely descriptive English explanation into rigorous mathematical definitions of what is actually expected from correctly written code.

In order to be effective, unit tests need to be small and numerous, like little Zerglings attacking your code. What that means is that writing tests really sucks. It’s tedious, it’s repetitive, and it can feel pretty silly to be writing tests instead of going through your code for places to refactor and examine. I’ve never liked doing it and I’ll usually avoid it. Recently though, Josh Petrie added NUnit to the SlimDX build process, and I decided take a couple hours to write some unit tests for the oldest and most used class in it, the Direct3D 9 Device. By the time I was done, I’d identified and fixed three bugs.

It’s hard to argue with results. These were bugs in the single most popular class in a production library that gets several thousand downloads every release. Some of the bugs weren’t even terribly obscure; one only showed up with one particular generic type parameter, and one was actually an interface design mistake. (The set function took two arguments, but the get function returned them as a single bitwise ORed value like the native library does.) As a result, the plan is to significantly expand the amount of tests in SlimDX. It’s not fun, but when it comes to effectively writing code that actually works reliably, unit tests are absolutely necessary. It even forces you to examine your interface from the perspective of someone using your code, and as a result you can identify mistakes in it that made sense from the perspective of writing that code, but not using it.

As I said before, tests provide a rigorous definition of how you expect code to behave. It provides a proper mathematical specification, and writing the tests requires you to take a good hard look at how your specification translates into actually using the code. But what I’ve described so far isn’t what test driven development means. There’s a lot of implications to TDD, and my next technical entry will take a good look at some of those implications. The basic concept, though, is dead simple. I’ll sum it up in one sentence: What happens if you write your tests before writing the code it’s intended to test?

Welcome to Ventspace

I’ve played with blogging multiple times before. My original blog was Element 61, which gained some decent traffic thanks to dissecting some of the Quake 3 codebase that had recently been released. I got bored of it fairly quickly, though, after failing to write a coherent discussion of my terrain engine. More recently I maintained GameDev.Net Journal, which I got access to after being granted moderator privileges. I made a number of posts there, almost entirely technical. I lost steam on that several months ago. And so now I’m here, on attempt three, with Ventspace.

Why?

A couple reasons. There are any number of idiots out there who can code, and well. For me, that’s not enough. It’s inadequate to just code. One of the key aspects of being a software developer is being able to communicate effectively. And in an increasingly electronic world, writing is key to communication. So first and foremost, I’m writing here because that’s the only way I can improve as a writer. Then there’s the idea that I have something to share with other developers, and that other developers can gain insight into their own work by reading what I have to say. It’s why I read The Old New Thing and Coding Horror. (Although Atwood’s been slacking ever since the Stack Overflow project started.) Last, there’s the simple fact that people seem to enjoy my writing. People seem to like my weekly entries for The Daily GameDev.Net, and my regular forum posting as well. Unfortunately that forum posting has (somewhat necessarily, due to being a moderator now) lost its edge, and I’m thinking I’ll use this blog to get that feel back.

But why here?

Why not my journal, or my older Blogspot site? For one thing, I have technical objections against both those blogs, and it looks like WordPress is a far more promising alternative. GameDev’s real advantage is that my direct, best audience is right there and able to comment easily, but it is lacking in features. (Plus, I don’t like the URL format it uses.) Blogspot’s integration with Google is handy, but other than that I’m just more impressed by what I see with WordPress. The main reason, though, is that I’ve decided to try and draw a line in the sand. I think I read this most recently on Coding Horror, but it’s advice that you will get from practically any experienced writer/blogger — the only way to successfully run a blog is to post regularly, and often. That’s the way you get the rhythm of writing down, become more efficient at writing, and frankly it’s the only way to build an audience. Unfortunately, I’m terrible with schedules. Truly terrible. It’s a miracle I paid my rent today. So I’m not committing to specific update days. However, I’ve decided that I will, as much as possible, put three new entries a week into Ventspace, starting now.

Quality probably won’t be consistent, and I guarantee you I can’t sustain a rate of three insightful, technical posts a week. Some of them are going to be stupid. I might recycle a few from the older blogs. That’s fine though; that’s how a blog works. Sometimes there I will post a magnificent essay on SlimDX’s development process, and sometimes I will post funny pictures. But one way or another, I will keep new content flowing, and hopefully people will read it. Ventspace IS a technical blog by a professional programmer, but I think a more interesting mix can’t hurt. And to start off that more interesting mix, allow me to reintroduce myself.

I’m Promit Roy. I live in Baltimore, and work for Day 1 Studios as a Core Technology Engineer. My focus has historically been in games, and to a lesser extent tools and other client applications. Right now I am mainly in charge of build system development/maintenance (a role I share with my immediate boss) and tools development. I’m kind of a “whatever needs to get done” infrastructure Engineer at Day 1. This is my blog, Ventspace, and it has no affiliation with Day 1 and does not represent their views. My interests in software development are not terribly specific, but I like mid to low level architecture stuff, and I hate the web. But I’m pragmatic, and the web seems rather more important these days, so I’m planning to learn that too. I also love cars and talking about cars. I’m writing because unlike most of the software developers out there, I’m actually good at it and I’m not going to let that ability rot.

Welcome to Ventspace.