Games Look Bad, Part 1: HDR and Tone Mapping

This is Part 1 of a series examining techniques used in game graphics and how those techniques fail to deliver a visually appealing end result. See Part 0 for a more thorough explanation of the idea behind it.

High dynamic range. First experienced by most consumers in late 2005, with Valve’s Half Life 2: Lost Coast demo. Largely faked at the time due to technical limitations, but it laid the groundwork for something we take for granted in nearly every blockbuster title. The contemporaneous reviews were nothing short of gushing. We’ve been busy making a complete god awful mess of it ever since.

Let’s review, very quickly. In the real world, the total contrast ratio between the brightest highlights and darkest shadows during a sunny day is on the order of 1,000,000:1. We would need 20 bits of just luminance to represent those illumination ranges, before even including color in the mix. A typical DSLR can record 12-14 bits (16,000:1 in ideal conditions). A typical screen can show 8 (curved to 600:1 or so). Your eyes… well, it’s complicated. Wikipedia claims 6.5 (100:1) static. Others disagree.

Graphics programmers came up with HDR and tone mapping to solve the problem. Both film and digital cameras have this same issue, after all. They have to take enormous contrast ratios at the input, and generate sensible images at the output. So we use HDR to store the giant range for lighting computations, and tone maps to collapse the range to screen. The tone map acts as our virtual “film”, and our virtual camera is loaded with virtual film to make our virtual image. Oh, and we also throw in some eye-related effects that make no sense in cameras and don’t appear in film for good measure. Of course we do.

And now, let’s marvel in the ways it goes spectacularly wrong.

battlefield_1_1020170716-rural-08_1500890393cod13232272656804_ccca70cc7e_o

In order: Battlefield 1, Uncharted: Lost Legacy, Call of Duty: Infinite Warfare, and Horizon Zero Dawn. HZD is a particular offender in the “terrible tone map” category and it’s one I could point to all day long. And so we run head first into the problem that plagues games today and will drive this series throughout: at first glance, these are all very pretty 2017 games and there is nothing obviously wrong with the screenshots. But all of them feel videogamey and none of them would pass for a film or a photograph. Or even a reasonably good offline render. Or a painting. They are instantly recognizable as video games, because only video games try to pass off these trashy contrast curves as aesthetically pleasing. These images look like a kid was playing around in Photoshop and maxed the Contrast slider. Or maybe that kid was just dragging the Curves control around at random.

The funny thing is, this actually has happened to movies before.

maxresdefault

Hahaha. Look at that Smaug. He looks terrible. Not terrifying. This could be an in-game screenshot any day. Is it easy to pick on Peter Jackson’s The Hobbit? Yes, it absolutely is. But I think it serves to highlight that while technical limitations are something we absolutely struggle with in games, there is a fundamental artistic component here that is actually not that easy to get right even for film industry professionals with nearly unlimited budgets.

Allow me an aside here into the world of film production. In 2006, the founder of Oakley sunglasses decided the movie world was disingenuous in their claims of what digital cameras could and could not do, and set out to produce a new class of cinema camera with higher resolution, higher dynamic range, higher everything than the industry had and would exceed the technical capabilities of film in every regard. The RED One 4K was born, largely accomplishing its stated goals and being adopted almost immediately by one Peter Jackson. Meanwhile, a cine supply company founded in 1917 called Arri decided they don’t give a damn about resolution, and shipped the 2K Arri Alexa camera in 2010. How did it go? 2015 Oscars: Four of the five nominees in the cinematography category were photographed using the ARRI Alexa. Happy belated 100th birthday, Arri.

So what gives? Well, in the days of film there was a lot of energy expended on developing the look of a particular film stock. It’s not just chemistry; color science and artistic qualities played heavily into designing film stocks, and good directors/cinematographers would (and still do) choose particular films to get the right feel for their productions. RED focused on exceeding the technical capabilities of film, leaving the actual color rendering largely in the hands of the studio. But Arri? Arri focused on achieving the distinctive feel and visual appeal of high quality films. They better understood that even in the big budget world of motion pictures, color rendering and luminance curves are extraordinarily difficult to nail. They perfected that piece of the puzzle and it paid off for them.

Let’s bring it back to games. The reality is, the tone maps we use in games are janky, partly due to technical limitations. We’re limited to a 1D luminance response where real film produces both hue and saturation shifts. The RGB color space is a bad choice to be doing this in the first place. And because nobody in the game industry has an understanding of film chemistry, we’ve all largely settled on blindly using the same function that somebody somewhere came up with. It was Reinhard in years past, then it was Hable, now it’s ACES RRT. And it’s stop #1 on the train of Why does every game this year look exactly the goddamn same?

The craziest part is we’re now at the point of real HDR televisions showing game renders with wider input ranges. Take this NVIDIA article which sees the real problem and walks right past it. The ACES tone map is destructive to chroma. Then they post a Nikon DSLR photo of a TV in HDR mode as a proxy for how much true HDR improves the viewing experience. Which is absolutely true – but then why does the LDR photo of your TV look so much better than the LDR tone map image? There’s another tone map in this chain which nobody thought to examine: Nikon’s. They have decades of expertise in doing this. Lo and behold, their curve makes a mockery of the ACES curve used in the reference render. Wanna know why that is? It’s because the ACES RRT was never designed to be an output curve in the first place. Its primary design goal is to massage differences between cameras and lenses used in set so they match better. You’re not supposed to send it to screen! It’s a preview/baseline curve which is supposed to receive a film LUT and color grading over top of it.

“Oh, but real games do use a post process LUT color grade!” Yeah, and we screwed that up too. We don’t have the technical capability to run real film industry LUTs in the correct color spaces, we don’t have good tools to tune ours, they’re stuck doing double duty for both “filmic look” as well as color grading, the person doing it doesn’t have the training background, and it’s extraordinary what an actual trained human can do after the fact to fix these garbage colors. Is he cheating by doing per-shot color tuning that a dynamic scene can’t possibly accomplish? Yes, obviously. But are you really going to tell me that any of these scenes from any of these games look like they are well balanced in color, contrast, and overall feel?

Of course while we’re all running left, Nintendo has always had a fascinating habit of running right. I can show any number of their games for this, but Zelda: Breath of the Wild probably exemplifies it best when it comes to HDR. double_1487330294849_file_the_legend_of_zelda_-_breath_of_the_wild_screenshot___3__

No HDR. No tone map. The bloom and volumetrics are being done entirely in LDR space. (Or possibly in 10 bit. Not sure.) Because in Nintendo’s eyes, if you can’t control the final outputs of the tone mapped render in the first place, why bother? There’s none of that awful heavy handed contrast. No crushed blacks. No randomly saturated whites in the sunset, and saturation overall stays where it belongs across the luminance range. The game doesn’t do that dynamic exposure adjustment effect that nobody actually likes. Does stylized rendering help? Sure. But you know what? Somebody would paint this. It’s artistic. It’s aesthetically pleasing. It’s balanced in its transition from light to dark tones, and the over-brightness is used tastefully without annihilating half the sky in the process.

Now I don’t think that everybody should walk away from HDR entirely. (Probably.) There’s too much other stuff we’ve committed to which requires it. But for god’s sake, we need to fix our tone maps. We need to find curves that are not so aggressively desaturating. We need curves that transition contrast better from crushed blacks to mid-tones to blown highlights. LUTs are garbage in, garbage out and they cannot be used to fix bad tone maps. We also need to switch to industry standard tools for authoring and using LUTs, so that artists have better control over what’s going on and can verify those LUTs outside of the rendering engine.

In the meantime, the industry’s heavy hitters are just going to keep releasing this kind of over-contrasty garbage.

45hfdmf

Before I finish up, I do want to take a moment to highlight some games that I think actually handle HDR very well. First up is Resident Evil 7, which benefits from a heavily stylized look that over-emphasizes contrast by design.

image_resident_evil_7_32138_3635_0003

That’s far too much contrast for any normal image, but because we’re dealing with a horror game it’s effective in giving the whole thing an unsettling feel that fits the setting wonderfully. The player should be uncomfortable with how the light and shadows collide. This particular scene places the jarring transition right in your face, and it’s powerful.

Next, at risk of seeming hypocritical I’m going to say Deus Ex: Mankind Divided (as well as its predecessor).

041599

The big caveat with DX is that some scenes work really well. The daytime outdoors scenes do not. The night time or indoor scenes that fully embrace the surrealistic feeling of the world, though, are just fantastic. Somehow the weird mix of harsh blacks and glowing highlights serves to reinforce the differences between the bright and dark spots that the game is playing with thematically throughout. It’s not a coincidence that Blade Runner 2049 has many similarities. Still too much contrast though.

Lastly, I’m going to give props to Forza Horizon 3.

forzahorizon39_25_201xiyor

 

Let’s be honest: cars are “easy mode” for HDR. They love it. But there is a specific reason this image works so well. It is low contrast. Nearly all of it lives in the mid-tones, with only a few places wandering into deep shadow (notably the trees) and almost nothing in the bright highlights. But the image is low contrast because cars themselves tend to use a lot of black accents and dark regions which are simply not visible when you crush the blacks as we’ve seen in other games. Thus the toe section of the curve is lifted much more than we normally see. Similarly, overblown highlights mean whiting out the car in the specular reflections, which are big and pretty much always image based lighting for cars. It does no good to lose all of that detail, but the entire scene benefits from the requisite decrease in contrast. The exposure level is also noticeably lower, which actually leaves room for better mid-tone saturation. (This is also a trick used by Canon cameras, whose images you see every single day.) The whole image ends up with a much softer and more pleasant look that doesn’t carry the inherent stress we find in the images I criticized at the top. If we’re looking for an exemplar for how to HDR correctly in a non-stylized context, this is the model to go by.

Where does all this leave us? With a bunch of terrible looking games, mostly. There are a few technical changes we need to make right up front, from basic decreases in contrast to simple tweaks to the tone map to improved tools for LUT authoring. But as the Zelda and Forza screenshots demonstrate, and as the Hobbit screenshot warns us, this is not just a technical problem. Bad aesthetic choices are being made in the output stages of the engine that are then forced on the rest of the creative process. Engine devs are telling art directors that their choices in tone maps are one of three and two are legacy options. Is it bad art direction or bad graphics engineering? It’s both, and I suspect both departments are blaming the other for it. The tone map may be at the end of graphics pipeline, but in film production it’s the first choice you make. You can’t make a movie without loading film stock in the camera, and you only get to make that choice once (digital notwithstanding). Don’t treat your tone map as something to tweak around the edges when balancing the final output LUT. Don’t just take someone else’s conveniently packaged function. The tone map’s role exists at the beginning of the visual development process and it should be treated as part of the foundation for how the game will look and feel. Pay attention to the aesthetics and visual quality of the map upfront. In today’s games these qualities are an afterthought, and it shows.

UPDATE: User “vinistois” on HackerNews shared a screenshot from GTA 5 and I looked up a few others. It’s very nicely done tone mapping. Good use of mid-tones and contrast throughout with great transitions into both extremes. You won’t quite mistake it for film, I don’t think, but it’s excellent for something that is barely even a current gen product. This is proof that we can do much better from an aesthetic perspective within current technical and stylistic constraints. Heck, this screenshot isn’t even from a PC – it’s the PS4 version.

place

Advertisements

Games Look Bad, Part 0: Explanation and Defense

I’m about to start a series of blog posts called Games Look Bad. Before I start throwing stones from my glass house over here, I wanted to offer an explanation of what I’m doing and a defense of why I’m doing it.

There’s no doubt that we’ve seen a sustained and significant period of improvement in real-time computer graphics over the past three decades. We’ve made significant advances in nearly every aspect of visual look and feel,  drawing quite a bit from the film industry in the process. So why the heck do most games look so bad?

Games are technically much more sophisticated than ever before, but I’m going to stake out a claim: aesthetically something has gone quite wrong, and the products don’t live up to the hype. Show me a next-gen, cutting edge game and I will show you an image that no competent film industry professional would ever deem acceptable. Why not? The answer lives at the crossroads of art and technology, a strange neglected intermediary which we in the industry tend to avoid talking about. Particularly in the last ten years, several new techniques have appeared that are foundational to practically every high end game on the market. These are well documented from a technical standpoint, and it’s generally assumed that graphics programmers who have stayed current are fluent in at least the basic goals and implementations of these techniques, if not the finer points of them. I won’t labor to build a complete list, but you likely know them: normal maps, HDR/tonemaps, physically based shading, volumetrics, DoF/bokeh, etc.

What’s extremely difficult to find, though, is a discussion of how to make these techniques visually appealing. Oh sure, we’ll sort of handwave it from time to time, but graphics programmers as a set don’t like talking about visual appeal in the way that artists do. It’s much easier to build the tools and then let the artists make it pretty. Except the artists, even the tech artists, don’t always have the know-how or mathematical tools to solve that problem. Sometimes we end up borrowing our looks from someone else – how many of you have googled FilmLut.tga? How many of you are using Unreal’s tone map operator, tweaked or even verbatim?

This series is going to take a sharply critical tone towards most AAA games being shipped today, because it’s my belief that there are fundamental problems with many of the techniques we’re using today that reach beyond strictly technical constraints. Graphics programmers and engines are implementing many techniques for new effects without taking the time or energy to properly refine the visual and aesthetic aspects of those effects. Marketing tells us we should be impressed by all the new features, yet when you take a step back from the fact that these are games and evaluate the images without that context, they look horrible. This is a problem that is fixable today, with current technology.

I don’t know if my thesis here is particularly well developed, but it’s a good excuse for the meat of this series. I don’t want to talk about how to implement techniques. There are many people who have done an excellent job of that and you should have that background coming in. I’m going to talk about the visual choices we make in these techniques, how they make our games better, how they make our games worse, and whether we’re using them well. I’m going to encourage everyone to think critically about why and how we’re implementing the things that make modern games tick, and examine the tunnel vision that has afflicted that process maybe since the beginning. And in the process, I’m going to criticize people’s work which far exceeds my own in every respect, while largely failing to provide solutions to problems. I know that and I accept it. And that is where we shall start.

Game Debugging And Tweaking Via MIDI Controller

Today, we’re going to talk about how to configure a MIDI controller to act as a debugging aid for a game – or any software development project.

Why a MIDI controller?

1024px-livid_alias8_midi_controller
Photo credit: Wikimedia Commons / CC BY-SA 3.0

Why would you want to do this? It’s very common to have a set of internal variables that we want to tweak on the fly, while the game is running, until things are just right. There’s animation, physics, AI, game play and difficulty, graphics, UI, and more. Typically people end up homebrewing a variety of tricks to edit these variables. You might ‘borrow’ a couple keyboard keys and throw an on screen text display in. You might have a developer console command and type in the values. If you really get fancy, you might even have a TCP debug channel that can control things from another UI, maybe even on another computer.

These options all have their pros and cons. Why add MIDI as another option to the mix?

  • It’s really easy and safe to disable for releases. We’re probably not using MIDI for any legitimate user purpose, after all.
  • It doesn’t require any UI, on any screen.
  • Supports platforms like mobile where keystroke debugging can be particularly problematic.
  • It’s quick to get to edit a large range of parameters in tandem.
  • Editing without explicit numerical values avoids the problem of sticking to “convenient” numbers.
  • It’s easy to give this setup to a non-technical person and let them toy around.
  • Tactile editing is just plain fun.

How does a MIDI controller work?

Over the MIDI protocol, naturally. MIDI was standardized in 1983 as a way for musical tools like keyboards, samplers, sequencers, and all sorts of computerized devices to communicate with each other. While in ye olden days the connection was made over a 7 pin DIN cable, many modern controllers have USB interfaces. These are class compliant devices, which means you can simply plug them into a computer and go, without any drivers or mandatory support software.

MIDI itself is a slightly wonky protocol, based strictly around events. There’s a wide range of events, but for our purposes we only really need four: Note On, Note Off, Control Change, and Program Change. Note On/Off are easy – they’re your key down/up equivalents for a MIDI controller. Control Change (CC) represents a value change on an analog control, like a knob or slider. Program Change changes an entire bank or preset.

Note what’s not in there: state. The whole thing is designed to be stateless, which means no state queries either. But many MIDI controllers do have state, in the physical position of their knobs or sliders. That leads us to the dreaded MIDI CC jump. In short, what happens when a value in our software is set to 0 (min) and the user twists a knob is at 127 (max)? The knob will transmit a CC with a value of 126 attached, causing the variable it’s attached to to skyrocket. This desynchronization between software and hardware state can be confusing, inconvenient, and downright problematic.

Choosing a MIDI controller

x-touch-mini_p0b3m_top_l
Photo Credit: Behringer/MUSIC Group

Enter the Behringer X-Touch Mini. For a trivial $60 street at time of writing, take a look at what it gives us:

  • 8 rotary encoders with LED collars for value display
  • Volume slider
  • 24 push buttons (8 push-knob, 16 LED) with configurable behavior (momentary/toggle)
  • Dual control layers, so multiply everything above by two
  • Simple class-compliant USB connectivity

Now back up and read that again. Rotary encoders. LED collars. Get it? Encoders aren’t knobs – they spin freely. The value isn’t based on the position of a potentiometer, but digitally stored and displayed using the LEDs around the outside of each encoder. Same goes for the button states. This is a controller that can not only send CC messages but receive them too, and update its own internal state accordingly. (The volume slider is not digital and will still cause CC jumps.)

Setting up the X-Touch Mini

xtouchedit
Photo Credit: Promit Roy / CC BY-SA 3.0

Behringer offers a program called X-Touch Editor which you’ll want to download. Despite the terrible English and oddball UI, It allows you to configure the X-Touch Mini’s behavior and save it persistently onto the controller’s internal memory. You can change the display style of the LEDs (I recommend fan) and change the buttons between momentary (like a keyboard button) and toggle (push on, push again off). It also offers other options for MIDI behavior, but I recommend leaving that stuff where it is. For my purposes, I set the center row of buttons to act as toggles, and left the rest of it alone.

Understanding the MIDI protocol

At this stage it might be helpful to download a program called MIDI-OX. This is a simple utility that lets you send and monitor MIDI messages, which is very useful in understanding exactly what’s happening when you are messing with the controller. Note that MIDI devices are acquired exclusively – X-Touch Edit, MIDI-OX, and your own code will all lock each other out.

The MIDI messages we’ll be using have a simple structure: a four bit message ID, a four bit channel selection, followed by one or two data bytes. The X-Touch Mini is configured to use channel 11 for its controls out of the box. Here are the relevant message IDs:

enum MidiMessageId
{
    MIDI_NoteOn = 144,
    MIDI_NoteOff = 128,
    MIDI_ControlChange = 176,
    MIDI_ProgramChange = 192,
};

Channels are 0 based in the protocol, so you’ll add 10 to these values when submitting them. NoteOn and NoteOff are accompanied by a note ID (like a key code) and a velocity (pressure). The key codes go from 0-23 for layer A, representing the buttons left to right and top to bottom. When you switch to layer B, they’ll switch over to 24-47. You’ll receive On and Off to reflect the button state, and you can also send On and Off back to the controller to set Toggle mode buttons on or off. This is really handy when hooked up to internal boolean variables: light on = true, light off = false. We’re not interested in the pressure field, but it does need to be set correctly. The values are 127 for NoteOn and 0 for NoteOff.

ControlChange (CC) will appear any time the knobs are rotated, with the knob ID and the current value as data bytes. The default range is 0-127; I recommend leaving it that way and rescaling it appropriately in software. For layer A, the knob IDs go from 1-8 (yes, they’re one based) and the slider is assigned to 9. On layer B that’ll be 10-17 and 18. You can transmit identical messages back to the controller to set the value on its end, and the LED collar will automatically update to match.

The X-Touch will never send you a ProgramChange (PC) by default. However on the current firmware, it will ignore messages that don’t apply to the currently active layer. You can send it a Program Change to toggle between layer A (0) and B (1), and then send the rest of your data for that layer to sync everything properly. PC only has a single data byte attached, which is the desired program.

Writing the MIDI glue code

Go ahead and grab RtMidi. It’s a nice little open source library, which is really just composed of a single header and source file pair, but supports Windows, Linux, and Mac OSX. (I have an experimental patch for iOS support as well that may go up soon.) I won’t cover how to use the library in detail here, as that’s best left to the samples – the code you need is right on their homepage – but I will give a quick overview.

You’ll need to create two objects: RtMidiIn for receiving data, and RtMidiOut for sending it. Each of these has to be hooked to a “port” – since multiple MIDI devices can be attached, this allows you to select which one you want to communicate with. The easiest thing to do here is just to search the port lists for a string name match.

At this point it’s just a question of signing up for the appropriate callbacks and parsing out their data, and then sending the correct messages back. The last step is to bidirectionally synchronize variables in your code to the MIDI controller. I did it with some template/macro nonsense:

void GameStage::MidiSyncOut()
{
	if(_midiOut)
	{
		_midiOut->Send(MIDI_ProgramChange, 0);
		MidiVars(1, 0, 0, 0);
		_midiOut->Send(MIDI_ProgramChange, 1);
		MidiVars(1, 0, 0, 0);
		_midiOut->Send(MIDI_ProgramChange, 0);
	}
}

void GameStage::NoteOn(unsigned int channel, unsigned int note, unsigned int velocity)
{
	MidiVars(0, 0, note, velocity);
}

void GameStage::NoteOff(unsigned int channel, unsigned int note, unsigned int velocity)
{
	MidiVars(0, 0, note, velocity);
}

void GameStage::ControlChange(unsigned int channel, unsigned int control, unsigned int value)
{
	MidiVars(0, 1, control, value);
}

template<typename T1, typename T2, typename T3> void MidiVariableOut(const T1& var, T2 min, T3 max, unsigned int knob, MidiOut* midiOut)
{
	float val = (var - min) / (max - min);
	val = clamp(val, 0.0f, 1.0f);
	unsigned char outval = unsigned char(val * 127);
	midiOut->Send(MIDI_ControlChange + 10, knob, outval);
}

template<typename T1, typename T2, typename T3> void MidiVariableIn(T1& var, T2 min, T3 max, unsigned int knob, unsigned int controlId, unsigned int noteOrCC, unsigned char value)
{
	if(noteOrCC && knob == controlId)
	{
		float ratio = value / 127.0f;
		var = T1(min + (max - min) * ratio);
	}
}

void MidiBoolOut(const bool& var, unsigned int button, MidiOut* midiOut)
{
	if(var)
		midiOut->Send(MIDI_NoteOn + 10, button, 127);
	else
		midiOut->Send(MIDI_NoteOff + 10, button, 0);
}

void MidiBoolIn(bool& var, unsigned int button, unsigned int controlId, unsigned int noteOrCC, unsigned char value)
{
	if(!noteOrCC && button == controlId)
	{
		var = value > 0;
	}
}

#define MIDIVAR(var, min, max, knob) inout ? MidiVariableOut(var, min, max, knob, _midiOut) : MidiVariableIn(var, min, max, knob, controlId, noteOrCC, value)
#define MIDIBOOL(var, button) inout ? MidiBoolOut(var, button, _midiOut) : MidiBoolIn(var, button, controlId, noteOrCC, value)
void GameStage::MidiVars(unsigned int inout, unsigned int noteOrCC, unsigned int controlId, unsigned int value)
{
	if(!_midiOut)
		return;

	MIDIVAR(_fogTopColor.x, 0.0f, 1.0f, 1);
	MIDIVAR(_fogTopColor.y, 0.0f, 1.0f, 2);
	MIDIVAR(_fogTopColor.z, 0.0f, 1.0f, 3);
	MIDIVAR(_fogBottomColor.x, 0.0f, 1.0f, 4);
	MIDIVAR(_fogBottomColor.y, 0.0f, 1.0f, 5);
	MIDIVAR(_fogBottomColor.z, 0.0f, 1.0f, 6);
	MIDIVAR(CoCScale, 1.0f, 16.0f, 8);

	MIDIBOOL(MUSIC, 8);
	MIDIBOOL(RenderEnvironment, 9);
	MIDIBOOL(DepthOfField, 10);
}

Probably not going to win any awards for that, but it does the trick. MidiVars does double duty as an event responder and a full data uploader. MidiSyncOut just includes the PC messages to make sure both layers are fully updated. Although I haven’t done it here, it would be very easy to data drive this, attach variables to MIDI controls from dev console, etc.

Once everything’s wired up, you have a completely independent physical display of whatever game values you want, ready to tweak and adjust to your heart’s content at a moment’s notice. If any of you have particularly technically minded designers/artists, they’re probably already using MIDI controllers along with independent glue apps that map them to keyboard keys. Why not cut out the middle man and have explicit engine/tool support?

Sony A77 Mark II: EVF Lag and Blackout Test

I’m planning to review this camera properly at some point, but for the time being, I wanted to do a simple test of what the parameters of EVF lag and blackout are.

Let’s talk about lag first. What do we mean? The A77 II uses an electronic viewfinder, which means that the viewfinder is a tiny LCD panel, showing a feed of what the imaging sensor currently sees. This view takes camera exposure and white balance into exposure, allowing you to get a feel for what the camera is actually going to record when the shutter fires. However, downloading and processing the sensor data, and then showing it on the LCD, takes time. This shutter firing needs to compensate for this lag; if you hit the shutter at the exact moment an event occurs on screen, the lag is how late you will actually fire the shutter as a result.

How do we test the lag? Well, the A77 II’s rear screen shows exactly the same display as the viewfinder, presumably with very similar lag. So all we have to do is point the camera at an external timer, and photograph both the camera and the timer simultaneously. And so that’s exactly what I did.
P1030514-screen
Note that I didn’t test whether any particular camera settings affected the results. The settings are pretty close to defaults. “Live View Display” is set to “Setting Effect ON”. These are the values I got, across 6 shots, in millseconds:
32, 16, 17, 34, 17, 33 = 24.8 ms average
I discarded a few values due to illegible screen (mid transition), but you get the picture. The rear LCD, and my monitor, are running at a 60 Hz refresh rate, which means that a new value appears on screen every ~16.67 ms. The lag wobbles between one and two frames, but this is mostly due to the desynchronization of the two screen refresh intervals. It’s not actually possible to measure any finer using this method, unfortunately. However the average value gives us a good ballpark value of effectively 25 ms. Consider that a typical computer LCD screen is already going to be in the 16ms range for lag, and TVs are frequently running in excess of 50ms. This is skirting the bottom of what the fastest humans (pro gamers etc) can detect. Sony’s done a very admirable job of getting the lag under control here.

Next up: EVF blackout. What is it? Running the viewfinder is essentially a continuous video processing job for the camera, using the sensor feed. In order to take a photo, the video feed needs to be stopped, the sensor needs to be blanked, the exposure needs to be taken, the shutter needs to be closed, the image downloaded off the sensor into memory, then the shutter must open again and the video feed must be resumed. The view of the camera goes black during this entire process, which can take quite a long time. To test this, I simply took a video of the camera while clicking off a few shots (1/60 shutter) in single shot mode. Here’s a GIFed version at 20 fps:
P1030523
By stepping through the video, I can see how long the screen is black. These are the numbers I got, counted in 60 Hz video frames:
17, 16, 16, 17, 16, 16 = 272 ms average
The results here are very consistent; we’ll call it a 0.27 second blackout time. For comparison, Canon claims that the mirror blackout on the Canon 7D is 0.055 seconds, so this represents a substantial difference between the two cameras. It also seems to be somewhat worse than my Panasonic GH4, another EVF based camera, although I haven’t measured it. I think this is an area which Sony needs to do a bit more, and I would love to see a firmware update to try and get this down at least under 200 ms.

It’s worth noting that the camera behaves differently in burst mode, going to the infamous “slideshow” effect. At either 8 or 12 fps settings, the screen shows the shot just taken rather than a live feed. This quantization makes “blackout time” slightly meaningless, but it can present a challenge when tracking with erratically moving subjects.

Time Capsule Draft: “Speculating About Xbox Next”

I was digging through my Ventspace post drafts, and I found this writeup that I apparently decided not to post. It was written in March of 2012, a full year and a half before the Xbox One arrived in the market. In retrospect, I’m apparently awesome. On the one hand, I wish I’d posted this up at the time, because it’s eerily accurate. On the other hand, the guesses are actually accurate enough that this might have looked to Microsoft like a leak, rather than speculation. Oh well. Here it is for your amusement. I haven’t touched a thing about it.


I’ve been hearing a lot of rumors, though the credibility of any given information is always suspect. I have some supposed info about the specs on the next Xbox, but I’m not drawing on any of that info here. I’m dubious about at least some of the things I heard, and it’s not good to spill that kind of info if you’re trying to maintain a vaguely positive relationship with a company anyway. So what I’m presenting here is strictly speculation based on extrapolation of what we’ve seen in the past and overall industry and Microsoft trends. I’m also assuming that MS is fairly easy to read and that they’re unlikely to come out of left field here.

  • 8 GB shared memory. The original Xbox had 64 MB of shared memory. The Xbox 360 has 512, a jump of 8x. This generation is dragging along a little longer, and memory prices have dropped violently in the last year or so. I would like to see 16 GB actually, but the consoles always screw us on memory and I just don’t think we’ll be that lucky. 4 GB is clearly too low, they’d be insane to ship a console with that now. As for the memory type, we’re probably talking simple (G)DDR3 shared modules. The Xboxes have always been shared memory and there’s no reason for them to change that now. Expect some weird addressing limitations on the GPU side.
  • Windows 8 kernel. All indications are that the WinCE embedded kernel is being retired over the next two years (at least for internal use). There’s a substantial tech investment in Windows 8, and I think we’re going to see the desktop kernel roll out across all three screens. (HINT HINT.) iOS and Android are both running stripped desktop kernels, and the resources in current mobile platforms make WinXP’s minimum hardware requirements look comically low. There is no reason to carry the embedded kernel along any longer. I wouldn’t want to be a CE licensee right now.
  • x86-64, 8×2 threads, out of order CPU. There are three plausible CPU architectures to choose from: x86, ARM, and PowerPC. Remember what I said about the Windows 8 kernel? There’s no Windows 8 PPC build, and we’re not going to see PowerPC again here. ARM is of course a big focus right now, but the design parameters of the current chips simply won’t accommodate a console. They’re not fast enough and that can’t be easily revised. That pretty much leaves us with x86. The only extant in-order x86 architecture is Intel Atom, which sucks. I think they’ll get out of order for free from the existing architectures. As far as the CPU, 8 core is essentially the top of the market right now, and I’m assuming they’ll hyperthread it. They’ll probably steal a core away from the OS, and I wouldn’t be surprised if they disable another core for yield purposes. That means six HT cores, which is a simple doubling of the current Xbox. I have a rumored clock-speed, but have decided not to share. Think lower rather than higher.
  • DirectX 11 GPU — AMD? DX11 class should be blatantly obvious. I have reason to believe that AMD is the supplier, and I did hear a specific arch but I don’t believe it. There’s no word in NVIDIA land about a potential contract, either. No idea if they’re giving the design ownership to MS again or anything like that, all I know is the arrows are all pointed the same way. There are some implications for the CPU here.
  • Wifi N and Gigabit ethernet. This is boring standard consumer networking hardware. No surprises here.
  • Optical drive? — I don’t think they want to have one. I do think they have to have one, though you can definitely expect a stronger push towards digital distribution than ever. There’s no choice but to support Blu-ray at this point. Top tier games simply need the space. I suspect that we’ll see a very large (laptop grade) hard drive included in at least some models. Half terabyte large, with larger sizes later in the lifecycle. That is purely a guess, though.
  • AMD Fusion APU? — I’m going to outlandishly suggest that a Fusion APU could be the heart of this console. With an x86 CPU and a mainstream Radeon core in about the right generation, the existing Fusion product could be retooled for use in a console. Why not? It already has the basic properties you want in a console chip. The big sticking points are performance and heat. It’s easy to solve either one but not both at once, and we all know what happened last time Microsoft pushed the heat envelope too far. If it is Fusion architecture, I would be shocked if they were to actually integrate the CPU and GPU dies.
  • Kinect. — Here’s another outlandish one: Every Xbox Next will include a Kinect (2?), in the box. Kinect has been an enormous winner for Microsoft so far on every single front, and this is where they’re going to draw the battle lines against Nintendo and Sony. Nintendo’s control scheme is now boring to the general public, with the Wii U being introduced to a resounding “meh”. PS Move faded into irrelevance the day it was launched. For the first time in many years, the Xbox is becoming the casual gamers’ console and they’re going to hammer that advantage relentlessly. Microsoft is also pushing use of secondary features (eg microphone) for hardcore games — see Mass Effect 3.
  • $500. Yes, it’s high, although not very high once you adjust for inflation. The Xbox 360 is an extremely capable device, especially for the no-so-serious crowd. It’s also pure profit for Microsoft, and really hitting its stride now as the general public’s long tail console. There’s no need to price its successor aggressively, and the stuff I just described is rather expensive besides. A $600 package option at launch would not be surprising.
  • November 2013. As with the last two Xboxes, it will be launched for the holiday season. Some people were saying it would be announced this year but the more I think about it, the less it makes sense to do so. There’s no way it’s launching this year, and they’re not going to announce it a year and some ahead of time. E3 2013 will probably be the real fun.

There are some problems with the specs I’ve listed so far. AMD doesn’t produce the CPU I described. Not that the rumors match any other known CPU, but Intel is closer. I don’t think one of the Phenom X6 designs is a credible choice. The Xbox 360 CPU didn’t match any existing chips either, so this may not really be a problem. The total package price would have to be quite high with a Kinect 2 included. The Xbox 360 may function as a useful buffer against being priced out of the market.

Quick tip: Retina mode in iOS OpenGL rendering is not all-or-nothing

Some of you are probably working on Retina support and performance for your OpenGL based game for iOS devices. If you’re like us, you’re probably finding that a few of the devices (*cough* iPad 3) don’t quiiite have the GPU horsepower to drive your fancy graphics at retina resolutions. So now you’re stuck with 1x and 4x MSAA, which performs decently well but frankly looks kind of bad. It’s a drastic step down in visual fidelity, especially with all the alpha blend stuff that doesn’t antialias. (Text!) Well it turns out you don’t have to choose such a drastic step. Here’s the typical enable-retina code you’ll find on StackOverflow or whatever:

if([[UIScreen mainScreen] respondsToSelector:@selector(scale)] && [[UIScreen mainScreen] scale] == 2)
{
self.contentScaleFactor = 2.0;
eaglLayer.contentsScale = 2.0;
}


//some GL setup stuff
...

//get the correct backing framebuffer size
int fbWidth, fbHeight;
glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_WIDTH, &fbWidth);
glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_HEIGHT, &fbHeight);

The respondsToSelector bit is pretty token nowadays – what was that, iOS 3? But there’s not much to it. Is the screen a 2x scaled screen? Great, set our view to 2x scale also. Boom, retina. Then we ask the GL runtime what we are running at, and set everything up from there. The trouble is it’s a very drastic increase in resolution, and many of the early retina devices don’t have the GPU horsepower to really do nice rendering. The pleasant surprise is, the scale doesn’t have to be 2.0. Running just a tiny bit short on fill?

if([[UIScreen mainScreen] respondsToSelector:@selector(scale)] && [[UIScreen mainScreen] scale] == 2)
{
self.contentScaleFactor = 1.8;
eaglLayer.contentsScale = 1.8;
}

Now once you create the render buffers for your game, they’ll appear at 1.8x resolution in each each direction, which is very slightly softer than 2.0 but much, much crisper than 1.0. I waited until after I Am Dolphin cleared the Apple App Store approval process, to make sure that they wouldn’t red flag this usage. Now that it’s out, I feel fairly comfortable sharing it. This can also be layered with multisampling (which I’m also doing) to fine tune the look of poly edges that would otherwise give away the trick. I use this technique to get high resolution, high quality sharp rendering at 60 fps across the entire range of Apple devices, from the lowly iPhone 4S, iPod 5, and iPad 3 on up.

I Am Dolphin – Kinect Prototype

I’d hoped to write up a nice post for this, but unfortunately I haven’t had much time lately. Releasing a game, it turns out, is not at all relaxing. Work doesn’t end when you hit that submit button to Apple.

In the meantime, I happened to put together a video showing a prototype of the game, running off Kinect control. I thought you all might find it interesting, as it’s a somewhat different control than the touch screen. Personally I think it’s the best version of the experience we’ve made, and we’ve had several (touch screen, mouse, PS Move, Leap, etc). Unlike the touch screen version, you get full 3D directional control. We don’t have to infer your motion intention. This makes a big difference in the feeling of total immersion.

Our New Game: I Am Dolphin

After an incredibly long time of quiet development, our new game, I Am Dolphin, will be available this Thursday, October 9th, on the Apple/iOS App Store. This post will be discussing the background and the game itself; I’m planning to post more technical information about the game and development in the future. This depends somewhat on people reading and commenting – tell me what you want to know about the work and I’m happy to answer as much as I can.

For those of you who may not have followed my career path over time: A close friend and I have spent quite a few years doing R&D with purely physically driven animation. There’s plenty of work out there on the subject; ours is not based on any of it and takes a completely different approach. About three years ago, we met a neurologist at the Johns Hopkins Hospital who helped us set up a small research group at Hopkins to study biological motion and create a completely new simulation system from the ground up, based around neurological principles and hands-on study of dolphins at the National Aquarium in Baltimore. Unlike many other physical animation systems, including our own previous work, the new work allows the physical simulation to be controlled as a player character. We also developed a new custom in-house framework, called the Kata Engine, to make the simulation work possible.

One of the goals in developing this controllable simulation was to learn more about human motor control, and specifically to investigate how to apply this technology to recovery from motor impairments such as stroke. National Geographic was kind enough to write some great articles on our motivations and approach:

Virtual Dolphin On A Mission

John Krakauer’s Stroke of Genius

Although the primary application of our work is medical and scientific, we’ve also spent our spare time to create a game company, Max And Haley LLC, and a purely entertainment focused version of the game. This is the version that will be publicly available in a scant few days.

Here is a review of the game by AppUnwrapper.

I got my hands on the beta version of the game, and it’s incredibly impressive and addictive. I spent two hours playing right off the bat without even realizing it, and have put in quite a few more hours since. I just keep wanting to come back to it. iPhones and iPads are the perfect platform for the game, because they allow for close and personal, tactile controls via simple swipes across the screen.

I now have three shipped titles to my name; I’d say this is the first one I’m really personally proud of. It’s my firm belief that we’ve created something that is completely unique in the gaming world, without being a gimmick. Every creature is a complete physical simulation. The dolphins you control respond to your swipes, not by playing pre-computed animation sequences but by actually incorporating your inputs into the drive parameters of the underlying simulation. The end result is a game that represents actual motion control, not gesture-recognition based selection of pre-existing motions.

As I said at the beginning of the post, this is mostly a promotional announcement. However, this is meant to be a technical blog, not my promotional mouthpiece. I want to dig in a lot to the actual development and technical aspects of this game. There’s a lot to talk about in the course of developing a game with a three person (2x coder, 1x artist) team, building a complete cross-platform engine from nothing, all in the backdrop of an academic research hospital environment. Then there’s the actual development of the simulation, which included a lot of interaction with the dolphins, the trainers, and the Aquarium staff. We did a lot of filming (but no motion capture!) in the course of the development as well; I’m hoping to share some of that footage moving forward.

Here’s a slightly older trailer – excuse the wrong launch date on this version. We decided to slip the release by two months after this was created – that’s worth a story in itself. It is not fully representative of the final product, but our final media isn’t quite ready.

Game Code Build Times: RAID 0, SSD, or both for the ultimate in speed?

I’ve been in the process of building and testing a new machine using Intel’s new X99 platform. This platform, combined with the new Haswell-E series of CPUs, is the new high end of what Intel is offering in the consumer space. One of the pain points for developers is build time. For our part, we’re building in the general vicinity of 400K LOC of C++ code, some of which is fairly complex — it uses standard library and boost headers, as well as some custom template stuff that is not simple to compile. The worst case is my five year old home machine, an i5-750 compiling to a single magnetic drive, which turns in a six minute full rebuild time. Certainly not the biggest project ever, but a pretty good testbed and real production code.

I wanted to find out what storage system layout would provide the best results. Traditionally game developers used RAID 0 magnetic arrays for development, but large capacity SSDs have now become common and inexpensive enough to entertain seriously for development use. I tested builds on three different volumes:

  • A single Samsung 850 Pro 512 GB (boot)
  • A RAID 0 of two Crucial MX100 512 GB
  • A RAID 0 of three WD Black 4 TB (7200 rpm)

Both RAID setups were blank. The CPU is an i7-5930k hex-core (12 threads) and I’ve got 32 GB of memory on board. Current pricing for all of these storage configurations is broadly similar. Now then, the results. Will the Samsung drive justify its high price tag? Will the massive bandwidth of two striped SSDs scream past the competitors? Can the huge magnetic drives really compete with the pinnacle of solid state technology? Who will win?

Drumroll…

They’re all the same.

All three configurations run my test build in roughly 45 seconds, the differences between them being largely negligible. In fact it’s the WD Blacks that posted the fastest time at 42s. The obvious takeaway is that all of these setups are past the threshold where something else is the bottleneck. That something in this case is the CPU, and more specifically the overall hardware thread count. Overclocking the CPU from 3.5 to 4.5 did nothing to help. I’ve heard of some studios outfitting their engineers with dual Xeon setups, and it’s not looking so crazy to do so when employee time is on the line. (The potential downside is that the machine starts to stray significantly from what the game will actually run on.) Given the results, and the sizes of modern game projects, I’d recommend using an inexpensive 500 GB SSD for a boot drive (Crucial MX100, Sandisk Ultra II, 840 EVO), and stocking up on the WD Blacks for data. Case closed.

But… as long as we’re here, why don’t we take a look at what these drives are benchmarking at? The 850 Pro is a monster of a drive. Those striped MX100s might be the real heros though; ATTO shows them flirting with a full gigabyte per second of sequential transfer. Here are the raw CrystalDiskMark numbers for all three:

Samsung 850 Pro:

Sequential Read : 520.557 MB/s
Sequential Write : 489.836 MB/s
Random Read 512KB : 407.993 MB/s
Random Write 512KB : 465.648 MB/s
Random Read 4KB (QD=1) : 24.216 MB/s [ 5912.1 IOPS]
Random Write 4KB (QD=1) : 71.216 MB/s [ 17386.7 IOPS]
Random Read 4KB (QD=32) : 398.378 MB/s [ 97260.3 IOPS]
Random Write 4KB (QD=32) : 331.571 MB/s [ 80950.0 IOPS]

2x Crucial MX100 in RAID 0:

Sequential Read : 898.908 MB/s
Sequential Write : 905.506 MB/s
Random Read 512KB : 695.787 MB/s
Random Write 512KB : 854.666 MB/s
Random Read 4KB (QD=1) : 26.271 MB/s [ 6413.8 IOPS]
Random Write 4KB (QD=1) : 110.554 MB/s [ 26990.8 IOPS]
Random Read 4KB (QD=32) : 430.077 MB/s [104999.3 IOPS]
Random Write 4KB (QD=32) : 413.606 MB/s [100978.0 IOPS]

3x WD Black 4TB in RAID 0:

Sequential Read : 530.522 MB/s
Sequential Write : 494.534 MB/s
Random Read 512KB : 61.752 MB/s
Random Write 512KB : 162.619 MB/s
Random Read 4KB (QD=1) : 0.724 MB/s [ 176.7 IOPS]
Random Write 4KB (QD=1) : 4.461 MB/s [ 1089.1 IOPS]
Random Read 4KB (QD=32) : 5.090 MB/s [ 1242.8 IOPS]
Random Write 4KB (QD=32) : 5.307 MB/s [ 1295.6 IOPS]

I don’t claim that these numbers are reliable or representative. I am only posting them to provide a general sense of the performance characteristics involved in each choice. The SSDs decimate the magnetic drive setup for random ops, though the 512 KB values are respectable. I had expected the 4K random read, for which SSDs are known, to have a significant impact on build time, but that clearly isn’t the case. The WDs are able to dispatch 177 of those per second; despite being 33x slower than the 850 Pro, this is still significantly faster than the compiler can keep up with. Even in the best case scenarios, a C++ compiler won’t be able to clear out more than a couple dozen files a second.