September 27, 2012

7 Problems Raytracing Doesn't Solve

I see a lot of people get excited about extreme concurrency in modern hardware bringing us closer to the magical holy grail of raytracing. It seems that everyone thinks that once we have raytracing, we can fully simulate entire digital worlds, everything will be photorealistic, and graphics will become a "solved problem". This simply isn't true, and in fact highlights several fundamental misconceptions about the problems faced by modern games and other interactive media.

For those unfamiliar with the term, raytracing is the process of rendering a 3D scene by tracing the path of a beam of light after it is emitted from a light source, calculating its properties as it bounces off various objects in the world until it finally hits the virtual camera. At least, you hope it hits the camera. You see, to be perfectly accurate, you have to cast a bajillion rays of light out from the light sources and then see which ones end up hitting the camera at some point. This is obviously a problem, because most of the rays don't actually hit the camera, and are simply wasted. Because this brute force method is so incredibly inefficient, many complex algorithms (such as photon-mapping and Metropolis light transport) have been developed to yield approximations that are thousands of times more efficient. These techniques are almost always focused on attempting to find paths from the light source to the camera, so rays can be cast in the reverse direction. Some early approximations actually cast rays out from the camera until they hit an object, then calculated the lighting information from the distance and angle, disregarding other objects in the scene. While highly efficient, this method produced extremely inaccurate results.

It is with a certain irony that raytracing is touted as being a precise, super-accurate rendering method when all raytracing is actually done via approximations in the first place. Pixar uses photon-mapping for its movies. Most raytracers operate on stochastic sampling approximations. We can already do raytracing in realtime, if we get approximate enough, it just looks boring and is extremely limited. Graphics development doesn't just stop when someone develops realtime raytracing, because there will always be room for a better approximation.

1. Photorealism

The meaning of photorealism is difficult to pin down, in part because the term is inherently subjective. If you define photorealism as being able to render a virtual scene such that it precisely matches a photo, then it is almost impossible to achieve in any sort of natural environment where the slightest wind can push a tree branch out of alignment.

This quickly gives rise to defining photorealism as rendering a virtual scene such that it is indistinguishable from a photograph of a similar scene, even if they aren't exactly the same. This, however, raises the issue of just how indistinguishable it needs to be. This seems like a bizarre concept, but there are different degrees of "indistinguishable" due to the differences between people's observational capacities. Many people will never notice a slightly misaligned shadow or a reflection that's a tad too bright. For others, they will stand out like sore thumbs and completely destroy their suspension of disbelief.

We have yet another problem in that the entire concept of "photorealism" has nothing to do with how humans see the world in the first place. Photos are inherently linear, while human experience a much more dynamic, log-based lighting scale. This gives rise to HDR photography, which actually has almost nothing to do with the HDR implemented in games. Games simply change the brightness of the entire scene, instead of combining the brightness of multiple exposures to brighten some areas and darken others in the same photo. If all photos are not created equal, then exactly which photo are we talking about when we say "photorealistic"?

2. Complexity

Raytracing is often cited as allowing an order of magnitude more detail in models by being able to efficiently process many more polygons. This is only sort of true in that raytracing is not subject to the same computational constraints that rasterization is. Rasterization must render every single triangle in the scene, whereas raytracing is only interested in whether or not a ray hits a triangle. Unfortunately, it still has to navigate through the scene representation. Even if a raytracer could handle a scene with a billion polygons efficiently, this raises completely unrelated problems involving RAM access times and cache pollution that suddenly become actual performance bottlenecks instead of micro-optimizations.

In addition, raytracing approximation algorithms almost always take advantage of rays that degrade quickly, such that they can only bounce 10-15 times before becoming irrelevant. This is fine and dandy for walking around in a city or a forest, but what about a kitchen? Even though raytracing is much better at handling reflections accurately, highly reflective materials cripple the raytracer, because now rays are bouncing hundreds of times off a myriad of surfaces instead of just 10. If not handled properly, it can absolutely devastate performance, which is catastrophic for game engines that must maintain constant render times.

3. Scale

How do you raytrace stars? Do you simply wrap a sphere around the sky and give it a "star" material? Do you make them all point sources infinitely far away? How does this work in a space game, where half the stars you see can actually be visited, and the other half are entire galaxies? How do you accurately simulate an entire solar system down to the surface of a planet, as the Kerbal Space Program developers had to? Trying to figure out how to represent that kind of information in a meaningful form with only 64 bits of precision, if you are lucky, is a problem completely separate from raytracing, yet of increasingly relevant concern as games continue to expand their horizons more and more. How do we simulate an entire galaxy? How can we maintain meaningful precision when faced with astronomical scales, and how does this factor in to our rendering pipeline? These are problems that arise in any rendering pipeline, regardless of what techniques it uses, due to fundamental limitations in our representations of numbers.

4. Materials

Do you know what methane clouds look like? What about writing an aerogel shader? Raytracing, by itself, doesn't simply figure out how a given material works, you have to tell it how each material behaves, and its accuracy is wholly dependent on how accurate your description of the material is. This isn't easy, either, it requires advanced mathematical models and heaps of data collection. In many places we're actually still trying to figure out how to build physically correct material equations in the first place. Did you know that Dreamworks had to rewrite part of their cloud shader1 for How To Train Your Dragon? It turns out that getting clouds to look good when your character is flying directly beneath them with a hand raised is really hard.

This is just for common lighting phenomena! How are you going to write shaders for things like pools of magic water and birefringent calcite crystals? How about trying to accurately simulate circular polarizers when most raytracers don't even know what polarization is? Does being photorealistic require you to simulate the Tyndall Effect for caustics in crystals and particulate matter? There are so many tiny little details all around us that affect everything from the color of our iris to the creation of rainbows. Just how much does our raytracer need to simulate in order to be photorealistic?

5. Physics

What if we ignored the first four problems and simply assumed we had managed to make a perfect, magical photorealistic raytracer. Congratulations, you've managed to co-opt the entirety of your CPU for the task of rendering a static 3D scene, leaving nothing left for the physics. All we've managed to accomplish is taking the "interactive" out of "interactive media". Being able to influence the world around us is a key ingredient to immersion in games, and this requires more and more accurate physics, which are arguably just as difficult to calculate as raytracing is. The most advanced real-time physics engine to-date is the Lagoa Multiphysics, and it can only just barely simulate a tiny scene in a well-controlled environment before it completely decimates a modern CPU. This is without any complex rendering at all. Now try doing that for a scene with a radius of several miles. Oh, and remember our issue with scaling? This applies to physics too! Except with physics, its an order of magnitude even more difficult.

6. Content

As many developers have been discovering, procedural generation is not magic pixie dust you can sprinkle on problems to make them go away. Yet, without advances in content generation, we are forced to hire armies of artists to create the absurd amounts of detail required by modern games. Raytracing doesn't solve this problem, it makes it worse. In any given square mile of a human settlement, there are billions of individual objects, ranging from pine cones, to rocks, to TV sets, to crumbs, all of which technically have physics, and must be kept track of, and rendered, and even more importantly, modeled.

Despite multiple attempts at leveraging procedural generation, the content problem has simply refused to go away. Until we can effectively harness the power of procedural generation, augmented artistic tools, and automatic design morphing, the advent of fully photorealistic raytracing will be useless. The best graphics engine in the world is nothing without art.

7. AI

<Patrician|Away> what does your robot do, sam
<bovril> it collects data about the surrounding environment, then discards it and drives into walls
Bash.org quote #240849
Of course, while we're busy desperately trying to raytrace supercomplex scenes with advanced physics, we haven't even left any CPU time to calculate the AI! The AI in games is so consistently terrible its turned into its own trope. The game industry spends all its computational time trying to render a scene, leaving almost nothing left for the AI routines, forcing them to rely on techniques from 1968. Think about that - we are approaching the point where AI in games comes down to a 50-year old technique that was considered hopelessly outdated before I was even born. Oh, and I should also point out that Graphics, Physics, Art, and AI are all completely separate fields with fundamentally different requirements that all have to work together in a coherent manner just so you can shoot headshots in Call of Duty 22.

I know that raytracing is exciting, sometimes simply as a demonstration of raw computational power. But it always disheartens me when people fantasize about playing amazingly detailed games indistinguishable from real life when that simply isn't going to happen, even with the inevitable development2 of realtime raytracing. By the time it becomes commercially viable, it will simply be yet another incremental step in our eternal quest for infinite realism. It is an important step, and one we should strive for, but it alone is not sufficient to spark a revolution.

1 Found on the special features section of the How To Train Your Dragon DVD.
2 Disclaimer: I've been trying to develop an efficient raytracing algorithm for ages and haven't had much luck. These guys are faring much better.

September 23, 2012

Teenage Rebellion as a Failure of Society

Historians have noticed that the concept of teenage rebellion is a modern invention. Young adults often (but not always) have a tendency to be horny and impulsive, but the flagrant and sometimes violent rejection of authority associated with teenagers is a stereotype unique to modern culture. Many adults incorrectly assume this means we have gotten "too soft" and need to bring back spanking, paddles, and other harsher methods of punishment. As any respectable young adult will tell you, that isn't the answer, and in fact highlights the underlying issue of ageism that is creating an aloof, frustrated, and repressed youth.

The problem is that adults refuse to take children seriously. Until puberty, kids are often aware of this, but most simply don't care (and sometimes take advantage of it). As they develop into young adults, however, this begins to clash with their own aspirations. They want to be in control of their own lives, because they're trying to figure out what they want their lives to be. They want to explore the world and decide where to stand and what to avoid. Instead, they are locked inside a school for 6-7 hours and spoon-fed buckets of irrelevant information, which they must then regurgitate on a battery of tests that have no relation to reality. They are not given meaningful opportunities to prove themselves as functional members of society. Instead, they are explicitly forbidden from participating in the adult world until the arbitrary age of 18, regardless of how mature or immature they are. They are told that they can't be an adult not because of their behavior, but simply because they aren't old enough. The high school dropout and the valedictorian get to be adults at exactly the same time - 18.

Our refusal to let young adults prove how mature they can be is doubly ironic in the context of a faltering global economy in desperate need of innovative new technologies to create jobs. Teenagers are unrestricted by concepts of impossibility, and free from the consequences of failed experiments. They don't have to worry about acquiring government funding or getting published in a peer-reviewed journal. They just want to make cool things, and that is exactly what we need. So obviously, to improve student performance in schools, our politicians tie school funding to test scores. You can't legislate innovation, you can only inspire it. Filling in those stupid scantron forms is not conducive to creative thinking. Our hyper-emphasis on test scores has succeeded only in ensuring that the only students who get into college are ones that are good at taking tests, not inventing things.

Young adults are entirely capable of being mature, responsible members of society if we just give them the chance to be adults instead of using a impartial age barrier that serves only to segregate them from the rest of society. They are doomed to be treated as second-class citizens not because they are behaving badly, but because they aren't old enough. Physical labor and repetitive jobs are being replaced by automated machines, and these jobs aren't coming back. The new economy isn't run by office drones that follow instructions like robots, but by technological pioneers that change the world. You can't institutionalize creativity, or put it on a test. You can't measure imagination or grade ingenuity.

So what do we do? We cut funding for creative art programs and increase standardized testing. Our attempts to save our educational system are only ensuring its imminent demise as it prepares kids to live in a world that no longer exists.

The most valuable commodity in this new economy will be your imagination - the one thing computers can't do. Maybe if we actually treated young adults like real people, their creativity could become the driving force of economic prosperity.

September 19, 2012

Analyzing XKCD: Click and Drag

Today, xkcd featured a comic with a comically large image that is navigated by clicking and dragging. In the interests of SCIENCE (and possibly accidentally DDoSing Randall's image server - sorry!), I created a static HTML file of the entire composite image.1

The collage is made up of 225 images2 that stretch out over a total image area 79872 pixels high and 165888 pixels wide. The images take up 5.52 MB of space and are named with a simple naming scheme "ydxd.png" where d represents a cardinal direction appropriate for the axis (n for north, s for south on the y axis and e for east, w for west on the x axis) along with the tile coordinate number; for example, "1n1e.png". Tiles are 2048x2048 png images with an average size of 24.53 KB. If you were to try and represent this as a single, uncompressed 32-bit 79872x165888 image file, it would take up 52.99 GB of space.

Assuming a human's average height is 1.8 meters, that would give this image a scale of about 1 meter per 22 pixels. That means the total composite image is approximately 3.63 kilometers high and 7.54 kilometers wide. It would take an average human 1.67 hours to walk from one end of the image to the other. Note that the characters at the far left say they've been walking for 2 miles - they are 67584 pixels from the starting point, which translates to 3.072 km or ~1.9 miles, so this seems to indicate my rough estimates here are reasonably accurate.

If Randall spent, on average, one hour drawing each frame, it would take him 9.375 days of constant, nonstop work to finish this. If he instead spent an average of 10 minutes per frame, it would take ~37.5 hours, or almost an entire 40-hour work week.

Basically I'm saying Randall Munroe is fucking insane.

1 If you are on firefox or chrome, right-clicking and selecting "Save as" will download the HTML file along with all 225 images into a separate folder.
2 There are actually 3159 possible images (39 x 81), but all-white and all-black images are not included, instead being replaced by either the default white background or a massive black <div> representing the ground, located 28672 pixels from the top of the image, with a height of 51200.