December 24, 2011

Your Esoteric Language is Useless

You'd think that programmers would get over these ridiculous language wars. The consensus should be that any one programmer is going to use whatever language they are most comfortable with that gets the job done most efficiently. If someone knows C/C++, C#, and Java, they're probably going to use C++ to write a console game. You can argue that language [x] is terrible because of [x], but the problem is that ALL languages are terrible for one reason or another. Every aspect of a languages design is a series of trade-offs, and if you try to criticize a language that is useful in one context because it isn't useful in another, you are ignoring the entire concept of a trade-off.

These arguments go on for hours upon hours, about what exactly is a trade-off and what languages arguably have stupid features and what makes someone a good programer and blah blah blah blah SHUT UP. I don't care what language you used, if your program is shit, your program is shit. I don't care if you wrote it in Clojure or used MongoDB or used continuations and closures in whatever esoteric functional language happens to be popular right now. Your program still sucks. If someone else writes a better program in C without any elegant use of anything, and it works better than your program, they're doing their job better than you.

I don't care if they aren't as good a programmer as you are, by whatever stupid, arbitrary standards you've invented to make yourself feel better, they're still doing a better job than you. I don't care if your haskell editor was written in haskell. Good for you. It sucks. It is terribly designed. It's workflow is about as conducive as a blob of molasses on a mountain in January. I don't care if you are using a fantastic stack of professionally designed standard libraries instead of re-inventing the wheel. That guy over there re-invented the wheel the wrong way 10 times and his program is better than yours because it's designed with the user in mind instead of a bunch of stupid libraries. I don't care if you're using Mercurial over SVN or Git on Linux using Emacs with a bunch of extensions that make you super productive. Your program still sucks.

I am sick and tired of people judging programmers on a bunch of rules that don't matter. Do you know functional programming? Do you know how to implement a LAMP stack? Obviously you don't use C++ or anything like that, do you?

These programmers have no goddamn idea what they're talking about. But that isn't what concerns me. What concerns me is that programmers are so obsessed over what language is best or what tool is best or what library they should use when they should be more concerned about what their program actually DOES. They get so caught up in building whatever elegant crap they're trying to build they completely forget what the end user experience is, especially when the end user has never used the program before. Just as you are not a slave to your tools, your program is not enslaved to your libraries.

Your program's design should serve the user, not a bunch of data structures.

December 2, 2011

The Irrationality of Idiots

If Everyone Else is Such an Idiot, How Come You're Not Rich? - Megan McArdle
I run around calling a whole lot of people and/or things stupid, dump, moronic, or some other variation of idiot. As the above quote exemplifies, saying such things tends to be a bit dangerous, since if everyone else was an idiot, you should be rich as hell. My snarky reaction to that, of course, would be that I'm not rich yet (and even then, "rich" in the sense of the quote is really just a metaphor for success, depending on how you define it for yourself), but in truth there are very specific reasons I call someone an idiot, and they don't necessarily involve actual intelligence.

To me, someone is an idiot if they refuse to argue in a rational manner. If you ignore evidence or use nonsensical reasoning and logical fallacies to support your beliefs, you're an idiot. If you don't like me calling you an idiot, that's just fine, because I acknowledge your existence about as much as I acknowledge the existence of dirty clothes on my bedroom floor. It's only when there is such a pile of dirty laundry lying around that it impedes movement that I really notice and clean it up. In the case of suffocating amounts of stupidity, I usually just go somewhere else. The rest of the time, stupid people can only serve to grudgingly function in a society, not take part in running it. This is because designing and running a society requires rational thinking and logical arguments, or nothing gets done.

I can get really angry about certain things, but I must yield to opinions that have a reasonable basis, if only to acknowledge that I might be wrong, even if I think I'm not. Everything I say or do must have some sort of logical basis, even if it originated from pure intuition. So long as you can poke legitimate holes in an accepted theory, you can hold some pretty crazy opinions that can't be considered illogical, though perhaps still incredibly risky or unlikely.

All the other times I call someone an idiot, I'm usually being lazy when I should really be calling the action idiotic. For example, I can't legitimately call Mark Zuckerberg an idiot. If I call him an idiot, I'm not forming a legitimate opinion, and its probably because he did something that pissed me off and I'm ranting about it, and you are free to ignore my invalid opinion, at least until I clarify that what he did was idiotic, not him. Of course, sometimes people repeatedly do things that are just so mind-bogglingly stupid that it is entirely justified to actually call them a moron, because they are displaying a serious lack of bona fide intelligence. Usually, though, most people are entirely capable of rational thought, but simply do not care enough to exercise it, in which case their idiocy stems from an unwillingness to use rationality, not actual intelligence.

I bring this up, because it seems to be a serious problem. What happens when we lose rationality? People can't compromise anymore, and we get a bunch of stupendously idiotic proposals borne out of ignorance that no longer has to pass through a filter of logical argumentation. All irrational disputes become polarized because neither side is willing to listen to the other, and the emotions that are intrinsically tied to the dispute prevent any meaningful progress from being made. Society breaks down in the face of irrationality because irrationality refuses to acknowledge things like, people are different.

Well gee, that sounds like our current political mess.

I am an aggressive supporter of educational reform, and one of the things that I believe should be taught in schools is not only rational thought and logical arguments, but how rational thought can complement creativity and irrational emotions. We cannot rid ourselves of illogical beliefs, because then we've turned into Vulcans, but we must learn, as a species, when our emotions are appropriate, and when we need to exercise our ability to be rational agents. As it is, we are devolving into a prehistoric mess of irrational demands and opinions that only serve to drag society backwards, just as we begin unlocking the true potential of our technology.

Relevent:

December 1, 2011

The Great Mystery of Linear Gradient Lighting

A long, long time ago, in pretty much the same place I'm sitting in right now, I was learning how one would do 2D lighting with soft shadows and discovered the age old adage in 2D graphics: linear gradient lighting looks better than mathematically correct inverse square lighting.

Strange.

I brushed it off as artistic license and perceptual trickery, but over the years, as I dug into advanced lighting concepts, nothing could explain this. It was a mystery. Around the time I discovered microfacet theory I figured it could theoretically be an attempt to approximate non-lambertanian reflectance models, but even that wouldn't turn an exponential curve into a linear one.

This bizarre law even showed up in my 3D lighting experiments. Attempting to invoke the inverse square law would simply result in extremely bright and dark areas and would look absolutely terrible, and yet the only apparent fix I saw anywhere was simply calculating light via linear distance in clear violation of observed light behavior. Everywhere I looked, people calculated light on a linear basis, everywhere, on everything. Was it the equations? Perhaps the equations being used operated on linear light values instead of exponential ones and so only output the correct value if the light was linear? No, that wasn't it. I couldn't figure it out. Years and years and years would pass with this discrepancy left unaccounted for.

A few months ago I noted an article on gamma correction and assumed it was related to color correction or some other post process effect designed to compensate for monitor behavior, and put it as a very low priority research point on my mental to-do-list. No reason fixing up minor brightness problems until your graphics engine can actually render everything properly. Yesterday, though, I happened across a Hacker News posting about learning modern 3D engine programming. Curious if it had anything I didn't already know, I ran through its topics, and found this. Gamma correction wasn't just making the scene brighter to fit with the monitor, it was compensating for the fact that most images are actually already gamma-corrected.

In a nutshell, the brightness of a monitor is exponential, not linear (with a power of about 2.2). The result is that a linear gradient displayed on the monitor is not actually increasing in brightness linearly. Because it's mapped to a curve, it will actually increase in brightness exponentially. This is due to the human visual system processing luminosity on a logarithmic scale. The curve in question is this:

Gamma Response Curve
Source: GPU Gems 3 - Chapter 24: The Importance of Being Linear

You can see the effect in this picture, taken from the article I mentioned:
Linear Curve

The thing is, I always assumed the top linear gradient was a linear gradient. Sure it looks a little dark, but hey, I suppose that might happen if you're increasing at 25% increments, right? WRONG. The bottom strip is a true linear gradient1. The top strip is a literal assignment of linear gradient RGB values, going from 0 to 62 to 126, etc. While this is, digitally speaking, a mathematical linear gradient, what happens when it gets displayed on the screen? It gets distorted by the CRT Gamma curve seen in the above graph, which makes the end value exponential. The bottom strip, on the other hand, is gamma corrected - it is NOT a mathematical linear gradient. It's values go from 0 to 134 to 185. As a result, when this exponential curve is displayed on your monitor, it's values are dragged down by the exact inverse exponential curve, resulting in a true linear curve. An image that has been "gamma-corrected" in this manner is said to exist in sRGB color space.

The thing is, most images aren't linear. They're actually in the sRGB color space, otherwise they'd look totally wrong when we viewed them on our monitors. Normally, this doesn't matter, which is why most 2D games simply ignore gamma completely. Because all a 2D engine does is take a pixel and display it on the screen without touching it, if you enable gamma correction you will actually over-correct the image and it will look terrible. This becomes a problem with image editing, because digital artists are drawing and coloring things on their monitors and they try to make sure that everything looks good on their monitor. So if an artist were visually trying to make a linear gradient, they would probably make something similar to the already gamma-corrected strip we saw earlier. Because virtually no image editors linearize images when saving (for good reason), the resulting image an artist creates is actually in sRGB color space, which is why only turning on gamma correction will usually simply make everything look bright and washed out, since you are normally using images that are already gamma-corrected. This is actually good thing due to subtle precision issues, but it creates a serious problem when you start trying to do lighting calculations.

The thing is, lighting calculations are linear operations. It's why you use Linear Algebra for most of your image processing needs. Because of this, when I tried to use the inverse-square law for my lighting functions, the resulting value that I was multiplying on to the already-gamma-corrected image was not gamma corrected! In order to do proper lighting, you would have to first linearize the gamma-corrected image, perform the lighting calculation on it, and then re-gamma-correct the end result.

Wait a minute, what did we say the gamma curve value was? It's $$x^{2.2}$$, so $$x^{0.45}$$ will gamma-correct the value $$x$$. But the inverse square law states that the intensity of a light is actually $$\frac{1}{x^2}$$, so if you were to gamma correct the inverse square law, you'd end up with: \[ {\bigg(\frac{1}{x^2}}\bigg)^{0.45} = {x^{-2}}^{0.45} = x^{-0.9} ≈ x^{1} \]
That's almost linear!2

OH MY GOD
MIND == BLOWN

That's it! The reason I saw linear curves all over the place was because it was a rough approximation to gamma correction! The reason linear lighting looks good in a 2D game is because its actually an approximation to a gamma-corrected inverse-square law! Holy shit! Why didn't anyone ever explain this?!3 Now it all makes sense! Just to confirm my findings, I went back to my 3D lighting experiment, and sure enough, after correcting the gamma values, using the inverse square law for the lighting gave correct results! MUAHAHAHAHAHAHA!

For those of you using OpenGL, you can implement gamma correction as explained in the article mentioned above. For those of you using DirectX9 (not 10), you can simply enable D3DSAMP_SRGBTEXTURE on whichever texture stages are using sRGB textures (usually only the diffuse map), and then enable D3DRS_SRGBWRITEENABLE during your drawing calls (a gamma-correction stateblock containing both of those works nicely). For things like GUI, you'll probably want to bypass the sRGB part. Like OpenGL, you can also skip D3DRS_SRGBWRITEENABLE and simply gamma-correct the entire blended scene using D3DCAPS3_LINEAR_TO_SRGB_PRESENTATION in the Present() call, but this has a lot of caveats attached. In DirectX10, you no longer use D3DSAMP_SRGBTEXTURE. Instead, you use an sRGB texture format (see this presentation for details).


1 or at least much closer, depending on your monitors true gamma response
2 In reality I'm sweeping a whole bunch of math under the table here. What you really have to do is move the inverse square curve around until it overlaps the gamma curve, then apply it, and you'll get something that is roughly linear.
3 If this is actually standard course material in a real graphics course, and I am just really bad at finding good tutorials, I apologize for the palm hitting your face right now.

November 24, 2011

Signed Integers Considered Stupid (Like This Title)

Unrelated note: If you title your article "[x] considered harmful", you are a horrible person with no originality. Stop doing it.

Signed integers have always bugged me. I've seen quite a bit of signed integer overuse in C#, but it is most egregious when dealing with C/C++ libraries that, for some reason, insist on using for(int i = 0; i < 5; ++i). Why would you ever write that? i cannot possibly be negative and for that matter shouldn't be negative, ever. Use for(unsigned int i = 0; i < 5; ++i), for crying out loud.

But really, that's not a fair example. You don't really lose anything using an integer for the i value there because its range isn't large enough. The places where this become stupid are things like using an integer for height and width, or returning a signed integer count. Why on earth would you want to return a negative count? If the count fails, return an unsigned -1, which is just the maximum possible value for your chosen unsigned integral type. Of course, certain people seem to think this is a bad idea because then you will return the largest positive number possible. What if they interpret that as a valid count and try to allocate 4 gigs of memory? Well gee, I don't know, what happens when you try to allocate -1 bytes of memory? In both cases, something is going to explode, and in both cases, its because the person using your code is an idiot. Neither way is more safe than the other. In fact, signed integers cause far more problems then they solve.

One of the most painfully obvious issues here is that virtually every single architecture in the world uses the two's complement representation of signed integers. When you are using two's complement on an 8-bit signed integer type (a char in C++), the largest positive value is 127, and the largest negative value is -128. That means a signed integer can represent a negative number so large it cannot be represented as a positive number. What happens when you do (char)abs(-128)? It tries to return 128, which overflows back to... -128. This is the cause of a host of security problems, and what's hilarious is that a lot of people try to use this to fuel their argument that you should use C# or Java or Haskell or some other esoteric language that makes them feel smart. The fact is, any language with fixed size integers has this problem. That means C# has it, Java has it, most languages have it to some degree. This bug doesn't mean you should stop using C++, it means you need to stop using signed integers in places they don't belong. Observe the following code:
  if (*p == '*')
  {
    ++p;
    total_width += abs (va_arg (ap, int));
  }
This is retarded. Why on earth are you interpreting an argument as a signed integer only to then immediately call abs() on it? So a brain damaged programmer can throw in negative values and not blow things up? If it can only possibly be valid when it is a positive number, interpret it as a unsigned int. Even if someone tries putting in a negative number, they will serve only to make the total_width abnormally large, instead of potentially putting in -128, causing abs() to return -128 and creating a total_width that is far too small, causing a buffer overflow and hacking into your program. And don't go declaring total_width as a signed integer either, because that's just stupid. Using an unsigned integer here closes a potential security hole and makes it even harder for a dumb programmer to screw things up1.

I can only attribute the vast overuse of int to programmer laziness. unsigned int is just too long to write. Of course, that's what typedef's are for, so that isn't an excuse, so maybe they're worried a programmer won't understand how to put a -1 into an unsigned int? Even if they didn't, you could still cast the int to an unsigned int to serve the same purpose and close the security hole. I am simply at a loss as to why I see int's all over code that could never possibly be negative. If it could never possibly be negative, you are therefore assuming that it won't be negative, so it's a much better idea to just make it impossible for it to be negative instead of giving hackers 200 possible ways to break your program.


1 There's actually another error here in that total_width can overflow even when unsigned, and there is no check for that, but that's beyond the scope of this article.

November 7, 2011

Why Kids Hate Math

They're teaching it wrong.

And I don't just mean teaching the concepts incorrectly (although they do plenty of that), I mean their teaching priorities are completely backwards. Set Theory is really fun. Basic Set Theory can be taught to someone without them needing to know how to add or subtract. We teach kids Venn Diagrams but never teach them all the fun operators that go with them? Why not? You say they won't understand? Bullshit. If we can teach third graders binary, we can teach them set theory. We take forever to get around to teaching algebra to kids, because its considered difficult. If something is a difficult conceptual leap, then you don't want to delay it, you want to introduce the concepts as early as possible. I say start teaching kids algebra once they know basic arithmetic. They don't need to know how to do crazy weird stuff like x * x = x² (they don't even know what ² means), but you can still introduce them to the idea of representing an unknown value with x. Then you can teach them exponentiation and logs and all those other operators first in the context of numbers, and then in the context of unknown variables. Then algebra isn't some scary thing that makes all those people who don't understand math give up, its something you simply grow up with.

In a similar manner, what the hell is with all those trig identities? Nobody memorizes those things! You memorize like, 2 or 3 of them, and almost only ever use sin² + cos² = 1. In a similar fashion, nobody ever uses integral trig identities because if you are using them you should have converted your coordinate system to polar coordinates, and if you can't do that then you can just look them up for crying out loud. Factoring and completing the square can be useful, but forcing students to do these problems over and over when they almost never actually show up in anything other than spoon-fed equations is insane.

Partial Fractions, on the other hand, are awesome and fun and why on earth are they only taught in intermediate calculus?! Kids are ALWAYS trying to pull apart fractions like that, and we always tell them to not do it - why not just teach them the right way to do it? By the time they finally got around to teaching me partial fractions, I was thinking that it would be some horrifically difficult, painful, complex process. It isn't. You just have to follow a few rules and then 0 out some functions. How can that possibly be harder than learning the concept of differentiation? And its useful too!

Lets say we want to teach someone basic calculus. How much do they need to know? They need to know addition, subtraction, division, multiplication, fractions, exponentiation, roots, algebra, limits, and derivatives. You could teach someone calculus without them knowing what sine and cosine even are. You could probably argue that, with proper teaching, calculus would be about as hard, or maybe a little harder, than trigonometry. Trigonometry, by the way, has an inordinate amount of time spent on it. Just tell kids how right triangles work, sine/cosine/tangent, SOHCAHTOA, a few identities, and you're good. You don't need to know scalene and isosceles triangles. Why do we even have special names for them? Who gives a shit if a triangle has sides of the same length? Either its a right triangle and its useful or its not a right triangle and you have to do some crazy sin law shit that usually means your algorithm is just wrong and so the only time you ever actually need to use it you can just look up the formula because it is a obtuse edge case that almost never comes up.

Think about that. We're grading kids by asking them to solve edge cases that never come up in reality and grading how well they are in math based off of that. And then we're confused when they complain about math having no practical application? Well duh. The sheer amount of time spent on useless topics is staggering. Calculus should be taught to high school freshman. Differential equations and complex analysis go to the seniors, and by the time you get into college you're looking at combinatorics and vector analysis, not basic calculus.

I have already seen some heavily flawed arguments against this. Some people say that people aren't interested in math, so this will never work. Since I'm saying that teaching kids advanced concepts early on will make them interested in math, this is a circular argument and invalid. Other people claim that the kids will never understand because of some bullshit about needing logical constructs, which just doesn't make sense because you should still introduce the concepts. Introducing a concept early on and having the student be confused about it is a good thing because it means they'll try to work it out over time. The more time you give them, the more likely it will click. Besides, most students aren't understanding algebra with the current system anyway, so I fail to see the point of that argument. It's not working now so don't try to change it or you'll make it worse? That's just pathetic.

TL;DR: Stop teaching kids stupid, pointless math they won't need and maybe they won't rightfully conclude that what they are being taught is useless.

September 23, 2011

Don't Work on Someone Else's Dream

When I complain to my friends about a recent spat of not being productive, they often remind me of the occasional 10 hours I spend forgetting to eat while hunting down a bug. When introducing myself, I am always clear that, most of the time, I am either busy, or trying to be busy. Everything to me is work, everything that makes me proud of myself is work, everything in my future will, hopefully, be more work. The entire concept of retiring to me is madness. I never want to stop working.

This is often mistaken as an unhealthy obsession with work, which is not entirely true. I am not torturing myself every day for 10 hours just so I can prove myself, I'm doing exactly what I want to do. I'm 21 years old, I can drink and smoke (but never do), I can drive (but I take the bus anyway), I go to college (but rarely attend classes), and in general am supposed to be an adult. Most people my age are finishing college and inevitably taking low paying jobs while they search for another low paying internship at a company so they can eventually get a high paying job that actually uses what they learned in college after they're too old to care.

If I really wanted, I could be at Facebook or Microsoft right now. I even had a high school internship at Microsoft, and probably could have gotten a college one too. I could have spent my time learning all the languages the companies want you to learn, and become incredibly well-versed in everything that everyone else already knows. I could have taught myself proper documentation and proper standards and proper guidelines and kept up my goody two-shoes act for the rest of my fucking life and get congratulated for being such a well-behaved and successful clone.

Fuck that.

I am 21 years old, and I'm going to spend it doing what I like doing, working on the projects I want to work on, and figuring out a way to make a living out of it even if I have to live out of my parents house for another 6 months. I am not going to get a job doing what other people tell me is important. While I am often very critical of myself as a person, realistically speaking, my only regrets are the moments I spent not working, or wasting time on things that weren't important. It doesn't matter that I've been working on a project most people dismiss as a childish fantasy since I was 18. It doesn't matter that I have no income and no normal job and no programming skills that would get me hired at a modern tech company because everyone hates C++ and only cares about web development.

I'm not working on something a CEO thinks is important, I'm working on something I think is important. I'm going to start a company so I can continue to work on what I think is important, and every single employee I will ever hire will work on something they think is important. This doesn't necessarily mean its fun - finding a rogue typecast is anything but fun - but rather its something that you are willing to ride the highs and lows through because it is intrinsically important to you, as a person. You should not wait until you're 35 with a family and a wife to worry about. Do it now. Do whatever is necessary to make it possible for you start working on whatever you think is important and then do it so hard you can make a living out of it.

Don't waste the best 10 years of your life working on someone else's dream.



(don't waste 10 years of your life forgetting to eat, either. That just isn't healthy)

September 9, 2011

C# to C++ Tutorial - Part 3: Classes and Structs and Inheritance (OH MY!)

[ 1 · 2 · 3 · 4 · 5 · 6 · 7 ]

Classes in C#, like most object-oriented languages, are very similar to their C++ counterparts. They are declared with class, exist between brackets and inherit classes using a colon ':'. Note, however, that all classes in C++ must end with a semicolon! You will forget this semicolon, and then all the things will break. You can do pretty much everything you can do with a C# class in a C++ class, except that C++ does not have partial classes, and in C++ classes themselves cannot be declared public, protected or private. Both of these features don't exist because they are made irrelevant with how classes are declared in header files.

In C# you usually just have one code file with the class declared in it along with all the code for all the functions. You can just magically use this class everywhere else and everything is fun and happy with rainbows. As mentioned before, C++ uses header files, and they are heavily integrated into the class system. We saw before how in order to use a function somewhere else, its prototype must first be declared in the header file. This applies to both classes and pretty much everything else. You need to understand that unlike C#, C++ does not have magic dust in its compiler. In C++, it just goes down the list of .cpp files, does a bit of dependency optimization, and then simply compiles each .cpp file by taking all the content from all the headers that are included (including all the headers included in the headers) and pasting it before the actual code from the .cpp file, and compiling. This process is repeated separately for every single code file, and no order inconsistencies are allowed anywhere in the code, the headers, or even the order that the headers are included in. The compiler literally takes every single #include statement as it is and simply replaces it with the code of the header it points to, wherever this happens to be in the code. This can (and this has happened to me) result in certain configurations of header files working even though one header file is actually missing a dependency. For example:

//Rainbow.h

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
}; // DO NOT FORGET THE SEMICOLON
//Unicorn.h

class Unicorn
{
  int magic;
};
//main.cpp

#include "Unicorn.h"
#include "Rainbow.h"

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}

Compiling main.cpp will succeed in this case, even though Rainbow.h is referencing the Unicorn class without it ever being declared. The reason behind this is what happens when the compiler expands all the includes. Right before compiling main.cpp (after the preprocessor has run), main.cpp looks like this:

//main.cpp

//Unicorn.h

class Unicorn
{
  int magic;
};
//Rainbow.h

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
}; // DO NOT FORGET THE SEMICOLON

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}

It is now obvious that because Rainbow.h was included after Unicorn.h, the Unicorn reference was resolved since it was declared before Rainbow. However, had we reversed the order of the include files, we would have had an anachronism: an inconsistency in our chronological arrangement. It is very bad practice to construct headers that are dependent on the order in which they are included, so we usually resolve something like this by having Rainbow.h simply include Unicorn.h, and then it won't matter what order they are included in.

//Rainbow.h

#include "Unicorn.h"

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
Left as is, however, and we run into a problem. Lets try compiling main.cpp:
//main.cpp

#include "Rainbow.h"
#include "Unicorn.h"

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

#include "Unicorn.h"

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h

class Unicorn
{
  int magic;
};

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

//Unicorn.h

class Unicorn
{
  int magic;
};

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h

class Unicorn
{
  int magic;
};

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}

We've just declared Unicorn twice! Obviously one way to solve this in our very, very simplistic example is to just remove the spurious #include statement, but this violates the unwritten rule of header files - any header file should be able to be included anywhere in any order regardless of what other header files have been included. This means that, first, any header file should include all the header files that it needs to resolve its dependencies. However, as we see here, that simply makes it extremely likely that a header file will get included 2 or 3 or maybe hundreds of times. What we need is an include guard.

//Unicorn.h

#ifndef __UNICORN_H__
#define __UNICORN_H__

class Unicorn
{
  int magic;
};

#endif

Understanding this requires knowledge of the C Preprocessor, which is what goes through and processes your code before its compiled. It is very powerful, but right now we only need to know the basics. Any statement starting with # is a preprocessor command. You will notice that #include is itself a preprocessor command, which makes sense, since the preprocessor was replacing those #include's with the code they contained. #define lets you define a constant (or if you want to be technical, an object-like macro). It can be equal to a number or a word or just not be equal to anything and simply be in a defined state. #ifdef and #endif are just an if statement that allows the code inside of it to exist if the given constant is defined. #ifndef simply does the opposite - the code inside only exists if the given constant doesn't exist.

So, what we do is pick a constant name that probably will never be used in anything else, like __UNICORN_H__, and put in a check to see if it is defined. The first time the header is reached, it won't be defined, so the code inside #ifndef will exist. The next line tells the preprocessor to define __UNICORN_H__, the constant we just checked for. That means that the next time this header is included, __UNICORN_H__ will have been defined, and so the code will be skipped over. Observe:

//main.cpp

#include "Rainbow.h"
#include "Unicorn.h"

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

#include "Unicorn.h"

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h

#ifndef __UNICORN_H__
#define __UNICORN_H__

class Unicorn
{
  int magic;
};

#endif

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

//Unicorn.h

#ifndef __UNICORN_H__
#define __UNICORN_H__

class Unicorn
{
  int magic;
};

#endif

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h

#ifndef __UNICORN_H__
#define __UNICORN_H__

class Unicorn
{
  int magic;
};

#endif

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

//Unicorn.h

#ifndef __UNICORN_H__
#define __UNICORN_H__

class Unicorn
{
  int magic;
};

#endif

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h

#ifndef __UNICORN_H__
#endif

int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}
//main.cpp

//Rainbow.h

//Unicorn.h


class Unicorn
{
  int magic;
};


class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};
//Unicorn.h


int main(int argc, char *argv[]) 
{
  Rainbow rainbow;
}

Our problem is solved! However, note that //Unicorn.h was left in, because it was outside the include guard. It is absolutely critical that you put everything inside your include guard (ignoring comments), or it will either not work properly or be extremely inefficient.

//Rainbow.h

#include "Unicorn.h"

#ifndef __RAINBOW_H__ //WRONG WRONG WRONG WRONG WRONG
#define __RAINBOW_H__

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};

#endif

In this case, the code still compiles, because the include guards prevent duplicate definitions, but its very taxing on the preprocessor that will repeatedly attempt to include Unicorn.h only to discover that it must be skipped over anyway. The preprocessor may be powerful, but it is also very dumb and is easily crippled. The thing is slow enough as it is, so try to keep its workload to a minimum by putting your #include's inside the include guard. Also, don't put semicolons on preprocessor directives. Even though almost everything else in the entire language wants semicolons, semicolons in preprocessor directives will either be redundant or considered a syntax error.

//Rainbow.h

#ifndef __RAINBOW_H__
#define __RAINBOW_H__

#include "Unicorn.h" // SMILES EVERYWHERE!

class Rainbow
{
  Unicorn _unicorns[5]; // 5 unicorns dancing on rainbows
};

#endif
Ok, so now we know how to properly use header files, but not how they are used to declare classes. Let's take a class declared in C#, and then transform it into an equivalent prototype in C++.
public class Pegasus : IComparable<Pegasus>
{
  private Rainbow rainbow;
  protected int magic;
  protected bool flying;

  const int ID=10;
  static int total=0;
  const string NAME="Pegasus";

  public Pegasus()
  {
    flying=false;
    magic=1;
    IncrementTotal();
  }
  ~Pegasus()
  {
    magic=0;
  }
  public void Fly()
  {
    flying=true;
  }
  private void Land()
  {
    flying=false;
  }
  public static string GetName()
  {
    return NAME;
  }
  private static void IncrementTotal()
  {
    ++total;
  }
  public int CompareTo(Pegasus other)
  {
    return 0;
  }
}
class Pegasus : public IComparable<Pegasus>
{
public:
  Pegasus();
  ~Pegasus();
  void Fly();
  virtual int CompareTo(Pegasus& other);
  
  static const int ID=10;
  static int total;
  static const char* NAME;

  inline static void IncrementTotal() { ++total; }

protected:
  int magic;
  bool flying;

private:
  void Land();
  
  Rainbow rainbow;
};

Immediately, we are introduced to C++'s method of dealing with public, protected and private. Instead of specifying it for each item, they are done in groups. The inheritance syntax is identical, and we've kept the static variables, but now only one of them is being initialized in the class. In C++, you cannot initialize a static variable inside a class unless it is a static const int (or any other integral type). Instead, we will have to initialize total and NAME when we get around to implementing the code for this class. In addition, while most of the functions do not have code, as expected, IncrementTotal does. As an aside, C# does not have static const because it considers it redundant - all constant values are static. C++, however, allows you to declare a const variable that isn't static. While this would be useless in C#, there are certain situations where it is useful in C++.

If a given function's code doesn't have any dependencies unavailable in the header file the class is declared in, you can define that method in the class prototype itself. However, as I mentioned before, code in header files runs the danger of being compiled twice. While the compiler is usually good about properly instancing the class, it is usually a good idea to inline any functions defined in the header. Functions that are inline'd are embedded inside code that calls them instead of being explicitly called. That means instead of pushing arguments on to the stack and returning, the compiler simply embeds the function inside of the code that called it, like so:

#include "Pegasus.h"

// Before compilation
int main(int argc, char *argv[]) 
{
  Pegasus::IncrementTotal()
}

// After compilation
int main(int argc, char *argv[]) 
{
  ++Pegasus::total;
}

The consequence of this means that the function itself is never actually instantiated. In fact the function might as well not exist - you won't be able to call it from a DLL because the function was simply embedded everywhere that it was used, kind of like a fancy macro. This neatly solves our issue with code in header files, and will be important later on. This also demonstrates how one accesses static variables and functions in a class. Just like before, the C# method of using . no longer works, you must use the Scope Resolution Operator (::) to access static members and functions of a class. This same operator is what allows us to declare the code elsewhere without confusing the compiler.

//Pegasus.cpp

#include "Pegasus.h"

int Pegasus::total = 0;
const char* Pegasus::NAME = "Pegasus";

Pegasus::Pegasus() : IComparable<Pegasus>(), magic(1), flying(false)
{
  IncrementTotal();
}

Pegasus::~Pegasus()
{
  magic=0;
}

void Pegasus::Fly()
{
  flying=true;
}

void Pegasus::Land()
{
  flying=false;
}

string Pegasus::GetName()
{
  return NAME;
}

int Pegasus::CompareTo(Pegasus other)
{
  return 0;
}

This looks similar to what our C# class looked like, except the functions aren't in the class anymore. Pegasus:: tells the compiler what class the function you are defining belongs in, which allows it to assign the class function prototype to the correct implementation, just like it did with normal functions before. Notice that static is not used when defining GetName() - All function decorations (inline, static, virtual, explicit, etc.) are only allowed on the function prototype. Note that all these rules apply to static variable initialization as well; both total and NAME are resolved using Pegasus:: and don't have the static decorator, only their type. Even though we're using const char* instead of string, you can still initialize a constant value using = "string".

The biggest difference here is in the constructor. In C#, the only things you bother with in the constructor after the colon is either initializing a subclass or calling another constructor. In C++, you can initialize any subclasses you have along with any variables you have, including passing arguments to whatever constructors your variables might have. Most notably is the ability to initialize constant values, which means you can have a constant integer that is set to a value passed through the constructor, or based off a function call from somewhere else. Unfortunately, C++ traditionally does not allow initializing any variables in any sub-classes, nor does it allow calling any of your own constructors. C++0x partially resolves this problem, but it is not fully implemented in VC++ or other modern compilers. This blow, however, is mitigated by default arguments in functions (and by extension, constructors), which allows you to do more with fewer functions.

The order in which variables are constructed is occasionally important if there is an inter-dependency between them. While having such inter-dependencies are generally considered a bad idea, they are sometimes unavoidable, and you can take advantage of a compiler's default behavior of initializing the values in a left to right order. While this behavior isn't technically guaranteed, it is sufficiently reliable for you to take use of it in the occasional exceptional case, but always double-check that the compiler hasn't done crazy optimization in the release version (usually, though, this will just blow the entire program up, so it's pretty obvious).

Now, C# has another datatype, the struct. This is a limited datatype that cannot have a constructor and is restricted to value-types. It is also passed by-value through functions by default, unlike classes. This is very similar to how structs behaved in C, but have no relation to C++'s struct type. In C++, a struct is completely identical to a class in every way, save for one minor detail: all members of a class are private by default, while all members of a struct are public by default. That's it. You can take any class and replace it with struct and the only thing that will change is the default access modifier.

Even though there is no direct analogue to C#'s struct, there is an implicit equivalent. If a class or struct (C++ really doesn't care) meets the requirements of a traditional C struct (no constructor, only basic data types), then it's treated as Plain Old Data, and you are then allowed to skip the constructor and initialize its contents using the special bracket initialization that was touched on before. Yes, you can initialize constant variables using that syntax too.

One thing I've skipped over is the virtual code decorator in the C++ prototype of Pegasus, which is not actually necessary, because the function is already attempting to override another virtual function declared in IComparable, which implicitly makes it virtual. However, in C#, IComparable is implemented as an interface, which is not present in C++. Of course, if you really think about it, an interface is kind of like a normal class, just with all abstract methods (ignore the inheritance issues with this for now). So, we could rewrite the C# implementation of IComparable as a class with abstract methods:

public class IComparable<T>
{
  public abstract int CompareTo(T other);
}
As it turns out, this has a direct C++ analogue:
template<class T>
class IComparable
{
public:
  virtual int CompareTo(T other)=0;
}

This virtual function, instead of being implemented, has an =0 on the end of it. That makes the function pure virtual, which is just another way of saying abstract. So the C++ version of abstract is a pure virtual function, and a C++ version of interfaces is just a class made entirely out of pure virtual functions. Just as C# prevents you from instantiating an abstract class or interface, C++ considers any class that either declares or inherits pure virtual functions without giving them code as an abstract class that cannot be instantiated. Unfortunately C++ does not have anything like sealed, override, etc., so you are on your own there. Keep in mind that public IComparable<T> could easily be replaced with protected or private for more control.

The reason C# has interfaces at all is because C# only allows you to inherit a single class, regardless of whether or not its abstract. If its got code, you can only inherit it once. Interfaces, however, have no code, and so C# lets you pile them on like candy. This isn't done in C++, because C++ supports multiple inheritance. In C++ you can have any class inherit any other class, no matter what, but you can only instantiate a class if it provides implementations for all pure virtual functions somewhere along its inheritance line. Unfortunately, there are a lot of caveats about multiple inheritance, the most notorious being the Diamond Problem.

Let's say you have a graphics engine that has an Image class, and that image class inherits from an abstract class that holds its position. Obviously, any image on the screen is going to have a position. Then, let's take a physics engine, with a basic object that also inherits from an abstract class that holds its position. Obviously any physics object must have a position. So, what happens when you have a game object that is both an image and a physics object? Since the image and the physics object are in fact the same thing, both of them must have the same position at all times, but both inherit the abstract class storing position separately, resulting in two positions. Which one is the right position? When you call SetPosition, which position are you talking about?

Virtual inheritance was introduced as an attempt to solve this problem. It works by creating a single instance of a derived class for the entire inheritance change, such that both the physics object and the image share the same position, as they are supposed to. Unfortunately, it can't resolve all the ambiguities, and it introduces a whole truckload of new problems. It has a nasty habit of being unable to resolve its own virtual functions properly and introducing all sorts of horrible weirdness. Most incredibly bizarre is a virtually inherited class's constructor - it must be initialized in the last class in the inheritance chain, and is one of the first classes to get its constructor called, regardless of where it might be in the hierarchy. It's destructor order is equally as bizarre. Virtual inheritance is sometimes useful for certain small utility classes that must be shared through a wide variety of situations, like a flag class. As a rule of thumb, you should only use virtual inheritance in a class that either relies on the default constructor or only offers a constructor that takes no arguments, and has no superclasses. This allows you to just slap the virtual keyword on and forget about all the wonky constructor details.

class Pegasus : virtual IComparable<Pegasus>

If you ever think you need to use virtual inheritance on something more complicated, your code is broken and you need to rethink your program's architecture (and the compiler probably won't be able to do it properly anyway). On a side-note, the constructors for any given object are called from the top down. That is, when your object's constructor is called, it immediately calls all the constructors for all it's superclasses, usually before even doing any variable initialization, and then those object constructors immediately call all their superclass constructors, and so on until the first line of code executed in your program is whatever the topmost class was. This then filters down until control is finally returned to your original constructor, such that any constructor code is only executed after all of its base classes have been constructed. The exact reverse happens for destructors, with the lowest class destructor being executed first, and after its finished, the destructors for all its base classes are called, such that a class destructor is always called while all of its base classes still exist.

Hopefully you are familiar with C#'s enum keyword. While it used to be far more limited, it has now been extended to such a degree it is identical to C++, even the syntax is the same. The only difference between the two is that the C++ version can't be declared public, protected or private and needs to have a semicolon on the end (like everything else). Like in C#, enums, classes and structs can be embedded in classes, except in C++ they can also be embedded in structs (because structs are basically classes with a different name). Also, C++ allows you to declare an enum/class/etc. and a variable inside the class at the same time using the following syntax:

class Pegasus
{
  enum Count { Uno=2, Dos, Tres, Quatro, Cinco } variable;
  enum { Uno=2, Dos, Tres, Quatro, Cinco } var2; //When used to immediately declare a variable, enums can be anonymous
}

//Same as above
class Pegasus
{
  enum Count { Uno=2, Dos, Tres, Quatro, Cinco }; //cannot be anonymous

  Count variable;
  Count var2;
}

Unions are exclusive to C++, and are a special kind of data structure where each element occupies the same address. To understand what that means, let's look at an example:

union //Unions are usually anonymous, but can be named
{
  struct { // The anonymity of this struct exposes its internal members.
    __int32 low;
    __int32 high;
  }
  __int64 full;
}
union

__int32 and __int64 are simply explicitly declaring 32-bit and 64-bit integers. This union allows us to either set an entire 64-bit integer, or to only set its low or high portion. This happens because the data structure is laid out as follows:

Integer Union Layout

Both low and full are mapped to the exact same place in memory. The only difference is that low is a 32-bit integer, so when you set that to 0, only the first four bytes are set to zero. high is pointing to a location in memory that is exactly 4 bytes in front of low and full. So, if low and full were located at 0x000F810, high would be located at 0x000F814. Setting high to zero sets the last four bytes to zero, but doesn't touch the first four. Consequently, if you set high to 0, reading full would always return the same value as low, since it would essentially be constrained to a 32-bit integer. Unions, however, do not have to have matching memory layouts:

union //Unions are usually anonymous, but can be named
{
  char pink[5]
  __int32 fluffy;
  __int64 unicorns;
}

The layout of this union is:

Jagged Union Layout

Any unused space is simply ignored. This same rule would apply for any structs being used to group data. The size of the union is simply the size of its largest member. Setting all 5 elements of pink here would result in fluffy being equal to zero, and only the last 24-bits (or last 3 bytes) of unicorns be untouched. Likewise, setting fluffy to zero would zero out the first 4 elements in pink (indexes 0-3), leaving the 5th untouched. These unions are often used in performance critical areas where a single function must be able to recieve many kinds of data, but will only ever recieve a single group of data at a time, and so it would be more efficient to map all the possible memory configurations to a single data structure that is large enough to hold the largest group. Here is a real world example:

struct __declspec(dllexport) cGUIEvent
{
  cGUIEvent() { memset(this,0,sizeof(cGUIEvent)); }
  cGUIEvent(unsigned char _evt, const cVecT<int>* mousecoords, unsigned char _button, bool _pressed) : evt(_evt), subevt(0), mousecoords(mousecoords), button(_button), pressed(_pressed) {}
  cGUIEvent(unsigned char _evt, const cVecT<int>* mousecoords, unsigned short _scrolldelta) : evt(_evt), subevt(0), mousecoords(mousecoords), scrolldelta(_scrolldelta) {}
  union
 {
  struct
  {
   unsigned char evt;
   unsigned char subevt;
  };
  unsigned short realevt;
 };

  union
  {
    struct { const cVecT<int>* mousecoords; unsigned char button; bool pressed; };
    struct { const cVecT<int>* mousecoords; short scrolldelta; };
    struct { //the three ctrl/shift/alt bools (plus a held bool) here are compressed into a single byte
      bool down;
      unsigned char keycode; //only used by KEYDOWN/KEYUP
      char ascii; //Only used by KEYCHAR
      wchar_t unicode; //Only used by KEYCHAR
      char sigkeys;
    };
    struct { float value; short joyaxis; }; //JOYAXIS
    struct { bool down; short joybutton; }; //JOYBUTTON*
  };
};

Here, the GUI event is mapped to memory according to the needs of the event that it is representing, without the need for complex inheritance or wasteful memory usage. Unions are indispensable in such scenarios, and as a result are very common in any sort of message handling system.

One strange decorator that has gone unexplained in the above example is the __declspec(dllexport) class decorator. When creating a windows DLL, if you want anything to be usable by something inheriting the DLL, you have to export it. In VC++, this can be done with a module definition file (.def), which is useful if you'll be using GetProcAddress manually, but if you are explicitly linking to a DLL, __declspec(dllexport) automatically exports the function for you when placed on a function. When placed on a class, it automatically exports the entire class. However, for anyone to utilize it, they have to have the header file. This arises to DLLs being distributed as DLLs, linker libraries (.lib), and sets of header files, usually in an "include" directory. In certain cases, only some portions of your DLL will be accessible to the outside, and so you'll want two collections of header files - outside header files and internal ones that no one needs to know about. Consequently, utilizing a large number of C++ DLLs usually involves substantial organization of a whole lot of header files.

Due to the compiler-specific nature of DLL management, they will be covered in Part 6. For now, its on to operator overloading, copy semantics and move semantics!

Part 4: Operator Overload

September 5, 2011

The Problem of Vsync

If you were to write directly to the screen when drawing a bouncing circle, you would run into some problems. Because you don't do any buffering, your user might end up with a quarter circle drawn for a frame. This can be solved through Double Buffering, which means you draw the circle on to a backbuffer, then "flip" (or copy) the completed image on to the screen. This means you will only ever send a completely drawn scene to the monitor, but you will still have tearing issues. These are caused by trying to update the monitor outside of its refresh rate, meaning you will have only finished drawing half of your new scene over the old scene in the monitor's video buffer when it updates itself, resulting in half the scanlines on the screen having the new scene and half still having the old scene, which gives the impression of tearing.

This can be solved with Vsync, which only flips the backbuffer right before the screen refreshes, effectively locking your frames per second to the refresh rate (usually 60 Hz or 60 FPS). Unfortunately, Vsync with double buffering is implemented by simply locking up the entire program until the next refresh cycle. In DirectX, this problem is made even worse because the API locks up the program with a 100% CPU polling thread, sucking up an entire CPU core just waiting for the screen to enter a refresh cycle, often for almost 13 milliseconds. So your program sucks up an entire CPU core when 90% of the CPU isn't actually doing anything but waiting around for the monitor.

This waiting introduces another issue - Input lag. By definition any input given during the current frame can only come up when the next frame is displayed. However, if you are using vsync and double buffering, the current frame on the screen was the LAST frame, and the CPU is now twiddling its thumbs until the monitor is ready to display the frame that you have already finished rendering. Because you already rendered the frame, the input now has to wait until the end of the frame being displayed on the screen, at which point the frame that was already rendered is flipped on to the screen and your program finally realizes that the mouse moved. It now renders yet another frame taking into account this movement, but because of Vsync that frame is blocked until the next refresh cycle. This means, if you were to press a key just as a frame was put up on the monitor, you would have two full frames of input lag, which at 60 FPS is 33 ms. I can ping a server 20 miles away with a ping of 21 ms. You might as well be in the next city with that much latency.

There is a solution to this - Triple Buffering. The idea is a standard flip mechanism commonly used in dual-thread lockless synchronization scenarios. With two backbuffers, the application can write to one and once its finished, tell the API and it will mark it for flipping to the front-buffer. Then the application starts drawing on the second, after waiting for any flipping operation to finish, and once its done, marks that for flipping to the front-buffer and starts drawing on the first again. This way, the application can draw 2000 frames a second, but only 60 of those frames actually get flipped on to the monitor using what is essentially a lockless flipping mechanism. Because the application is now effectively rendering 2000 frames per second, there is no more input lag. Problem Solved.

Except not, because DirectX implements Triple Buffering in the most useless manner possible. DirectX just treats the extra buffer as a chain, and rotates through the buffers as necessary. The only advantage this has is that it avoids waiting for the backbuffer copy operation to finish before writing again, which is completely useless in an era where said copy operation would have to be measured in microseconds. Instead, it simply ensures that vsync blocks the program, which doesn't solve the input issue at all.

However, there is a flag, D3DPRESENT_DONOTWAIT, that forces vsync to simply return an error if the refresh cycle isn't available. This would allow us to implement a hack resembling what triple buffering should be like by simply rolling our own polling loop and re-rendering things in the background on the second backbuffer. Problem solved!

Except not. It turns out the Nvidia and Intel don't bother implementing this flag, forcing Vsync to block no matter what you do, and to make matters worse, this feature doesn't have an entry in D3DCAPS9, meaning the DirectX9 API just assumes that it exists, and there is no way to check if it is supported. Of course, don't complain about this to anyone, because of the 50% of people who asked about this who weren't simply ignored, almost all of them were immediately accused of bad profiling, and that the Present() function couldn't possibly be blocking with the flag on. I question the wisdom of people who ignore the fact that the code executed its main loop 2000 times with vsync off and 60 times with it on and somehow come to the conclusion that Present() isn't blocking the code.

Either way, we're kind of screwed now. Absolutely no feature in DirectX actually does what its supposed to do, so there doesn't seem to be a way past this input lag.

There is, however, another option. Clever developers would note that to get around vsync's tendency to eat up CPU cycles like a pig, one could introduce a Sleep() call. So long as you left enough time to render the frame, you could recover a large portion of the wasted CPU. A reliable way of doing this is figuring out how long the last frame took to render, then subtracting that from the FPS you want to enforce and sleep in the remaining time. By enforcing an FPS of something like 80, you give yourself a bit of breathing room, but end up finishing rendering the frame around the same time it would have been presented anyway.

By timing your updates very carefully, you can execute a Sleep() call, then update all the inputs, then render the scene. This allows you to cut down the additional lag time by nearly 50% in ideal conditions, almost completely eliminating excess input lag. Unfortunately, if your game is already rendering at or below 100 FPS, it takes you 10 milliseconds to render a frame, allowing you only 2.5 milliseconds of extra time to look for input, which is of limited usefulness. This illustrates why Intel and Nvidia are unlikely to care about D3DPRESENT_DONOTWAIT - modern games will never render fast enough for substantial input lag reduction.

Remember when implementing the Yield that the amount of time it takes to render the frame should be the time difference between the two render calls, minus the amount of time spent sleeping, minus the amount of time Present() was blocking.

September 1, 2011

Musical Genres

As a composer, I have been on the receiving end of a lot of musical criticism - some useful, most ridiculous. I have given out quite a bit of criticism myself, but after discovering that most people aren't interested in brutally honest opinions, I have since avoided it. However, one thing that continues to come up over and over again is someone complaining about Genres.

"This is too fast to be trance."

"This isn't real Drum'n'Bass."

Sometimes people will even slam entire swathes of subgenres, like Ishkur's rant on Epic Trance (and by extension almost anything related to it), literally accusing it as betraying the entire idea of trance: "There must be a word to describe the pain one feels when witnessing (or hearing, rather) something once pure and brilliant completely sold down the river. Sometime in the mid-90s trance decided to drop the technique of slowly introducing complicated layers and building adequate tension over long stretches, replacing them with cutesy little insta-melodies ... The average attention span, way too ritalin-freaked to pay attention to the slow, brooding trance in its original form, liked the anthemic singalong tone of the NEW McTrance, and that's why all you trance crackers are reading this right now. Not because you grew a taste for this super awesome underground music ... But because trance reformed its sound and delivery to suit [YOU]."

This is repeated for something like half the listed subgenres of trance, and in fact the entire trance genre in his "Guide" is just one giant extended rant about how Trance sucks now and it used to be awesome and now we've all ruined it forever.

This kind of stuck-up, bigoted, brain-melting stupidity has grated against my nerves for years, half because I just don't like stupid stuck-up dipshits, but mostly because it is simply wrong.

Genres do not define music. Genres were invented so people could find music similar to songs that they liked. That's all. There are no rules for any genre other than "it should sound kind of like other songs in said genre", and even then it's commonplace to have songs associated with multiple genres. Genres are a categorization system, and nothing else. Many people try and justify their opinions by saying that they're criticizing the classification of the song instead of the song itself, and suggesting that it should be put in some kind of subgenre instead. When the inevitable subgenre usually fails to exist because the composer is being creative like their supposed to, they'll suggest something asinine, like "put it in Miscellaneous, that's what its there for."

Really? Put this obviously heavily drum'n'bass influenced song in Miscellaneous with a bunch of off-the-wall experimental stuff instead of songs that, you know, actually sound like it, just because it doesn't conform to a bunch of imaginary rules you pulled out of your ass to "qualify" the song for the genre? Well why don't we just invent another subgenre? We've only got like a couple hundred of them now, 80% of which are basically the same damn thing. People try to defend their perceived sanctity of genres, but the problem is that its all bullshit. Let's remind ourselves, why do genres exist?

Genres exist so people can find songs that sound similar to music they like. If you have a bajillion subgenres, no one's going to be able to accurately classify every single song into its own little niche, and what's more infuriating is that this misses the point completely. The vast majority of people do not have laser-guided musical tastes. They just listen to whatever the heck music they like. If they're looking for a song, they don't want to have to filter through hundreds of meaningless subgenres, because all they're really looking for is something like, Trance, or maybe Melodic Trance, and that's about as qualifying as you can get while still being useful. Consequently if your song is weird, you are better off picking the closest well-known genre of music that it sounds like and slapping it in there.

And yet, it still doesn't stop. People start throwing on ridiculous prescriptive rules like, a trance song has to be mixable, and to be club friendly you have to have 1 minute of intro with no bass, or it has to be between 116-148 BPM, or you have to use these types of instruments, or you have to do X, or X, or X. Music is art, god damn it, what matters is what a song feels like. If it feels like trance even though its flying along at 166 BPM, and a lot of people who like trance also like that song, then it belongs in trance no matter how much you complain about it. Maybe stick it in "Energy Trance", it kinda gets the idea across, but its still Trance, so who cares, and even then this point is usually moot, because these arguments always come up on websites with either a set list of genres, or one that operates on keywords. In the former case, you can't qualify your genre with anything more than "trance" because the only thing they offer is "Trance" and "Techno". In the latter case, you'll have to tag it with Trance no matter what you do because otherwise no one will ever know your song exists.

Attacking a song because of its perceived genre is the dumbest, most useless criticism you can ever give, unless the artist explicitly states that they are trying for a very specific sound, and even then its rarely a genre and usually more of an abstract concept used across several subgenres, in which case you should be referring to the idea, not the genre. People need to understand that if I slap a "Trance" label on to my song, it doesn't automatically mean I am trying to make whatever anglicized version of "Trance" they have deluded themselves into thinking encapsulates the entire genre (which is completely different from everyone else's), it is simply there to help them find the damn song.

July 21, 2011

C# to C++ Tutorial - Part 2: Pointers Everywhere!

[ 1 · 2 · 3 · 4 · 5 · 6 · 7 ]

We still have a lot of ground to cover on pointers, but before we do, we need to address certain conceptual frameworks missing from C# that one must be intimately familiar with when moving to C++.

Specifically, in C# you mostly work with the Heap. The heap is not difficult to understand - its a giant lump of memory that you take chunks out of to allocate space for your classes. Anything using the new keyword is allocated on the heap, which ends up being almost everything in a C# program. However, the heap isn't the only source of memory - there is also the Stack. The Stack is best described as what your program lives inside of. I've said before that everything takes up memory, and yes, that includes your program. The thing is that the Heap is inherently dynamic, while the Stack is inherently fixed. Both can be re-purposed to do the opposite, but trying to get the Stack to do dynamic allocation is extremely dangerous and is almost guaranteed to open up a mile-wide security hole.

I'm going to assume that a C# programmer knows what a stack is. All you need to understand is that absolutely every single piece of data that isn't allocated on the heap is pushed or popped off your program's stack. That's why most debuggers have a "stack" of functions that you can go up and down. Understanding the stack in terms of how many functions you're inside of is ok, but in reality, there are also variables declared on the stack, including every single parameter passed to a function. It is important that you understand how variable scope works so you can take advantage of declaring things on the stack, and know when your stack variables will simply vanish into nothingness. This is where { and } come in.

int main(int argc, char *argv[])
{
  int bunny = 1;
  
  {
    int carrot=3;
    int lettuce=8;
    bunny = 2; // Legal
  }

  //carrot=2; //Compiler error: carrot does not exist
  int carrot = 3; //Legal, since the other carrot no longer exists
  
  {
    int lettuce = 0;

    { 
       //int carrot = 1; //Compiler error: carrot already defined
       int grass = 9;
       
       bunny = grass; //Still legal
       bunny = carrot; // Also legal
    }
    
    //bunny = grass; //Illegal
    bunny = lettuce; //Legal
  }
  
  //bunny = lettuce; //Illegal
}

{ and } define scope. Anything declared inside of them ceases to exist outside, but is still accessible to any additional layers of scope declared inside of them. This is a way to see your program's stack in action. When bunny is declared, its pushed on to the stack. Then we enter our first scope area, where we push carrot and lettuce on to the stack and set bunny to 2, which is legal because bunny is still on the stack. When the scope is then closed, however, anything declared inside the scope is popped from the stack in the exact opposite order it was pushed on. Unfortunately, compiler optimization might change that order behind the scenes, so don't rely on it, but it should be fairly consistent in debug builds. First lettuce is de-allocated (and its destructor called, if it has one), then carrot is de-allocated. Consequently, trying to set carrot to 2 outside of the scope will result in a compiler error, because it doesn't exist anymore. This means we can now declare an entirely new integer variable that is also called carrot, without causing an error. If we visualize this as a stack, that means carrot is now directly above bunny. As we enter a new scope area, lettuce is then put on top of carrot, and then grass is put on top of lettuce. We can still assign either lettuce or carrot to bunny, since they are all on the stack, but once we leave this inner scope, grass is popped off the stack and no longer exists, so any attempt to use it causes an error. lettuce, however, is still there, so we can assign lettuce to bunny before the scope closes, which pops lettuce off the stack.

Now the only things on the stack are bunny and carrot, in that order (if the compiler hasn't moved things around). We are about to leave the function, and the function is also surrounded by { and }. This is because a function is, itself, a scope, so that means all variables declared inside of that scope are also destroyed in the order they were declared in. First carrot is destroyed, then bunny is destroyed, and then the function's parameters argc and argv are destroyed (however the compiler can push those on to the stack in whatever order it wants, so we don't know the order they get popped off), until finally the function itself is popped off the stack, which returns program flow to whatever called it. In this case, the function was main, so program flow is returned to the parent operating system, which does cleanup and terminates the process.

You can declare anything that has a size determined at compile time on the stack. This means if you have an array that has a constant size, you can declare it on the stack:

int array[5]; //Array elements are not initialized and therefore are undefined!
int array[5] = {0,0,0,0,0}; //Elements all initialized to 0
//int array[5] = {0}; // Compiler error - your initialization must match the array size

You can also let the compiler infer the size of the array:

int array[] = {1,2,3,4}; //Declares an array of 4 ints on the stack initialized to 1,2,3,4

Not only that, but you can declare class instances and other objects on the stack.

Class instance(arg1, arg2); //Calls a constructor with 2 arguments
Class instance; //Used if there are no arguments for the constructor
//Class instance(); //Causes a compiler error! The compiler will think its a function.

In fact, if you have a very simple data structure that uses only default constructors, you can use a shortcut for initializing its members. I haven't gone over classes and structs in C++ yet (See Part 3), but here is the syntax anyway:

struct Simple
{
  int a;
  int b;
  const char* str;
};

Simple instance = { 4, 5, "Sparkles" };
//instance.a is now 4
//instance.b is now 5
//instance.str is now "Sparkles"

All of these declare variables on the stack. C# actually does this with trivial datatypes like int and double that don't require a new statement to allocate, but otherwise forces you to use the Heap so its garbage collector can do the work.

Wait a minute, stack variables automatically destroy themselves when they go out-of-scope, but how do you delete variables allocated from the Heap? In C#, you didn't need to worry about this because of Garbage Collection, which everyone likes because it reduces memory leaks (but even I have still managed to cause a memory leak in C#). In C++, you must explicitly delete all your variables declared with the new keyword, and you must keep in mind which variables were declared as arrays and which ones weren't. In both C# and C++, there are two uses of the new keyword - instantiating a single object, and instantiating an array. In C++, there are also two uses of the delete keyword - deleting a single object and deleting an array. You cannot mix up delete statements!

int* Fluffershy = new int();
int* ponies = new int[10];

delete Fluffershy; // Correct
//delete ponies; // WRONG, we should be using delete [] for ponies
delete [] ponies; // Just like this
//delete [] Fluffershy; // WRONG, we can't use delete [] on Fluffershy because we didn't
                        // allocate it as an array.

int* one = new int[1];

//delete one; // WRONG, just because an array only has one element doesn't mean you can
              // use the normal delete!
delete [] one; // You still must use delete [] because you used new [] to allocate it.

As you can see, it is much easier to deal with stack allocations, because they are automatically deallocated, even when the function terminates unexpectedly. std::auto_ptr takes advantage of this by taking ownership of a pointer and automatically deleting it when it is destroyed, so you can allocate the auto_ptr on the stack and benefit from the automatic destruction. However, in C++0x, this has been superseded by std::unique_ptr, which operates in a similar manner but uses some complex move semantics introduced in the new standard. I won't go into detail about how to use these here as its out of the scope of this tutorial. Har har har.

For those of you who like throwing exceptions, I should point out common causes of memory leaks. The most common is obviously just flat out forgetting to delete something, which is usually easily fixed. However, consider the following scenario:

void Kenny()
{
  int* kenny = new int();
  throw "BLARG";
  delete kenny; // Even if the above exception is caught, this line of code is never reached.
}

int main(int argc, char* argv[])
{
  try {
  Kenny();
  } catch(char * str) { 
    //Gotta catch'em all.
  }
  return 0; //We're leaking Kenny! o.O
}

Even this is fairly common:

int main(int argc, char* argv[])
{
  int* kitty = new int();

  *kitty=rand();
  if(*kitty==0)
    return 0; //LEAK
  
  delete kitty;
  return 0;
}

These situations seem obvious, but they will happen to you once the code becomes enormous. This is one reason you have to be careful when inside functions that are very large, because losing track of if statements may result in you forgetting what to delete. A good rule of thumb is to make sure you delete everything whenever you have a return statement. However, the opposite can also happen. If you are too vigilant about deleting everything, you might delete something you never allocated, which is just as bad:

int main(int argc, char* argv[])
{
  int* rarity = new int();
  int* spike;

  if(rarity==NULL)
  {
    spike=new int();
  }
  else
  {
    delete rarity;
    delete spike; // Suddenly, in an alternate dimension, earth ceased to exist
    return 0;
  }
  
  delete rarity; // Since this only happens if the allocation failed and returned a NULL
                 // pointer, this will also blow up.
  delete spike;
  return 0;
}

Clearly, one must be careful when dealing with allocating and destroying memory in C++. Its usually best to encapsulate as much as possible in classes that automate such things. But wait, what about that NULL pointer up there? Now that we're familiar with memory management, we're going to dig into pointers again, starting with the NULL pointer.

Since a pointer points to a piece of memory that's somewhere between 0 and 4294967295, what happens if its pointing at 0? Any pointer to memory location 0 is always invalid. All you need to know is that the operating system does some magic voodoo to ensure that any attempted access of memory location 0 will always throw an error, no matter what. 1, 2, 3, and any other double or single digit low numbers are also always invalid. 0xfdfdfdfd is what the VC++ debugger sets uninitialized memory to, so that pointer location is also always invalid. A pointer set to 0 is called a Null Pointer, and is usually used to signify that a pointer is empty. Consequently if an allocation function fails, it tends to return a null pointer. Null pointers are returned when the operation failed and a valid pointer cannot be returned. Consequently, you may see this:

int main(int argc, char* argv[])
{
  int* blink = new int();
  if(blink!=0) delete blink;
  blink=0;
  return 0;
}

This is known as a safe deletion. It ensures that you only delete a pointer if it is valid, and once you delete the pointer you set the pointer to 0 to signify that it is invalid. Note that NULL is defined as 0 in the standard library, so you could also say blink = NULL.

Since pointers are just integers, we can do pointer arithmetic. What happens if you add 1 to a pointer? If you think of pointers as just integers, one would assume it would simply move the pointer forward a single byte.

Moving a Pointer 1 byte

This isn't what happens. Adding 1 to a pointer of type integer results in the pointer moving forward 4 bytes.

Moving a Pointer 4 bytes

Adding or subtracting an integer $i$ from a pointer moves that pointer $i\cdot n$ bytes, where $n$ is the size, in bytes, of the pointer's type. This results in an interesting parallel - adding or subtracting from a pointer is the same as treating the pointer as an array and accessing it via an index.

int main(int argc, char* argv[])
{
  int* kitties = new int[14];
  int* a = &kitties[7];
  int* b = kitties+7; //b is now the same as a
  int* c = &a[4];
  int* d = b+4; //d is now the same as c
  int* e = &kitties[11];
  int* f = kitties+11; 
  //c,d,e, and f now all point to the same location
}

So pointer arithmetic is identical to accessing a given index and taking the address. But what happens when you try to add two pointers together? Adding two pointers together is undefined because it tends to produce total nonsense. Subtracting two pointers, however, is defined, provided you subtract a smaller pointer from a larger one. The reason this is allowed is so you can do this:

int main(int argc, char* argv[])
{
  int* eggplants = new int[14];
  int* a = &eggplants[7];
  int* b = eggplants+10;
  int diff = b-a; // Diff is now equal to 3
  a += (diff*2); // adds 6 to a, making it point to eggplants[13]
  diff = a-b; // diff is again equal to 3
  diff = a-eggplants; //diff is now 13
  ++a; //The increment operator is valid on pointers, and operates the same way a += 1 would
  // So now a points to eggplants[14], which is not a valid location, but this is still
  // where the "end" of the array technically is.
  diff = a-eggplants; // Diff now equals 14, the size of the array
  --b; // Decrement works too
  diff = a-b; // a is pointing to index 14, b is pointing to 9, so 14-9 = 5. Diff is now 5.
  return 0;
}

There is a mistake in the code above, can you spot it? I used a signed integer to store the difference between the two pointers. What if one pointer was above 2147483647 and the other was at 0? The difference would overflow! Had I used an unsigned integer to store the difference, I'd have to be really damn sure that the left pointer was larger than the right pointer, or the negative value would also overflow. This complexity is why you have to goad windows into letting your program deal with pointer sizes over 2147483647.

In addition to arithmetic, one can compare two pointers. We already know we can use == and !=, but we can also use < > <= and >=. While you can get away with comparing two completely unrelated pointers, these comparison operators are usually used in a context like the following:

int main(int argc, char* argv[])
{
  int* teapots = new int[15];
  int* end = teapots+15;
  for(int* s = teapots; s<end; ++s)
    *s = 0;
  return 0;
}

Here the for loop increments the pointer itself rather than an index, until the pointer reaches the end, at which point it terminates. But, what if you had a pointer that didn't have any type at all? void* is a legal pointer type, that any pointer type can be implicitly converted to. You can also explicitly cast void* to any pointer type you want, which is why you are allowed to explicitly cast any pointer type to another pointer type (int* p; short* q = (short*)p; is entirely legal). Doing so, however, is obviously dangerous. void* has its own problems, namely, how big is it? The answer is, you don't know. Any attempt to use any kind of pointer arithmetic with a void* pointer will cause a compiler error. It is most often used when copying generic chunks of memory that only care about size in bytes, and not what is actually contained in the memory, like memcpy().

int main(int argc, char* argv[])
{
  int* teapots = new int[15];
  void* p = (void*)teapots;
  p++; // compiler error
  unsigned short* d = (unsigned short*)p;
  d++; // No compiler error, but you end up pointing to half an integer
  d = (unsigned short*)teapots; // Still valid
  return 0;
}

Now that we know all about pointer manipulation, we need to look at pointers to pointers, and to anchor this in a context that actually makes sense, we need to look at how C++ does multidimensional arrays. In C#, multidimensional arrays look like this:

int[,] table = new int[4,5];

C++ has a different, but fairly reasonable stack-based syntax. When you want to declare a multidimensional array on the heap, however, things start getting weird:

int unicorns[5][3]; // Well this seems perfectly reasonable, I wonder what-
  int (*cthulu)[50] = new int[10][50]; // OH GOD GET IT AWAY GET IT AWAAAAAY...!
  int c=5;
  int (*cthulu)[50] = new int[c][50]; // legal
  //int (*cthulu)[] = new int[10][c]; // Not legal. Only the leftmost parameter
                                      // can be variable
  //int (*cthulu)[] = new int[10][50]; // This is also illegal, the compiler is not allowed
                                       // to infer the constant length of the array.

Why isn't the multidimensional array here just an int**? Clearly if int* x is equivalent to int x[], shouldn't int** x be equivalent to int x[][]? Well, it is - just look at the main() function, its got a multidimensional array in there that can be declared as just char** argv. The problem is that there are two kinds of multidimensional arrays - square and jagged. While both are accessed in identical ways, how they work is fundamentally different.

Let's look at how one would go about allocating a 3x5 square array. We can't allocate a 3x5 chunk out of our computer's memory, because memory isn't 2-dimensional, its 1-dimensional. Its just freaking huge line of bytes. Here is how you squeeze a 2-dimensional array into a 1-dimensional line:

Allocating a 3x5 array

As you can see, we just allocate each row right after the other to create a 15-element array ($5\cdot 3 = 15$). But then, how do we access it? Well, if it has a width of 5, to access another "row" we'd just skip forward by 5. In general, if we have an $n$ by $m$ multidimensional array being represented as a one-dimensional array, the proper index for a coordinate $(x,y)$ is given by: array[x + (y*n)]. This can be extended to 3D and beyond but it gets a little messy. This is all the compiler is really doing with multidimensional array syntax - just automating this for you.

Now, if this is a square array (as evidenced by it being a square in 2D or a cube in 3D), a jagged array is one where each array is a different size, resulting in a "jagged" appearance:

Jagged array visualization

We can't possibly allocate this in a single block of memory unless we did a lot of crazy ridiculous stuff that is totally unnecessary. However, given that arrays in C++ are just pointers to a block of memory, what if you had a pointer to a block of memory that was an array of pointers to more blocks of memory?

Jagged array pointers

Suddenly we have our jagged array that can be accessed just like our previous arrays. It should be pointed out that with this format, each inner-array can be in a totally random chunk of memory, so the last element could be at position 200 and the first at position 5 billion. Consequently, pointer arithmetic only makes sense within each column. Because this is an array of arrays, we declare it by creating an array of pointers. This, however, does not initialize the entire array; all we have now is an array of illegal pointers. Since each array could be a different size than the other arrays (this being the entire point of having a jagged array in the first place), the only possible way of initializing these arrays is individually, often by using a for loop. Luckily, the syntax for accessing jagged arrays is the exact same as with square arrays.

int main(int argc, char* argv[])
{
  int** jagged = new int*[5]; //Creates an array of 5 pointers to integers.
  for(int i = 0; i < 5; ++i)
  {
    jagged[i] = new int[3+i]; //Assigns each pointer to a new array of a unique size
  }
  jagged[4][1]=0; //Now we can assign values directly, or...
  int* second = jagged[2]; //Pull out one column, and
  second[0]=0; //manipulate it as a single array

  // The double-access works because of the order of operations. Since [] is just an
  // operator, it is evaluated from left to right, like any other operator. Here it is
  // again, but with the respective types that each operator resolves to in parenthesis.
  ( (int&) ( (int*&) jagged[4] ) [1] ) = 0;
}
As you can see above, just like we can have pointers to pointers, we can also have references to pointers, since pointers are just another data type. This allows you to re-assign pointer values inside jagged arrays, like so: jagged[2] = (int*)kitty. However, until C++0x, those references didn't have any meaningful data type, so even though the compiler was using int*&, using that in your code will throw a compiler error in older compilers. If you need to make your code work in non-C++0x compilers, you can simply avoid using references to pointers and instead use a pointer to a pointer.
int* bunny;
int* value = new int[5];

int*& bunnyref = bunny; // Throws an error in old compilers
int** pbunny = &bunny; // Will always work
bunnyref = value; // This does the same exact thing as below.
*pbunny = value;

// bunny is now equal to value
This also demonstrates the other use of a pointer-to-pointer data type, allowing you to remotely manipulate a pointer just like a pointer allows you to remotely manipulate an integer or other value type. So obviously you can do pointers to pointers to pointers to pointers to an absurd degree of lunacy, but this is exceedingly rare so you shouldn't need to worry about it. Now you should be strong in the art of pointer-fu, so our next tutorial will finally get into object-oriented techniques in C++ in comparison to C#. Part 3: Classes and Structs and Inheritance OH MY!