I am attempting to create a program that can find human figures in video of game play of call of duty. I have compiled a list of ~2200 separate images from this video that eithe
Better features win over better learning algorithms. The basic principle in feature selection is that the best features maximize interclass variance and minimize intraclass variance. In your case, the features should emphasize the difference between images that contain a human figure and images that don't, and deemphasize the differences between images of the same class.
For instance, you could try and find the contour of the human figure, and calculate features based on the contour. OpenCV already has some functions for calculating features of contours: Moments, GetCentralMoment, NormalizedCentralMoment etc. The question then would be: how to segment human figures from the background, so that their contour can be found? There are several ways to approach this problem, such as by using texture segmentation.
Once you can solve the segmentation problem and calculate reasonable features, the choice of learning algorithm is not really that important. But why not try several and see what works best? Take a look at the Machine Learning section in the OpenCV docs.
This problem is too hard for a normal ANN.
ANNs aren't really very well suited to images with lots of spatial transformations (i.e. human figures in different positions). They effectively need to learn each possible position independently, since they can't generalise well over translations, rotations and scaling etc. Even if you managed to make it work, you'd probably need billions of training images and years of training time.
Your best bet is probably to go with either:
Footnotes:
1 For the nitpickers: Without a highly complex classifier.
2 You can also employ a cascade of boosted classifiers to gain speed without giving away too much in detection rate.
It's not crystal clear to me what you are trying to accomplish, but it seems that you are trying to do real-time player tracking (or something similar) using the wrong approach. Human tracking is something that one would expect to be done through digital image/video processing of pictures of real human beings.
Depending on your purpose, player tracking is something that should not be done through image processing because it can be very demanding on the CPU. Tracking player models inside a game is a practice usually used for cheating applications, and it requires one to either inject code on the game process, or be the middle man between the game engine and the graphics driver. Since the game client always knows where the other players are (even if you cannot see them), one could search the process memory for the X,Y,Z coordinates of the players, or intercept graphics rendering calls searching for the location where a player model will be rendered on the screen (which can be a little tricky, since it requires a basic understanding of OpenGL/DirectX and debugging skills).
I'm not sure if its OK to detail such techniques on StackOverflow, but I will say that this topic has been largely discussed on several reverse engineer/cheating forums like GameDeception.