I am working on motion detection with non-static camera using opencv. I am using a pretty basic background subtraction and thresholding approach to get a broad sense of all that
I guess you can improve your original attempt by using morphological transformations. Take a look at http://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html#gsc.tab=0. Probably you can deal with a closed set for each entity after that, specially with separate players as you got in your original image.