Bayesian filters have exactly this purpose. That's the techno you'll find in most tools that identify spam.
Example, to detect a language (from http://sebsauvage.net/python/snyppets/#bayesian) :
from reverend.thomas import Bayes
guesser = Bayes()
guesser.train('french','La souris est rentrée dans son trou.')
guesser.train('english','my tailor is rich.')
guesser.train('french','Je ne sais pas si je viendrai demain.')
guesser.train('english','I do not plan to update my website soon.')
>>> print guesser.guess('Jumping out of cliffs it not a good idea.')
[('english', 0.99990000000000001), ('french', 9.9999999999988987e-005)]
>>> print guesser.guess('Demain il fera très probablement chaud.')
[('french', 0.99990000000000001), ('english', 9.9999999999988987e-005)]
But it works to detect any type you will train it for : technical text, songs, jokes, etc. As long as you can provide enought material to let the tool learn what does you document looks like.