We were set an algorithm problem in class today, as a \"if you figure out a solution you don\'t have to do this subject\". SO of course, we all thought we will give it a go.
Create a huge text file like this (read the article for inspiration)
For every word, scan the text and whenever you match that word, count the 'categories' that appear in N (maximum, aka radio) positions left and right of it.
The word is likely to belong in the category with the greatest counter.