I have lots of strings containing text in lots of different spellings. I am tokenizing these strings by searching for keywords and if a keyword is found I use an assoicated text
I suggest to approaches:
1) Tokenise using string.Split
and match against a Dictionary of keys you have
2) Implement tokeniser yourself a reader with ReadToken()
method which it adds the characters to a buffer until it finds (Split could be doing that) a split character and outputs that as token. Then you check against your dictionary.