I\'m comparing song titles, using Latin script (although not always), my aim is an algorithm that gives a high score if the two song titles seem to be the same same title and a
You are likely need to solve a string-to-string correction problem. Levenshtein distance algorithm is implemented in many languages. Before running it I'd remove all spaces from string, because they don't contain any sensitive information, but may influence two strings difference. For string search prefix trees are also useful, you can have a look in this direction as well. For example here or here. Was already discussed on SO. If spaces are so much significant in your case, just assign a greater weight to them.