I\'ve been scratching my head trying to find a way to solve this problem without having to get into NLP and start training models. I have 2 rather large data sets that should be