I am looking at an algorithm that can map between characters with diacritics (tilde, circumflex, caret, umlaut, caron) and their \"simple\" character.
For example:>
There is a draft report on character folding on the unicode website which has a lot of relevant material. See specifically Section 4.1. "Folding algorithm".
Here's a discussion and implementation of diacritic marker removal using Perl.
These existing SO questions are related: