I\'m looking for pseudocode, or sample code, to convert higher bit ascii characters (like, Ü which is extended ascii 154) into U (which is ascii 85).
My initial gues
It really depends on the nature of your source strings. If you know the string's encoding, and you know that it's an 8-bit encoding — for example, ISO Latin 1 or similar — then a simple static array is sufficient:
static const char xlate[256] = { ..., ['é'] = 'e', ..., ['Ü'] = 'U', ... }
new_c = xlate[old_c];
On the other hand, if you have a different encoding, or if you're using UTF-8 encoded strings, you will probably find the functions in the ICU library very helpful.
I think you just can't.
I usually do something like that:
AccentString = 'ÀÂÄÉÈÊ[and all the other]'
ConvertString = 'AAAEEE[and all the other]'
Looking for the char in AccentString and replacing it for the same index in ConvertString
Most languages have a standard way to replace accented characters with standard ASCII, but it depends on the language, and it often involves replacing a single accented character with two ASCII ones. e.g. in German ü becomes ue. So if you want to handle natural languages properly it's a lot more complicated than you think it is.
A lookup array is probably the simplest and fastest way to accomplish this. This is one way that you can convert say, ASCII to EBCDIC.
Is converting Ü to U really what you would like to do? I don't know about other languages but in German Ü would become Ue, ö would become oe, etc.
Try the uni2ascii program.