How do I translate 8bit characters into 7bit characters? (i.e. Ü to U)

前端未结

关注

 15  1954

I\'m looking for pseudocode, or sample code, to convert higher bit ascii characters (like, Ü which is extended ascii 154) into U (which is ascii 85).

My initial gues

相关标签:

15条回答

北恋

2020-12-05 11:08
It really depends on the nature of your source strings. If you know the string's encoding, and you know that it's an 8-bit encoding — for example, ISO Latin 1 or similar — then a simple static array is sufficient:
```
static const char xlate[256] = { ..., ['é'] = 'e', ..., ['Ü'] = 'U', ... }
...
new_c = xlate[old_c];
```
On the other hand, if you have a different encoding, or if you're using UTF-8 encoded strings, you will probably find the functions in the ICU library very helpful.
0 讨论(0)
发布评论:

提交评论
- 加载中...
一向

2020-12-05 11:11

I think you just can't.

I usually do something like that:

AccentString = 'ÀÂÄÉÈÊ[and all the other]'
ConvertString = 'AAAEEE[and all the other]'

Looking for the char in AccentString and replacing it for the same index in ConvertString

HTH

0 讨论(0)
发布评论:

提交评论
- 加载中...
小蘑菇

2020-12-05 11:12

Most languages have a standard way to replace accented characters with standard ASCII, but it depends on the language, and it often involves replacing a single accented character with two ASCII ones. e.g. in German ü becomes ue. So if you want to handle natural languages properly it's a lot more complicated than you think it is.

0 讨论(0)
发布评论:

提交评论
- 加载中...
轻奢々

2020-12-05 11:14

A lookup array is probably the simplest and fastest way to accomplish this. This is one way that you can convert say, ASCII to EBCDIC.

0 讨论(0)
发布评论:

提交评论
- 加载中...
我在风中等你

2020-12-05 11:15

Is converting Ü to U really what you would like to do? I don't know about other languages but in German Ü would become Ue, ö would become oe, etc.

0 讨论(0)
发布评论:

提交评论
- 加载中...
感动是毒

2020-12-05 11:15

Try the uni2ascii program.

0 讨论(0)
发布评论:

提交评论
- 加载中...