Arabic Character Encoding Issue: UTF-8 versus Windows-1256

前端 未结 4 2104
野趣味
野趣味 2021-01-05 11:10

Quick Background: I inherited a large sql dump file containing a combination of english and arabic text and (I think) it was originally exported using \'lat

4条回答
  •  栀梦
    栀梦 (楼主)
    2021-01-05 11:44

    If the document looks right when declared as windows-1256 encoded, then it most probably is windows-1256 encoded. So it was apparently not exported using latin1—which would have been impossible, since latin1 has no Arabic letters.

    If this is just about a single file, then the simplest way is to convert it from windows-1256 encoding to utf-8 encoding, using e.g. Notepad++. (Open the file in it, change the encoding, via File format menu, to Arabic, windows-1256. Then select Convert to UTF-8 in the File format menu and do File → Save.)

    Windows-1256 and UTF-8 are completely different encodings, so data gets all messed up if you declare windows-1256 data as UTF-8 or vice versa. Only ASCII characters, such as English letters, have the same representation in both encodings.

提交回复
热议问题