问题
I have problem with detecting CP1250 in mb_detect_encoding(), in my case I want detect 3 encodings:
mb_detect_encoding($string, 'UTF-8,ISO-8859-2,Windows-1250')
But Windows isn't in supported encodings, any solution?
回答1:
mb_detect_encoding
always "detects" single-byte encodings. You can read about this in the documentation for mb_detect_order:
mbstring currently implements the following encoding detection filters. If there is an invalid byte sequence for the following encodings, encoding detection will fail.
UTF-8, UTF-7, ASCII, EUC-JP,SJIS, eucJP-win, SJIS-win, JIS, ISO-2022-JP
For ISO-8859-X, mbstring always detects as ISO-8859-X.
For UTF-16, UTF-32, UCS2 and UCS4, encoding detection will fail always.
Conclusions:
- It's meaningless to ask for detection of ISO-8859-2; it will always tell you "yes, that's it" (unless of course it detects UTF-8 first).
- Windows-1250 is not supported, but even if it were it would work exactly like ISO-8859-2.
In general, it is impossible to detect single-byte encodings with accuracy. If you find yourself needing to do that in PHP you will need to do it manually; don't expect very good results.
回答2:
It is not feasible to distinguish ISO-8859-2 from Windows-1250, or any other single-byte encoding from any other encoding for that matter. mb_detect_encoding
simply gives you the first encoding which is valid for the given string, and both are equally valid. "Detecting" encodings is by definition not possible with any amount of accuracy.
来源:https://stackoverflow.com/questions/17104340/mb-detect-encoding-doesnt-properly-working-with-windows-1250-cp1250