I have written a simplified version to demonstrate the problem. I am encoding special characters in utf-8 and UTF-16 format.
With utf-8 encoding there is no problem, whe
Answer to the problem was given by @tripleee.
By defining utf-16le or utf-16be instead of utf-16 resolved the problem.
Sample of solution:
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
import chardet
def myEncode(s, pattern):
try:
s.strip()
u = unicode(s, pattern)
print chardet.detect(u.encode(pattern, 'strict'))
return u.encode(pattern, 'strict')
except UnicodeDecodeError as err:
return "UnicodeDecodeError: ", err
except Exception as err:
return "ExceptionError: ", err
print myEncode(r"""Test !"#$%&'()*+-,./:;<=>?@[\]?_{@}~& € ÄÖÜ äöüß £¥§""",
'utf-8')
print myEncode(r"""Test !"#$%&'()*+-,./:;<=>?@[\]?_{@}~& € ÄÖÜ äöüß £¥§""",
'utf-16be')
Sample of output:
{'confidence': 0.99, 'language': '', 'encoding': 'utf-8'}
Test !"#$%&'()*+-,./:;<=>?@[\]?_{@}~& € ÄÖÜ äöüß £¥§
{'confidence': 0.99, 'language': '', 'encoding': 'utf-8'}
Test !"#$%&'()*+-,./:;<=>?@[\]?_{@}~& € ÄÖÜ äöüß £¥§