问题
I am importing data from MS-Excel to PostgreSQL in python(2.6)
using pyodbc
.
The problem faced is:
There are characters like left single quotation mark(ANSI hex code : 0x91)
, etc in the excel source. Now, when it is import into PostgreSQL using pyodbc, it terminates and gives the error DatabaseError: invalid byte sequence for encoding "UTF8": 0x91
.
What I tried: I used decode('unicode_escape')
for the time being. But, this cannot be done as this simply removes/escapes the concerned character.
Alternate trial: Decode initially, Unicode everywhere and then Encode later when needed from database. This can also not be done due to the expanse of the project at hand.
Please suggest me some method/procedure/in-built functions to accomplish the task.
回答1:
Find out the real encoding of the source document. It might be WIN1251
. Either transcode it (for instance with iconv) or set the client_encoding
of PostgreSQL accordingly.
If you don't have a setting in pyodbc
(which I don't know), you can always issue a plain SQL command:
SET CLIENT_ENCODING TO 'WIN1251';
More in the chapter "Automatic Character Set Conversion Between Server and Client" of the manual.
来源:https://stackoverflow.com/questions/8238617/import-data-from-excel-to-postgres-in-python-using-pyodbc