Pandas - Writing an excel file containing unicode - IllegalCharacterError

后端 未结 7 1351
离开以前
离开以前 2021-02-07 17:05

I have the following code:

import pandas as pd

x = [u\'string with some unicode: \\x16\']
df = pd.DataFrame(x)

If I try to write this datafram

7条回答
  •  独厮守ぢ
    2021-02-07 18:07

    I've answered a similar question at this post: https://stackoverflow.com/a/63950544/1851492, below is the same content.


    If you don't want to install another excel writer engine (e.g. xlsxwriter), you may try to remove these illegal characters by looking for the pattern which cause IllegalCharacterError raised.

    Open cell.py which under the path /path/to/your/python/site-packages/openpyxl/cell/, look for check_string function, you'll see it using a defined regular expression pattern ILLEGAL_CHARACTERS_RE to find those illegal characters. Trying to locate its definition you'll see this line:

    ILLEGAL_CHARACTERS_RE = re.compile(r'[\000-\010]|[\013-\014]|[\016-\037]')

    This line is what you need to remove those characters. Copy this line to your program and execute below code before your dataframe is writing to excel:

    dataframe = dataframe.applymap(lambda x: ILLEGAL_CHARACTERS_RE.sub(r'', x) if isinstance(x, str) else x)

    The above line will apply remove those characters to every cells.

提交回复
热议问题