发表新帖

发表新帖

Pandas - Writing an excel file containing unicode - IllegalCharacterError

后端未结

关注

 7  1372

离开以前 2021-02-07 17:05

I have the following code:

import pandas as pd

x = [u\'string with some unicode: \\x16\']
df = pd.DataFrame(x)

If I try to write this datafram

7条回答

独厮守ぢ (楼主)

2021-02-07 18:07

I've answered a similar question at this post: https://stackoverflow.com/a/63950544/1851492, below is the same content.

If you don't want to install another excel writer engine (e.g. xlsxwriter), you may try to remove these illegal characters by looking for the pattern which cause IllegalCharacterError raised.

Open cell.py which under the path /path/to/your/python/site-packages/openpyxl/cell/, look for check_string function, you'll see it using a defined regular expression pattern ILLEGAL_CHARACTERS_RE to find those illegal characters. Trying to locate its definition you'll see this line:

ILLEGAL_CHARACTERS_RE = re.compile(r'[\000-\010]|[\013-\014]|[\016-\037]')

This line is what you need to remove those characters. Copy this line to your program and execute below code before your dataframe is writing to excel:

dataframe = dataframe.applymap(lambda x: ILLEGAL_CHARACTERS_RE.sub(r'', x) if isinstance(x, str) else x)

The above line will apply remove those characters to every cells.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...

热议问题