I have the following code:
import pandas as pd
x = [u\'string with some unicode: \\x16\']
df = pd.DataFrame(x)
If I try to write this datafram
I've answered a similar question at this post: https://stackoverflow.com/a/63950544/1851492, below is the same content.
If you don't want to install another excel writer engine (e.g. xlsxwriter), you may try to remove these illegal characters by looking for the pattern which cause IllegalCharacterError
raised.
Open cell.py
which under the path /path/to/your/python/site-packages/openpyxl/cell/
, look for check_string
function, you'll see it using a defined regular expression pattern ILLEGAL_CHARACTERS_RE
to find those illegal characters. Trying to locate its definition you'll see this line:
ILLEGAL_CHARACTERS_RE = re.compile(r'[\000-\010]|[\013-\014]|[\016-\037]')
This line is what you need to remove those characters. Copy this line to your program and execute below code before your dataframe is writing to excel:
dataframe = dataframe.applymap(lambda x: ILLEGAL_CHARACTERS_RE.sub(r'', x) if isinstance(x, str) else x)
The above line will apply remove those characters to every cells.