Solving error “delimiter must be a 1-character string” while writing a dataframe to a csv file

孤街醉人 提交于 2019-12-11 06:26:09

问题


Using this question: Pandas writing dataframe to CSV file as a model, I wrote the following code to make a csv file:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)

But it returns the following error:

TypeError: "delimiter" must be an 1-character string

I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html but I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.


回答1:


Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literals with valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035

For example, where df is a pandas DataFrame:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')

TypeError: "delimiter" must be an 1-character string

Using a byte lteral with the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*- will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')

(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)




回答2:


As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv module with python2.x.

The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')

This will raise the following error:

TypeError ....              
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string

The following however, will show the expected result

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))

Output:

',0,1\n0,a,A\n1,b,B\n'

In your case, you should edit your code as follows:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)


来源:https://stackoverflow.com/questions/21005059/solving-error-delimiter-must-be-a-1-character-string-while-writing-a-dataframe

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!