Retrieve delimiter infered by read_csv in pandas

后端未结

关注

 3  767

When using the configuration for automatic separator detection to read csv files (pd.read_csv(file_path, sep=None)), pandas tries to infer the delimiter (or separat

相关标签:

3条回答

清酒与你

2021-02-13 04:55
I think you can do this without having to import csv:
```
reader = pd.read_csv(file_path, sep = None, iterator = True)
inferred_sep = reader._engine.data.dialect.delimiter
```
EDIT:

Forgot the iterator = True argument.
0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2021-02-13 05:02
If all you want to do is detect the dialect of a csv (without loading in your data), you can use the inbuilt csv.Sniffer standard:

The Sniffer class is used to deduce the format of a CSV file.

In particular, the sniff method:
```
sniff(sample, delimiters=None)
```
Analyze the given sample and return a Dialect subclass reflecting the parameters found. If the optional delimiters parameter is given, it is interpreted as a string containing possible valid delimiter characters.
Here's an example of its usage:
```
with open('example.csv', 'r') as csvfile:
    dialect = csv.Sniffer().sniff(csvfile.readline())
    print(dialect.delimiter)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2021-02-13 05:10
csv.Sniffer

The Sniffer class is used to deduce the format of a CSV file.

sniff(sample, delimiters=None)

Analyze the given sample and return a Dialect subclass reflecting the parameters found. If the optional delimiters parameter is given, it is interpreted as a string containing possible valid delimiter characters.

Dialect.delimiter

A one-character string used to separate fields. It defaults to ','
```
import csv

sniffer = csv.Sniffer()
dialect = sniffer.sniff('first, second, third, fourth')
print dialect.delimiter
```
0 讨论(0)
发布评论:

提交评论
- 加载中...