When using the configuration for automatic separator detection to read csv files (pd.read_csv(file_path, sep=None)
), pandas tries to infer the delimiter (or separat
I think you can do this without having to import csv
:
reader = pd.read_csv(file_path, sep = None, iterator = True)
inferred_sep = reader._engine.data.dialect.delimiter
EDIT:
Forgot the iterator = True
argument.
If all you want to do is detect the dialect of a csv (without loading in your data), you can use the inbuilt csv.Sniffer standard:
The Sniffer class is used to deduce the format of a CSV file.
In particular, the sniff
method:
sniff(sample, delimiters=None)
Analyze the given sample and return a Dialect subclass reflecting the parameters found. If the optional delimiters parameter is given, it is interpreted as a string containing possible valid delimiter characters.
Here's an example of its usage:
with open('example.csv', 'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.readline())
print(dialect.delimiter)
csv.Sniffer
The Sniffer class is used to deduce the format of a CSV file.
sniff(sample, delimiters=None)
Analyze the given sample and return a Dialect subclass reflecting the parameters found. If the optional delimiters parameter is given, it is interpreted as a string containing possible valid delimiter characters.
Dialect.delimiter
A one-character string used to separate fields. It defaults to ','
import csv
sniffer = csv.Sniffer()
dialect = sniffer.sniff('first, second, third, fourth')
print dialect.delimiter