How to check if a CSV has a header using Python?

前端 未结 5 550
再見小時候
再見小時候 2021-01-17 22:54

I have a CSV file and I want to check if the first row has only strings in it (ie a header). I\'m trying to avoid using any extras like pandas etc. I\'m thinking I\'ll use a

5条回答
  •  伪装坚强ぢ
    2021-01-17 23:28

    Here is a function I use with pandas in order analyze whether header should be set to 'infer' or None:

    def identify_header(path, n=5, th=0.9):
        df1 = pd.read_csv(path, header='infer', nrows=n)
        df2 = pd.read_csv(path, header=None, nrows=n)
        sim = (df1.dtypes.values == df2.dtypes.values).mean()
        return 'infer' if sim < th else None
    

    Based on a small sample, the function checks the similarity of dtypes with and without a header row. If the dtypes match for a certain percentage of columns, it is assumed that there is no header present. I found a threshold of 0.9 to work well for my use cases. This function is also fairly fast as it only reads a small sample of the csv file.

提交回复
热议问题