How to ignore the first line of data when processing CSV data?

前端 未结 17 1939
庸人自扰
庸人自扰 2020-11-22 10:05

I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don\'t want Python to take the top row into account. Ho

相关标签:
17条回答
  • 2020-11-22 10:36

    The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.

    import pandas as pd
    
    data = pd.read_csv('all16.csv')
    data.min()
    
    0 讨论(0)
  • 2020-11-22 10:37

    use csv.DictReader instead of csv.Reader. If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc

    0 讨论(0)
  • 2020-11-22 10:43

    You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:

    import csv
    
    with open('all16.csv', 'r', newline='') as file:
        has_header = csv.Sniffer().has_header(file.read(1024))
        file.seek(0)  # Rewind.
        reader = csv.reader(file)
        if has_header:
            next(reader)  # Skip header row.
        column = 1
        datatype = float
        data = (datatype(row[column]) for row in reader)
        least_value = min(data)
    
    print(least_value)
    

    Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:

        data = (float(row[1]) for row in reader)
    

    Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:

    with open('all16.csv', 'rb') as file:
    
    0 讨论(0)
  • 2020-11-22 10:43

    Well, my mini wrapper library would do the job as well.

    >>> import pyexcel as pe
    >>> data = pe.load('all16.csv', name_columns_by_row=0)
    >>> min(data.column[1])
    

    Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:

    >>> min(data.column["Column 1"])
    
    0 讨论(0)
  • 2020-11-22 10:46

    For me the easiest way to go is to use range.

    import csv
    
    with open('files/filename.csv') as I:
        reader = csv.reader(I)
        fulllist = list(reader)
    
    # Starting with data skipping header
    for item in range(1, len(fulllist)): 
        # Print each row using "item" as the index value
        print (fulllist[item])  
    
    0 讨论(0)
提交回复
热议问题