Sort CSV using a key computed from two columns, grab first n largest values

后端 未结 3 487
被撕碎了的回忆
被撕碎了的回忆 2021-01-17 03:11

Python amateur here...let\'s say here I have snippet of an example csv file:

Country, Year, GDP, Population
Country1         


        
3条回答
  •  孤街浪徒
    2021-01-17 03:45

    The relevant modules would be:

    • csv for parsing the input
    • collections.namedtuple to name the fields
    • the filter() function to extract the specified year range
    • heapq.nlargest() to find the largest values
    • pprint.pprint() for nice output

    Here's a little bit to get you started (I would do it all but what is the fun in having someone write your whole program and deprive you of the joy of finishing it):

    from __future__ import division
    import csv, collections, heapq, pprint
    
    filecontents = '''\
    Country, Year, GDP, Population
    Country1,2002,44545,24352
    Country2,2004,14325,75677
    Country3,2004,23132412,1345234
    Country4,2004,2312421,12412
    '''
    
    CountryStats = collections.namedtuple('CountryStats', ['country', 'year', 'gdp', 'population'])
    dialect = csv.Sniffer().sniff(filecontents)
    
    data = []
    for country, year, gdp, pop in csv.reader(filecontents.splitlines()[1:], dialect):
        row = CountryStats(country, int(year), int(gdp), int(pop))
        if row.year == 2004:
            data.append(row)
    
    data.sort(key = lambda s: s.gdp / s.population)
    pprint.pprint(data)
    

提交回复
热议问题