Python: partition list of names into equally sized sublists

后端 未结 2 925
南笙
南笙 2021-01-12 12:36

I have a list of names, e.g. [\'Agrajag\', \'Colin\', \'Deep Thought\', ... , \'Zaphod Beeblebrox\', \'Zarquon\']. Now I want to partition this list into approx

相关标签:
2条回答
  • 2021-01-12 13:25

    Here's something that might work. I feel certain there's a simpler way though... probably involving itertools. Note that num_pages only roughly determines how many pages you'll actually get.

    EDIT: Whoops! There was a bug -- it was cutting off the last group! The below should be fixed, but note that the length of the last page will be slightly unpredictable. Also, I added .upper() to account for possible lowercase names.

    EDIT2: The previous method of defining letter_groups was inefficient; the below dict-based code is more scalable:

    names = ['Agrajag', 'Colin', 'Deep Thought', 'Ford Prefect' , 'Zaphod Beeblebrox', 'Zarquon']
    num_pages = 3
    
    def group_names(names, num_pages):
        letter_groups = defaultdict(list)
        for name in names: letter_groups[name[0].upper()].append(name)
        letter_groups = [letter_groups[key] for key in sorted(letter_groups.keys())]
        current_group = []
        page_groups = []
        group_size = len(names) / num_pages
        for group in letter_groups:
            current_group.extend(group)
            if len(current_group) > group_size:
                page_groups.append(current_group)
                current_group = []
        if current_group: page_groups.append(current_group)
    
        return page_groups
    
    print group_names(names, num_pages)
    
    0 讨论(0)
  • 2021-01-12 13:33

    Since your name_list has to be sorted for groupby to work, can't you just check every Nth value and build your divisions that way?

    right_endpoints = name_list[N-1::N]
    

    And using "A" as your leftmost endpoint and "Z" as your rightmost endpoint, you can construct the N divisions accordingly and they should all have the same size.

    1. So, the first left endpoint would be "A", the first right endpoint would be right_endpoints[0].
    2. The next left endpoint would be the character after right_endpoints[0], the next right endpoint would be right_endpoints[1].
    3. Etc., until you hit the Nth range and that has a set endpoint of "Z".

    The issue you may run into is what if two of these right_endpoints are the same...

    edit: example

    >>> names = ['Aaron', 'Abel', 'Cain', 'Daniel', 'Darius', 'David', 'Ellen', 'Gary', 'James', 'Jared', 'John', 'Joseph', 'Lawrence', 'Michael', 'Nicholas', 'Terry', 'Victor', 'Zulu']
    >>> right_ends, left_ends = names[2::3], names[3::3]
    >>> left_ends = ['A'] + left_ends
    >>> left_ends, right_ends
    >>> ["%s - %s" % (left, right) for left, right in zip(left_ends, right_ends)]
    ['A - Cain', 'Daniel - David', 'Ellen - James', 'Jared - Joseph', 'Lawrence - Nicholas', 'Terry - Zulu']
    
    0 讨论(0)
提交回复
热议问题