How to print the progress of a list comprehension in python?

后端 未结 5 1374
忘掉有多难
忘掉有多难 2021-01-02 23:17

In my method i have to return a list within a list. I would like to have a list comprehension, because of the performance since the list takes about 5 minutes to create.

相关标签:
5条回答
  • 2021-01-02 23:33

    I have the need to make @ted's answer (imo) more readable and to add some explanations.

    Tidied up solution:

    # Function to print the index, if the index is evenly divisable by 1000:
    def report(index):
        if index % 1000 == 0:
            print(index)
    
    # The function the user wants to apply on the list elements
    def process(x, index, report):
         report(index) # Call of the reporting function
         return 'something ' + x # ! Just an example, replace with your desired application
    
    # !Just an example, replace with your list to iterate over
    mylist = ['number ' + str(k) for k in range(5000)]
    
    # Running a list comprehension
    [process(x, index, report) for index, x in enumerate(mylist)]
    

    Explanation: of enumerate(mylist): using the function enumerate it is possible to have indices in addition to the elements of an iterable object (cf. this question and its answers). For example

    [(index, x) for index, x in enumerate(["a", "b", "c"])] #returns
    [(0, 'a'), (1, 'b'), (2, 'c')]
    

    Note: index and x are no reserved names, just names I found convenient - [(foo, bar) for foo, bar in enumerate(["a", "b", "c"])] yields the same result.

    0 讨论(0)
  • 2021-01-02 23:41

    1: Use a side function

    def report(index):
        if index % 1000 == 0:
            print(index)
    
    def process(token, index, report=None):
        if report:
            report(index) 
        return token['text']
    
    l1 = [{'text': k} for k in range(5000)]
    
    l2 = [process(token, i, report) for i, token in enumerate(l1)]
    

    2: Use and and or statements

    def process(token):
        return token['text']
    
    l1 = [{'text': k} for k in range(5000)]
    l2 = [(i % 1000 == 0 and print(i)) or process(token) for i, token in enumerate(l1)]
    

    3: Use both

    def process(token):
        return token['text']
    
    def report(i):
        i % 1000 == 0 and print(i)
    
    l1 = [{'text': k} for k in range(5000)]
    l2 = [report(i) or process(token) for i, token in enumerate(l1)]
    

    All 3 methods print:

    0
    1000
    2000
    3000
    4000
    

    How 2 works

    • i % 1000 == 0 and print(i): and only checks the second statement if the first one is True so only prints when i % 1000 == 0
    • or process(token): or always checks both statements, but returns the first one which evals to True.
      • If i % 1000 != 0 then the first statement is False and process(token) is added to the list.
      • Else, then the first statement is None (because print returns None) and likewise, the or statement adds process(token) to the list

    How 3 works

    Similarly as 2, because report(i) does not return anything, it evals to None and or adds process(token) to the list

    0 讨论(0)
  • 2021-01-02 23:46
    doc_collection = [[1, 2],
                      [3, 4],
                      [5, 6]]
    
    result = [print(progress) or
              [str(token) for token in document]
              for progress, document in enumerate(doc_collection)]
    
    print(result)  # [['1', '2'], ['3', '4'], ['5', '6']]
    

    I don't consider this good or readable code, but the idea is fun.

    It works because print always returns None so print(progress) or x will always be x (by the definition of or).

    0 讨论(0)
  • 2021-01-02 23:46
    def show_progress(it, milestones=1):
        for i, x in enumerate(it):
            yield x
            processed = i + 1
            if processed % milestones == 0:
                print('Processed %s elements' % processed)
    

    Simply apply this function to anything you're iterating over. It doesn't matter if you use a loop or list comprehension and it's easy to use anywhere with almost no code changes. For example:

    doc_collection = [[1, 2],
                      [3, 4],
                      [5, 6]]
    
    result = [[str(token) for token in document]
              for document in show_progress(doc_collection)]
    
    print(result)  # [['1', '2'], ['3', '4'], ['5', '6']]
    

    If you only wanted to show progress for every 100 documents, write:

    show_progress(doc_collection, 100) 
    
    0 讨论(0)
  • 2021-01-02 23:49

    Here is my implementation.

    pip install progressbar2

    from progressbar import progressbar
    new_list = [your_function(list_item) for list_item in progressbar(old_list)]`
    

    You will see a progress bar while running the code block above.

    0 讨论(0)
提交回复
热议问题