Python random lines from subfolders

前端 未结 3 550
太阳男子
太阳男子 2020-12-09 22:30

I have many tasks in .txt files in multiple sub folders. I am trying to pick up a total 10 tasks randomly from these folders, their contained files and finally a text line w

3条回答
  •  醉梦人生
    2020-12-09 23:17

    EDIT: On closer scrutiny this answer does not fit the bill. Reworking it led me to the reservoir sampling algorithm, which @Blckknght used in his answer. So ignore this answer.

    Few ways of doing it. Here's one...

    1. Get a list of all task files
    2. Select one at random
    3. Select a single line from that file at random
    4. Repeat until we have the desired number of lines

    The code...

    import os
    import random
    
    def file_iterator(top_dir):
        """Gather all task files"""
        files = []
        for dirpath, dirnames, filenames in os.walk(top_dir):
            for filename in filenames:
                if not filename.endswith('.txt'):
                    continue
                path = os.path.join(dirpath, filename)
                files.append(path)
        return files
    
    
    def random_lines(files, number=10):
        """Select a random file, select a random line until we have enough
        """
        selected_tasks = []
    
        while len(selected_tasks) < number:
            f = random.choice(files)
            with open(f) as tasks:
                lines = tasks.readlines()
                l = random.choice(lines)
                selected_tasks.append(l)
        return selected_tasks
    
    
    ## Usage
    files = file_iterator(r'C:\\Tasks')
    random_tasks = random_lines(files)
    

提交回复
热议问题