I have many tasks in .txt files in multiple sub folders. I am trying to pick up a total 10 tasks randomly from these folders, their contained files and finally a text line w
EDIT: On closer scrutiny this answer does not fit the bill. Reworking it led me to the reservoir sampling algorithm, which @Blckknght used in his answer. So ignore this answer.
Few ways of doing it. Here's one...
The code...
import os
import random
def file_iterator(top_dir):
"""Gather all task files"""
files = []
for dirpath, dirnames, filenames in os.walk(top_dir):
for filename in filenames:
if not filename.endswith('.txt'):
continue
path = os.path.join(dirpath, filename)
files.append(path)
return files
def random_lines(files, number=10):
"""Select a random file, select a random line until we have enough
"""
selected_tasks = []
while len(selected_tasks) < number:
f = random.choice(files)
with open(f) as tasks:
lines = tasks.readlines()
l = random.choice(lines)
selected_tasks.append(l)
return selected_tasks
## Usage
files = file_iterator(r'C:\\Tasks')
random_tasks = random_lines(files)