Select files in directory and move them based on text list of filenames

拈花ヽ惹草 提交于 2019-12-08 11:08:34

问题


So I have a folder of a few thousand pdf files in /path, and I have a list of hundreds of names called names.csv (only one column, it could just as easily be .txt).

I'm trying to select (and ideally, move) the pdfs, where any name from names.csv is found in any filename.

From my research so far, it seems like listdir and regex is one approach to at least get a list of the files I want:

import os, sys  
import re 


for files in os.listdir('path'):
    with open('names.csv') as names: 
        for name in names:
            match  = re.search(name, files)

        print match  

But currently this is just returning 'None' 'None' etc, all the way down.

I'm probably doing a bunch of things wrong here. And I'm not even near the part where I need to move the files. But I'm just hoping to get over this first hump.

Any advice is much appreciated!


回答1:


You say that your names.csv is one column. That must mean that each name is followed by a newline char, which will also be included when matching. You could try this:

match  = re.search(name.rstrip(), files)

Hope it helps.




回答2:


The problem is that your name variable always ends with a newline character \n. The newline character isn't present in the file names, so regex doesn't find any matches.

There are also a few other small issues with your code:

  • You're opening the names.csv file in each iteration of the loop. It would be more efficient to open the file once, then loop through all files in the directory.
  • Regex isn't necessary here, and in fact can cause problems. If, for example, a line in your csv file looked like (this isn't a valid regex, then your code would throw an exception. This could be fixed by escaping it first, but regex still isn't necessary.
  • Your print match is in the wrong place. Since match is overwritten in each iteration of the loop, and you're printing its value after the loop, you only get to see its last value.

The fixed code could look like this:

import os

# open the file, make a list of all filenames, close the file
with open('names.csv') as names_file:
    # use .strip() to remove trailing whitespace and line breaks
    names= [line.strip() for line in names_file] 

for filename in os.listdir('path'):
    for name in names:
        # no need for re.search, just use the "in" operator
        if name in filename:
             # move the file
             os.rename(os.path.join('path', filename), '/path/to/somewhere/else')
             break


来源:https://stackoverflow.com/questions/37297527/select-files-in-directory-and-move-them-based-on-text-list-of-filenames

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!