How to process files from one subfolder to another in each directory using Python?

此生再无相见时 提交于 2020-01-03 17:38:03

问题


I have a basic file/folder structure on the Desktop where the "Test" folder contains "Folder 1", which in turn contains 2 subfolders:

  • An "Original files" subfolder which contains shapefiles (.shp).
  • A "Processed files" subfolder which is empty.

I am attempting to write a script which looks into each parent folder (Folder 1, Folder 2 etc) and if it finds an Original Files subfolder, it will run a function and output the results into the Processed files subfolder.

I made a simple diagram to showcase this where if Folder 1 contains the relevant subfolders then the function will run; if Folder 2 does not contain the subfolders then it's simply ignored:

I looked into the following posts but having some trouble:

  • python glob issues with directory with [] in name

  • Getting a list of all subdirectories in the current directory

  • How to list all files of a directory?

The following is the script which seems to run happily, annoying thing is that it doesn't produce an error so this real noob can't see where the problem is:

import os, sys

from os.path import expanduser
home = expanduser("~")

for subFolders, files in os.walk(home + "\Test\\" + "\*Original\\"):
 if filename.endswith('.shp'):

    output = home + "\Test\\" + "\*Processed\\" + filename

    # do_some_function, output  

回答1:


I guess you mixed something up in your os.walk()-loop.

I just created a simple structure as shown in your question and used this code to get what you're looking for:

root_dir = '/path/to/your/test_dir'
original_dir = 'Original files'
processed_dir = 'Processed files'

for path, subdirs, files in os.walk(root_dir):
    if original_dir in path:
        for file in files:
            if file.endswith('shp'):
                print('original dir: \t' + path)
                print('original file: \t' + path + os.path.sep + file)
                print('processed dir: \t' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir)
                print('processed file: ' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir + os.path.sep + file)
                print('')

I'd suggest to only use wildcards in a directory-crawling script if you are REALLY sure what your directory tree looks like. I'd rather use the full names of the folders to search for, as in my script.

Update: Paths

Whenever you use paths, take care of your path separators - the slashes.

On windows systems, the backslash is used for that:

C:\any\path\you\name

Most other systems use a normal, forward slash:

/the/path/you/want

In python, a forward slash could be used directly, without any problem:

path_var = '/the/path/you/want'

...as opposed to backslashes. A backslash is a special character in python strings. For example, it's used for the newline-command: \n

To clarify that you don't want to use it as a special character, but as a backslash itself, you either have to "escape" it, using another backslash: '\\'. That makes a windows path look like this:

path_var = 'C:\\any\\path\\you\\name'

...or you could mark the string as a "raw" string (or "literal string") with a proceeding r. Note that by doing that, you can't use special characters in that string anymore.

path_var = r'C:\any\path\you\name'

In your comment, you used the example root_dir = home + "\Test\\". The backslash in this string is used as a special character there, so python tries to make sense out of the backslash and the following character: \T. I'm not sure if that has any meaning in python, but \t would be converted to a tab-stop. Either way - that will not resolve to the path you want to use.

I'm wondering why your other example works. In "C:\Users\me\Test\\", the \U and \m should lead to similar errors. And you also mixed single and double backslashes.

That said...

When you take care of your OS path separators and trying around with new paths now, also note that python does a lot of path-concerning things for you. For example, if your script reads a directory, as os.walk() does, on my windows system the separators are already processed as double backslashes. There's no need for me to check that - it's usually just hardcoded strings, where you'll have to take care.

And finally: The Python os.path module provides a lot of methods to handle paths, seperators and so on. For example, os.path.sep (and os.sep, too) wil be converted in the correct seperator for the system python is running on. You can also build paths using os.path.join().

And finally: The home-directory

You use expanduser("~") to get the home-path of the current user. That should work fine, but if you're using an old python version, there could be a bug - see: expanduser("~") on Windows looks for HOME first

So check if that home-path is resolved correct, and then build your paths using the power of the os-module :-)

Hope that helps!



来源:https://stackoverflow.com/questions/29283466/how-to-process-files-from-one-subfolder-to-another-in-each-directory-using-pytho

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!