How to select files with numbered extensions from a folder?

允我心安 提交于 2019-12-11 11:28:36

问题


I am trying to build my own dataset for a project. Therefore I need to select files that have been exported from another program and come with numbered extensions:

exported_file_1_aaa.001
exported_file_2_aaa.002
exported_file_3_aaa.003
...
exported_file_5_zzz.925
...and so on.

I know how to select files with a specific extension e.g. '.txt' from a folder and append it to a list or dict. Is there any way to solve this with '.nnn'

ext = '.nnn'
all_files = [i for i in os.listdir(dir) if os.path.splitext(i)[1] == ext]
for f in all_files:
    ...

回答1:


You can mix the capabilities of shell globbing (glob) and regex (re).

With glob you can get the files ending with a number, so that we get a limited number files for re to do the final check:

glob.iglob('exported_file_*.*[0-9]')

Then we can match the files precisely with Regex pattern:

\.\d+$

this will match file names ending in digits after last ..

Putting together:

import glob
import re
[file for file in glob.iglob('exported_file_*.*[0-9]') if re.search(r'\.\d+$', file)]

Shell globbing is not as flexible as re, otherwise we could have done with glob alone.

Also, if you're sure that all files end in a certain number of digits then glob alone would work e.g. for files ending in 3 digits after last .:

glob.iglob('exported_file_*.[0-9][0-9][0-9]')



回答2:


If you don't care about the length of the extension, you can use the isdigit method:

all_files = [i for i in os.listdir(dir) if os.path.splitext(i)[1].isdigit()]
for f in all_files: 
    ....



回答3:


You can use the glob module.

import glob

my_dir = "mydir"

all_files = [fn for fn in glob.glob(f"{my_dir}/*.[0-9][0-9][0-9]")]


来源:https://stackoverflow.com/questions/54381624/how-to-select-files-with-numbered-extensions-from-a-folder

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!