问题
I have an image file and I'd like to check if its part of an image sequence using python.
For example i start with this file:
/projects/image_0001.jpg
and i want to check if the file is part of a sequence i.e.
/projects/image_0001.jpg
/projects/image_0002.jpg
/projects/image_0003.jpg
...
Checking for whether there is a sequence of images seems simple if i can determine if the file name could be art of a sequence, i.e. if there is a sequence of numbers of the file name
My first though was to ask the user to add ####
to the file path where the numbers should be and input a start and end frame number to replace the hashes with but this is obviously not very user friendly. Is there a way to check for a sequence of numbers in a string with regular expressions or something similar?
回答1:
It's relatively easy to use python's re
module to see if a string contains a sequence of digits. You could do something like this:
mo = re.findall('\d+', filename)
This will return a list of all digits sequences in filename
. If:
- There is a single result (that is, the filename contains only a single sequence of digits), AND
- A subsequent filename has a single digit sequence of the same length, AND
- The second digit sequence is 1 greater than the previous
...then maybe they're part of a sequence.
回答2:
I'm assuming the problem is more for being able to differentiate between sequenced files on disk than knowing any particular information about the filenames themselves.
If thats the case, and what you're looking for is something that is smart enough to take a list like:
- /path/to/file_1.png
- /path/to/file_2.png
- /path/to/file_3.png
- ...
- /path/to/file_10.png
- /path/to/image_1.png
- /path/to/image_2.png
- ...
- /path/to/image_10.png
And get back a result saying - I have 2 sequences of files: /path/to/file_#.png and /path/to/image_#.png you are going to need 2 passes - 1st pass to determine valid expressions for files, 2nd pass to figure out what all other files meet that requirement.
You'll also need to know if you're going to support gaps (is it required to be sequential)
- /path/to/file_1.png
- /path/to/file_2.png
- /path/to/file_3.png
- /path/to/file_5.png
- /path/to/file_6.png
- /path/to/file_7.png
Is this 1 sequence (/path/to/file_#.png) or 2 sequences (/path/to/file_1-3.png, /path/to/file_5-7.png)
Also - how do you want to handle numeric files in sequences?
- /path/to/file2_1.png
- /path/to/file2_2.png
- /path/to/file2_3.png
etc.
With that in mind, this is how I would accomplish it:
import os.path
import projex.sorting
import re
def find_sequences( filenames ):
"""
Parse a list of filenames into a dictionary of sequences. Filenames not
part of a sequence are returned in the None key
:param filenames | [<str>, ..]
:return {<str> sequence: [<str> filename, ..], ..}
"""
local_filenames = filenames[:]
sequence_patterns = {}
sequences = {None: []}
# sort the files (by natural order) so we always generate a pattern
# based on the first potential file in a sequence
local_filenames.sort(projex.sorting.natural)
# create the expression to determine if a sequence is possible
# we are going to assume that its always going to be the
# last set of digits that makes a sequence, i.e.
#
# test2_1.png
# test2_2.png
#
# test2 will be treated as part of the name
#
# test1.png
# test2.png
#
# whereas here the 1 and 2 are part of the sequence
#
# more advanced expressions would be needed to support
#
# test_01_2.png
# test_02_2.png
# test_03_2.png
pattern_expr = re.compile('^(.*)(\d+)([^\d]*)$')
# process the inputed files for sequences
for filename in filenames:
# first, check to see if this filename matches a sequence
found = False
for key, pattern in sequence_patterns.items():
match = pattern.match(filename)
if ( not match ):
continue
sequences[key].append(filename)
found = True
break
# if we've already been matched, then continue on
if ( found ):
continue
# next, see if this filename should start a new sequence
basename = os.path.basename(filename)
pattern_match = pattern_expr.match(basename)
if ( pattern_match ):
opts = (pattern_match.group(1), pattern_match.group(3))
key = '%s#%s' % opts
# create a new pattern based on the filename
sequence_pattern = re.compile('^%s\d+%s$' % opts)
sequence_patterns[key] = sequence_pattern
sequences[key] = [filename]
continue
# otherwise, add it to the list of non-sequences
sequences[None].append(filename)
# now that we have grouped everything, we'll merge back filenames
# that were potential sequences, but only contain a single file to the
# non-sequential list
for key, filenames in sequences.items():
if ( key is None or len(filenames) > 1 ):
continue
sequences.pop(key)
sequences[None] += filenames
return sequences
And an example usage:
>>> test = ['test1.png','test2.png','test3.png','test4.png','test2_1.png','test2_2.png','test2_3.png','test2_4.png']
>>> results = find_sequences(test)
>>> results.keys()
[None, 'test#.png', 'test2_#.png']
There is a method in there that refers to natural sorting, which is a separate topic. I just used my natural sort method from my projex library. It is open-source, so if you want to use or see it, its here: http://dev.projexsoftware.com/projects/projex
But that topic has been covered elsewhere on the forums, so Just used the method from the library.
来源:https://stackoverflow.com/questions/11855801/whats-the-best-way-of-determining-if-an-image-is-part-of-a-sequence