问题
I'm trying to build a flexible regular expression to pick out the artist name and song title of a media file. I'd like it to be flexible and support all of the following:
01 Example Artist - Example Song.mp3
01 Example Song.mp3 (In this example, there's no artist so that group should be null)
Example Artist - Example Song.mp3
Example Song.mp3 (Again, no artist)
I've come up with the following (in .NET syntax, particularly for named capture groups):
\d{0,2}\s*(?<artist>[^-]*)?[\s-]*(?<songname>.*)(\.mp3|\.m4a)
This works well, but fails for this input: 01 Example Song.mp3
It swallows the song name as the artist, I believe because of greedy matching. So, I tried modifying the expression so that the artist part would be lazy matching:
\d{0,2}\s*(?<artist>[^-]*)*?[\s-]*(?<songname>.*)(\.mp3|\.m4a)
The change is:
(?<artist>[^-]*)?
became
(?<artist>[^-]*)*?
This does indeed fix the above problem. But now, it fails for this input:
01 Example Artist - Example Song.mp3
Now, it's too lazy in that it captures "Example Artist - Example Song" as the songname and captures nothing for the artist name.
Does anyone have a suggestion regarding this?
回答1:
You can't achieve this task only with greediness, you need to be more descriptive using groups (optional or not). An example:
(?x) # switch on comment mode
^ # start of the string
(?: (?<track>\d{1,3}) \s*[\s-]\s* )? # the track is optional ( including separators)
(?: (?<artist>.+?) \s*-\s* )? # the same with the artist name
(?<title> .+ )
(?<ext> \.m(?:p3|4a) )
demo
As an aside, audio filenames can be very weird, even with the best pattern of the world, I doubt you can handle all cases.
You can be a little more flexible and more efficient if you replace .+
with something more explicit:
^(?x)
(?: (?<track>\d{1,3}) \s*[\s-]\s* )?
(?: (?<artist> \S+ (?>[ .-][^\s.-]*)*? ) \s*-\s*)?
(?<title> [^.\n]+ (?>\.[^.\n]*)*? )
(?<ext> \.m(?:p3|4a) )
( \n
are only here for test purpose, you can remove them when you apply the pattern one filename at a time)
来源:https://stackoverflow.com/questions/32288423/regex-to-pick-out-artist-name-and-song-title-issue-with-lazy-matching