Regular expression to match only the first file in a RAR file set

后端 未结 4 789
半阙折子戏
半阙折子戏 2021-01-16 00:39

To see what file to invoke the unrar command on, one needs to determine which file is the first in the file set.

Here are some sample file names, of which - naturall

相关标签:
4条回答
  • 2021-01-16 00:56

    Don't rely on the names of the files to determine which one is first. You're going to end up finding an edge case where you get the wrong file.

    RAR's headers will tell you which file is the first on in the volume, assuming they were created in a somewhat-recent version of RAR.

    HEAD_FLAGS Bit flags:
    2 bytes

    0x0100 - First volume (set only by RAR 3.0 and later)

    So open up each file and examine the RAR headers, looking specifically for the flag that indicates which file is the first volume. This will never fail, as long as the archive isn't corrupt. I have done my own tests with spanning RAR archives and their headers are correct according to the link above.

    This is a much, much safer way of determining which file is first in a set like this.

    0 讨论(0)
  • 2021-01-16 00:57

    The short answer is that it's not possible to construct a single regex to satisfy your problem. Ruby 1.8 does not have lookaround assertions (the (?<! stuff in your example regex) which is why your regex doesn't work. This leaves you with two options.

    1) Use more than one regex to do it.

    def is_first_rar(filename)
        if ((filename =~ /part(\d+)\.rar$/) == nil)
            return (filename =~ /\.rar$/) != nil
        else
            return $1.to_i == 1
        end
    end
    

    2) Use the regex engine for ruby 1.9, Oniguruma. It supports lookaround assertions, and you can install it as a gem for ruby 1.8. After that, you can do something like this:

    def is_first_rar(filename)
        reg = Oniguruma::ORegexp.new('.*(?:(?<!part\d\d\d|part\d\d|\d)\.rar|\.part0*1\.rar)')
        match = reg.match(filename)
        return match != nil
    end
    
    0 讨论(0)
  • 2021-01-16 00:59

    I am no regex expert but here is my attempt

    ^(yes|no)\.(rar|part0*1\.rar)$
    

    Replace "yes|no" with the actual file name. I matched it against your examples to see if it would only match the first set hence the "yes|no" in the regex.

    UPDATE: fixed as per the comment. Not sure why the user would not know the filename so i did not fix that part...

    0 讨论(0)
  • 2021-01-16 01:04

    Personally I wouldn't use (extended) regular expressions in this case (or at least not just one to do it all). What's wrong with coding this in, for example, a few ifs?

    0 讨论(0)
提交回复
热议问题