Python: How to use RegEx in an if statement?

后端 未结 5 2052
鱼传尺愫
鱼传尺愫 2020-12-25 10:06

I have the following code which looks through the files in one directory and copies files that contain a certain string into another directory, but I am trying to use Regula

相关标签:
5条回答
  • 2020-12-25 10:15

    The REPL makes it easy to learn APIs. Just run python, create an object and then ask for help:

    $ python
    >>> import re
    >>> help(re.compile(r''))
    

    at the command line shows, among other things:

    search(...)

    search(string[, pos[, endpos]]) --> match object or None. Scan through string looking for a match, and return a corresponding MatchObject instance. Return None if no position in the string matches.

    so you can do

    regex = re.compile(regex_txt, re.IGNORECASE)
    
    match = regex.search(content)  # From your file reading code.
    if match is not None:
      # use match
    

    Incidentally,

    regex_txt = "facebook.com"
    

    has a . which matches any character, so re.compile("facebook.com").search("facebookkcom") is not None is true because . matches any character. Maybe

    regex_txt = r"(?i)facebook\.com"
    

    The \. matches a literal "." character instead of treating . as a special regular expression operator.

    The r"..." bit means that the regular expression compiler gets the escape in \. instead of the python parser interpreting it.

    The (?i) makes the regex case-insensitive like re.IGNORECASE but self-contained.

    0 讨论(0)
  • 2020-12-25 10:21
    if re.match(regex, content) is not None:
      blah..
    

    You could also use re.search depending on how you want it to match.

    0 讨论(0)
  • 2020-12-25 10:23

    Regex's shouldn't really be used in this fashion - unless you want something more complicated than what you're trying to do - for instance, you could just normalise your content string and comparision string to be:

    if 'facebook.com' in content.lower():
        shutil.copy(x, "C:/Users/David/Desktop/Test/MyFiles2")
    
    0 讨论(0)
  • 2020-12-25 10:28

    First you compile the regex, then you have to use it with match, find, or some other method to actually run it against some input.

    import os
    import re
    import shutil
    
    def test():
        os.chdir("C:/Users/David/Desktop/Test/MyFiles")
        files = os.listdir(".")
        os.mkdir("C:/Users/David/Desktop/Test/MyFiles2")
        pattern = re.compile(regex_txt, re.IGNORECASE)
        for x in (files):
            with open((x), 'r') as input_file:
                for line in input_file:
                    if pattern.search(line):
                        shutil.copy(x, "C:/Users/David/Desktop/Test/MyFiles2")
                        break
    
    0 讨论(0)
  • 2020-12-25 10:35

    if re.search(r'pattern', string):

    Simple if-test:

    if re.search(r'ing\b', "seeking a great perhaps"):     # any words end with ing?
        print("yes")
    

    Pattern check, extract a substring, case insensitive:

    match_object = re.search(r'^OUGHT (.*) BE$', "ought to be", flags=re.IGNORECASE)
    if match_object:
        assert "to" == match_object.group(1)     # what's between ought and be?
    

    Notes:

    • Use re.search() not re.match. Match restricts to the start of strings, a confusing convention if you ask me. If you do want a string-starting match, use caret or \A instead, re.search(r'^...', ...)

    • Use raw string syntax r'pattern' for the first parameter. Otherwise you would need to double up backslashes, as in re.search('ing\\b', ...)

    • In this example, \b is a special sequence meaning word-boundary in regex. Not to be confused with backspace.

    • re.search() returns None if it doesn't find anything, which is always falsy.

    • re.search() returns a Match object if it finds anything, which is always truthy.

    • a group is what matched inside parentheses

    • group numbering starts at 1

    • Specs

    • Tutorial

    0 讨论(0)
提交回复
热议问题