Python + Regex: AttributeError: 'NoneType' object has no attribute 'groups'

后端 未结 2 1018
傲寒
傲寒 2020-11-30 06:57

I have a string which I want to extract a subset of. This is part of a larger Python script.

This is the string:

import re

htmlString = \'

        
相关标签:
2条回答
  • 2020-11-30 07:18
    import re
    
    htmlString = '</dd><dt> Fine, thank you.&#160;</dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'
    
    SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'
    
    Result = re.search(SearchStr.decode('utf-8'), htmlString.decode('utf-8'), re.I | re.U)
    
    print Result.groups()
    

    Works that way. The expression contains non-latin characters, so it usually fails. You've got to decode into Unicode and use re.U (Unicode) flag.

    I'm a beginner too and I faced that issue a couple of times myself.

    0 讨论(0)
  • 2020-11-30 07:26

    You are getting AttributeError because you're calling groups on None, which hasn't any methods.

    regex.search returning None means the regex couldn't find anything matching the pattern from supplied string.

    when using regex, it is nice to check whether a match has been made:

    Result = re.search(SearchStr, htmlString)
    
    if Result:
        print Result.groups()
    
    0 讨论(0)
提交回复
热议问题