Extract text between two tags using regex in Ruby

后端 未结 3 1044
长发绾君心
长发绾君心 2021-01-27 04:22

Let say I have this string which contains html a tag:

Berlin-Treptow-Köpenick
<         


        
相关标签:
3条回答
  • 2021-01-27 04:33

    I have made the assumption that the string to be extracted is comprised of alphanumeric characters--including accented letters--and hyphens, and that the string immediately follows the first instance of the character '>'.

    string =
    '<a href="abgeordnete-1128-0----w8397.html" class="small_link">Berlin-Treptow-Köpenick</a>'
    
    r = /
        (?<=\>)       # match '>' in a positive lookbehind
        [\p{Alnum}-]+ # match >= 0 alphameric character and hyphens
        /x            # extended or free-spacing mode
    
    string[r] #=> "Berlin-Treptow-Köpenick"
    

    Note that /A-Za-z0-9/ does not match accented characters such as 'ö'.

    Alternatively, one can use the POSIX syntax:

    r = /(?<=\>)[[[:alnum:]]-]+/
    
    0 讨论(0)
  • 2021-01-27 04:35

    You could use:

    html = '<a href="abgeordnete-1128-0----w8397.html" class="small_link">Berlin-Treptow-Köpenick</a>'
    html.match(/>(.*)</)[1]
    #=> "Berlin-Treptow-Köpenick"
    

    When your html partial get more complex then I would recommend looking libraries like nokogiri.

    0 讨论(0)
  • 2021-01-27 04:45
    string = '<a href="abgeordnete-1128-0----w8397.html" class="small_link">Berlin-Treptow-Köpenick</a>'
    
    string.scan(/<[a][^>]*>(.+?)<\/[a]>/).flatten
    
    0 讨论(0)
提交回复
热议问题