How to simulate negative lookbehind in Go

后端 未结 2 1024
心在旅途
心在旅途 2020-12-20 15:23

I\'m trying to write a regex that can extract a command, here\'s what I\'ve got so far using a negative lookbehind assertion:



        
相关标签:
2条回答
  • 2020-12-20 15:45

    You can actually match the preceding character (or the beginning of line) and use a group to get the desired text in a subexpression.

    Regex

    (?:^|[^@#/])\b(\w+)
    
    • (?:^|[^@#/]) Matches either ^ the beginning of line or [^@#/] any character except @#/
    • \b A word boundary to assert the beginning of a word
    • (\w+) Generates a subexpression
      • and matches \w+ any number of word characters

    Code

    cmds := []string{
        `/msg @nickname #channel foo bar baz`,
        `#channel @nickname foo bar baz /foo`,
        `foo bar baz @nickname #channel`,
        `foo bar baz#channel`}
    
    regex := regexp.MustCompile(`(?:^|[^@#/])\b(\w+)`)
    
    
    // Loop all cmds
    for _, cmd := range cmds{
        // Find all matches and subexpressions
        matches := regex.FindAllStringSubmatch(cmd, -1)
    
        fmt.Printf("`%v` \t==>\n", cmd)
    
        // Loop all matches
        for n, match := range matches {
            // match[1] holds the text matched by the first subexpression (1st set of parentheses)
            fmt.Printf("\t%v. `%v`\n", n, match[1])
        }
    }
    

    Output

    `/msg @nickname #channel foo bar baz`   ==>
        0. `foo`
        1. `bar`
        2. `baz`
    `#channel @nickname foo bar baz /foo`   ==>
        0. `foo`
        1. `bar`
        2. `baz`
    `foo bar baz @nickname #channel`    ==>
        0. `foo`
        1. `bar`
        2. `baz`
    `foo bar baz#channel`   ==>
        0. `foo`
        1. `bar`
        2. `baz`
    

    Playground
    http://play.golang.org/p/AaX9Cg-7Vx

    0 讨论(0)
  • 2020-12-20 15:49

    Since in your negated lookbehind, you are only using a simple character set; you can replace it with a negated character-set:

    \b[^@#/]\w.*
    

    If the are allowed at the start of the string, then use the ^ anchor:

    (?:^|[^@#\/])\b\w.*
    

    Based on the samples in Go playground link in your question, I think you're looking to filter out all words beginning with a character from [#@/]. You can use a filter function:

    func Filter(vs []string, f func(string) bool) []string {
        vsf := make([]string, 0)
        for _, v := range vs {
            if f(v) {
                vsf = append(vsf, v)
            }
        }
        return vsf
    }
    

    and a Process function, which makes use of the filter above:

    func Process(inp string) string {
        t := strings.Split(inp, " ")
        t = Filter(t, func(x string) bool {
            return strings.Index(x, "#") != 0 &&
                strings.Index(x, "@") != 0 &&
                strings.Index(x, "/") != 0
        })
        return strings.Join(t, " ")
    }
    

    It can be seen in action on playground at http://play.golang.org/p/ntJRNxJTxo

    0 讨论(0)
提交回复
热议问题