Regex to find last occurrence of pattern in a string

前端 未结 4 1246
一个人的身影
一个人的身影 2021-01-14 00:02

My string being of the form:

\"as.asd.sd fdsfs. dfsd  d.sdfsd. sdfsdf sd   .COM\"

I only want to match against the last segment of whitespa

相关标签:
4条回答
  • 2021-01-14 00:31

    You can try like so:

    (\s+)(?=\.[^.]+$)
    

    (?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line.

    Demo:

    https://regex101.com/r/k9VwC6/3

    0 讨论(0)
  • 2021-01-14 00:45

    In a general case, you can match the last occurrence of any pattern using the following scheme:

    pattern(?![\s\S]*pattern)
    (?s)pattern(?!.*pattern)
    pattern(?!(?s:.*)pattern)
    

    where [\s\S]* matches any zero or more chars as many as possible. (?s) and (?s:.) can be used with regex engines that support these constructs so as to use . to match any chars.

    In this case, rather than \s+(?![\s\S]*\s), you may use

    \s+(?!\S*\s)
    

    See the regex demo. Note the \s and \S are inverse classes, thus, it makes no sense using [\s\S]* here, \S* is enough.

    Details:

    • \s+ - one or more whitespace chars
    • (?!\S*\s) - that are not immediately followed with any 0 or more non-whitespace chars and then a whitespace.
    0 讨论(0)
  • 2021-01-14 00:53
    "as.asd.sd ffindMyLastOccurrencedsfs. dfindMyLastOccurrencefsd  d.sdfsd. sdfsdf sd   ..COM"
    
    .*(?=((?<=\S)\s+)).*
    
    replaced by `>\1<`
    
    >   <
    

    As a more generalized example

    "as.asd.sd ffindMyLastOccurrencedsfs. dfindMyLastOccurrencefsd  d.sdfsd. sdfsdf sd   ..COM"
    
    .*(?=(findMyLastOccurrence|(?<=\S)\s+|(?<=[^\.])\.+)).*
    
    replaced by `>\1<`
    
    >..<
    

    Explanation:

    Part 1 .*

    • is greedy and finds everything as long as the needles are found. Thus, it also captures all needle occurrences until the very last needle.

    edit to add:

    • in case we are interested in the first hit, we can prevent the greediness by writing .*?

    Part 2 (?=(findMyLastOccurrence|(?<=\S)\s+|(?<=[^\.])\.+|(?<=**Not**NeedlePart)NeedlePart+))

    • defines the 'break' condition for the greedy 'find-all'. It consists of several parts:
      (?=(needles))
      • positive lookahead: ensure that previously found everything is followed by the needles findMyLastOccurrence|(?<=\S)\s+|(?<=[^\.])\.+)|(?<=**Not**NeedlePart)NeedlePart+
      • several needles for which we are looking. Needles are patterns themselves.
      • In case we look for a collection of whitespaces, dots or other needleparts, the pattern we are looking for is actually: anything which is not a needlepart, followed by one or more needleparts (thus needlepart is +). See the example for whitespaces \s negated with \S, actual dot . negated with [^.]

    Part 3 .*

    • as we aren't interested in the remainder, we capture it and dont use it any further. We could capture it with parenthesis and use it as another group, but that's out of scope here
    0 讨论(0)
  • 2021-01-14 00:54

    You can try this. It will capture the last white space segment - in the first capture group.

    (\s+)\.[^\.]*$
    
    0 讨论(0)
提交回复
热议问题