Remove any digit only in first N characters

后端 未结 3 1863
攒了一身酷
攒了一身酷 2021-02-14 17:16

I\'m looking for a regular expression to catch all digits in the first 7 characters in a string.

This string has 12 characters:

A12B345CD678


        
相关标签:
3条回答
  • 2021-02-14 17:43

    The regex solution is cool, but I'd use something easier to read for maintainability. E.g.

    library(stringr)
    
    str_sub(s, 1, 7) = gsub('[A-Z]', '', str_sub(s, 1, 7))
    
    0 讨论(0)
  • 2021-02-14 17:52

    You can also use a simple negative lookbehind:

    s <- "A12B345CD678"
    gsub("(?<!.{7})\\D", "", s, perl=T)
    
    0 讨论(0)
  • 2021-02-14 17:53

    You can use the known SKIP-FAIL regex trick to match all the rest of the string beginning with the 8th character, and only match non-digit characters within the first 7 with a lookbehind:

    s <- "A12B345CD678"
    gsub("(?<=.{7}).*$(*SKIP)(*F)|\\D", "", s, perl=T)
    ## => [1] "12345CD678"
    

    See IDEONE demo

    The perl=T is required for this regex to work. The regex breakdown:

    • (?<=.{7}).*$(*SKIP)(*F) - matches any character but a newline (add (?s) at the beginning if you have newline symbols in the input), as many as possible (.*) up to the end ($, also \\z might be required to remove final newlines), but only if preceded with 7 characters (this is set by the lookbehind (?<=.{7})). The (*SKIP)(*F) verbs make the engine omit the whole matched text and advance the regex index to the position at the end of that text.
    • | - or...
    • \\D - a non-digit character.

    See the regex demo.

    0 讨论(0)
提交回复
热议问题