remove (non-breaking) space character in string

前端 未结 1 1215
心在旅途
心在旅途 2021-01-21 22:19

This question seems to make it easy to remove space characters in a string in R. However when I load the following table I\'m not able to remove a space between two numbers (eg.

相关标签:
1条回答
  • 2021-01-21 23:05

    You may shorten the test creation to just 2 steps and using just 1 PCRE regex (note the perl=TRUE parameter):

    test = sub(",", ".", gsub("(*UCP)[\\s\\p{L}]+|\\W+$", "", area_cult10$V5, perl=TRUE), fixed=TRUE)
    

    Result:

     [1] "11846.4" "6529.2"  "3282.7"  "616.0"   "1621.8"  "125.7"   "14.2"   
     [8] "401.6"   "455.5"   "11.7"    "160.4"   "79.1"    "37.6"    "29.6"   
    [15] ""        "13.9"    "554.1"   "236.7"   "312.8"   "4.6"     "136.9"  
    [22] "1374.4"  "1332.3"  "1281.8"  "3.7"     "5.0"     "18.4"    "23.4"   
    [29] "42.0"    "2746.2"  "106.6"   "2100.4"  "267.8"   "258.4"   "13.1"   
    [36] "23.5"    "11.6"    "310.2"  
    

    The gsub regex is worth special attention:

    • (*UCP) - the PCRE verb that enforces the pattern to be Unicode aware
    • [\\s\\p{L}]+ - matches 1+ whitespace or letter characters
    • | - or (an alternation operator)
    • \\W+$ - 1+ non-word chars at the end of the string.

    Then, sub(",", ".", x, fixed=TRUE) will replace the first , with a . as literal strings, fixed=TRUE saves performance since it does not have to compile a regex.

    0 讨论(0)
提交回复
热议问题