This question seems to make it easy to remove space characters in a string in R. However when I load the following table I\'m not able to remove a space between two numbers (eg.
You may shorten the test
creation to just 2 steps and using just 1 PCRE regex (note the perl=TRUE
parameter):
test = sub(",", ".", gsub("(*UCP)[\\s\\p{L}]+|\\W+$", "", area_cult10$V5, perl=TRUE), fixed=TRUE)
Result:
[1] "11846.4" "6529.2" "3282.7" "616.0" "1621.8" "125.7" "14.2"
[8] "401.6" "455.5" "11.7" "160.4" "79.1" "37.6" "29.6"
[15] "" "13.9" "554.1" "236.7" "312.8" "4.6" "136.9"
[22] "1374.4" "1332.3" "1281.8" "3.7" "5.0" "18.4" "23.4"
[29] "42.0" "2746.2" "106.6" "2100.4" "267.8" "258.4" "13.1"
[36] "23.5" "11.6" "310.2"
The gsub
regex is worth special attention:
(*UCP)
- the PCRE verb that enforces the pattern to be Unicode aware[\\s\\p{L}]+
- matches 1+ whitespace or letter characters|
- or (an alternation operator)\\W+$
- 1+ non-word chars at the end of the string.Then, sub(",", ".", x, fixed=TRUE)
will replace the first ,
with a .
as literal strings, fixed=TRUE
saves performance since it does not have to compile a regex.