What is the most efficient way to apply gsub
to various columns?
The following does not work
x1=c(\"10%\",\"20%\",\"30%\")
x2=c(\"60%\",\"50%\",
We can unlist
per_col
columns, remove "%
" symbol and convert it into numeric.
x[per_col] <- as.numeric(gsub("%","", unlist(x[per_col])))
#In this case using sub would be enough too as we have only 1 % symbol to replace
#x[per_col] <- as.numeric(sub("%","", unlist(x[per_col])))
x
# x1 x2 x3
#1 10 60 1
#2 20 50 2
#3 30 40 3
To add on docendo discimus' answer, an extension with non-adjacent columns and returning a data.frame
:
x1 <- c("10%", "20%", "30%")
x2 <- c("60%", "50%", "40%")
x3 <- c(1, 2, 3)
x4 <- c("60%", "50%", "40%")
x <- data.frame(x1, x2, x3, x4)
x[, c(1:2, 4)] <- as.data.frame(apply(x[,c(1:2, 4)], 2,
function(x) {
as.numeric(gsub("%", "", x))}
))
> x
x1 x2 x3 x4
1 10 60 1 60
2 20 50 2 50
3 30 40 3 40
> class(x)
[1] "data.frame"
You can use apply
to apply it to the whole data.frame
apply(x, 2, function(y) as.numeric(gsub("%", "", y)))
x1 x2 x3
[1,] 10 60 1
[2,] 20 50 2
[3,] 30 40 3
The first answer works but be careful if you are using data.frame
with string: the @docendo discimus's answer will return NAs
.
If you want to keep the content of your column as string just remove the as.numeric
and convert your table into a data frame after :
as.data.frame(apply(x, 2, function(y) as.numeric(gsub("%", "", y))))
x1 x2 x3
[1,] 10 60 1
[2,] 20 50 2
[3,] 30 40 3
Or, you could try the lapply
solution:
as.data.frame(lapply(x, function(y) gsub("%", "", y)))
x1 x2 x3
1 10 60 1
2 20 50 2
3 30 40 3
To clean the %
out you can do:
x[per_col] <- lapply(x[per_col], function(y) as.numeric(gsub("%", "", y)))
x
x1 x2 x3
1 10 60 1
2 20 50 2
3 30 40 3