问题
I have a variable, and for some reason R has added an extra "X" in the beginning of each. Is this a common occurrence that I could have avoided?
Anyhow, below is my data (currently the variable is stored in a list):
X1
X5
X33
X37
...
> str(rc1_output)
chr [1:63, 1:3] "X1" "X5" "X33" "X37" "X52" "X645" "X646" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:63] "X1" "X5" "X33" "X37" ...
..$ : chr [1:3] "" "Entropy" "Subseq."
> dput(head(rc1_output))
structure(c("X1", "X5", "X33", "X37", "X52", "X645", "0", "0",
"0", "0", "0", "0", "0.256010845762264", "0.071412419435563",
"0.071412419435563", "0.071412419435563", "0.071412419435563",
"0.071412419435563"), .Dim = c(6L, 3L), .Dimnames = list(c("X1",
"X5", "X33", "X37", "X52", "X645"), c("", "Entropy", "Subseq."
)))
How can I loop through all rows of the variable and remove the X
?
回答1:
Try substr
or gsub
:
x <- c("X1", "X354", "X234", "X2134")
substr(x, 2, nchar(x))
# [1] "1" "354" "234" "2134"
gsub("^X", "", x)
# [1] "1" "354" "234" "2134"
Update
It looks like just the first column (which is unnamed) and the rownames
are affected. The same general approach applies:
> rc1_output[, 1] <- gsub("^X", "", rc1_output[, 1])
> rc1_output
Entropy Subseq.
X1 "1" "0" "0.256010845762264"
X5 "5" "0" "0.071412419435563"
X33 "33" "0" "0.071412419435563"
X37 "37" "0" "0.071412419435563"
X52 "52" "0" "0.071412419435563"
X645 "645" "0" "0.071412419435563"
Repeat the process for rownames(rc1_output)
if required, like this:
rownames(rc1_output) <- gsub("^X", "", rownames(rc1_output))
My guess, however, is that you can solve this problem more effectively at an earlier stage in your code somewhere. If we knew how this data came to be in this form in the first place, that would make it much easier to diagnose.
来源:https://stackoverflow.com/questions/22177756/deleting-an-extra-character-in-each-row