问题
In the R data frame coded for below, I would like to replace all of the times that B
appears with b
.
junk <- data.frame(x <- rep(LETTERS[1:4], 3), y <- letters[1:12])
colnames(junk) <- c(\"nm\", \"val\")
this provides:
nm val
1 A a
2 B b
3 C c
4 D d
5 A e
6 B f
7 C g
8 D h
9 A i
10 B j
11 C k
12 D l
My initial attempt was to use a for
and if
statements like so:
for(i in junk$nm) if(i %in% \"B\") junk$nm <- \"b\"
but as I am sure you can see, this replaces ALL of the values of junk$nm
with b
. I can see why this is doing this but I can\'t seem to get it to replace only those cases of junk$nm where the original value was B
.
NOTE: I managed to solve the problem with gsub
but in the interest of learning R I still would like to know how to get my original approach to work (if it is possible)
回答1:
Easier to convert nm to characters and then make the change:
junk$nm <- as.character(junk$nm)
junk$nm[junk$nm == "B"] <- "b"
EDIT: And if indeed you need to maintain nm as factors, add this in the end:
junk$nm <- as.factor(junk$nm)
回答2:
another useful way to replace values
library(plyr)
junk$nm <- revalue(junk$nm, c("B"="b"))
回答3:
Short answer is:
junk$nm[junk$nm %in% "B"] <- "b"
Take a look at Index vectors in R Introduction (if you don't read it yet).
EDIT. As noticed in comments this solution works for character vectors so fail on your data.
For factor best way is to change level:
levels(junk$nm)[levels(junk$nm)=="B"] <- "b"
回答4:
As the data you show are factors, it complicates things a little bit. @diliop's Answer approaches the problem by converting to nm
to a character variable. To get back to the original factors a further step is required.
An alternative is to manipulate the levels of the factor in place.
> lev <- with(junk, levels(nm))
> lev[lev == "B"] <- "b"
> junk2 <- within(junk, levels(nm) <- lev)
> junk2
nm val
1 A a
2 b b
3 C c
4 D d
5 A e
6 b f
7 C g
8 D h
9 A i
10 b j
11 C k
12 D l
That is quite simple and I often forget that there is a replacement function for levels()
.
Edit: As noted by @Seth in the comments, this can be done in a one-liner, without loss of clarity:
within(junk, levels(nm)[levels(nm) == "B"] <- "b")
回答5:
The easiest way to do this in one command is to use which
command and also need not to change the factors into character by doing this:
junk$nm[which(junk$nm=="B")]<-"b"
回答6:
You have created a factor variable in nm
so you either need to avoid doing so or add an additional level to the factor attributes. You should also avoid using <-
in the arguments to data.frame()
Option 1:
junk <- data.frame(x = rep(LETTERS[1:4], 3), y =letters[1:12], stringsAsFactors=FALSE)
junk$nm[junk$nm == "B"] <- "b"
Option 2:
levels(junk$nm) <- c(levels(junk$nm), "b")
junk$nm[junk$nm == "B"] <- "b"
junk
回答7:
If you are working with character variables (note that stringsAsFactors
is false here) you can use replace:
junk <- data.frame(x <- rep(LETTERS[1:4], 3), y <- letters[1:12], stringsAsFactors = FALSE)
colnames(junk) <- c("nm", "val")
junk$nm <- replace(junk$nm, junk$nm == "B", "b")
junk
# nm val
# 1 A a
# 2 b b
# 3 C c
# 4 D d
# ...
回答8:
stata.replace<-function(data,replacevar,replacevalue,ifs) {
ifs=parse(text=ifs)
yy=as.numeric(eval(ifs,data,parent.frame()))
x=sum(yy)
data=cbind(data,yy)
data[yy==1,replacevar]=replacevalue
message=noquote(paste0(x, " replacement are made"))
print(message)
return(data[,1:(ncol(data)-1)])
}
Call this function using below line.
d=stata.replace(d,"under20",1,"age<20")
来源:https://stackoverflow.com/questions/5824173/replace-a-value-in-a-data-frame-based-on-a-conditional-if-statement