Sorry guys if this is a noob question. I need help on how to loop over my dataframe.Here is a sample data.
a <- c(10:29);
b <- c(40:59);
e <- rep(1,
I would use cut()
for this:
test$e = cut(test$a,
breaks = c(0, 15, 20, 25, 30),
labels = c(1, 2, 3, 4))
If you want to "generalize" the cut--in other words, where you don't know exactly how many sets of 5 (levels) you need to make--you can take a two-step approach using c()
and seq()
:
test$e = cut(test$a,
breaks = c(0, seq(from = 15, to = max(test$a)+5, by = 5)))
levels(test$e) = 1:length(levels(test$e))
Since Backlin beat me to the cut()
solution, here's another option (which I don't prefer in this case, but am posting just to demonstrate the many options available in R).
Use recode()
from the car
package.
require(car)
test$e = recode(test$a, "0:15 = 1; 15:20 = 2; 20:25 = 3; 25:30 = 4")
data.frame(a, b, e=(1:4)[cut(a, c(-Inf, 15, 20, 25, 30))])
Update:
Greg's comment provides a more direct solution without the need to go via subsetting an integer vector with a factor returned from cut
.
data.frame(a, b, e=findInterval(a, c(-Inf, 15, 20, 25, 30)))
You don't need a loop. You have nearly all you need:
test[test$a > 15 & test$a < 20, "e"] <- 2