I\'m having trouble passing this rle
function on a data.frame
. Function works great on another set:
fgroup <- aggregate(fevents2
The problem is that a factor is *not* an atomic vector as the error clearly says. Either convert all the factors to characters first (and not by coercing them to numeric!) or do the conversion inside the anonymous function you are applying.
So this, which implements the second idea, works:
aggregate(fevents2[,3:14], list(weeks = fevents2[, 1]),
function(x) rle(as.character(x))$values)
after a fashion:
> aggregate(fevents2[,3:14], list(weeks = fevents2[, 1]),
+ function(x) rle(as.character(x))$values)
weeks vv.1 vv.2 vv.3 vv.4 vv.5 vv.6 rv.1 rv.2 rv.3 rv.4 rv.5 rv.6 rv.7 ja.1
1 1 C RR G RR nil C AA G AA nil nil C VB G VB nil C VB G VB nil C VV
ja.2 ja.3 ja.4 aa.1 aa.2 bv.1 bv.2 bv.3 aj.1 aj.2 aj.3 aj.4 aj.5 vb.1 vb.2
1 nil C VV G VV C AJ nil nil C VR G VR C RJ nil C RV G RV nil C AA nil
vb.3 vb.4 vb.5 rj.1 rj.2 rr vr.1 vr.2 vr.3 vr.4 vr.5 bb jr.1 jr.2 jr.3
1 C AJ nil C AJ C JR G JR C BB C JA nil C RJ nil C RJ C BV nil C VB G VB
jr.4 jr.5
1 nil C JA
though I am not sure what you expected to get - there is only one week here and aggregate
and rle
have stuck all the values together. Did you want separate $values
for each of the variables in fevents2
that you are aggregating over?
Another thing:
as.numeric(as.character(fevents2))
can't possibly work as the data are not numeric! and you can't apply those functions to a data frame and get anything like what you intended - if they work at all.
The sapply()
thing should work. Here is a version that checks whether each variable is a factor or not and coerces it if it is:
fevents3 <- sapply(fevents2,
function(x) if(is.factor(x)) { as.character(x) } else { x })
But note sapply()
simplifies to a matrix which will change the aggregate()
method dispatched:
> class(fevents3)
[1] "matrix"
Instead perhaps
fevents3 <- lapply(fevents2,
function(x) if(is.factor(x)) { as.character(x) } else { x })
fevents3 <- data.frame(fevents3, stringsAsFactors = FALSE)
Now if you wanted to apply rle()
to each column of the split-up data and keep the separate how about
spl <- split(fevents3, list(weeks = fevents3[, 1]))
res <- lapply(spl, function(x) lapply(x[, 3:14], function(y) rle(y)$values))
which gives
> res
$`1`
$`1`$vv
[1] "C RR" "G RR" "nil" "C AA" "G AA" "nil"
$`1`$rv
[1] "nil" "C VB" "G VB" "nil" "C VB" "G VB" "nil"
$`1`$ja
[1] "C VV" "nil" "C VV" "G VV"
$`1`$aa
[1] "C AJ" "nil"
$`1`$bv
[1] "nil" "C VR" "G VR"
$`1`$aj
[1] "C RJ" "nil" "C RV" "G RV" "nil"
$`1`$vb
[1] "C AA" "nil" "C AJ" "nil" "C AJ"
$`1`$rj
[1] "C JR" "G JR"
$`1`$rr
[1] "C BB"
$`1`$vr
[1] "C JA" "nil" "C RJ" "nil" "C RJ"
$`1`$bb
[1] "C BV"
$`1`$jr
[1] "nil" "C VB" "G VB" "nil" "C JA"
Which is the same answer as that for aggregate()
above, but with each rle()
output kept separate:
> unlist(res)
1.vv1 1.vv2 1.vv3 1.vv4 1.vv5 1.vv6 1.rv1 1.rv2 1.rv3 1.rv4 1.rv5
"C RR" "G RR" "nil" "C AA" "G AA" "nil" "nil" "C VB" "G VB" "nil" "C VB"
1.rv6 1.rv7 1.ja1 1.ja2 1.ja3 1.ja4 1.aa1 1.aa2 1.bv1 1.bv2 1.bv3
"G VB" "nil" "C VV" "nil" "C VV" "G VV" "C AJ" "nil" "nil" "C VR" "G VR"
1.aj1 1.aj2 1.aj3 1.aj4 1.aj5 1.vb1 1.vb2 1.vb3 1.vb4 1.vb5 1.rj1
"C RJ" "nil" "C RV" "G RV" "nil" "C AA" "nil" "C AJ" "nil" "C AJ" "C JR"
1.rj2 1.rr 1.vr1 1.vr2 1.vr3 1.vr4 1.vr5 1.bb 1.jr1 1.jr2 1.jr3
"G JR" "C BB" "C JA" "nil" "C RJ" "nil" "C RJ" "C BV" "nil" "C VB" "G VB"
1.jr4 1.jr5
"nil" "C JA"
> aggregate(fevents2[,3:14], list(weeks = fevents2[, 1]),
+ function(x) rle(as.character(x))$values)
weeks vv.1 vv.2 vv.3 vv.4 vv.5 vv.6 rv.1 rv.2 rv.3 rv.4 rv.5 rv.6 rv.7 ja.1
1 1 C RR G RR nil C AA G AA nil nil C VB G VB nil C VB G VB nil C VV
ja.2 ja.3 ja.4 aa.1 aa.2 bv.1 bv.2 bv.3 aj.1 aj.2 aj.3 aj.4 aj.5 vb.1 vb.2
1 nil C VV G VV C AJ nil nil C VR G VR C RJ nil C RV G RV nil C AA nil
vb.3 vb.4 vb.5 rj.1 rj.2 rr vr.1 vr.2 vr.3 vr.4 vr.5 bb jr.1 jr.2 jr.3
1 C AJ nil C AJ C JR G JR C BB C JA nil C RJ nil C RJ C BV nil C VB G VB
jr.4 jr.5
1 nil C JA
[Note: This is only true here because the data snippet you show has just one week. I can't recall how unlist(res))
will look if there is more than one week.]