Here I make a new column to indicate whether myData is above or below its median
### MedianSplits based on Whole Data
#create some test data
myDataFrame=data.fra
Here is a hack-ish way. Hadley may come with something more elegant:
To start, we simple concatenate the by
output:
R> do.call(c,byOutput)
A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5
1 2 2 1 1 1 1 2 1 2 1 2 1 1 2
and what matters that we get the factor levels 1 and 2 here which we can use to re-index a new factor with those levels:
R> c("Below","Above")[do.call(c,byOutput)]
[1] "Below" "Above" "Above" "Below" "Below" "Below" "Below" "Above"
[8] "Below" "Above" "Below" "Above" "Below" "Below" "Above"
R> as.factor(c("Below","Above")[do.call(c,byOutput)])
[1] Below Above Above Below Below Below Below Above Below Above
[11] Below Above Below Below Above
Levels: Above Below
which we can then assign into the data.frame
you wanted to modify:
R> myDataFrame$FactorLevelMedianSplit <-
as.factor(c("Below","Above")[do.call(c,byOutput)])
Update: Never mind, we'd need to reindex myDataFrame to be sorted A A ... A B ... B C ... C as well before we add the new column. Left as an exercise...