I have some mixed-type data that I would like to store in an R data structure of some sort. Each data point has a set of fixed attributes which may be 1-d numeric, factors, or
I would just use the data in the "long" format.
E.g.
> d1 <- data.frame(id=1:3, num_words=c(2,1,4), phrase=c("hello world", "greetings", "take me to your leader"))
> d2 <- data.frame(id=c(rep(1,2), rep(2,1), rep(3,5)), token_length=c(5,5,9,4,2,2,4,6))
> d2$tokenid <- with(d2, ave(token_length, id, FUN=seq_along))
> d <- merge(d1,d2)
> subset(d, nchar(phrase) > 10)
id num_words phrase token_length tokenid
1 1 2 hello world 5 1
2 1 2 hello world 5 2
4 3 4 take me to your leader 4 1
5 3 4 take me to your leader 2 2
6 3 4 take me to your leader 2 3
7 3 4 take me to your leader 4 4
8 3 4 take me to your leader 6 5
> with(d, tapply(token_length, id, mean))
1 2 3
5.0 9.0 3.6
Once the data is in the long format, you can use sqldf or plyr to extract what you want from it.