Vectorize my thinking: Vector Operations in R

前端未结

关注

 3  1761

So earlier I answered my own question on thinking in vectors in R. But now I have another problem which I can\'t \'vectorize.\' I know vectors are faster and loops slower, b

相关标签:

3条回答

悲哀的现实

2020-12-24 09:55
Here's what seems like another very R-type way to generate the sums. Generate a vector that is as long as your input vector, containing nothing but the repeated sum of n elements. Then, subtract your original vector from the sums vector. The result: a vector (isums) where each entry is your original vector less the ith element.
```
> (my.data$item[my.data$fixed==0])
[1] 1 1 3 5 7
> sums <- rep(sum(my.data$item[my.data$fixed==0]),length(my.data$item[my.data$fixed==0]))
> sums
[1] 17 17 17 17 17
> isums <- sums - (my.data$item[my.data$fixed==0])
> isums
[1] 16 16 14 12 10
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
臣服心动

2020-12-24 10:03
Strangely enough, learning to vectorize in R is what helped me get used to basic functional programming. A basic technique would be to define your operations inside the loop as a function:
```
data = ...;
items = ...;

leave_one_out = function(i) {
   data1 = data[items != i];
   delta = ...;  # some operation on data1
   return delta;
}


for (j in items) {
   delta.list = cbind(delta.list, leave_one_out(j));
}
```
To vectorize, all you do is replace the for loop with the sapply mapping function:
```
delta.list = sapply(items, leave_one_out);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
终归单人心

2020-12-24 10:20
This is no answer, but I wonder if any insight lies in this direction:
```
> tapply((my.data$item[my.data$fixed==0])[-1], my.data$year[my.data$fixed==0][-1], sum)
```
tapply produces a table of statistics (sums, in this case; the third argument) grouped by the parameter given as the second argument. For example
```
2001 2003 2005 2007
1    3    5    7
```
The [-1] notation drops observation (row) one from the selected rows. So, you could loop and use [-i] on each loop
```
for (i in 1:length(my.data$item)) {
  tapply((my.data$item[my.data$fixed==0])[-i], my.data$year[my.data$fixed==0][-i], sum)
}
```
keeping in mind that if you have any years with only 1 observation, then the tables returned by the successive tapply calls won't have the same number of columns. (i.e., if you drop out the only observation for 2001, then 2003, 2005, and 2007 would be te only columns returned).
0 讨论(0)
发布评论:

提交评论
- 加载中...