R self reference

后端未结

关注

 4  1952

终归单人心

In R I find myself doing something like this a lot:

adataframe[adataframe$col==something]<-adataframe[adataframe$col==something)]+1

相关标签:

4条回答

南旧

2020-11-28 06:33

You should be paying more attention to Gabor Grothendeick (and not just in this instance.) The cited inc function on Matt Asher's blog does all of what you are asking:

(And the obvious extension works as well.)

add <- function(x, inc=1) {
   eval.parent(substitute(x <- x + inc))
 }
# Testing the `inc` function behavior

EDIT: After my temporary annoyance at the lack of approval in the first comment, I took the challenge of adding yet a further function argument. Supplied with one argument of a portion of a dataframe, it would still increment the range of values by one. Up to this point has only been very lightly tested on infix dyadic operators, but I see no reason it wouldn't work with any function which accepts only two arguments:

transfn <- function(x, func="+", inc=1) {
   eval.parent(substitute(x <- do.call(func, list(x , inc)))) }

(Guilty admission: This somehow "feels wrong" from the traditional R perspective of returning values for assignment.) The earlier testing on the inc function is below:

df <- data.frame(a1 =1:10, a2=21:30, b=1:2)
 inc <- function(x) {
   eval.parent(substitute(x <- x + 1))
 }

#---- examples===============>

> inc(df$a1)  # works on whole columns
> df
   a1 a2 b
1   2 21 1
2   3 22 2
3   4 23 1
4   5 24 2
5   6 25 1
6   7 26 2
7   8 27 1
8   9 28 2
9  10 29 1
10 11 30 2
> inc(df$a1[df$a1>5]) # testing on a restricted range of one column
> df
   a1 a2 b
1   2 21 1
2   3 22 2
3   4 23 1
4   5 24 2
5   7 25 1
6   8 26 2
7   9 27 1
8  10 28 2
9  11 29 1
10 12 30 2

> inc(df[ df$a1>5, ])  #testing on a range of rows for all columns being transformed
> df
   a1 a2 b
1   2 21 1
2   3 22 2
3   4 23 1
4   5 24 2
5   8 26 2
6   9 27 3
7  10 28 2
8  11 29 3
9  12 30 2
10 13 31 3
# and even in selected rows and grepped names of columns meeting a criterion
> inc(df[ df$a1 <= 3, grep("a", names(df)) ])
> df
   a1 a2 b
1   3 22 1
2   4 23 2
3   4 23 1
4   5 24 2
5   8 26 2
6   9 27 3
7  10 28 2
8  11 29 3
9  12 30 2
10 13 31 3

0 讨论(0)

小鲜肉

2020-11-28 06:34
Try package data.table and its := operator. It's very fast and very short.
```
DT[col1==something, col2:=col3+1]
```
The first part col1==something is the subset. You can put anything here and use the column names as if they are variables; i.e., no need to use $. Then the second part col2:=col3+1 assigns the RHS to the LHS within that subset, where the column names can be assigned to as if they are variables. := is assignment by reference. No copies of any object are taken, so is faster than <-, =, within and transform.

Also, soon to be implemented in v1.8.1, one end goal of j's syntax allowing := in j like that is combining it with by, see question: when should I use the := operator in data.table.

UDPDATE : That was indeed released (:= by group) in July 2012.
0 讨论(0)
发布评论:

提交评论
- 加载中...
后悔当初

2020-11-28 06:48
Here is what you can do. Let us say you have a dataframe
```
df = data.frame(x = 1:10, y = rnorm(10))
```
And you want to increment all the y by 1. You can do this easily by using transform
```
df = transform(df, y = y + 1)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2020-11-28 06:49
I'd be partial to (presumably the subset is on rows)
```
ridx <- adataframe$col==something
adataframe[ridx,] <- adataframe[ridx,] + 1
```
which doesn't rely on any fancy / fragile parsing, is reasonably expressive about the operation being performed, and is not too verbose. Also tends to break lines into nicely human-parse-able units, and there is something appealing about using standard idioms -- R's vocabulary and idiosyncrasies are already large enough for my taste.
0 讨论(0)
发布评论:

提交评论
- 加载中...