Using do loops in R to create new variables

前端未结

关注

 6  1803

I\'m a long time SAS programmer looking to make the jump to R. I know R isn\'t all that great for variable re-coding but is there a way to do this with do loops.

相关标签:

6条回答

遇见更好的自我

2020-12-20 01:46
This is really late, but you can actually do this without loops or *apply. I'm assuming that the variables are columns in a data frame (which makes sense if the OP is familiar with SAS datasets and macros).
```
df[paste("c", 1:100, sep="_")] <- df[paste("a", 1:100, sep="_")] +
                                  df[paste("b", 1:100, sep="_")]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
旧时难觅i

2020-12-20 01:52
SAS uses a rudimentary macro language, which depends on text replacement rather than evaluation of expressions like any proper programming language. Your SAS files are essentially two things: SAS commands, and Macro expressions (things starting with '%'). Macro languages are highly problematic and hard to debug (for example, do expressions within expressions get expanded? Why do you have to do "&&x" or even "&&&x"? Why do you need two semicolons here?). It's clunky, and inelegant compared to a well-designed programming language that is based on a single syntax.

If your a_i variables are single numbers, then you should have made them as a vector - e.g:
```
> a = 1:100
> b = runif(100)
```
Now I can get elements easy:
```
> a[1]
```
and add up in parallel:
```
> c = a + b
```
You could do it with a loop, initialising c first:
```
> c = rep(0,100)
> for(i in 1:100){
   c[i]=a[i]+b[i]
   }
```
But that would be sloooooow.

Nearly every R beginner asks 'how do I create a variable a_i for some values of i', and then shortly afterwards they ask how to access variable a_i for some values of i. The answer is always to make a as either a vector or a list.
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉梦人生

2020-12-20 01:54
I suspect that if you have one hundred variables a_1, a_2, ..., a_100, all of your variables are related. In fact, if you want to do
```
c_1 = a_1 + b_1
```
then a, b, c are related. Therefore, I recommend that you combine all of your variables into a single data frame, where one column is a and another is b.

The question is how do you combine your variables in a sensible way. However, to give a useful answer, can you tell us how these variables are created?

Perhaps this isn't suitable, for your case. If not, a bit more information would be useful.
0 讨论(0)
发布评论:

提交评论
- 加载中...
后悔当初

2020-12-20 01:56
This is actually a pretty interesting question. From my reading and recent (forced) use of SAS, the question seems to be trying to recode variables in a SAS dataset within a data step using a bit of macro code. Otherwise if they were free variables being created they would start with a & character. I think the example code would actually be better represented like:
```
%macro recodevars;
data test;
  set test;

  %do i=1 %to 100;
  c_&i = a_&i + b_&i;
  %end;

run;
%mend recodevars;
%recodevars;
```
You could do something similar in R like this example:
```
test <- data.frame(vara1=1:10,varb1=2:11,vara2=3:12,varb2=4:13)

test[paste0("varc",1:2)] <- test[paste0("vara",1:2)] + test[paste0("varb",1:2)]
```
I'd be curious to know what insight others have to answer the question if it is applied to a dataframe and not free variables.
0 讨论(0)
发布评论:

提交评论
- 加载中...
清酒与你

2020-12-20 02:00
This stuff is trivial. To me, it looks like you want to find a way to create commands automatically and execute them. Easy peasy.

For instance, this assigns to C_i the value in A_i:
```
for(i in 1:100){
    tmpCmd = paste("C_",i,"= A_",i, sep = "")
    eval(parse(text = tmpCmd))
}
rm(i, tmpCmd)
```
Just remember eval(parse(text = ...))) and paste(), and you're off to the races in creating loops of commands to execute.

You can then add in the operation you'd like to do, i.e. the summation with B_i, by swapping in this line:
```
    tmpCmd = paste("C_",i,"= A_",i," + B_",i, sep = "")
```
However, others are right that using good data structures is a way to avoid having to do a lot of tedious things like this. Yet, when you need to, such repetitive code isn't hard to devise.
0 讨论(0)
发布评论:

提交评论
- 加载中...

青春惊慌失措

2020-12-20 02:00

The R way would be to use lists.

> a_1 = 1
> a_2 = 2
> a_3 = 3
> a_4 = 4
> a_5 = 5

> b_1 = 1
> b_2 = 2
> b_3 = 3
> b_4 = 4
> b_5 = 5

> a.list <- ls(patter='a_*')
> a.list
[1] "a_1" "a_2" "a_3" "a_4" "a_5"

and define blist as well.

if(length(a.list)==length(b.list)){
   c.list <- lapply(1:length(a.list), function(x) eval(parse(text=a.list[x])) + eval(parse(text=b.list[x])))

   c.list.names <- paste('c', 1:length(a.list), sep='_')

   lapply(1:length(c.list), function(x) assign(c.list.names[x], c.list[x], envir=.GlobalEnv)) 
}

I can't think of a way to do this without the eval(parse(yuk)) and assign unless you follow csgillespie's advice (which is the right way!)

0 讨论(0)