How to rewrite this Stata code in R?

后端 未结 4 607
挽巷
挽巷 2021-01-02 07:34

One of the things Stata does well is the way it constructs new variables (see example below). How to do this in R?

foreach i in A B C D {  
    forval n=1990         


        
相关标签:
4条回答
  • 2021-01-02 08:08

    Both Spacedman and Joshua have very valid points. As Stata has only one dataset in memory at any given time, I'd suggest to add the variables to a dataframe (which is also a kind of list) instead of to the global environment (see below).

    But honestly, the more R-ish way to do so, is to keep your factors factors instead of variable names.

    I make some data as I believe it is in your R version now (at least, I hope so...)

    Data <- data.frame(
        popA1989 = 1:10,
        popB1989 = 10:1,
        popC1989 = 11:20,
        popD1989 = 20:11
    )
    
    Trend <- replicate(11,runif(10,-0.1,0.1))
    

    You can then use the stack() function to obtain a dataframe where you have a factor pop and a numeric variable year

    newData <- stack(Data)
    newData$pop <- substr(newData$ind,4,4)
    newData$year <- as.numeric(substr(newData$ind,5,8))
    newData$ind <- NULL
    

    Filling up the dataframe is then quite easy :

    for(i in 1:11){
    
      tmp <- newData[newData$year==(1988+i),]
      newData <- rbind(newData,
          data.frame( values = tmp$values*Trend[,i],
                      pop = tmp$pop,
                      year = tmp$year+1
          )
      )
    }
    

    In this format, you'll find most R commands (selections of some years, of a single population, modelling effects of either or both, ...) a whole lot easier to perform later on.

    And if you insist, you can still create a wide format with unstack()

    unstack(newData,values~paste("pop",pop,year,sep=""))
    

    Adaptation of Joshua's answer to add the columns to the dataframe :

    for(L in LETTERS[1:4]) {
      for(i in 1990:2000) {
        new <- paste("pop",L,i,sep="")  # create name for new variable
        old <- get(paste("pop",L,i-1,sep=""),Data)  # get old variable
        trend <- Trend[,i-1989]  # get trend variable
        Data <- within(Data,assign(new, old*(1+trend)))
      }
    }
    
    0 讨论(0)
  • 2021-01-02 08:15

    DONT do it in R. The reason its messy is because its UGLY code. Constructing lots of variables with programmatic names is a BAD THING. Names are names. They have no structure, so do not try to impose one on them. Decent programming languages have structures for this - rubbishy programming languages have tacked-on 'Macro' features and end up with this awful pattern of constructing variable names by pasting strings together. This is a practice from the 1970s that should have died out by now. Don't be a programming dinosaur.

    For example, how do you know how many popXXXX variables you have? How do you know if you have a complete sequence of pop1990 to pop2000? What if you want to save the variables to a file to give to someone. Yuck, yuck yuck.

    Use a data structure that the language gives you. In this case probably a list.

    0 讨论(0)
  • 2021-01-02 08:25

    Assuming you have population data in vector pop1989 and data for trend in trend.

    require(stringr)# because str_c has better default for sep parameter
    dta <- kronecker(pop1989,cumprod(1+trend))
    names(dta) <- kronecker(str_c("pop",LETTERS[1:4]),1990:2000,str_c)
    
    0 讨论(0)
  • 2021-01-02 08:34

    Assuming popA1989, popB1989, popC1989, popD1989 already exist in your global environment, the code below should work. There are certainly more "R-like" ways to do this, but I wanted to give you something similar to your Stata code.

    for(L in LETTERS[1:4]) {
      for(i in 1990:2000) {
        new <- paste("pop",L,i,sep="")  # create name for new variable
        old <- get(paste("pop",L,i-1,sep=""))  # get old variable
        trend <- get(paste("trend",i,sep=""))  # get trend variable
        assign(new, old*(1+trend))
      }
    }
    
    0 讨论(0)
提交回复
热议问题