Mixed Merge in R - Subscript solution?

前端 未结 2 813
无人及你
无人及你 2021-02-06 16:09

Note: I changed the example from when I first posted. My first example was too simplified to capture the real problem.

I have two data frames

2条回答
  •  囚心锁ツ
    2021-02-06 16:27

    There are several ways to do this (it is R, after all) but I think the most clear is creating an index. We need a function that creates a sequential index (starting at one and ending with the number of observations).

    seq_len(3) 
    > [1] 1 2 3
    

    But we need to calculate this index within each grouping variable (state). For this we can use R's ave function. It takes a numeric as the first argument, then the grouping factors, and finally the function to be applied in each group.

    s1$index <- with(s1,ave(value1,state,FUN=seq_len))
    s2$index <- with(s2,ave(value2,state,FUN=seq_len))
    

    (Note the use of with, which tells R to search for the variables within the environment/dataframe. This is better practice than using s1$value1, s2$value2, etc.)

    Now we can simply merge (join) the two data frames (by the variables present in the both data frames: state and index).

    merge(s1,s2)
    

    which gives

       state index value1 value2
    1    IA     1      1      6
    2    IA     2      2      7
    3    IA     3      3      8
    4    IL     1      4      3
    5    IL     2      5      4
    6    IL     3      6      5
    

    For this to work, there should be the same number of observations by state in each of the data frames.

    [Edit: commented the code for clarity.] [Edit: Used seq_len instead of creating a new function as suggested by hadley.]

提交回复
热议问题