R: populating and/or duplicating rows based upon other columns

99封情书 提交于 2019-12-06 03:04:37

问题


My question is based upon this question.

I have a data as below. I want to fill cells by first looking down and then by looking up as long as the bom is same. In case of bom=A, I want to fill up rows as shown. But in case of bom=B, as the type_p column is different, I want to duplicate rows and feel the blanks

bom=c(rep("A",4),rep("B",3))
Part=c("","lambda","beta","","tim","tom","")
type_p=c("","sub","sub","","sub","pan","")
ww=c(1,2,3,4,1,2,3)
df=data.frame(bom,Part,type_p,ww)

> df
  bom   Part type_p ww
1   A                1
2   A lambda    sub  2
3   A   beta    sub  3
4   A                4
5   B    tim    sub  1
6   B    tom    pan  2
7   B                3

The final data that I want is as below

    bom Part    type_p  ww
1   A   lambda  sub      1
2   A   lambda  sub      2
3   A   beta    sub      3
4   A   beta    sub      4
5   B   tim     sub      1
6   B   tim     sub      2
7   B   tim     sub      3
5   B   tom     pan      1
6   B   tom     pan      2
7   B   tom     pan      3

________________________________________Update 1

The logic that I want is as below. Please remember that my data is very huge and I have thousands of values in each column.

bom and ww columns are always populated/filled in incoming data

  1. Check if an entry in column bom has more than 1 value in the column type_p
  2. If there is only 1 value then fill blanks in type_p and ww columns by first looking down and then looking up. In this case bom=A has only one value in type_p (sub)
  3. If an entry in column bom has more than 1 unique value in type_p column then create additional sets of the same rows of that bom such that total sets will be equal to distinct values in type_p column for that bom. In this case bom=B has two values in type_p (sub and pan)
  4. Fill blanks in type_p and ww columns by first looking down and then looking up (look at the source row to fill up values)

===========================================================Update 2

After step 3, the data frame would look like below

> df
   bom   Part type_p ww
1    A lambda    sub  1
2    A lambda    sub  2
3    A   beta    sub  3
4    A   beta    sub  4
5    B    tim    sub  1
6    B                2
7    B                3
8    B                1
9    B    tom    pan  2
10   B                3

回答1:


With tidyr and dplyr, you could manage to do something near what you aim

library(tidyr)
library(dplyr)
# replacing empty string with NA
df <- df %>% mutate_each(funs(sub("^$", NA, .)), Part, type_p)
# filling missing values 
df <- df %>% fill(Part, type_p,.direction = "down") %>% fill(Part, type_p,.direction = "up")

df
#>   bom   Part type_p ww
#> 1   A lambda    sub  1
#> 2   A lambda    sub  2
#> 3   A   beta    sub  3
#> 4   A   beta    sub  4
#> 5   B    tim    sub  1
#> 6   B    tom    pan  2
#> 7   B    tom    pan  3

To obtain what you described (in question and comment), you could treat BOM A & B separately:

bind_rows(
  df %>% filter(bom == "A"), 
  df %>% filter(bom == "B") %>%
    complete(nesting(bom, Part, type_p), ww)
)
#> Source: local data frame [10 x 4]
#> 
#>       bom   Part type_p    ww
#>    (fctr)  (chr)  (chr) (dbl)
#> 1       A lambda    sub     1
#> 2       A lambda    sub     2
#> 3       A   beta    sub     3
#> 4       A   beta    sub     4
#> 5       B    tim    sub     1
#> 6       B    tim    sub     2
#> 7       B    tim    sub     3
#> 8       B    tom    pan     1
#> 9       B    tom    pan     2
#> 10      B    tom    pan     3


来源:https://stackoverflow.com/questions/35711655/r-populating-and-or-duplicating-rows-based-upon-other-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!