Merging two data frames with different sizes and missing values

独自空忆成欢 提交于 2019-12-24 11:09:56

问题


I'm having a problem merging two data frames in R.

The first one consists of 103731 obs of 6 variables. The variable that I have to use to merge has 77111 unique values and the rest are NAs with a value of 0. The second one contains the frequency of those variables plus the frequency of the NAs so a frame of 77112 obs for 2 variables.

The resulting frame I need to get is the first one joined with the frequency for the merging variable, so a df of 103731 obs with the frequency for each value of the merging variable (so with duplicates if freq > 1 and also for each NA (or 0)).

Can anybody help me?

The result I'm getting now contains a data frame of 1 894 919 obs and I used:

tot = merge(df1, df2, by = "mergingVar", all= F, sort = F);  

Also I played a lot with 'all=' and none of the variations gave the right df.


回答1:


why don't you just take the frequency table of your first table?

a <- data.frame(a = c(NA, NA, 2,2,3,3,3))
data.frame(table(a, useNA = 'ifany'))

     a Freq
1    2    2
2    3    3
3 <NA>    2

or mutate from plyr

ddply(a, .(a), mutate, freq = length(a))

   a freq
1  2    2
2  2    2
3  3    3
4  3    3
5  3    3
6 NA    2
7 NA    2


来源:https://stackoverflow.com/questions/22225409/merging-two-data-frames-with-different-sizes-and-missing-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!