Sort data by number of NA's in each line

你。 提交于 2019-12-23 12:57:30

问题


I want to sort a data frame that has some missing values.

name    dist1   dist2   dist3   prop1   prop2   prop3   month2  month5  month10 month25 month50 issue
1   A1  232.0   1462.91 232.0000    728.00  0.370   0.05633453  1188.1  1188.1  1188.1  1188.1  1188.1  Yes
2   A2  142.0   58.26   2847.7690   17.10   0.080   0.07667063  14581.6 15382.0 19510.9 25504.0 NA  Yes
3   A3  102.0   1160.94 102.0000    53.40   0.090   0.07667063  144.8   144.8   144.8   291.8   761.4   Yes
4   A4  126.0   1377.23 126.0000    64.30   2.120   0.11040091  366.5   496.8   665.3   NA  NA  Yes
5   A5  118.0   654.94  118.0000    16.50   0.030   0.05841914  0.0 10.2    198.4   733.7   1717.0  Yes
6   A6  110.0   1084.63 110.0000    340.00  0.390   0.07405169  4635.0  4863.0  7725.0  8028.0  NA  Yes
7   A7  123.0   0.00    1801.1811   83.40   0.030   0.06420000  4686.9  4803.6  5052.0  5418.5  7237.5  Yes
8   A8  125.0   0.00    5557.7428   1.14    0.050   0.06604286  4932.0  8607.0  10827.0 13679.0 NA  Yes
9   A9  108.0   0.00    6207.3491   92.30   0.070   0.08710000  3360.0  7440.0  10508.0 12571.0 16925.0 Yes
10  A10 60.0    0.00    2500.0000   0.73    0.020   0.06819053  15.1    19.9    19.9    19.9    19.9    Yes
11  A11 210.0   700.78  210.0000    7.78    0.290   0.07866589  182.4   182.4   182.4   298.0   1864.1  No
12  A12 155.0   530.48  155.0000    1.33    0.170   0.07578345  1.0 2.0 3.0 4.0 5.0 No
13  A13 21.0    840.00  21.0000 308.00  0.030   0.05508490  1008.7  1450.8  2439.8  4947.2  6818.9  No
14  A14 114.0   1083.24 114.0000    171.00  0.040   0.04670335  564.7   722.8   760.6   879.8   944.4   No
15  A15 109.0   1051.03 109.0000    20.30   0.070   0.05274389  5503.1  9127.9  11167.4 18226.1 20243.4 No
16  A16 107.0   922.80  107.0000    0.03    0.020   0.04403927  232.6   1016.5  2203.8  3844.9  4000.6  No
17  A17 100.0   278.10  100.0000    0.82    0.100   0.07270705  2754.0  4701.7  5311.9  9579.3  14651.3 No
18  A18 138.0   798.42  138.0000    1.04    0.100   0.07148773  3657.2  4014.0  4525.9  4674.7  4838.5  No
19  A19 105.0   695.02  105.0000    1.41    0.120   0.06716963  3530.2  4076.1  11517.0 18899.5 21073.0 No
20  A20 81.0    12.00   879.2651    16.70   0.120   0.08087098  6477.1  6788.8  7320.0  7947.7  8726.6  No
21  A21 102.0   1052.96 102.0000    66.40   0.010   0.02926897  181.7   294.0   355.5   1431.6  NA  No

only month2 month5 month10 month25 month50 contain NAs, and if one if the earlier one is NA, then all the rests are also NAs.

ie.e if month2 is NA, then month5 month10 month25 month50 are all NA's.

I want to sort the data based on the number of missing values in each line.

The sorted data frame should have all complete data first, followed by lines with 1 missing value, then with 2, and so on.

Can anyone help me?


回答1:


You can use

dat[order(rowSums(is.na(dat))), ]

where dat is the name of your data frame.




回答2:


Is this what you want? Assume dat is your given sample data.

> s <- sort(apply(is.na(dat), 1, sum))
> dat[names(s), ]


来源:https://stackoverflow.com/questions/24093446/sort-data-by-number-of-nas-in-each-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!