问题
I want to sort a data frame that has some missing values.
name dist1 dist2 dist3 prop1 prop2 prop3 month2 month5 month10 month25 month50 issue
1 A1 232.0 1462.91 232.0000 728.00 0.370 0.05633453 1188.1 1188.1 1188.1 1188.1 1188.1 Yes
2 A2 142.0 58.26 2847.7690 17.10 0.080 0.07667063 14581.6 15382.0 19510.9 25504.0 NA Yes
3 A3 102.0 1160.94 102.0000 53.40 0.090 0.07667063 144.8 144.8 144.8 291.8 761.4 Yes
4 A4 126.0 1377.23 126.0000 64.30 2.120 0.11040091 366.5 496.8 665.3 NA NA Yes
5 A5 118.0 654.94 118.0000 16.50 0.030 0.05841914 0.0 10.2 198.4 733.7 1717.0 Yes
6 A6 110.0 1084.63 110.0000 340.00 0.390 0.07405169 4635.0 4863.0 7725.0 8028.0 NA Yes
7 A7 123.0 0.00 1801.1811 83.40 0.030 0.06420000 4686.9 4803.6 5052.0 5418.5 7237.5 Yes
8 A8 125.0 0.00 5557.7428 1.14 0.050 0.06604286 4932.0 8607.0 10827.0 13679.0 NA Yes
9 A9 108.0 0.00 6207.3491 92.30 0.070 0.08710000 3360.0 7440.0 10508.0 12571.0 16925.0 Yes
10 A10 60.0 0.00 2500.0000 0.73 0.020 0.06819053 15.1 19.9 19.9 19.9 19.9 Yes
11 A11 210.0 700.78 210.0000 7.78 0.290 0.07866589 182.4 182.4 182.4 298.0 1864.1 No
12 A12 155.0 530.48 155.0000 1.33 0.170 0.07578345 1.0 2.0 3.0 4.0 5.0 No
13 A13 21.0 840.00 21.0000 308.00 0.030 0.05508490 1008.7 1450.8 2439.8 4947.2 6818.9 No
14 A14 114.0 1083.24 114.0000 171.00 0.040 0.04670335 564.7 722.8 760.6 879.8 944.4 No
15 A15 109.0 1051.03 109.0000 20.30 0.070 0.05274389 5503.1 9127.9 11167.4 18226.1 20243.4 No
16 A16 107.0 922.80 107.0000 0.03 0.020 0.04403927 232.6 1016.5 2203.8 3844.9 4000.6 No
17 A17 100.0 278.10 100.0000 0.82 0.100 0.07270705 2754.0 4701.7 5311.9 9579.3 14651.3 No
18 A18 138.0 798.42 138.0000 1.04 0.100 0.07148773 3657.2 4014.0 4525.9 4674.7 4838.5 No
19 A19 105.0 695.02 105.0000 1.41 0.120 0.06716963 3530.2 4076.1 11517.0 18899.5 21073.0 No
20 A20 81.0 12.00 879.2651 16.70 0.120 0.08087098 6477.1 6788.8 7320.0 7947.7 8726.6 No
21 A21 102.0 1052.96 102.0000 66.40 0.010 0.02926897 181.7 294.0 355.5 1431.6 NA No
only month2 month5 month10 month25 month50 contain NAs, and if one if the earlier one is NA, then all the rests are also NAs.
ie.e if month2 is NA, then month5 month10 month25 month50 are all NA's.
I want to sort the data based on the number of missing values in each line.
The sorted data frame should have all complete data first, followed by lines with 1 missing value, then with 2, and so on.
Can anyone help me?
回答1:
You can use
dat[order(rowSums(is.na(dat))), ]
where dat
is the name of your data frame.
回答2:
Is this what you want? Assume dat
is your given sample data.
> s <- sort(apply(is.na(dat), 1, sum))
> dat[names(s), ]
来源:https://stackoverflow.com/questions/24093446/sort-data-by-number-of-nas-in-each-line