How to avoid NA columns in dcast() output?

后端未结

关注

 4  1942

名媛妹妹

How can I avoid NA columns in dcast() output from the reshape2 package?

In this dummy example the dcast() o

相关标签:

4条回答

独厮守ぢ

2021-01-21 06:26

You could rename the NA column of the output and then make it NULL. (This works for me).

require(reshape2)
data(iris)
iris[ , "Species2"] <- iris[ , "Species"]
iris[ 2:7, "Species2"] <- NA

(x <- dcast(iris, Species ~ Species2, value.var = "Sepal.Width", 
            fun.aggregate = length)) 

setnames(x , c("setosa", "versicolor", "virginica", "newname"))

x$newname <- NULL

0 讨论(0)

说谎

2021-01-21 06:28
Here is how I was able to get around it:
```
iris[is.na(iris)] <- 'None'

x <- dcast(iris, Species ~ Species2, value.var="Sepal.Width", fun.aggregate = length)

x$None <- NULL
```
The idea is that you replace all the NAs with 'None', so that dcast creates a column called 'None' rather than 'NA'. Then, you can just delete that column in the next step if you don't need it.
0 讨论(0)
发布评论:

提交评论
- 加载中...

庸人自扰

2021-01-21 06:37

One solution that I've found, which I'm not positively unhappy with, is based on the dropping NA values approach suggested in the comments. It leverages the subset argument in dcast() along with .() from plyr:

require(plyr)
(x <- dcast(iris, Species ~ Species2, value.var = "Sepal.Width",
            fun.aggregate = length, subset = .(!is.na(Species2))))
##     Species setosa versicolor virginica
##1     setosa     44          0         0
##2 versicolor      0         50         0
##3  virginica      0          0        50

For my particular purpose (within a custom function) the following works better:

(x <- dcast(iris, Species ~ Species2, value.var = "Sepal.Width", 
            fun.aggregate = length, subset = .(!is.na(get("Species2")))))
##     Species setosa versicolor virginica
##1     setosa     44          0         0
##2 versicolor      0         50         0
##3  virginica      0          0        50

0 讨论(0)

挽巷

2021-01-21 06:41

library(dplyr)
library(tidyr)
iris %>%
  filter(!is.na(Species2)) %>%
  group_by(Species, Species2) %>%
  summarize(freq = n()) %>%
  spread(Species2, freq)

0 讨论(0)