na.rm

data.table 1.8.x mean() function auto removing NA?

限于喜欢 提交于 2019-12-10 13:24:54
问题 Today I found out a bug in my program due to data.table auto remove NA for mean for example: > a<-data.table(a=c(NA,NA,FALSE,FALSE), b=c(1,1,2,2)) > a > a[,list(mean(a), sum(a)),by=b] b V1 V2 1: 1 0 NA // Why V1 = 0 here? I had expected NA 2: 2 0 0 > mean(c(NA,NA,FALSE,FALSE)) [1] NA > mean(c(NA,NA)) [1] NA > mean(c(FALSE,FALSE)) [1] 0 Is this the intended behaviour? 回答1: This isn't intended. Looks like a problem with optimization ... > a[,list(mean(a), sum(a)),by=b] b V1 V2 1: 1 0 NA 2: 2 0

Issue with NA values in R

半腔热情 提交于 2019-12-02 04:56:12
问题 I feel this should be something easy, I have looked x the internet, but I keep getting error messages. I have done plenty of analytics in the past but am new to R and programming. I have a pretty basic function to calculate means x columns of data: columnmean <-function(y){ nc <- ncol(y) means <- numeric(nc) for(i in 1:nc) { means[i] <- mean(y[,i]) } means } I'm in RStudio and testing it using the included 'airquality' dataset. When I load the AQ dataset and run my function: data("airquality"

Issue with NA values in R

会有一股神秘感。 提交于 2019-12-02 00:41:21
I feel this should be something easy, I have looked x the internet, but I keep getting error messages. I have done plenty of analytics in the past but am new to R and programming. I have a pretty basic function to calculate means x columns of data: columnmean <-function(y){ nc <- ncol(y) means <- numeric(nc) for(i in 1:nc) { means[i] <- mean(y[,i]) } means } I'm in RStudio and testing it using the included 'airquality' dataset. When I load the AQ dataset and run my function: data("airquality") columnmean(airquality) I get back: NA NA 9.957516 77.882353 6.993464 15.803922 Because the first two

How to pass na.rm as argument to tapply?

空扰寡人 提交于 2019-11-30 23:14:56
问题 I´d like to calculate mean and sd from a dataframe with one column for the parameter and one column for a group identifier. How can I calculate them when using tapply ? I could use sd(v1, group, na.rm=TRUE) , but can´t fit the na.rm=TRUE into the statement when using tapply . omit.na is no option. I have a whole bunch of parameters and have to go through them step by step without losing half of the dataframe when excluding all lines with one missing value. data("weightgain", package = "HSAUR"

Is it possible to set na.rm to TRUE globally?

血红的双手。 提交于 2019-11-28 09:02:19
For commands like max the option na.rm is set by default to FALSE . I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- i.e. during a session. How can I require R to set na.rm = TRUE whenever it is an option? I found options(na.action = na.omit) but this doesn't work. I know that I can set a na.rm=TRUE option for each and every function I write. my.max <- function(x) {max(x, na.rm=TRUE)} But that's not what I am looking for. I'm wondering if there's something I could do more globally/universally instead of doing it for each function. One

NaN is removed when using na.rm=TRUE

守給你的承諾、 提交于 2019-11-28 00:05:12
问题 This reproducible example is a very simplified version of my code: x <- c(NaN, 2, 3) #This is fine, as expected max(x) > NaN #Why does na.rm remove NaN? max(x, na.rm=TRUE) > 3 To me, NA (missing value) and NaN (not a number) are two completely different entities, why does na.rm remove NaN ? How can I ignore NA and not NaN ? ps:I am using 64-bit R version 3.0.0 on Windows7. Edit: Upon some more study I found that is.na returns true for NaN too! This is the cause of confusion for me. is.na(NaN)

Is it possible to set na.rm to TRUE globally?

≡放荡痞女 提交于 2019-11-27 02:35:50
问题 For commands like max the option na.rm is set by default to FALSE . I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- i.e. during a session. How can I require R to set na.rm = TRUE whenever it is an option? I found options(na.action = na.omit) but this doesn't work. I know that I can set a na.rm=TRUE option for each and every function I write. my.max <- function(x) {max(x, na.rm=TRUE)} But that's not what I am looking for. I'm wondering if