I have a problem using data from a tab delimited data file imported with read.delim
.
Most of the columns contain numerical data which I need to do a
You say "Most of the columns contain numerical data". That's the problem. Only when all columns contain numerical data, can the function apply used without changing the data type. If there is any non-numerical data in other columns, you should change the data type in the function apply:
pvalue<-apply(x,1,ttest<-function(tmp {
if(length(unique(c(tmp[5],tmp[7],tmp[9])))!=1 &&
length(unique(c(tmp[11],tmp[13],tmp[15])))!=1)
t.test(c(as.numeric(tmp[5]),as.numeric(tmp[7]),
as.numeric(tmp[9])), c(as.numeric(tmp[11]),
as.numeric(tmp[13]),as.numeric(tmp[15])))$p.value
else NA})
It is possible that some of your data may not be in numeric format after loading it. Check the structure of the data with str(your.data)
. If your your desired variables are not numeric you can convert them with data$var1 <- as.numeric(data$var1)
.
I have been able to reproduce your error message with the following small example:
x = as.factor(1:5)
y = as.factor(1:5)
t.test(x, y)
yields
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
The problem is you are trying to perform a t-test on non-numeric vectors. Addition likewise is not defined for factors:
x + y
yields
[1] NA NA NA NA NA
Warning message:
In Ops.factor(x, y) : + not meaningful for factors
The warning gives keen insight as to what is amiss and also explains why your t-test is not working.
To fix the problem, you need to do as ilya suggests: convert your vectors to numeric with as.numeric(as.character())