问题
I am new to R and I am trying to draw a histogram using hist() of a list of 100,000 numbers like this
-0.764
-0.662
-0.764
-0.019
0.464
0.668
0.464
but I cannot do it because R complains that the content is not numeric. This is what I've tried:
I read the file using
t <- read.table(file= "file.txt", sep = "\n", dec = ".", header = TRUE)
, the data loads and looks well (I get the same values)I tried to make it numeric using
as.numeric(c(t[,1])), sapply(t, as.numeric)
, but I get completely different numbers, like53 428 791 428 582 428 979 428 456 533 550
I think their might be a problem with the decimal point "." or the negative signs "-" or both. Any ideas?
Many thanks!
回答1:
R seems to have transformed the first column of your data as a factor. This should not happen if all your data in this column where numeric in your file. So there must be an element which is not recognized as a number.
You can try the following (which is a bit dirty) in R to try to identify where the problem is. Starting with the following factor :
R> v <- factor(c("0.51", "-0.12", "0.345", "0.45b", "-0.8"))
You can identify which value causes problem with :
R> v[is.na(as.numeric(as.character(v)))]
[1] 0.45b
And you can find the position of this value in your vector with :
R> which(is.na(as.numeric(as.character(v))))
[1] 4
回答2:
If you want to convert a factor to a numeric type, you have to understand how factors work.
Internally, each distinct item (each "factor") in a column of class factor
is stored as a number. These are the numbers that you're seeing when you run as.numeric
. These numbers, are actually just indexes on the levels of the factor, so if you type levels(t[,1])
you should see a list of all the different values in the first column of your data frame.
So, with this knowledge, we can use a trick to extract the actual numbers:
as.numeric(levels(t[,1])[t[,1]])
Of course, if R interpreted this row of numbers as a factor when read.table
was reading it, before this trick will work, you'll have to remove the row that contains the non-numeric type.
来源:https://stackoverflow.com/questions/15421172/r-cannot-use-hist-because-content-not-numeric-due-to-negative-decimal-number