R cannot use hist() because “content not numeric” due to negative decimal numbers?

不羁的心 提交于 2021-02-04 19:49:26

问题


I am new to R and I am trying to draw a histogram using hist() of a list of 100,000 numbers like this

-0.764
-0.662
-0.764
-0.019
0.464
0.668
0.464

but I cannot do it because R complains that the content is not numeric. This is what I've tried:

  • I read the file using t <- read.table(file= "file.txt", sep = "\n", dec = ".", header = TRUE), the data loads and looks well (I get the same values)

  • I tried to make it numeric using as.numeric(c(t[,1])), sapply(t, as.numeric), but I get completely different numbers, like

    53 428 791 428 582 428 979 428 456 533 550

I think their might be a problem with the decimal point "." or the negative signs "-" or both. Any ideas?

Many thanks!


回答1:


R seems to have transformed the first column of your data as a factor. This should not happen if all your data in this column where numeric in your file. So there must be an element which is not recognized as a number.

You can try the following (which is a bit dirty) in R to try to identify where the problem is. Starting with the following factor :

R> v <- factor(c("0.51", "-0.12", "0.345", "0.45b", "-0.8"))

You can identify which value causes problem with :

R> v[is.na(as.numeric(as.character(v)))]
[1] 0.45b

And you can find the position of this value in your vector with :

R> which(is.na(as.numeric(as.character(v))))
[1] 4



回答2:


If you want to convert a factor to a numeric type, you have to understand how factors work.

Internally, each distinct item (each "factor") in a column of class factor is stored as a number. These are the numbers that you're seeing when you run as.numeric. These numbers, are actually just indexes on the levels of the factor, so if you type levels(t[,1]) you should see a list of all the different values in the first column of your data frame.

So, with this knowledge, we can use a trick to extract the actual numbers:

as.numeric(levels(t[,1])[t[,1]])

Of course, if R interpreted this row of numbers as a factor when read.table was reading it, before this trick will work, you'll have to remove the row that contains the non-numeric type.



来源:https://stackoverflow.com/questions/15421172/r-cannot-use-hist-because-content-not-numeric-due-to-negative-decimal-number

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!