问题
Datalink: Data
Code:
ccfsisims <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/GTAP-CGE/GTAP_NewAggDatabase/NewFiles/GTAP_ConsIndex.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
ccfsirsts <- as.data.frame(ccfsisims)
ccfsirsts[7:25] <- sapply(ccfsirsts[7:25],as.numeric)
ccfsirsts <- droplevels(ccfsirsts)
ccfsirsts <- transform(ccfsirsts,sres=factor(sres,levels=unique(sres)))
ccfsirsts[1:5,]
Issue:
So, if you check the column "pSVIPM", the values displayed in the dataframe "ccfsirsts" are different from what is actually saved in the .csv file. This problem occured when uploading a different set of data.
In the initial upload, i.e. "ccfsisims", everything seems to check out. It is afterward that the problem occurs.
Any thoughts on why this happens?
回答1:
when you load ccfsisims
do str(ccfsisims )
...(get in the habit of doing this)
you will see that pSVIPM
is a factor. So as.numeric
will simply change the factors to numbers in the order the levels appear.
Because if you look at your csv you have #DIV/0!
characters in there.
try it yourself:
> length(ccfsisims$pSVIPM[ccfsisims$pSVIPM == "#DIV/0!"])
[1] 350
来源:https://stackoverflow.com/questions/14880753/loading-data-issues