Imported a csv-dataset to R but the values becomes factors

痞子三分冷 提交于 2019-11-26 09:29:05

问题


I am very new to R and I am having trouble accessing a dataset I\'ve imported. I\'m using RStudio and used the Import Dataset function when importing my csv-file and pasted the line from the console-window to the source-window. The code looks as follows:

setwd(\"c:/kalle/R\")
stuckey <- read.csv(\"C:/kalle/R/stuckey.csv\")
point <- stuckey$PTS
time <- stuckey$MP

However, the data isn\'t integer or numeric as I am used to but factors so when I try to plot the variables I only get histograms, not the usual plot. When checking the data it seems to be in order, just that I\'m unable to use it since it\'s in factor form.


回答1:


Both the data import function (here: read.csv()) as well as a global option offer you to say stringsAsFactors=FALSE which should fix this.




回答2:


By default, read.csv checks the first few rows of your data to see whether to treat each variable as numeric. If it finds non-numeric values, it assumes the variable is character data, and character variables are converted to factors.

It looks like the PTS and MP variables in your dataset contain non-numerics, which is why you're getting unexpected results. You can force these variables to numeric with

point <- as.numeric(as.character(point))
time <- as.numeric(as.character(time))

But any values that can't be converted will become missing. (The R FAQ gives a slightly different method for factor -> numeric conversion but I can never remember what it is.)




回答3:


You can set this globally for all read.csv/read.* commands with options(stringsAsFactors=F)

Then read the file as follows: my.tab <- read.table( "filename.csv", as.is=T )




回答4:


When importing csv data files the import command should reflect both the data seperation between each column (;) and the float-number seperator for your numeric values (for numerical variable = 2,5 this would be ",").

The command for importing a csv, therefore, has to be a bit more comprehensive with more commands:

    stuckey <- read.csv2("C:/kalle/R/stuckey.csv", header=TRUE, sep=";", dec=",")

This should import all variables as either integers or numeric.




回答5:


I'm new to R as well and faced the exact same problem. But then I looked at my data and noticed that it is being caused due to the fact that my csv file was using a comma separator (,) in all numeric columns (Ex: 1,233,444.56 instead of 1233444.56).

I removed the comma separator in my csv file and then reloaded into R. My data frame now recognises all columns as numbers.

I'm sure there's a way to handle this within the read.csv function itself.




回答6:


This only worked right for me when including strip.white = TRUE in the read.csv command.

(I found the solution here.)




回答7:


for me the solution was to include skip = 0 (number of rows to skip at the top of the file. Can be set >0)

mydata <- read.csv(file = "file.csv", header = TRUE, sep = ",", skip = 22)



来源:https://stackoverflow.com/questions/5187745/imported-a-csv-dataset-to-r-but-the-values-becomes-factors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!