I am trying to read a csv file with R. I can read the file but I have levels when I call a variable. What are these levels and how can I remove them? The file can be downloa
The presence of levels for your variable HomeTeam
indicates that it is a factor (with 20 levels). You can specify StringAsFactors=FALSE
argument in the read.csv
function to remove it.
When you use ?read.csv to read a file, the argument stringsAsFactors
is set by default to TRUE
, you just need to set it to false to not get this result. This should work:
data = read.csv("Documents/bet/I1.csv", sep=",", stringsAsFactors=FALSE)
Under the default, columns (variables) in the file that contain strings are assumed to be factors. A factor is a categorical variable that can take only one of a fixed, finite set of possibilities. Those possible categories are the levels. You can read about factors in the R Intro manual here, and this is another tutorial.
In addition, since you are using read.csv, adding the sep=","
is redundant. It doesn't harm anything, but the comma is taken as the separator by default.