I am trying to read a .tsv (tab-separated value) file into R using a specific encoding. It's supposedly windows-1252
. And it has a header.
Any suggestions for the code to put it into a data frame?
Something like this perhaps?
mydf <- read.table('thefile.txt', header=TRUE, sep="\t", fileEncoding="windows-1252")
str(mydf)
You can also use:
read.delim('thefile.txt', header= T, fileEncoding= "windows-1252")
Simply entering the command into your R consol:
> read.delim
function (file, header = TRUE, sep = "\t", quote = "\"", dec = ".",
fill = TRUE, comment.char = "", ...)
read.table(file = file, header = header, sep = sep, quote = quote,
dec = dec, fill = fill, comment.char = comment.char, ...)
reveals that read.delim
is a packaged read.table
command that already specifies tabs as your data's separator. read.delim
might be more convenient if you're working with a lot of tsv files.
The difference between the two commands is discussed in more detail in this Stack question.
df <- read.delim(~/file_directory/file_name.tsv, header = TRUE)
will be working fine for single .tsv
file, because it is already tab separated, so no need sep = "\t"
. fileEncoding= "windows-1252"
could be used but not necessary.
来源:https://stackoverflow.com/questions/9764470/r-reading-a-tsv-file-using-specific-encoding