问题
I have a space delimited file and some columns are blank, so we end up having multiple spaces, and fread fails with error. But read.table works fine. See example:
library(data.table)
# R version 3.4.2 (2017-09-28)
# data.table_1.10.4-3
fread("A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE)
Error in fread("A B C D\n1 2 3\n4 5 6 7") : Expected sep (' ') but new line, EOF (or other non printing character) ends field 2 when detecting types from point 0: 1 2 3
read.table(text ="A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE)
# A B C D
# 1 1 2 NA 3
# 2 4 5 6 7
How do we read using fread, I tried setting sep = " "
and na.string = ""
, didn't help.
回答1:
In fread function, by default strip.white
is set to TRUE
, meaning leading trailing spaces are removed. That is useful to read files with fixed width or with irregular number of spaces as separator.
Whereas in read.table strip.white
by default is set to FALSE
.
fread("A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE, strip.white = FALSE)
# A B C D
# 1: 1 2 NA 3
# 2: 4 5 6 7
Note: Providing self-answer as I couldn't find relevant post, also this tripped me over once and twice.
Edit: This doesn't work anymore for data.table_1.12.2, related GitHub Issue.
来源:https://stackoverflow.com/questions/48215177/how-to-read-when-delimiter-is-space-and-missing-values-are-blank