Why write.csv and read.csv are not consistent? [closed]

陌路散爱 提交于 2019-12-10 02:25:13

问题


The problem is simple, consider the following example:

m <- head(iris)
write.csv(m, file = 'm.csv')
m1 <- read.csv('m.csv')

The result of this is that m1 is different from the original object m in that it has a new first column named "X". If I really wanted to make them equal, I have to use additional arguments, like in these two examples:

write.csv(m, file = 'm.csv', row.names = FALSE)
# and then
m1 <- read.csv('m.csv')

or

write.csv(m, file = 'm.csv')
m1 <- read.csv('m.csv', row.names = 1)

The question is, what is the reason of this difference? in particular, why if write.csv and read.csv are supposedly intended to stick to the Excel convention, the don't import the same object that was exported in the first place? To me this is a very counter intuitive behavior and highly undesirable.

(this results happens exactly the same if I use the csv2 variants of these functions)

Thanks in advance!


These are the data.frames m and m1 if you prefer not to use R to see the example:

> m
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

> m1
  X Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 1          5.1         3.5          1.4         0.2  setosa
2 2          4.9         3.0          1.4         0.2  setosa
3 3          4.7         3.2          1.3         0.2  setosa
4 4          4.6         3.1          1.5         0.2  setosa
5 5          5.0         3.6          1.4         0.2  setosa
6 6          5.4         3.9          1.7         0.4  setosa

回答1:


Here's my guess...

write.table writes a data.frame to a file and data.frames always have row names, so not writing row names by default would be throwing away information. (Yes, write.table will also write a matrix and matrices don't have to have row names, but data.frames are probably used much more often than matrices.)

read.table returns a data.frame but CSV files don't have any concept of row names, so someone may argue that it's counter-intuitive to assume, by default, that the first column of a CSV is a row name.

Now there may be a way to make these two functions consistent, but I would argue that writing to a text file isn't the best way to output/input data from one R session to another. It's much safer/faster to use save, load, saveRDS, readRDS, etc.



来源:https://stackoverflow.com/questions/12512062/why-write-csv-and-read-csv-are-not-consistent

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!