发表新帖

发表新帖

Prevent variable name getting mangled by read.csv/read.table?

后端未结

关注

 2  785

My data set testdata has 2 variables named PWGTP and AGEP

The data are in a .csv file.

When I do:

相关标签:

2条回答

走了就别回头了

2021-01-25 14:19
This is a BOM (Byte Order Mark) UTF-8 issue.

To prevent this from happening, 2 options:
1. Save your file as UTF-8 without BOM / signature -- or --
2. Use fileEncoding = "UTF-8-BOM" when using read.table or read.csv
Example:

mydata <- read.table(file = "myfile.txt", fileEncoding = "UTF-8-BOM")
0 讨论(0)
发布评论:

提交评论
- 加载中...
独厮守ぢ

2021-01-25 14:19
It is possible that the column names in the file could be 1 PWGTP i.e.with spaces between the number (or something else) and that characters which result in .. while reading in R. One way to prevent this would be to use check.names = FALSE in read.csv/read.table
```
d1 <- read.csv("yourfile.csv", header=TRUE, stringsAsFactors=FALSE, check.names=FALSE)
```
However, it is better not to have a name starting with number or have spaces in between.

So, suppose, if the OP read the data with the default options i.e. with check.names = TRUE, we can use sub to change the column names
```
names(d1) <- sub(".*\\.+", "", names(d1))
```
As an example
```
sub(".*\\.+", "", "ï..PWGTP")
#[1] "PWGTP"
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题