Override column types when importing data using readr::read_csv() when there are many columns

后端未结

关注

 2  559

I am trying to read a csv file using readr::read_csv in R. The csv file that I am importing has about 150 columns, I am just including the first few columns for the example. I a

相关标签:

2条回答

时光取名叫无心

2021-01-30 11:13
Here follows a more generic answer to this question if someone happens to stumble upon this in the future. It is less advisable to use "skip" to jump columns as this will fail to work if the imported data source structure is changed.

It could be easier in your example to simply set a default column type, and then define any columns that differ from the default.

E.g., if all columns typically are "d", but the date column should be "D", load the data as follows:
```
  read_csv(df, col_types = cols(.default = "d", date = "D"))
```
or if, e.g., column date should be "D" and column "xxx" be "i", do so as follows:
```
  read_csv(df, col_types = cols(.default = "d", date = "D", xxx = "i"))
```
The use of "default" above is powerful if you have multiple columns and only specific exceptions (such as "date" and "xxx").
0 讨论(0)
发布评论:

提交评论
- 加载中...

耶瑟儿～

2021-01-30 11:28

Yes. For example to force numeric data to be treated as characters:

examplecsv = "a,b,c\n1,2,a\n3,4,d"
read_csv(examplecsv)
# A tibble: 2 x 3
#      a     b     c
#  <int> <int> <chr>
#1     1     2     a
#2     3     4     d
read_csv(examplecsv, col_types = cols(b = col_character()))
# A tibble: 2 x 3
#      a     b     c
#  <int> <chr> <chr>
#1     1     2     a
#2     3     4     d

Choices are:

col_character() 
col_date()
col_time() 
col_datetime() 
col_double() 
col_factor() # to enforce, will never be guessed
col_integer() 
col_logical() 
col_number() 
col_skip() # to force skip column

More: http://readr.tidyverse.org/articles/readr.html

0 讨论(0)