Remove attributes from data read in readr::read_csv

南楼画角 提交于 2021-01-27 15:42:07

问题


readr::read_csv adds attributes that don't get updated when the data is edited. For example,

library('tidyverse')
df <- read_csv("A,B,C\na,1,x\nb,1,y\nc,1,z")

# Remove columns with only one distinct entry
no_info <- df %>% sapply(n_distinct)
no_info <- names(no_info[no_info==1]) 

df2 <- df %>% 
  select(-no_info)

Inspecting the structure, we see that column B is still present in the attributes of df2:

> str(df)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':    3 obs. of  3 variables:
 $ A: chr  "a" "b" "c"
 $ B: num  1 1 1
 $ C: chr  "x" "y" "z"
 - attr(*, "spec")=
  .. cols(
  ..   A = col_character(),
  ..   B = col_double(),
  ..   C = col_character()
  .. )
> str(df2)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':    3 obs. of  2 variables:
 $ A: chr  "a" "b" "c"
 $ C: chr  "x" "y" "z"
 - attr(*, "spec")=
  .. cols(
  ..   A = col_character(),
  ..   B = col_double(),
  ..   C = col_character()
  .. )
> attributes(df2)
$class
[1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame" 

$row.names
[1] 1 2 3

$spec
cols(
  A = col_character(),
  B = col_double(),
  C = col_character()
)

$names
[1] "A" "C"

> 

How can I remove columns (or any other updates to the data) and have the changes accurately reflected in the new data structure and attributes?


回答1:


You can remove column specifiction by setting it to NULL:

> attr(df, 'spec') <- NULL
> str(df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   3 obs. of  3 variables:
 $ A: chr  "a" "b" "c"
 $ B: int  1 1 1
 $ C: chr  "x" "y" "z"
> df
# A tibble: 3 x 3
  A         B C    
  <chr> <int> <chr>
1 a         1 x    
2 b         1 y    
3 c         1 z    


来源:https://stackoverflow.com/questions/54000753/remove-attributes-from-data-read-in-readrread-csv

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!