问题
readr::read_csv
adds attributes that don't get updated when the data is edited. For example,
library('tidyverse')
df <- read_csv("A,B,C\na,1,x\nb,1,y\nc,1,z")
# Remove columns with only one distinct entry
no_info <- df %>% sapply(n_distinct)
no_info <- names(no_info[no_info==1])
df2 <- df %>%
select(-no_info)
Inspecting the structure, we see that column B is still present in the attributes of df2
:
> str(df)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 3 obs. of 3 variables:
$ A: chr "a" "b" "c"
$ B: num 1 1 1
$ C: chr "x" "y" "z"
- attr(*, "spec")=
.. cols(
.. A = col_character(),
.. B = col_double(),
.. C = col_character()
.. )
> str(df2)
Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 3 obs. of 2 variables:
$ A: chr "a" "b" "c"
$ C: chr "x" "y" "z"
- attr(*, "spec")=
.. cols(
.. A = col_character(),
.. B = col_double(),
.. C = col_character()
.. )
> attributes(df2)
$class
[1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
$row.names
[1] 1 2 3
$spec
cols(
A = col_character(),
B = col_double(),
C = col_character()
)
$names
[1] "A" "C"
>
How can I remove columns (or any other updates to the data) and have the changes accurately reflected in the new data structure and attributes?
回答1:
You can remove column specifiction by setting it to NULL
:
> attr(df, 'spec') <- NULL
> str(df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 3 obs. of 3 variables:
$ A: chr "a" "b" "c"
$ B: int 1 1 1
$ C: chr "x" "y" "z"
> df
# A tibble: 3 x 3
A B C
<chr> <int> <chr>
1 a 1 x
2 b 1 y
3 c 1 z
来源:https://stackoverflow.com/questions/54000753/remove-attributes-from-data-read-in-readrread-csv