tibble | 易学教程

Converting data frame to tibble with word count

阅读更多关于 Converting data frame to tibble with word count

问题 I'm attempting to perform sentiment analysis based on http://tidytextmining.com/sentiment.html#the-sentiments-dataset . Prior to performing sentiment analysis I need to convert my dataset into a tidy format. my dataset is of form : x <- c( "test1" , "test2") y <- c( "this is test text1" , "this is test text2") res <- data.frame( "url" = x, "text" = y) res url text 1 test1 this is test text1 2 test2 this is test text2 In order to convert to one observation per row require to process text

custom function in mutate/tibble

阅读更多关于 custom function in mutate/tibble

问题 I am following a tutorial and I am trying to apply this part to my data/problem kclusts <- tibble(k = 1:9) %>% mutate( kclust = map(k, ~kmeans(points, .x)), tidied = map(kclust, tidy), glanced = map(kclust, glance), augmented = map(kclust, augment, points) ) However my data is slightly different to that of the tutorials. I am trying to apply the final line augmented = map(kclust, augment, points) . Code which Works (without the final line): kclust <- results %>% as_tibble() %>% select(-id_row

Coerce list of lists to data_frame, but maintain some elements in list columns

阅读更多关于 Coerce list of lists to data_frame, but maintain *some* elements in list columns

问题 I am looking for a way to reliably coerce a list structure to a data.frame or tibble while maintaining one or more columns as list columns. Consider the following list structure: d = data.frame(x = 1:10, y = 1.5*(1:10) + rnorm(10)) ex = list(label = "A", number = 1L, model = lm(y ~ x, data = d)) This does not work as intended: lapply(ex, as_data_frame) %>% bind_rows() Because the lm object in the "model" column gets vectorized in the conversion. However, wrapping the model in list gets the

tbl_df is transformed as list in S4 class

阅读更多关于 tbl_df is transformed as list in S4 class

问题 When I tried to use tbl_df in S4 classes, tbl_df slots seems to be transformed into list . library('tibble') setOldClass(c('tbl_df', 'tbl', 'data.frame')) setClass(Class = 'TestClass', slots = c(name = 'character'), contains = 'tbl_df') tmp1 <- new('TestClass', tibble(x = 1:5, y = 1, z = x ^ 2 + y), name = 'firsttest') tmp1@.Data [[1]] [1] 1 2 3 4 5 [[2]] [1] 1 1 1 1 1 [[3]] [1] 2 5 10 17 26 Can I visit the tmp1@.Data just like a tbl_df object? like tmp1@.Data # A tibble: 5 x 3 x y z * <int>

dplyr unquoting does not work with filter function

阅读更多关于 dplyr unquoting does not work with filter function

maybe I am missing something, but I can't seem to make dplyr's unquoting operator to work with the filter function. It does with with select, but not with filter... Example set.seed(1234) A = matrix(rnorm(100),nrow = 10, ncol = 10) colnames(A) <- paste("var", seq(1:10), sep = "") varname_test <- "var2" A <- as_tibble(A) select(A, !!varname_test) #this works as expected # this does NOT give me only the rows where var2 # is positive (result1 <- filter(A, !!varname_test > 0)) # This is how the result 1 should look like (result2 <- filter(A, var2 > 0)) # result1 is not equal to result2 I would

add_column in tibble with variable column name

阅读更多关于 add_column in tibble with variable column name

问题 This code doesn't work to add a column in tibble: library(tidyverse) df <- data.frame("Oranges" = 5) mycols <- c("Apples", "Bananas", "Oranges") add_column(df, mycols[[2]] = 7) I get the error message: Error: unexpected '=' in "add_column(df, mycols[[2]] =" But this code works: add_column(df, "Bananas" = 7) Why? I don't know the values of 'mycols' ahead of time. That's why I wrote my code for it to be a variable. Is this not possible in dplry? 回答1: Well, add_column seems to come from tibble

tbl_df is transformed as list in S4 class

阅读更多关于 tbl_df is transformed as list in S4 class

When I tried to use tbl_df in S4 classes, tbl_df slots seems to be transformed into list . library('tibble') setOldClass(c('tbl_df', 'tbl', 'data.frame')) setClass(Class = 'TestClass', slots = c(name = 'character'), contains = 'tbl_df') tmp1 <- new('TestClass', tibble(x = 1:5, y = 1, z = x ^ 2 + y), name = 'firsttest') tmp1@.Data [[1]] [1] 1 2 3 4 5 [[2]] [1] 1 1 1 1 1 [[3]] [1] 2 5 10 17 26 Can I visit the tmp1@.Data just like a tbl_df object? like tmp1@.Data # A tibble: 5 x 3 x y z * <int> <dbl> <dbl> 1 1 1 2 2 2 1 5 3 3 1 10 4 4 1 17 5 5 1 26 S3 objects, just for simplification, are lists

Get data frame into right format from web-scrapping work [closed]

阅读更多关于 Get data frame into right format from web-scrapping work [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 3 months ago . I have code which I use to web scrape past air atmosphere data repeatedly by wrapping the httr in the function.The original code works well on looping task. You may find the original code here https://stackoverflow.com/a/52545775/7356308. I modified it a bit to web-scrap

add_column in tibble with variable column name

阅读更多关于 add_column in tibble with variable column name

This code doesn't work to add a column in tibble: library(tidyverse) df <- data.frame("Oranges" = 5) mycols <- c("Apples", "Bananas", "Oranges") add_column(df, mycols[[2]] = 7) I get the error message: Error: unexpected '=' in "add_column(df, mycols[[2]] =" But this code works: add_column(df, "Bananas" = 7) Why? I don't know the values of 'mycols' ahead of time. That's why I wrote my code for it to be a variable. Is this not possible in dplry? Well, add_column seems to come from tibble rather than dplyr , but it does use the new tidy eval syntax. You can use add_column(df, !!(mycols[2]) := 7)

Replace NA with grouped means in R? [duplicate]

阅读更多关于 Replace NA with grouped means in R? [duplicate]

问题 This question already has answers here : How to replace NA with mean by group / subset? (4 answers) Closed last month . I am stuck at trying to replace NAs with means and I would be very grateful for help. I want to replace NAs in multiple columns of a dataframe with the mean of a group within the column. In the example below I would want to replace the NA in x1 with the 14.5, since 13 and 16 are in month 1. The NA in x2 should be replaced with 4.5. This is the way I tried it: library