broom | 易学教程

Custom R function around plot_ly() with fitted(lm(y~x)) using add_lines()

阅读更多关于 Custom R function around plot_ly() with fitted(lm(y~x)) using add_lines()

问题 I want to write a custom function around plot_ly() in R. That way, I can make a series of scatterplots with the same formatting and style, but not duplicate code. I used this page as a guide. This code reproduces the error: library(plotly) my_plot <- function(x, y, ...) { require(broom) plot_ly(data = mtcars, y = y, x = x, showlegend = FALSE, ...) %>% add_markers(y = y) %>% add_lines(y = ~fitted(lm(y ~ x))) %>% add_ribbons(data = augment(lm(y ~ x, data = mtcars)), ymin = ~.fitted - 1.96 * .se

R - tidy augment confidence interval

阅读更多关于 R - tidy augment confidence interval

问题 I am wondering how I can compute confidence interval using the broom package. What I am trying to do is simple and standard : set.seed(1) x <- runif(50) y <- 2.5 + (3 * x) + rnorm(50, mean = 2.5, sd = 2) dat <- data.frame(x = x, y = y) mod <- lm(y ~ x, data = dat) Using visreg I can plot regression models with CI very simply with : library(visreg) visreg(mod, 'x', overlay=TRUE) I am interesting in reproducing this using broom and ggplot2 , so far I only achieved this : library(broom) dt = lm

Find predictions for linear model that is grouped_by

阅读更多关于 Find predictions for linear model that is grouped_by

问题 I would like to get predicted values based on a model I fit to a training set of data. I have done this before, but now I have a grouping factor and it is throwing me off. I want to predict biomass based on population for each environment. library(tidyverse) fit_mods<-df %>% group_by(environ) %>% do(model = lm(biomass ~ poly(population, 2), data = .)) Ultimately, I will want to find at which population biomass is the greatest. Usually I would do this by creating a grid and running the model

Unnest all columns of nested tibble to a list of tibbles

阅读更多关于 Unnest all columns of nested tibble to a list of tibbles

问题 I am fitting a model to each group in a dataset. I am nesting the data by the grouping variable and then using map to fit a model to each group. Then I store the tidied model information as columns in a nested tibble. I'd like to save each of these columns as its own file, this example saves them as sheets in a excel workbook. Is there a way to not to unnest each column individually as a new tibble? Can all columns be unnested at once to a new list of tibbles? One that can be used in other

working with lists of models using the pipe syntax

阅读更多关于 working with lists of models using the pipe syntax

问题 I often like to fit and examine multiple models that relate two variables in an R dataframe. I can do that using syntax like this: require(tidyverse) require(broom) models <- list(hp ~ exp(cyl), hp ~ cyl) map_df(models, ~tidy(lm(data=mtcars, formula=.x))) But I'm used to the pipe syntax and was hoping to be able to something like this: mtcars %>% map_df(models, ~tidy(lm(data=., formula=.x))) That makes it clear that I'm "starting" with mtcars and then doing stuff to it to generate my output.

Correlation matrix with dplyr, tidyverse and broom - P-value matrix

阅读更多关于 Correlation matrix with dplyr, tidyverse and broom - P-value matrix

问题 all. I want to obtain the p-value from a correlation matrix using dplyr and/or broom packages and testing multiple variables at the same time . I'm aware of other methods, but dplyr seems easier and more intuitive for me. In addition, dplyr will need to correlate each variable to obtain the specific p-value, what makes the process easier and faster. I checked other links, but they did not work for this question (example 1, example 2, example 3) When I use this code, the correlation

bootstrapping by multiple groups in dplyr

阅读更多关于 bootstrapping by multiple groups in dplyr

I'm trying to bootstrap a bivariate correlation grouped by multiple variables in a tidy fashion. So far I've got: paks <- c('dplyr','tidyr','broom') lapply(paks, require, character.only=TRUE) set.seed(123) df <- data.frame( rep(c('group1','group2','group3','group4'),25), rep(c('subgroup1','subgroup2','subgroup3','subgroup4'),25), rnorm(25), rnorm(25) ) colnames(df) <- c('group','subgroup','v1','v2') cors_boot <- df %>% group_by(., group,subgroup) %>% bootstrap(., 10) %>% do(tidy(cor.test(.$v1,.$v2))) cors_boot This will succesffuly run 10 replications, but will not maintain the group_by

Conditional nls fitting with dplyr+broom

阅读更多关于 Conditional nls fitting with dplyr+broom

问题 I am using the dplyr and broom combination and try to fitting regression models depending on the condition inside of the data groups. Finally I want to extract the regression coefficients by each group. So far I'm getting the same fitting results for all groups (Each group is separated with letters a:f ) . It's the main problem. library(dplyr) library(minpack.lm) library(broom) direc <- rep(rep(c("North","South"),each=20),times=6) V <- rep(c(seq(2,40,length.out=20),seq(-2,-40,length.out=20))

Error when using broom (augment) and dplyr with loess fit

阅读更多关于 Error when using broom (augment) and dplyr with loess fit

问题 I am trying to use augment on a loess fit, but I receive the following error: Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 32, 11 In the error message, 11 happens to equal the number of observations in one segment and 32 is the total number of observations. The code is below. require(broom) require(dplyr) # This example uses the lm method and it works regressions <- mtcars %>% group_by(cyl) %>% do(fit = lm(wt ~ mpg, .)) regressions %>% augment(fit)

rolling regression with confidence interval (tidyverse)

阅读更多关于 rolling regression with confidence interval (tidyverse)

问题 This is related to rolling regression by group in the tidyverse? Consider again this simple example library(dplyr) library(purrr) library(broom) library(zoo) library(lubridate) mydata = data_frame('group' = c('a','a', 'a','a','b', 'b', 'b', 'b'), 'y' = c(1,2,3,4,2,3,4,5), 'x' = c(2,4,6,8,6,9,12,15), 'date' = c(ymd('2016-06-01', '2016-06-02', '2016-06-03', '2016-06-04', '2016-06-03', '2016-06-04', '2016-06-05','2016-06-06'))) group y x date <chr> <dbl> <dbl> <date> 1 a 1.00 2.00 2016-06-01 2 a