broom | 易学教程

keep region names when tidying a map using broom package

阅读更多关于 keep region names when tidying a map using broom package

问题 I am using the getData function from the raster package to retrieve the map of Argentina. I would like to plot the resulting map using ggplot2, so I am converting to a dataframe using the tidy function from the broom package. This works fine, but I can't figure out how to preserve the names of the federal districts so that I can use them on the map. Here is my original code that does not preserve the district names: # Original code: ################################## # get the map data from

Plotting predicted survival curves for continuous covariates in ggplot

阅读更多关于 Plotting predicted survival curves for continuous covariates in ggplot

问题 How can I plot survival curves for representative values of a continuous covariate in a cox proportional hazards model? Specifically, I would like to do this in ggplot using a "survfit.cox" "survfit" object. This may seem like a question that has already been answered, but I have searched through everything in SO with the terms 'survfit' and 'newdata' (plus many other search terms). This is the thread that comes closest to answering my question so far: Plot Kaplan-Meier for Cox regression In

R - dplyr bootstrap issue

阅读更多关于 R - dplyr bootstrap issue

问题 I have an issue understanding how to use the dplyr bootstrap function properly. What I want is to generate a bootstrap distribution from two randomly assigned groups and compute the difference in means, like this for example : library(dplyr) library(broom) data(mtcars) mtcars %>% mutate(treat = sample(c(0, 1), 32, replace = T)) %>% group_by(treat) %>% summarise(m = mean(disp)) %>% summarise(m = m[treat == 1] - m[treat == 0]) The issue is that I need to repeat this operation 100 , 1000 , or

R - dplyr bootstrap issue

阅读更多关于 R - dplyr bootstrap issue

bootstrapping by multiple groups in dplyr

阅读更多关于 bootstrapping by multiple groups in dplyr

问题 I'm trying to bootstrap a bivariate correlation grouped by multiple variables in a tidy fashion. So far I've got: paks <- c('dplyr','tidyr','broom') lapply(paks, require, character.only=TRUE) set.seed(123) df <- data.frame( rep(c('group1','group2','group3','group4'),25), rep(c('subgroup1','subgroup2','subgroup3','subgroup4'),25), rnorm(25), rnorm(25) ) colnames(df) <- c('group','subgroup','v1','v2') cors_boot <- df %>% group_by(., group,subgroup) %>% bootstrap(., 10) %>% do(tidy(cor.test(.$v1

Comparing models with dplyr and broom::glance: How to continue if error is produced?

阅读更多关于 Comparing models with dplyr and broom::glance: How to continue if error is produced?

问题 I would like to run each variable in a dataset as a univariate glmer model using the lme4 package in R. I would like to prepare the data with the dplyr/tidyr packages, and organize the results from each model with the broom package (i.e. do(glance(glmer...). I would most appreciate help that stuck within that framework. I'm not that great in R, but was able to produce a dataset that throws an error and has the same structure as the data I'm using: library(lme4) library(dplyr) library(tidyr)

R: How to apply a function that outputs a dataframe for multiple columns (using dplyr)?

阅读更多关于 R: How to apply a function that outputs a dataframe for multiple columns (using dplyr)?

问题 I want to find correlations, p-values and 95% CI between one specific column and all other columns in a dataframe. The 'broom' package provides an example how to do that between two columns using cor.test with dplyr and pipes. For mtcars and, say, mpg column we can run a correlation with another column: library(dplyr) library(broom) mtcars %>% do(tidy(cor.test(.$mpg, .$cyl))) estimate statistic p.value parameter conf.low conf.high 1 -0.852162 -8.919699 6.112687e-10 30 -0.9257694 -0.7163171

How to convert fitdistrplus::fitdist summary into tidy format?

阅读更多关于 How to convert fitdistrplus::fitdist summary into tidy format?

问题 I have the following code: x <- c( 0.367141764080875, 0.250037975705769, 0.167204185003365, 0.299794433447383, 0.366885973041269, 0.300453205296379, 0.333686861081341, 0.33301168850398, 0.400142004893329, 0.399433677388411, 0.366077304765104, 0.166402979455671, 0.466624230750293, 0.433499934139897, 0.300017278751768, 0.333673696762895, 0.29973685692478 ) fn <- fitdistrplus::fitdist(x,"norm") summary(fn) #> Fitting of the distribution ' norm ' by maximum likelihood #> Parameters : #> estimate

Comparison between dplyr::do / purrr::map, what advantages? [closed]

阅读更多关于 Comparison between dplyr::do / purrr::map, what advantages? [closed]

问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . When using broom I was used to combine dplyr::group_by and dplyr::do to perform actions on grouped data thanks to @drob. For example, fitting a linear model to cars depending on their gear system: library("dplyr") library("tidyr") library("broom") # using do() mtcars %>%

Fit model using each predictor columns indiviually store results in dataframe

阅读更多关于 Fit model using each predictor columns indiviually store results in dataframe

问题 I have a dataframe with one column of a response variable, and several columns of predictor variables. I want to fit models for the response variable using each of the predictor variables separately, finally creating a dataframe that contains the coefficients of the model. Previously, I would have done this: data(iris) iris_vars <- c("Sepal.Width", "Petal.Length", "Petal.Width") fits.iris <- lapply(iris_vars, function(x) {lm(substitute(Sepal.Length ~ i, list(i = as.name(x))), data = iris)}) #