plyr | 易学教程

Convert a list of numeric vectors with different lengths to data.frame

阅读更多关于 Convert a list of numeric vectors with different lengths to data.frame

问题 I have a df : dput(head(data)) structure(list(company_code = c(1L, 1L, 1L, 1L, 1L, 11L, 11L, 11L, 12L, 13L, 13L), company_name = c("AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Iggesunds B", "AB Iggesunds B", "AB Iggesunds B", "AB Industripapp", "AB Klippans FinpB", "AB Klippans FinpB" ), year_cg_code = c(11920L, 11920L, 11920L, 11920L, 11920L, 111929L, 111929L, 111929L, 121929L, 131929L, 131929L), plant

How to subtract a median only from integer value

阅读更多关于 How to subtract a median only from integer value

问题 I have this dataset df=structure(list(Dt = structure(1:39, .Label = c("2018-02-20 00:00:00.000", "2018-02-21 00:00:00.000", "2018-02-22 00:00:00.000", "2018-02-23 00:00:00.000", "2018-02-24 00:00:00.000", "2018-02-25 00:00:00.000", "2018-02-26 00:00:00.000", "2018-02-27 00:00:00.000", "2018-02-28 00:00:00.000", "2018-03-01 00:00:00.000", "2018-03-02 00:00:00.000", "2018-03-03 00:00:00.000", "2018-03-04 00:00:00.000", "2018-03-05 00:00:00.000", "2018-03-06 00:00:00.000", "2018-03-07 00:00:00

incorrect Rscript work when replacing medians

阅读更多关于 incorrect Rscript work when replacing medians

问题 I have dataset mydat=structure(list(code = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "52382МСК", class = "factor"), item = c(11709L, 11709L, 11709L, 11709L, 1170L, 1170L, 1170L, 1170L), sales = c(30L, 10L, 20L, 15L, 8L, 10L, 2L, 15L), action = c(0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L)), .Names = c("code", "item", "sales", "action" ), class = "data.frame", row.names = c(NA, -8L)) it has two groups by code and item code item 52382МСК 11709 52382МСК 1170 Also i have action column. It can

Lagged differences

阅读更多关于 Lagged differences

问题 Sample data: Date <- as.Date(c('1-01-2008','2-01-2008', '3-01-2008','4-01-2008', '5-01-2008', '1-01-2008','2-01-2008', '3-01-2008','4-01-2008', '5-01-2008'), format = "%m-%d-%Y") Country <- c('US', 'US','US','US', 'US', 'JP', 'JP', 'JP', 'JP', 'JP') Category <- c('Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple') Value <- c(runif(10, -0.5, 10)) df <- data.frame(Date, Country, Category, Value) I am using the following piece to calculate the lagged growth

Combining two dataframes keeping all columns [duplicate]

阅读更多关于 Combining two dataframes keeping all columns [duplicate]

问题 This question already has answers here : How to join (merge) data frames (inner, outer, left, right) (13 answers) Closed 4 years ago . What I would like to do is combine 2 dataframes, keeping all columns (which is not done in the example below) and input zeros where there are gaps in the dataframe from uncommon variables. This seems like a plyr or dplyr theme. However, a full join in plyr does not keep all of the columns, whilst a left or a right join does not keep all the rows I desire.

Spline on multiple factors in data frame

阅读更多关于 Spline on multiple factors in data frame

问题 This question is in the context where I have a lot Model types, each of the same class, but the amount of data for each Model is small and I want to spline to get a fuller dataset. I'm hoping to find a way to do this without having to individually spline every Model once at a time. So I have the following df: mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e") ,as.numeric(c(1,2,3,10,20,30)), as.numeric(c(5,10,20,20,15,10))) Give some names: colnames(mydf)<-c("Model",

3 layer Stacked histogram from already summarized counts using ggplot2

阅读更多关于 3 layer Stacked histogram from already summarized counts using ggplot2

问题 I would like some help coloring a ggplot2 histogram generated from summarized data in a data.frame. The dataset I'm using is the [R] build in (USArrests) dataset. I'm trying to adapt the solution that was given to this question by arun. The desired result is to make a histogram of "Crime" and color each bar according to the relative contribution of c("Assault", "Rape", "Murder"). The code: attach(USArrests) #Create vector SUM arrests per state Crime <- with(USArrests, Murder+ Rape+ Assault)

Spread and merge row records in R for the same customer

阅读更多关于 Spread and merge row records in R for the same customer

问题 I have the below data frame where I am trying to merge multiple transactions of the one customer into one single record. Input: ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU 7/27/16 7/27/16 265 O 15 1 INTEREST 855 7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855 Expected Output: ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST FEATURE_INSTALLMENT L_NU 7/27/16 7/27/16 265 O 29 1 1 1 855 Tried: install1 <- install %>% group_by(LO_NO,AB_NO,L_NU) %>% slice(which.min(as

Rolling sum on an unbalanced time series

阅读更多关于 Rolling sum on an unbalanced time series

问题 I have a series of annual incident counts per category, with no rows for years in which the category did not see an incident. I would like to add a column that shows, for each year, how many incidents occurred in the previous three years. One way to handle this is to add empty rows for all years with zero incidents, then use rollapply() with a left-aligned four year window, but that would expand my data set more than I want to. Surely there's a way to use ddply() and transform for this? The

Merging rows of binary data based on columns using ddply [duplicate]

阅读更多关于 Merging rows of binary data based on columns using ddply [duplicate]

问题 This question already has answers here : Aggregate / summarize multiple variables per group (e.g. sum, mean) (6 answers) Closed 3 years ago . I have the following dataframe for which I want merge together binary values from an amount of rows. df =data.frame(ID=c(rep("A",5),rep("B",5)), nr=c(rep("2",5),rep("3",5)), replicate(10,sample(0:1,10,rep=TRUE))) eg: # ID nr X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 # A 2 0 0 1 1 1 1 1 1 1 0 # A 2 1 0 0 0 0 0 0 1 0 1 # A 2 0 0 1 1 1 0 0 0 0 1 # A 2 0 0 0 0 0 1 1 1