plyr

Convert a list of numeric vectors with different lengths to data.frame

自作多情 提交于 2019-12-25 01:55:27
问题 I have a df : dput(head(data)) structure(list(company_code = c(1L, 1L, 1L, 1L, 1L, 11L, 11L, 11L, 12L, 13L, 13L), company_name = c("AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Iggesunds B", "AB Iggesunds B", "AB Iggesunds B", "AB Industripapp", "AB Klippans FinpB", "AB Klippans FinpB" ), year_cg_code = c(11920L, 11920L, 11920L, 11920L, 11920L, 111929L, 111929L, 111929L, 121929L, 131929L, 131929L), plant

How to subtract a median only from integer value

我们两清 提交于 2019-12-25 00:07:32
问题 I have this dataset df=structure(list(Dt = structure(1:39, .Label = c("2018-02-20 00:00:00.000", "2018-02-21 00:00:00.000", "2018-02-22 00:00:00.000", "2018-02-23 00:00:00.000", "2018-02-24 00:00:00.000", "2018-02-25 00:00:00.000", "2018-02-26 00:00:00.000", "2018-02-27 00:00:00.000", "2018-02-28 00:00:00.000", "2018-03-01 00:00:00.000", "2018-03-02 00:00:00.000", "2018-03-03 00:00:00.000", "2018-03-04 00:00:00.000", "2018-03-05 00:00:00.000", "2018-03-06 00:00:00.000", "2018-03-07 00:00:00

incorrect Rscript work when replacing medians

心已入冬 提交于 2019-12-24 20:49:50
问题 I have dataset mydat=structure(list(code = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "52382МСК", class = "factor"), item = c(11709L, 11709L, 11709L, 11709L, 1170L, 1170L, 1170L, 1170L), sales = c(30L, 10L, 20L, 15L, 8L, 10L, 2L, 15L), action = c(0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L)), .Names = c("code", "item", "sales", "action" ), class = "data.frame", row.names = c(NA, -8L)) it has two groups by code and item code item 52382МСК 11709 52382МСК 1170 Also i have action column. It can

Lagged differences

与世无争的帅哥 提交于 2019-12-24 16:52:28
问题 Sample data: Date <- as.Date(c('1-01-2008','2-01-2008', '3-01-2008','4-01-2008', '5-01-2008', '1-01-2008','2-01-2008', '3-01-2008','4-01-2008', '5-01-2008'), format = "%m-%d-%Y") Country <- c('US', 'US','US','US', 'US', 'JP', 'JP', 'JP', 'JP', 'JP') Category <- c('Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple', 'Apple') Value <- c(runif(10, -0.5, 10)) df <- data.frame(Date, Country, Category, Value) I am using the following piece to calculate the lagged growth

Combining two dataframes keeping all columns [duplicate]

坚强是说给别人听的谎言 提交于 2019-12-24 16:41:26
问题 This question already has answers here : How to join (merge) data frames (inner, outer, left, right) (13 answers) Closed 4 years ago . What I would like to do is combine 2 dataframes, keeping all columns (which is not done in the example below) and input zeros where there are gaps in the dataframe from uncommon variables. This seems like a plyr or dplyr theme. However, a full join in plyr does not keep all of the columns, whilst a left or a right join does not keep all the rows I desire.

Spline on multiple factors in data frame

…衆ロ難τιáo~ 提交于 2019-12-24 15:33:32
问题 This question is in the context where I have a lot Model types, each of the same class, but the amount of data for each Model is small and I want to spline to get a fuller dataset. I'm hoping to find a way to do this without having to individually spline every Model once at a time. So I have the following df: mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e") ,as.numeric(c(1,2,3,10,20,30)), as.numeric(c(5,10,20,20,15,10))) Give some names: colnames(mydf)<-c("Model",

3 layer Stacked histogram from already summarized counts using ggplot2

喜你入骨 提交于 2019-12-24 11:09:09
问题 I would like some help coloring a ggplot2 histogram generated from summarized data in a data.frame. The dataset I'm using is the [R] build in (USArrests) dataset. I'm trying to adapt the solution that was given to this question by arun. The desired result is to make a histogram of "Crime" and color each bar according to the relative contribution of c("Assault", "Rape", "Murder"). The code: attach(USArrests) #Create vector SUM arrests per state Crime <- with(USArrests, Murder+ Rape+ Assault)

Spread and merge row records in R for the same customer

点点圈 提交于 2019-12-24 10:49:38
问题 I have the below data frame where I am trying to merge multiple transactions of the one customer into one single record. Input: ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_CODE L_NU 7/27/16 7/27/16 265 O 15 1 INTEREST 855 7/27/16 7/27/16 265 O 14 1 INSTALLMENT 855 Expected Output: ST_DATE ND_DATE LO_NO ACTV_CODE ACTV_AMT AB_NO FEATURE_INTEREST FEATURE_INSTALLMENT L_NU 7/27/16 7/27/16 265 O 29 1 1 1 855 Tried: install1 <- install %>% group_by(LO_NO,AB_NO,L_NU) %>% slice(which.min(as

Rolling sum on an unbalanced time series

橙三吉。 提交于 2019-12-24 09:58:50
问题 I have a series of annual incident counts per category, with no rows for years in which the category did not see an incident. I would like to add a column that shows, for each year, how many incidents occurred in the previous three years. One way to handle this is to add empty rows for all years with zero incidents, then use rollapply() with a left-aligned four year window, but that would expand my data set more than I want to. Surely there's a way to use ddply() and transform for this? The

Merging rows of binary data based on columns using ddply [duplicate]

你。 提交于 2019-12-24 07:09:37
问题 This question already has answers here : Aggregate / summarize multiple variables per group (e.g. sum, mean) (6 answers) Closed 3 years ago . I have the following dataframe for which I want merge together binary values from an amount of rows. df =data.frame(ID=c(rep("A",5),rep("B",5)), nr=c(rep("2",5),rep("3",5)), replicate(10,sample(0:1,10,rep=TRUE))) eg: # ID nr X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 # A 2 0 0 1 1 1 1 1 1 1 0 # A 2 1 0 0 0 0 0 0 1 0 1 # A 2 0 0 1 1 1 0 0 0 0 1 # A 2 0 0 0 0 0 1 1 1