melt

Single row per id to multiple row per id

最后都变了- 提交于 2019-12-01 11:58:56
问题 I'd like to expand observations from single row-per-id to multiple rows-per-id based on a given time interval: > dput(df) structure(list(id = c(123, 456, 789), gender = c(0, 1, 1), yr.start = c(2005, 2010, 2000), yr.last = c(2007, 2012, 2000)), .Names = c("id", "gender", "yr.start", "yr.last"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L)) > df # A tibble: 3 x 4 id gender yr.start yr.last <dbl> <dbl> <dbl> <dbl> 1 123 0 2005 2007 2 456 1 2010 2012 3 789 1 2000 2000 I want

Melting an R data.table with a factor column

别来无恙 提交于 2019-12-01 11:47:53
I have the following R data.table (though this should scale with a data.frame too). The goal is to reshape this data.table to plot as a scatterplot in ggplot2 . I therefore need to reshape this data.table to have one "factor" column to color the points: > library(data.table) > dt ID x_A y_A x_B y_B 1: 05AC 0.81 3 0.92 2.05 2: 01BA 0.41 5 0.63 1.8 3: Z1AC 0.41 5 0.58 1.8 4: B2BA 0.21 6.5 1.00 1.8 .... I believe the correct output needs to be of the form: ID type x y 05AC A 0.81 3 05AC B 0.92 2.05 01BA A 0.41 5 01BA B 0.63 1.8 Z1AC A 0.41 5 Z1AC B 0.58 1.8 B2BA A 0.21 6.5 B2BA B 1.00 1.8 Is

Melting an R data.table with a factor column

杀马特。学长 韩版系。学妹 提交于 2019-12-01 08:49:55
问题 I have the following R data.table (though this should scale with a data.frame too). The goal is to reshape this data.table to plot as a scatterplot in ggplot2 . I therefore need to reshape this data.table to have one "factor" column to color the points: > library(data.table) > dt ID x_A y_A x_B y_B 1: 05AC 0.81 3 0.92 2.05 2: 01BA 0.41 5 0.63 1.8 3: Z1AC 0.41 5 0.58 1.8 4: B2BA 0.21 6.5 1.00 1.8 .... I believe the correct output needs to be of the form: ID type x y 05AC A 0.81 3 05AC B 0.92 2

Pandas Melt with Multiple Value Vars

冷暖自知 提交于 2019-11-30 10:24:58
I have a data set which is in wide format like this Index Country Variable 2000 2001 2002 2003 2004 2005 0 Argentina var1 12 15 18 17 23 29 1 Argentina var2 1 3 2 5 7 5 2 Brazil var1 20 23 25 29 31 32 3 Brazil var2 0 1 2 2 3 3 I want to reshape my data to long so that year, var1, and var2 become new columns Index Country year var1 var2 0 Argentina 2000 12 1 1 Argentina 2001 15 3 2 Argentina 2002 18 2 .... 6 Brazil 2000 20 0 7 Brazil 2001 23 1 I got my code to work when I only had one variable by writing df=(pd.melt(df,id_vars='Country',value_name='Var1', var_name='year')) I cant figure out how

Wide to long returns empty output - Python dataframe

陌路散爱 提交于 2019-11-30 09:50:20
问题 I have a dataframe which can be generated from the code as given below df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],'val1': [2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],'val2':[1,3,5],'date3': ['12/31/2027','11/25/2029','10/06/2025'],'val3':[7,9,11]}) I followed the below solution to convert it from wide to long pd.wide_to_long(df, stubnames=['date', 'val'], i='person_id', j='grp').sort_index(level=0) Though this works with sample data

Simultaneously melt multiple columns in Python Pandas

帅比萌擦擦* 提交于 2019-11-30 09:35:53
问题 wondering if pd.melt supports melting multiple columns. I have the below examples trying to have the value_vars as list of lists but i am getting an error: ValueError: Location based indexing can only have [labels (MUST BE IN THE INDEX), slices of labels (BOTH endpoints included! Can be slices of integers if the index is integers), listlike of labels, boolean] types Using pandas 0.23.1. df = pd.DataFrame({'City': ['Houston', 'Austin', 'Hoover'], 'State': ['Texas', 'Texas', 'Alabama'], 'Name':

Pandas Melt several groups of columns into multiple target columns by name

不想你离开。 提交于 2019-11-30 07:03:59
I would like to melt several groups of columns of a dataframe into multiple target columns. Similar to questions Python Pandas Melt Groups of Initial Columns Into Multiple Target Columns and pandas dataframe reshaping/stacking of multiple value variables into seperate columns . However I need to do this explicitly by column name, rather than by index location. import pandas as pd df = pd.DataFrame([('a','b','c',1,2,3,'aa','bb','cc'), ('d', 'e', 'f', 4, 5, 6, 'dd', 'ee', 'ff')], columns=['a_1', 'a_2', 'a_3','b_1', 'b_2', 'b_3','c_1', 'c_2', 'c_3']) df Original Dataframe: id a_1 a_2 a_3 b_1 b_2

Wide to long returns empty output - Python dataframe

可紊 提交于 2019-11-29 17:18:57
I have a dataframe which can be generated from the code as given below df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],'val1': [2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],'val2':[1,3,5],'date3': ['12/31/2027','11/25/2029','10/06/2025'],'val3':[7,9,11]}) I followed the below solution to convert it from wide to long pd.wide_to_long(df, stubnames=['date', 'val'], i='person_id', j='grp').sort_index(level=0) Though this works with sample data as shown below, it doesn't work with my real data which has more than 200 columns. Instead of person

Python Pandas Sum Values in Columns If date between 2 dates

我怕爱的太早我们不能终老 提交于 2019-11-29 09:56:52
问题 I have a dataframe df which can be created with this: data={'id':[1,1,1,1,2,2,2,2], 'date1':[datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4), datetime.date(2016,1,2),datetime.date(2016,1,4),datetime.date(2016,1,3),datetime.date(2016,1,1)], 'date2':[datetime.date(2016,1,5),datetime.date(2016,1,3),datetime.date(2016,1,5),datetime.date(2016,1,5), datetime.date(2016,1,4),datetime.date(2016,1,5),datetime.date(2016,1,4),datetime.date(2016,1,1)],

Using melt with matrix or data.frame gives different output

こ雲淡風輕ζ 提交于 2019-11-29 07:04:57
Consider the following code: set.seed(1) M = matrix(rnorm(9), ncol = 3) dimnames(M) = list(LETTERS[1:3], LETTERS[1:3]) print(M) A B C A -0.6264538 1.5952808 0.4874291 B 0.1836433 0.3295078 0.7383247 C -0.8356286 -0.8204684 0.5757814 melt(M) Var1 Var2 value 1 A A -0.6264538 2 B A 0.1836433 3 C A -0.8356286 4 A B 1.5952808 5 B B 0.3295078 6 C B -0.8204684 7 A C 0.4874291 8 B C 0.7383247 9 C C 0.5757814 If i call melt using a data.frame , i get a different result: DF = data.frame(M) melt(DF) variable value 1 A -0.6264538 2 A 0.1836433 3 A -0.8356286 4 B 1.5952808 5 B 0.3295078 6 B -0.8204684 7 C