melt | 易学教程

Single row per id to multiple row per id

阅读更多关于 Single row per id to multiple row per id

问题 I'd like to expand observations from single row-per-id to multiple rows-per-id based on a given time interval: > dput(df) structure(list(id = c(123, 456, 789), gender = c(0, 1, 1), yr.start = c(2005, 2010, 2000), yr.last = c(2007, 2012, 2000)), .Names = c("id", "gender", "yr.start", "yr.last"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L)) > df # A tibble: 3 x 4 id gender yr.start yr.last <dbl> <dbl> <dbl> <dbl> 1 123 0 2005 2007 2 456 1 2010 2012 3 789 1 2000 2000 I want

Melting an R data.table with a factor column

阅读更多关于 Melting an R data.table with a factor column

I have the following R data.table (though this should scale with a data.frame too). The goal is to reshape this data.table to plot as a scatterplot in ggplot2 . I therefore need to reshape this data.table to have one "factor" column to color the points: > library(data.table) > dt ID x_A y_A x_B y_B 1: 05AC 0.81 3 0.92 2.05 2: 01BA 0.41 5 0.63 1.8 3: Z1AC 0.41 5 0.58 1.8 4: B2BA 0.21 6.5 1.00 1.8 .... I believe the correct output needs to be of the form: ID type x y 05AC A 0.81 3 05AC B 0.92 2.05 01BA A 0.41 5 01BA B 0.63 1.8 Z1AC A 0.41 5 Z1AC B 0.58 1.8 B2BA A 0.21 6.5 B2BA B 1.00 1.8 Is

Melting an R data.table with a factor column

阅读更多关于 Melting an R data.table with a factor column

问题 I have the following R data.table (though this should scale with a data.frame too). The goal is to reshape this data.table to plot as a scatterplot in ggplot2 . I therefore need to reshape this data.table to have one "factor" column to color the points: > library(data.table) > dt ID x_A y_A x_B y_B 1: 05AC 0.81 3 0.92 2.05 2: 01BA 0.41 5 0.63 1.8 3: Z1AC 0.41 5 0.58 1.8 4: B2BA 0.21 6.5 1.00 1.8 .... I believe the correct output needs to be of the form: ID type x y 05AC A 0.81 3 05AC B 0.92 2

Pandas Melt with Multiple Value Vars

阅读更多关于 Pandas Melt with Multiple Value Vars

I have a data set which is in wide format like this Index Country Variable 2000 2001 2002 2003 2004 2005 0 Argentina var1 12 15 18 17 23 29 1 Argentina var2 1 3 2 5 7 5 2 Brazil var1 20 23 25 29 31 32 3 Brazil var2 0 1 2 2 3 3 I want to reshape my data to long so that year, var1, and var2 become new columns Index Country year var1 var2 0 Argentina 2000 12 1 1 Argentina 2001 15 3 2 Argentina 2002 18 2 .... 6 Brazil 2000 20 0 7 Brazil 2001 23 1 I got my code to work when I only had one variable by writing df=(pd.melt(df,id_vars='Country',value_name='Var1', var_name='year')) I cant figure out how

Wide to long returns empty output - Python dataframe

阅读更多关于 Wide to long returns empty output - Python dataframe

问题 I have a dataframe which can be generated from the code as given below df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],'val1': [2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],'val2':[1,3,5],'date3': ['12/31/2027','11/25/2029','10/06/2025'],'val3':[7,9,11]}) I followed the below solution to convert it from wide to long pd.wide_to_long(df, stubnames=['date', 'val'], i='person_id', j='grp').sort_index(level=0) Though this works with sample data

Simultaneously melt multiple columns in Python Pandas

阅读更多关于 Simultaneously melt multiple columns in Python Pandas

问题 wondering if pd.melt supports melting multiple columns. I have the below examples trying to have the value_vars as list of lists but i am getting an error: ValueError: Location based indexing can only have [labels (MUST BE IN THE INDEX), slices of labels (BOTH endpoints included! Can be slices of integers if the index is integers), listlike of labels, boolean] types Using pandas 0.23.1. df = pd.DataFrame({'City': ['Houston', 'Austin', 'Hoover'], 'State': ['Texas', 'Texas', 'Alabama'], 'Name':

Pandas Melt several groups of columns into multiple target columns by name

阅读更多关于 Pandas Melt several groups of columns into multiple target columns by name

I would like to melt several groups of columns of a dataframe into multiple target columns. Similar to questions Python Pandas Melt Groups of Initial Columns Into Multiple Target Columns and pandas dataframe reshaping/stacking of multiple value variables into seperate columns . However I need to do this explicitly by column name, rather than by index location. import pandas as pd df = pd.DataFrame([('a','b','c',1,2,3,'aa','bb','cc'), ('d', 'e', 'f', 4, 5, 6, 'dd', 'ee', 'ff')], columns=['a_1', 'a_2', 'a_3','b_1', 'b_2', 'b_3','c_1', 'c_2', 'c_3']) df Original Dataframe: id a_1 a_2 a_3 b_1 b_2

Wide to long returns empty output - Python dataframe

阅读更多关于 Wide to long returns empty output - Python dataframe

I have a dataframe which can be generated from the code as given below df = pd.DataFrame({'person_id' :[1,2,3],'date1': ['12/31/2007','11/25/2009','10/06/2005'],'val1': [2,4,6],'date2': ['12/31/2017','11/25/2019','10/06/2015'],'val2':[1,3,5],'date3': ['12/31/2027','11/25/2029','10/06/2025'],'val3':[7,9,11]}) I followed the below solution to convert it from wide to long pd.wide_to_long(df, stubnames=['date', 'val'], i='person_id', j='grp').sort_index(level=0) Though this works with sample data as shown below, it doesn't work with my real data which has more than 200 columns. Instead of person

Python Pandas Sum Values in Columns If date between 2 dates

阅读更多关于 Python Pandas Sum Values in Columns If date between 2 dates

问题 I have a dataframe df which can be created with this: data={'id':[1,1,1,1,2,2,2,2], 'date1':[datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4), datetime.date(2016,1,2),datetime.date(2016,1,4),datetime.date(2016,1,3),datetime.date(2016,1,1)], 'date2':[datetime.date(2016,1,5),datetime.date(2016,1,3),datetime.date(2016,1,5),datetime.date(2016,1,5), datetime.date(2016,1,4),datetime.date(2016,1,5),datetime.date(2016,1,4),datetime.date(2016,1,1)],

Using melt with matrix or data.frame gives different output

阅读更多关于 Using melt with matrix or data.frame gives different output

Consider the following code: set.seed(1) M = matrix(rnorm(9), ncol = 3) dimnames(M) = list(LETTERS[1:3], LETTERS[1:3]) print(M) A B C A -0.6264538 1.5952808 0.4874291 B 0.1836433 0.3295078 0.7383247 C -0.8356286 -0.8204684 0.5757814 melt(M) Var1 Var2 value 1 A A -0.6264538 2 B A 0.1836433 3 C A -0.8356286 4 A B 1.5952808 5 B B 0.3295078 6 C B -0.8204684 7 A C 0.4874291 8 B C 0.7383247 9 C C 0.5757814 If i call melt using a data.frame , i get a different result: DF = data.frame(M) melt(DF) variable value 1 A -0.6264538 2 A 0.1836433 3 A -0.8356286 4 B 1.5952808 5 B 0.3295078 6 B -0.8204684 7 C