melt | 易学教程

How to reshape a dataframe with “reoccurring” columns?

阅读更多关于 How to reshape a dataframe with “reoccurring” columns?

问题 I am new to data analysis with R. I recently got a pre-formatted environmental observation-model dataset, an example subset of which is shown below: date site obs mod site obs mod 2000-09-01 00:00:00 campus NA 61.63 city centre 66 56.69 2000-09-01 01:00:00 campus 52 62.55 city centre NA 54.75 2000-09-01 02:00:00 campus 52 63.52 city centre 56 54.65 Basically, the data include the time series of hourly observed and modelled concentrations of a pollutant at various sites in "reoccurring columns

Python pandas melting data to multiple columns and coulmn names in another column

阅读更多关于 Python pandas melting data to multiple columns and coulmn names in another column

问题 I have a dataframe which I want to melt the data into multiple target columns. The below code I used grp2 = pd.lreshape(grp1, cols.groupby(cols.str.split('_').str[1])).sort_values('ACCT_NAME') The above line I lose the column names grp2 = pd.melt(grp1 , id_vars = ['Client' , 'Industry'] , var_name = "H Year" , value_name = 'Count') The above line I dont get multiple target columns From DF Client INDUSTRY 1H2016_6MO 2H2016_6MO 1H2017_6MO 2H2017_6MO 1H2016_12MO 2H2016_12MO 1H2017_12MO 2H2017

R: How to split a string into values and map the resultant broken pieces as columns to the dataset? [duplicate]

阅读更多关于 R: How to split a string into values and map the resultant broken pieces as columns to the dataset? [duplicate]

问题 This question already has answers here : Split a column of concatenated comma-delimited data and recode output as factors (2 answers) Closed 2 years ago . As shown in the above pic, I've a column, genres, with a list of genres the corresponding movie belongs to. There are in total 19 unique genres. I'd like to know if I can manipulate this data into appending 19 columns to the data set each corresponding to each of the genres identifiers and label the corresponding cells as 0 or 1 indicating

Creating Stacked Bar Chart With one Variable for each Bar, using melt, and ggplot

阅读更多关于 Creating Stacked Bar Chart With one Variable for each Bar, using melt, and ggplot

问题 This question is raising different points as the one I posted yesterday, with a better description, so I hope for your understanding. I have the following Data: Data <- data.frame(LMX = c(1.92, 2.33, 3.52, 5.34, 6.07, 4.23, 3.45, 5.64), Thriving = c(4.33, 6.54, 6.13, 4.85, 4.26, 6.32, 5.63, 4.55), Wellbeing = c(1.92, 2.33, 3.52, 2.34, 4.07, 3.23, 3.45, 4.64)) rownames(Data) <- 1:8 Now, my aim is to generate a flipped over bar chart that is showing one bar for each variable with all bars

How to zero-normalize a molten dataframe?

阅读更多关于 How to zero-normalize a molten dataframe?

问题 Let's say I have this molten data.frame molten <- data.frame( gene = c("a1", "b1", "a1", "b1", "a1", "b1"), count = c(3, 4, 5, 2, 6, 7), condition = c("A", "A", "B", "B", "C", "C") ) # gene count condition # 1 a1 3 A # 2 b1 4 A # 3 a1 5 B # 4 b1 2 B # 5 a1 6 C # 6 b1 7 C Which looks like this unmolten molten %>% dcast(gene ~ condition, value.var = "count") # gene A B C # 1 a1 3 5 6 # 2 b1 4 2 7 How can I subtract column A from all the other numeric columns (B and C in this example). I want

Melt a pandas DataFrame

阅读更多关于 Melt a pandas DataFrame

问题 I have a pandas DataFrame like this: df = pd.DataFrame({'custid':[1,2,3,4], ...: 'prod1':['jeans','tshirt','jacket','tshirt'], ...: 'prod1_hnode1':[1,2,3,2], ...: 'prod1_hnode2':[6,7,8,7], ...: 'prod2':['tshirt','jeans','jacket','shirt'], ...: 'prod2_hnode1':[2,1,3,4], ...: 'prod2_hnode2':[7,6,8,7]}) In [54]: df Out[54]: custid prod1 prod1_hnode1 prod1_hnode2 prod2 prod2_hnode1 \ 0 1 jeans 1 6 tshirt 2 1 2 tshirt 2 7 jeans 1 2 3 jacket 3 8 jacket 3 3 4 tshirt 2 7 shirt 4 prod2_hnode2 0 7 1 6

Using melt / cast with variables of uneven length in R

阅读更多关于 Using melt / cast with variables of uneven length in R

问题 I'm working with a large data frame that I want to pivot, so that variables in a column become rows across the top. I've found the reshape package very useful in such cases, except that the cast function defaults to fun.aggregate=length. Presumably this is because I'm performing these operations by "case" and the number of variables measured varies among cases. I would like to pivot so that missing variables are denoted as "NA"s in the pivoted data frame. So, in other words, I want to go from

In R transpose and combine multiple dataframes with missing data and blank column names / rename melted columns prior to dcast

阅读更多关于 In R transpose and combine multiple dataframes with missing data and blank column names / rename melted columns prior to dcast

问题 I have searched and found many solutions that came close, but never quite worked in the end. This is probably something very simple, for those with experience... Here is a snippet of my data. This was created automatically from a JSON import by the package jsonlite. The data is very nicely structured, but I am nevertheless helpless. Update2: I have added the relevant data below structure(list(rightsize = c(42L, 50L, 52L, 49L, 41L, 41L, 41L, 41L, 41L, 45L, 47L, 42L, 45L, 46L, 42L, 44L, 44L,

Converting specific cells of data frame to table in R

阅读更多关于 Converting specific cells of data frame to table in R

问题 I have a data frame (read from RDS file) with 140 variables. I have subsetted 3 of them. But the subset has only one row with three column variables. I have to present it as a table and make a bar chart too. The subset data frame looks like this. HomeCondn_Good HomeCondn_Livabl HomeCondn_Dilapdtd (dbl) (dbl) (dbl) 1 65.9 29.7 4.3 The reproducible example is as follows: structure(list(HomeCondn_Good = 65.9, HomeCondn_Livabl = 29.7, HomeCondn_Dilapdtd = 4.3), .Names = c("HomeCondn_Good",

Melt and Merge on Substring - Python & Pandas

阅读更多关于 Melt and Merge on Substring - Python & Pandas

问题 I have data which has data like id name model_# ms bp1 cd1 sf1 sa1 rq1 bp2 cd2 sf2 sa2 rq2 ... 1 John 23984 1 23 234 124 25 252 252 62 194 234 234 ... 2 John 23984 2 234 234 242 62 262 622 262 622 26 262 ... for hundreds of models with up to 10 ms and variables counting up to 21. I have usually used pd.melt for doing my analysis where i look at bp1:bp21 or whatever. I currently have a need to create a melt where I look at bp1 values along with rq 1 values. I am looking to effectively create