melt

Reshaping multiple groups of columns in a data frame from wide to long

无人久伴 提交于 2019-12-11 06:59:36
问题 I am working with air-quality data. I tried to reshape the data frame from wide to long using melt function. Here is the data: Elev stands for Elevation , Obs for observation and US3, DK1, DE1 are models, where lm and ul represents first and third quantiles. Elev Obs lm ul US3 lm ul DK1 lm ul 1 0 37.74289 34.33422 41.27840 38.82037 35.35241 42.30042 49.31111 45.00134 53.90968 2 100 38.14076 34.71842 41.36560 39.82727 36.49086 43.22209 50.46545 45.79068 55.44664 3 250 39.31056 35.98180 42

Melt a array and make numeric values character

…衆ロ難τιáo~ 提交于 2019-12-11 05:47:56
问题 I have a array and I want to melt it based on the dimnames. The problem is that the dimension names are large numeric values and therefore making them character would convert them to a wrong ID see the example: test <- array(1:18, dim = c(3,3,2), dimnames = list(c(00901291282245454545454,329293929929292,2929992929922929), c("a", "b", "c"), c("d", "e"))) library(reshape2) library(data.table) test2 <- data.table(melt(test)) test2[, Var1 := as.character(Var1)] > test2 Var1 Var2 Var3 value 1: 9

Explode a row to multiple rows in pandas dataframe

不打扰是莪最后的温柔 提交于 2019-12-11 01:54:14
问题 I have a dataframe with the following header: id, type1, ..., type10, location1, ..., location10 and I want to convert it as follows: id, type, location I managed to do this using embedded for loops but it's very slow: new_format_columns = ['ID', 'type', 'location'] new_format_dataframe = pd.DataFrame(columns=new_format_columns) print(data.head()) new_index = 0 for index, row in data.iterrows(): ID = row["ID"] for i in range(1,11): if row["type"+str(i)] == np.nan: continue else: new_row = pd

Pandas 'partial melt' or 'group melt'

隐身守侯 提交于 2019-12-10 19:23:18
问题 I have a DataFrame like this >>> df = pd.DataFrame([[1,1,2,3,4,5,6],[2,7,8,9,10,11,12]], columns=['id', 'ax','ay','az','bx','by','bz']) >>> df id ax ay az bx by bz 0 1 1 2 3 4 5 6 1 2 7 8 9 10 11 12 and I want to transform it into something like this id name x y z 0 1 a 1 2 3 1 2 a 7 8 9 2 1 b 4 5 6 3 2 b 10 11 12 This is an unpivot / melt problem, but I don't know of any way to melt by keeping these groups intact. I know I can create projections across the original dataframe and then concat

how to use gather_ in tidyr with variables

旧巷老猫 提交于 2019-12-10 16:33:00
问题 I'm using tidyr together with shiny and hence needs to utilize dynamic values in tidyr operations. However I do have trouble using the gather_(), which I think was designed for such case. Minimal example below: library(tidyr) df <- data.frame(name=letters[1:5],v1=1:5,v2=10:14,v3=7:11,stringsAsFactors=FALSE) #works fine df %>% gather(Measure,Qty,v1:v3) dyn_1 <- 'Measure' dyn_2 <- 'Qty' dyn_err <- 'v1:v3' dyn_err_1 <- 'v1' dyn_err_2 <- 'v2' #error df %>% gather_(dyn_1,dyn_2,dyn_err) #error df %

Reshape data.frame with two columns into multiple columns with data (R)

拥有回忆 提交于 2019-12-10 15:21:32
问题 A trivial question but I cant find the answer as of yet. I want to split the dataframe column 'year' into a set of new columns with each year the column name and subsequent data below it: Year FQ 1975 3.156 1975 8.980 1977 10.304 1977 7.861 1979 4.729 1979 7.216 1981 4.856 1981 3.438 1983 9.887 1983 3.850 desired output: 1975 1977 1979 1981 1983 3.156 10.304 4.729 4.856 9.887 8.980 7.861 7.216 3.438 3.850 sample data: d<-structure(list(Year = structure(1:10, .Label = c("1975", "1975", "1977",

R: Pivot the rows into columns and use N/A's for missing values

时间秒杀一切 提交于 2019-12-10 13:47:10
问题 I have a dataframe that looks something like this NUM <- c("45", "45", "45", "45", "48", "50", "66", "66", "66", "68") Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D") Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4) df1 <- data.frame(NUM,Type,Points) df1: +-----+------+--------+ | NUM | TYPE | Points | +-----+------+--------+ | 45 | A | 9.2 | | 45 | F | 60.8 | | 45 | C | 22.9 | | 45 | B | 1012.7 | | 48 | D | 18.7 | | 50 | A | 11.1 | | 66 | E | 67.2 | | 66 | C | 63.1

dcast with custom fun.aggregate

牧云@^-^@ 提交于 2019-12-10 10:45:43
问题 I have data that looks like this: sample start end gene coverage X 1 10 A 5 X 11 20 A 10 Y 1 10 A 5 Y 11 20 A 10 X 1 10 B 5 X 11 20 B 10 Y 1 10 B 5 Y 11 20 B 10 I added additional columns: data$length <- (data$end - data$start + 1) data$ct_lt <- (data$length * data$coverage) I reformated my data using dcast: casted <- dcast(data, gene ~ sample, value.var = "coverage", fun.aggregate = mean) So my new data looks like this: gene X Y A 10.00000 10.00000 B 38.33333 38.33333 This is the correct

How can I add rows for all dates between two columns?

耗尽温柔 提交于 2019-12-10 03:59:39
问题 import pandas as pd mydata = [{'ID' : '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016'}, {'ID' : '20', 'Entry Date': '10/10/2016', 'Exit Date': '18/10/2016'}] mydata2 = [{'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '10/10/2016'}, {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '11/10/2016'}, {'ID': '10', 'Entry Date': '10/10/2016', 'Exit Date': '15/10/2016', 'Date': '12/10/2016'}, {'ID': '10', 'Entry Date': '10/10/2016',

combine multiple rows with same field in R

主宰稳场 提交于 2019-12-08 15:01:31
问题 I have a dataset as following V1 <- c(5,5,5,45,45,77) V2 <- c("low", "low", "medium", "low", "low", "high") V3 <- c(10,3,6,10,3,1) df <- cbind.data.frame(V1,V3,V2) v1 v2 v3 5 10 low 5 3 low 5 6 medium 45 10 low 45 3 low 77 1 high I want it to be v1 low medium high 5 13 6 0 45 13 0 0 77 0 0 1 I have tried with cast/melt with little success. 回答1: Using rehape2 as Frank answered in the comments: library(reshape2) dcast(df, V1 ~ V2, value.var = "V3", fun = sum, fill = 0) Output: V1 high low