reshape | 易学教程

Aggregating factor level counts - by factor

阅读更多关于 Aggregating factor level counts - by factor

问题 I have been trying to make a table displaying the counts of factor levels by another factor. For this, I looked on dozens of pages, questions... trying to use functions in some packages (dplyr, reshape) to get the job done, without any success in using them correctly. That's what I got: # my data: var1 <- c("red","blue","red","blue","red","red","red","red","red","red","red","red","blue","red","blue") var2 <- c("0","1","0","0","0","0","0","0","0","0","1","0","0","0","0") var3 <- c("2","2","1",

Reshaping database using reshape package

阅读更多关于 Reshaping database using reshape package

问题 I would like to reshaping some rows of my database. In particular I have some row that it replicate for the Id column. I would like to convert this row in column. I provide a code that it represent a example of my database. I'm trying with t() and reshape but it doesn't do that I would. Can anyone give me any suggestions? test<-data.frame(Id=c(1,1,2,3), St=c(20,80,80,20), gap=seq(0.02,0.08,by=0.02), gip=c(0.23,0.60,0.86,2.09), gat=c(0.0107,0.989,0.337,0.663)) 回答1: setNames(data.frame(t(test))

How can I go from wide to long while coupling numbered columns with one another?

阅读更多关于 How can I go from wide to long while coupling numbered columns with one another?

问题 I have a dataset that looks like this: phrase wo1sp wo2sp wo3sp wo1sc wo2sc wo3sc hello dan mark todd 10 5 4 hello mark dan chris 8 9 4 goodbye mark dan kev 2 4 10 what kev dan mark 4 5 5 And I'd like to change it to something like this: phrase sp sc hello dan 10 hello mark 5 hello todd 4 hello mark 8 hello dan 9 hello chris 4 goodbye mark 2 goodbye dan 4 goodbye kev 10 what kev 4 what dan 5 what mark 5 So, I know the first thing to do here is group_by(phrase) . What I'm not sure about is how

MATLAB reshape matrix converting indices to row index

阅读更多关于 MATLAB reshape matrix converting indices to row index

问题 Is it possible to reshape matrices such that x1 = 1 5 3 4 4 3 7 1 becomes x2 = 5 NaN 4 3 NaN NaN 1 or vice versa, where the first column in x1 is an index that corresponds to a row# in x2 ? 回答1: Create an array with NaNs and fill it with values: x2 = NaN(max(x1(:,1)),1); x2(x1(:,1)) = x1(:,2); Now, if zero padding is acceptable, then you can simply use the second line directly without first creating out . Alternatively , for your specific example (no overlapping indices) the same result is

Reshape potentially very large 1D-array into multidimensional matrix with variable dimensions

阅读更多关于 Reshape potentially very large 1D-array into multidimensional matrix with variable dimensions

问题 I have to postprocess data from a parametric analysis which has as output a 1D-array with the results. I would like to reshape this 1D array into a multidimensional matrix which has the dimensions of my investigated parameters (to be in the right order), and those dimensions may vary in number . I could came up with a function based on for-loops, but the problem is that with very large arrays I run out of RAM. I am perfectly aware that this is not the smartest way to do this. I was wondering

Making more sense of the reshape() function

阅读更多关于 Making more sense of the reshape() function

问题 Consider the following dataset which is taken from this question: Going from wide to long w/ coupled-columns: Is there a more R way to do this (i.e. - without using a for loop)? ( dput at the end of the question) phrase wo1sp wo2sp wo3sp wo1sc wo2sc wo3sc 1 hello dan mark todd 10 5 4 2 hello mark dan chris 8 9 4 3 goodbye mark dan kev 2 4 10 4 what kev dan mark 4 5 5 The goal is to reshape the data from wide to long taking into account that there is some pattern in the column names. The

Getting a tuple in a Dafaframe into multiple rows

阅读更多关于 Getting a tuple in a Dafaframe into multiple rows

问题 I have a Dataframe, which has two columns (Customer, Transactions). The Transactions column is a tuple of all the transaction id's of that customer. Customer Transactions 1 (a,b,c) 2 (d,e) I want to convert this into a dataframe, which has customer and transaction id's, like this. Customer Transactions 1 a 1 b 1 c 2 d 2 e We can do it using loops, but is there a straight 1 or 2 lines way for doing that. 回答1: You can use DataFrame constructor: df = pd.DataFrame({'Customer':[1,2], 'Transactions

R DataTable Join and constrain rows

阅读更多关于 R DataTable Join and constrain rows

问题 I'd like to summarize a set of observations in a datatable and could use some help with the syntax. I think this is as simple as a join but I'm trying to identify that specific values were seen on a specific observation DAY even if its across multiple measurements or sensors on that day. observations are summarized by date observations date have varied counts of measurements (rows per date) 'M'easurement columns indicate that a specific value was observed in ANY sensor for the day. I've

Parsing Data From Long to Wide Format in Python

阅读更多关于 Parsing Data From Long to Wide Format in Python

问题 I'm wondering what the best way to parse long form data into wide for is in python. I've previously been doing this sort of task in R but it really is taking to long as my files can be upwards of 1 gb. Here is some dummy data: Sequence Position Strand Score Gene1 0 + 1 Gene1 1 + 0.25 Gene1 0 - 1 Gene1 1 - 0.5 Gene2 0 + 0 Gene2 1 + 0.1 Gene2 0 - 0 Gene2 1 - 0.5 But I'd like to have it in the wide form where I've summed the scores over the strands at each position. Here is output I hope for:

Explode a row to multiple rows in pandas dataframe

阅读更多关于 Explode a row to multiple rows in pandas dataframe

问题 I have a dataframe with the following header: id, type1, ..., type10, location1, ..., location10 and I want to convert it as follows: id, type, location I managed to do this using embedded for loops but it's very slow: new_format_columns = ['ID', 'type', 'location'] new_format_dataframe = pd.DataFrame(columns=new_format_columns) print(data.head()) new_index = 0 for index, row in data.iterrows(): ID = row["ID"] for i in range(1,11): if row["type"+str(i)] == np.nan: continue else: new_row = pd