reshape | 易学教程

Compute mean and standard deviation by group for multiple variables in a data.frame

阅读更多关于 Compute mean and standard deviation by group for multiple variables in a data.frame

问题 Edit -- This question was originally titled << Long to wide data reshaping in R >> I'm just learning R and trying to find ways to apply it to help out others in my life. As a test case, I'm working on reshaping some data, and I'm having trouble following the examples I've found online. What I'm starting with looks like this: ID Obs 1 Obs 2 Obs 3 1 43 48 37 1 27 29 22 1 36 32 40 2 33 38 36 2 29 32 27 2 32 31 35 2 25 28 24 3 45 47 42 3 38 40 36 And what I want to end up with will look like this

Subsetting R data frame results in mysterious NA rows

阅读更多关于 Subsetting R data frame results in mysterious NA rows

问题 I've been encountering what I think is a bug. It's not a big deal, but I'm curious if anyone else has seen this. Unfortunately, my data is confidential, so I have to make up an example, and it's not going to be very helpful. When subsetting my data, I occassionally get mysterious NA rows that aren't in my original data frame. Even the rownames are NA. EG: example <- data.frame("var1"=c("A", "B", "A"), "var2"=c("X", "Y", "Z")) example var1 var2 1 A X 2 B Y 3 A Z then I run: example[example

What does -1 mean in numpy reshape?

阅读更多关于 What does -1 mean in numpy reshape?

问题 A numpy matrix can be reshaped into a vector using reshape function with parameter -1. But I don't know what -1 means here. For example: a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]]) b = numpy.reshape(a, -1) The result of b is: matrix([[1, 2, 3, 4, 5, 6, 7, 8]]) Does anyone know what -1 means here? And it seems python assign -1 several meanings, such as: array[-1] means the last element. Can you give an explanation? 回答1: The criterion to satisfy for providing the new shape is that 'The new

What does -1 mean in numpy reshape?

阅读更多关于 What does -1 mean in numpy reshape?

Easy way to convert long to wide format with counts [duplicate]

阅读更多关于 Easy way to convert long to wide format with counts [duplicate]

问题 This question already has answers here : Faster ways to calculate frequencies and cast from long to wide (4 answers) Closed last year . I have the following data set: sample.data <- data.frame(Step = c(1,2,3,4,1,2,1,2,3,1,1), Case = c(1,1,1,1,2,2,3,3,3,4,5), Decision = c("Referred","Referred","Referred","Approved","Referred","Declined","Referred","Referred","Declined","Approved","Declined")) sample.data Step Case Decision 1 1 1 Referred 2 2 1 Referred 3 3 1 Referred 4 4 1 Approved 5 1 2

Intuition and idea behind reshaping 4D array to 2D array in NumPy

阅读更多关于 Intuition and idea behind reshaping 4D array to 2D array in NumPy

问题 While implementing a Kronecker-product for pedagogical reasons (without using the obvious and readily available np.kron() ), I obtained a 4 dimensional array as an intermediate result, which I've to reshape to get the final result. But, I still can't wrap my head around reshaping these high dimensional arrays. I have this 4D array: array([[[[ 0, 0], [ 0, 0]], [[ 5, 10], [15, 20]]], [[[ 6, 12], [18, 24]], [[ 7, 14], [21, 28]]]]) This is of shape (2, 2, 2, 2) and I'd like to reshape it to (4,4)

dcast error: ‘Aggregation function missing: defaulting to length’

阅读更多关于 dcast error: ‘Aggregation function missing: defaulting to length’

问题 My df looks like this: Id Task Type Freq 3 1 A 2 3 1 B 3 3 2 A 3 3 2 B 0 4 1 A 3 4 1 B 3 4 2 A 1 4 2 B 3 I want to restructure by Id and get: Id A B … Z 3 5 3 4 4 6 I tried: df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq") and got the error: Aggregation function missing: defaulting to length I can't figure out what to put in the fun.aggregate . What's the problem? 回答1: The reason why you are getting this warning is in the description of fun.aggregate (see ?dcast ): aggregation

How to reshape data table after applying multiple functions to multiple variables?

阅读更多关于 How to reshape data table after applying multiple functions to multiple variables?

问题 I have the following sample data: Hostname Date-Time hdisk86 hdisk88 hdisk90 hdisk89 hdisk91 hdisk92 hdisk93 hdisk94 hdisk96 hdisk95 1: hostname1 2015-01-26 00:15:22 0 0 0 0 0 0 0 0 0 0 2: hostname1 2015-01-26 00:30:24 0 0 0 0 0 0 0 0 0 0 3: hostname1 2015-01-26 00:45:25 0 0 0 0 0 0 0 0 0 0 4: hostname1 2015-01-26 01:00:25 0 0 0 0 0 0 0 0 0 0 5: hostname1 2015-01-26 01:15:28 0 0 0 0 0 0 0 0 0 0 6: hostname1 2015-01-26 01:30:29 0 0 0 0 0 0 0 0 0 0 hdisk98 hdisk97 hdisk99 hdisk100 hdisk101

formatting multi-row data into single row in R

阅读更多关于 formatting multi-row data into single row in R

问题 I am a strange excel or csv formatted file which I want to import to R as a data frame. The problem is that some columns have multiple rows for the records, for example, the data is as follow: There are three columns and two rows but the tools columns has multiple columns, is there a way I can format the data so I will have only record with multiple tools (like say tool1, tool2, etc) Task Location Tools Raising ticket Alabama sharepoint word oracle Changing ticket Seattle word oracle Final

Intradataframe Analysis--creating a derivative data frame from another data frame

阅读更多关于 Intradataframe Analysis--creating a derivative data frame from another data frame

问题 This may be a little obtuse of a question title since I'm still getting up to speed with R but I'm doing some data frame manipulation to extract certain percentages regarding classification groups that are captured by one column that is a factor against another column I wish to obtain percentages from. I'll use the built in mtcars to demonstrate what I'm trying to achieve, where gear is playing the role of the classification variable, and cyl is the data I'm trying to get percentages from.