aggregate

combine two data frames and aggregate

☆樱花仙子☆ 提交于 2021-02-05 06:35:07
问题 I am having 2 data frames in the below format: dt1 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 2 3 1 2 2 3 4 1 1 3 1 1 1 1 4 1 2 1 2 5 1 1 1 1 6 1 2 1 2 dt2 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 1 3 1 2 2 3 4 1 0 4 1 1 1 1 6 1 2 1 2 9 2 1 1 1 12 1 2 1 2 and I want to aggregate and combine these two data frames by the id and the resulting dataframe like dt3 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 3 6 2 4 2 6 8 2 1 3 1 1 1 1 4 2 3 2 3 5 1 1 1 1 6 2 4 2 4 9 2 1 1 1

Aggregate and sum by one key and rest

谁说胖子不能爱 提交于 2021-02-04 08:27:36
问题 It's hard for me to explain what I want to get, so I will show an example: I have objects: {name: 'steve', received: 100} {name: 'carolina', received: 70} {name: 'steve', 'received: 30} {name: 'andrew', received: 10} I can do: { $group : { _id : '$name', sum : { "$sum" :'$received' }, }, }, And i will get: Steve received 130 (100 +30) Carolina received 70 Andrew received 10 But I need something like that: Steve received 130 (100 +30) Everyone else received 80 (70+10) How can I get this effect

How to use “Named aggregation” [duplicate]

拥有回忆 提交于 2021-02-02 09:51:07
问题 This question already has answers here : Multiple aggregations of the same column using pandas GroupBy.agg() (3 answers) Closed 1 year ago . I want to apply two different aggregates on the same column in a pandas DataFrameGroupBy and have the new columns be named. I've tried using what is shown here in the documentation. https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#named-aggregation In [82]: animals.groupby("kind").agg( ....: min_height=('height', 'min'), ....: max

Grouping rows aggregate and function in r

[亡魂溺海] 提交于 2021-01-29 19:01:31
问题 I am new to r and I wanted to aggregate the following matrix k n m s 1 g 10 11.8 2.4 2 g 20 15.3 3.2 3 g 15 8.4 4.1 4 r 14 3.0 5.0 5 r 16 6.0 7.0 6 r 5 8.0 15.0 results : k n s m 1 g 15 3.233333 7.31667 2 r 11.66667 9 4.16667 This was my attempt : k <- c("g", "g", "g", "r","r","r") n <- c(10,20,15,14,16,5) m <- c(11.8, 15.3, 8.4,3,6,8) s <- c(2.4, 3.2, 4.1,5,7,15) data1 <- data.frame(k,n,m,s) data2 <- aggregate(m ~ k, FUN = function(t) ********* , data=data1) I am more interested in m here is

Aggregate count of timeseries values which exceed threshold, by year-month

别说谁变了你拦得住时间么 提交于 2021-01-29 18:10:41
问题 I am now learning R and using the SEAS package to help me with some calculation in R and data is the same format as SEAS package likes. It is a time series require(seas) data(mscdata) dat.int <- (mksub(mscdata, id=1108447)) the heading of the data and it is 20 years of data year yday date t_max t_min t_mean rain snow precip However, I now need to calculate the number of days in each month rainfall is >= 1.0mm . So at the end of it. I would have two columns ( each month in each year and total

Dataframe aggregation of n-gram, their frequency and associate the entries of other columns with it using R

人盡茶涼 提交于 2021-01-29 15:48:18
问题 I am trying to aggregate a dataframe based on 1-gram (can be extended to n-gram by changing n in the code below) frequency and associate other columns to it. The way I did it is shown below. Are there any other shortcuts/ alternatives to produce the table shown at the very end of this question for the dataframe given below? The code and the results are shown below. The below chunk sets the environment, loads the libraries and reads the dataframe: # Clear variables in the working environment

Get apps with the highest review count since a dynamic series of days

三世轮回 提交于 2021-01-28 21:41:12
问题 I have two tables, apps and reviews (simplified for the sake of discussion): apps table id int reviews table id int review_date date app_id int (foreign key that points to apps) 2 questions: 1. How can I write a query / function to answer the following question?: Given a series of dates from the earliest reviews.review_date to the latest reviews.review_date (incrementing by a day), for each date, D , which apps had the most reviews if the app's earliest review was on or later than D ? I think

'dict' object has no attribute 'order_by' django

点点圈 提交于 2021-01-28 12:14:20
问题 i want to return a ManyToMany fields data , and also i've used aggregate to some calculation , now i need to return products as well this is my models.py class CustomerInvoice(models.Model): customer = models.CharField(max_length=50) items = models.ManyToManyField(Product,through='ProductSelecte') date = models.DateTimeField(auto_now_add=True) class ProductSelecte(models.Model): product = models.ForeignKey(Product, on_delete=models.CASCADE) products= models.ForeignKey(CustomerInvoice,on

Match Two different fields in Mongoose, Aggregate?

女生的网名这么多〃 提交于 2021-01-28 06:40:19
问题 I'm trying to match two different fields in the same document. But didn't get expected output as I want. Let me show with an example. I want to match weighted.phaseId with phases._id in same documents and not match should be removed from phases fields. Does any one have an Idea ? // Document after processing some aggregate query over a database. { "_id" : ObjectId("5a680c803096130f93d11c7a"), "weighted" : [ { "phaseId" : ObjectId("5a6734c32414e15d0c2920f0"), "_id" : ObjectId(

R: applying a function over a group

末鹿安然 提交于 2021-01-27 21:00:48
问题 I am looking to apply a function to a data frame and then store the results of that function in a new column in the data frame. Here is a sample of my data frame, tradeData: Login AL Diff a 1 0 a 1 0 a 1 0 a 0 1 a 0 0 a 0 0 a 0 0 a 1 -1 a 1 0 a 0 1 a 1 -1 a 1 0 a 0 1 b 1 0 b 0 1 b 0 0 b 0 0 b 1 -1 c 1 0 c 1 0 c 0 1 c 0 0 c 1 -1 Where the "Diff" column is the column I am trying to add. It just just the difference between the values row(x-1) and row(x) of tradeData, grouped by Login. Here are