group-by | 易学教程

Python Pandas Groupby/Append columns

阅读更多关于 Python Pandas Groupby/Append columns

问题 This is my example dataframe: Index Param1 Param2 A 1 2 A 3 4 B 1 3 B 4 Nan C 2 4 What I would like to get is: Index Param1 Param2 Param3 Param4 A 1 2 3 4 B 1 3 4 C 2 4 What would be the best way to achieve it using pandas? Thanks in advance for your help. 回答1: You can use groupby with unstack: def f(x): return (pd.DataFrame(np.sort(x.values.ravel()))) df = df.groupby('Index')['Param1','Param2'].apply(f).unstack() df.columns = df.columns.droplevel(0) print (df) 0 1 2 3 Index A 1 2 3 4 B 1 3 4

Django queryset - column IN() GROUP BY HAVING COUNT DISTINCT

阅读更多关于 Django queryset - column IN() GROUP BY HAVING COUNT DISTINCT

问题 With the following models: class Post(models.Model): class Meta: db_table = "posts" class Tag(models.Model): tag = models.CharField(max_length=50) class Meta: db_table = "tags" class PostTag(models.Model): postid = models.PositiveIntegerField() tagid = models.PositiveIntegerField() class Meta: unique_together = ("postid", "tagid") db_table = "posttags" To get postids of posts which contain all the tagids given in TAGLIST where TAGLEN is the number of tagids in TAGLIST: SELECT postid FROM

python - Pandas: groupby ffill for multiple columns

阅读更多关于 python - Pandas: groupby ffill for multiple columns

问题 I have the following DataFrame with some missing values. I want to use ffill() to fill missing values in both var1 and var2 grouped by date and building . I can do that for one variable at a time, but when I try to do it for both, it crashes. How can I do this for both variables at once, while also not modifying but retaining var3 or var4 ? df = pd.DataFrame({ 'date': ['2019-01-01','2019-01-01','2019-01-01','2019-01-01','2019-02-01','2019-02-01','2019-02-01','2019-02-01'], 'building': ['a',

Can't fix this: “Cannot group by an aggregate”

阅读更多关于 Can't fix this: “Cannot group by an aggregate”

问题 Sorry for the silly question. I have read a lot of threads about the same issue, but still, can't fix this... SELECT company_name, SUM(clicks) FROM table1 WHERE code = 'ES' GROUP BY 1 ORDER BY clicks DESC LIMIT 100; This results in: Expression 'clicks' is not present in the GROUP BY list And if I try this: SELECT company_name, SUM(clicks) FROM table1 WHERE code = 'ES' GROUP BY 1,2 ORDER BY clicks DESC LIMIT 100; This is what I get: Cannot group by an aggregate. If I try with no aggregation on

Sqlite GROUP BY priority

阅读更多关于 Sqlite GROUP BY priority

问题 +----+------------+------+ | id | title | lang | +----+------------+------+ | 1 | title 1 EN | en | | 1 | title 1 FR | fr | | 1 | title 1 ZH | zh | | 2 | title 2 EN | en | | 3 | title 3 ZH | zh | +----+------------+------+ this is my table and I want to group by id but I sometimes I need language "en" to have priority and sometimes I need to have language "zh" as priority SELECT * FROM table GROUP BY id gives me a list of all uniqe ids but places zh in favor for id 1, is it possible that I

SQL Server group by absorb null and empty values

阅读更多关于 SQL Server group by absorb null and empty values

问题 I have this data: Id Name amount Comments ------------------------------- 1 n1 421762 Hello 2 n2 421 Bye 3 n2 262 null 4 n2 5127 '' Each name may or may not have extra rows with null or empty comments. How can I group by name and sum(amount) such that it ignores/absorbs the null or empty comments in the grouping and shows me only 2 groups. Output I want: Id Name sum(amount) Comments ------------------------------------ 1 n1 421762 Hello 2 n2 5180 Bye I can't figure this out. I hoped that

What indexes to improve performance of JOIN and GROUP BY

阅读更多关于 What indexes to improve performance of JOIN and GROUP BY

问题 I have setup some tables and ran a query. However in my explain it would appear the SQL results in a temporary table being generated ( I assume this is because of the GROUP BY) I have added some indexes to speed up the query but wondering if there was a way to stop the use of a temporary table and if there is any other way I can speed my query up using indexes? CartData CREATE TABLE `cartdata` ( `IDCartData` INT(11) NOT NULL AUTO_INCREMENT, `CartOrderref` VARCHAR(25) NOT NULL DEFAULT '',

what is the difference between WHERE and HAVING [duplicate]

阅读更多关于 what is the difference between WHERE and HAVING [duplicate]

问题 This question already has answers here : Closed 10 years ago . Possible Duplicate: SQL: What’s the difference between HAVING and WHERE? i am learning sql syntax and i can't understand this. the second half of the question is a much more technical one. what is actually happening behind the scenes of the database between WHERE and HAVING? which one uses more resources? are they same algorithm just applying to different data sets? thanks! 回答1: Where is in most queries and limits the records that

SQL Server group by absorb null and empty values

阅读更多关于 SQL Server group by absorb null and empty values

How to sum the values of a numeric variable based on a string variable [duplicate]

阅读更多关于 How to sum the values of a numeric variable based on a string variable [duplicate]

问题 This question already has answers here : Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`? (2 answers) Closed 1 year ago . Consider the following dataframe: df <- data.frame(numeric=c(1,2,3,4,5,6,7,8,9,10), string=c("a", "a", "b", "b", "c", "d", "d", "e", "d", "f")) print(df) numeric string 1 1 a 2 2 a 3 3 b 4 4 b 5 5 c 6 6 d 7 7 d 8 8 e 9 9 d 10 10 f It has a numeric variable and a string variable. Now, I would like to create another dataframe in which the