Using group by on multiple columns

后端 未结 2 633
北海茫月
北海茫月 2020-11-22 02:02

I understand the point of GROUP BY x.

But how does GROUP BY x, y work, and what does it mean?

相关标签:
2条回答
  • 2020-11-22 02:06

    Group By X means put all those with the same value for X in the one group.

    Group By X, Y means put all those with the same values for both X and Y in the one group.

    To illustrate using an example, let's say we have the following table, to do with who is attending what subject at a university:

    Table: Subject_Selection
    
    +---------+----------+----------+
    | Subject | Semester | Attendee |
    +---------+----------+----------+
    | ITB001  |        1 | John     |
    | ITB001  |        1 | Bob      |
    | ITB001  |        1 | Mickey   |
    | ITB001  |        2 | Jenny    |
    | ITB001  |        2 | James    |
    | MKB114  |        1 | John     |
    | MKB114  |        1 | Erica    |
    +---------+----------+----------+
    

    When you use a group by on the subject column only; say:

    select Subject, Count(*)
    from Subject_Selection
    group by Subject
    

    You will get something like:

    +---------+-------+
    | Subject | Count |
    +---------+-------+
    | ITB001  |     5 |
    | MKB114  |     2 |
    +---------+-------+
    

    ...because there are 5 entries for ITB001, and 2 for MKB114

    If we were to group by two columns:

    select Subject, Semester, Count(*)
    from Subject_Selection
    group by Subject, Semester
    

    we would get this:

    +---------+----------+-------+
    | Subject | Semester | Count |
    +---------+----------+-------+
    | ITB001  |        1 |     3 |
    | ITB001  |        2 |     2 |
    | MKB114  |        1 |     2 |
    +---------+----------+-------+
    

    This is because, when we group by two columns, it is saying "Group them so that all of those with the same Subject and Semester are in the same group, and then calculate all the aggregate functions (Count, Sum, Average, etc.) for each of those groups". In this example, this is demonstrated by the fact that, when we count them, there are three people doing ITB001 in semester 1, and two doing it in semester 2. Both of the people doing MKB114 are in semester 1, so there is no row for semester 2 (no data fits into the group "MKB114, Semester 2")

    Hopefully that makes sense.

    0 讨论(0)
  • 2020-11-22 02:12

    The GROUP BY clause is used in conjunction with the aggregate functions to group the result-set by one or more columns. e.g.:

    SELECT column_name, aggregate_function(column_name)
    FROM table_name
    WHERE column_name operator value
    GROUP BY column_name;
    

    Remember this order:

    1. SELECT (is used to select data from a database)

    2. FROM (clause is used to list the tables)

    3. WHERE (clause is used to filter records)

    4. GROUP BY (clause can be used in a SELECT statement to collect data across multiple records and group the results by one or more columns)

    5. HAVING (clause is used in combination with the GROUP BY clause to restrict the groups of returned rows to only those whose the condition is TRUE)

    6. ORDER BY (keyword is used to sort the result-set)

    You can use all of these if you are using aggregate functions, and this is the order that they must be set, otherwise you can get an error.

    Aggregate Functions are:

    MIN returns the smallest value in a given column

    SUM returns the sum of the numeric values in a given column

    AVG returns the average value of a given column

    COUNT returns the total number of values in a given column

    COUNT(*) returns the number of rows in a table

    SQL script examples about using aggregate functions:

    Let's say we need to find the sale orders whose total sale is greater than $950. We combine the HAVING clause and the GROUP BY clause to accomplish this:

    SELECT 
        orderId, SUM(unitPrice * qty) Total
    FROM
        OrderDetails
    GROUP BY orderId
    HAVING Total > 950;
    

    Counting all orders and grouping them customerID and sorting the result ascendant. We combine the COUNT function and the GROUP BY, ORDER BY clauses and ASC:

    SELECT 
        customerId, COUNT(*)
    FROM
        Orders
    GROUP BY customerId
    ORDER BY COUNT(*) ASC;
    

    Retrieve the category that has an average Unit Price greater than $10, using AVG function combine with GROUP BY and HAVING clauses:

    SELECT 
        categoryName, AVG(unitPrice)
    FROM
        Products p
    INNER JOIN
        Categories c ON c.categoryId = p.categoryId
    GROUP BY categoryName
    HAVING AVG(unitPrice) > 10;
    

    Getting the less expensive product by each category, using the MIN function in a subquery:

    SELECT categoryId,
           productId,
           productName,
           unitPrice
    FROM Products p1
    WHERE unitPrice = (
                    SELECT MIN(unitPrice)
                    FROM Products p2
                    WHERE p2.categoryId = p1.categoryId)
    
    0 讨论(0)
提交回复
热议问题