MySQL 5.7.5+ get first row for the groups

本秂侑毒 提交于 2020-03-21 05:38:15

问题


I have a legacy application in which Group By has been used along with non aggregate columns to fetch the first row for each group. The query is as following:

SELECT
    columnPrimaryKey,
    column1,
    column2,
    column3
FROM
    (SELECT
        columnPrimaryKey,
        column1,
        column2,
        column3
    FROM testTable
    ORDER BY column2
) AS tbl
GROUP BY column3

Recently, the version was updated to 5.7.22 and now the above query is not returning expected results even after disabling the ONLY_FULL_GROUP_BY mode.

Yes, I can rewrite the query as following to work according to the new behavior as:

SELECT
    x.columnPrimaryKey,
    x.column1,
    x.column2,
    x.column3
FROM tableName AS x INNER JOIN (
    SELECT
        MIN( column2 ) AS column2,
        column3
    FROM tableName
    GROUP BY column3
) AS y ON x.column2 = y.column2 AND x.column3 = y.column3;

Unfortunately, that's not an option for now. The only option I see is to downgrade to 5.7.5 earlier.

Fiddle 5.7 with 'ONLY_FULL_GROUP_BY' disabled and unexpected results:

https://www.db-fiddle.com/f/8VjB7XpkobWVyXpPvUaGt2/0

Fiddle 5.6 with default modes and expected results:

https://www.db-fiddle.com/f/8VjB7XpkobWVyXpPvUaGt2/1

My question is: Is there any way to disable this behavior of random selection so that the legacy code works without rewriting them or downgrading?

Any suggestions greatly appreciated!


回答1:


Your ORDER BY in the derived table subquery is ignored in MySQL 5.7.

See https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html

The optimizer propagates an ORDER BY clause in a derived table or view reference to the outer query block if these conditions are all true:

  • The outer query is not grouped or aggregated.

  • The outer query does not specify DISTINCT, HAVING, or ORDER BY.

  • The outer query has this derived table or view reference as the only source in the FROM clause.

Otherwise, the optimizer ignores the ORDER BY clause.

Your outer query has a JOIN and a GROUP BY, so it doesn't qualify to propagate the ORDER BY, therefore it ignores the ORDER BY.

This optimizer behavior is controlled by the optimizer switch derived_merge. You can disable it.

Demo:

mysql [localhost] {msandbox} (test) > select @@version;
+-----------+
| @@version |
+-----------+
| 5.7.21    |
+-----------+

mysql [localhost] {msandbox} (test) > SELECT     columnPrimaryKey,     column1,     column2,     column3 FROM     (SELECT         columnPrimaryKey,         column1,         column2,         column3     FROM testTable     ORDER BY column2 ) AS tbl GROUP BY column3;
+------------------+----------------+---------+---------+
| columnPrimaryKey | column1        | column2 | column3 |
+------------------+----------------+---------+---------+
|                1 | Some Name 8-4  |       4 |       8 |
|                6 | Some Name 9-1  |       1 |       9 |
|                8 | Some Name 10-2 |       2 |      10 |
+------------------+----------------+---------+---------+

mysql [localhost] {msandbox} (test) > set optimizer_switch = 'derived_merge=off';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (test) > SELECT     columnPrimaryKey,     column1,     column2,     column3 FROM     (SELECT         columnPrimaryKey,         column1,         column2,         column3     FROM testTable     ORDER BY column2 ) AS tbl GROUP BY column3;
+------------------+----------------+---------+---------+
| columnPrimaryKey | column1        | column2 | column3 |
+------------------+----------------+---------+---------+
|                5 | Some Name 8-1  |       1 |       8 |
|                6 | Some Name 9-1  |       1 |       9 |
|                8 | Some Name 10-2 |       2 |      10 |
+------------------+----------------+---------+---------+


来源:https://stackoverflow.com/questions/51969174/mysql-5-7-5-get-first-row-for-the-groups

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!