MySQL - SELECT WHERE field IN (subquery) - Extremely slow why?

前端 未结 10 2053
情书的邮戳
情书的邮戳 2020-11-27 10:06

I\'ve got a couple of duplicates in a database that I want to inspect, so what I did to see which are duplicates, I did this:

SELECT relevant_field
FROM some         


        
相关标签:
10条回答
  • 2020-11-27 10:16

    The subquery is being run for each row because it is a correlated query. One can make a correlated query into a non-correlated query by selecting everything from the subquery, like so:

    SELECT * FROM
    (
        SELECT relevant_field
        FROM some_table
        GROUP BY relevant_field
        HAVING COUNT(*) > 1
    ) AS subquery
    

    The final query would look like this:

    SELECT *
    FROM some_table
    WHERE relevant_field IN
    (
        SELECT * FROM
        (
            SELECT relevant_field
            FROM some_table
            GROUP BY relevant_field
            HAVING COUNT(*) > 1
        ) AS subquery
    )
    
    0 讨论(0)
  • 2020-11-27 10:18

    I have reformatted your slow sql query with www.prettysql.net

    SELECT *
    FROM some_table
    WHERE
     relevant_field in
     (
      SELECT relevant_field
      FROM some_table
      GROUP BY relevant_field
      HAVING COUNT ( * ) > 1
     );
    

    When using a table in both the query and the subquery, you should always alias both, like this:

    SELECT *
    FROM some_table as t1
    WHERE
     t1.relevant_field in
     (
      SELECT t2.relevant_field
      FROM some_table as t2
      GROUP BY t2.relevant_field
      HAVING COUNT ( t2.relevant_field ) > 1
     );
    

    Does that help?

    0 讨论(0)
  • 2020-11-27 10:18

    I find this to be the most efficient for finding if a value exists, logic can easily be inverted to find if a value doesn't exist (ie IS NULL);

    SELECT * FROM primary_table st1
    LEFT JOIN comparision_table st2 ON (st1.relevant_field = st2.relevant_field)
    WHERE st2.primaryKey IS NOT NULL
    

    *Replace relevant_field with the name of the value that you want to check exists in your table

    *Replace primaryKey with the name of the primary key column on the comparison table.

    0 讨论(0)
  • 2020-11-27 10:19

    Try this

    SELECT t1.*
    FROM 
     some_table t1,
      (SELECT relevant_field
      FROM some_table
      GROUP BY relevant_field
      HAVING COUNT (*) > 1) t2
    WHERE
     t1.relevant_field = t2.relevant_field;
    
    0 讨论(0)
  • 2020-11-27 10:25
    SELECT st1.*
    FROM some_table st1
    inner join 
    (
        SELECT relevant_field
        FROM some_table
        GROUP BY relevant_field
        HAVING COUNT(*) > 1
    )st2 on st2.relevant_field = st1.relevant_field;
    

    I've tried your query on one of my databases, and also tried it rewritten as a join to a sub-query.

    This worked a lot faster, try it!

    0 讨论(0)
  • 2020-11-27 10:29

    Rewrite the query into this

    SELECT st1.*, st2.relevant_field FROM sometable st1
    INNER JOIN sometable st2 ON (st1.relevant_field = st2.relevant_field)
    GROUP BY st1.id  /* list a unique sometable field here*/
    HAVING COUNT(*) > 1
    

    I think st2.relevant_field must be in the select, because otherwise the having clause will give an error, but I'm not 100% sure

    Never use IN with a subquery; this is notoriously slow.
    Only ever use IN with a fixed list of values.

    More tips

    1. If you want to make queries faster, don't do a SELECT * only select the fields that you really need.
    2. Make sure you have an index on relevant_field to speed up the equi-join.
    3. Make sure to group by on the primary key.
    4. If you are on InnoDB and you only select indexed fields (and things are not too complex) than MySQL will resolve your query using only the indexes, speeding things way up.

    General solution for 90% of your IN (select queries

    Use this code

    SELECT * FROM sometable a WHERE EXISTS (
      SELECT 1 FROM sometable b
      WHERE a.relevant_field = b.relevant_field
      GROUP BY b.relevant_field
      HAVING count(*) > 1) 
    
    0 讨论(0)
提交回复
热议问题