MySql - Self Join - Full Table Scan (Cannot Scan Index)

烂漫一生 提交于 2019-12-13 08:01:54

问题


I have the following self-join query:

SELECT A.id
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt)

The query is quite slow, and after looking at the execution plan the cause appears to be a full table scan in the JOIN. The table has only 500 rows, and suspecting this to be the issue I increased it to 100,000 rows in order to see if it made a difference to the optimizer's selection. It did not, with 100k rows it was still doing a full table scan.

My next step was to try and force indexes with the following query, but the same situation arises, a full table scan:

SELECT A.id
FROM categories_nested_set      AS A
LEFT JOIN categories_nested_set AS B 
FORCE INDEX (idx_lft, idx_rgt)
ON (A.lft BETWEEN B.lft AND B.rgt)

All columns (id, lft, rgt) are integers, all are indexed.

Why is MySql doing a full table scan here?

How can I change my query to use indexes instead of a full table scan?

CREATE TABLE mytbl ( lft int(11) NOT NULL DEFAULT '0', 
 rgt int(11) DEFAULT NULL, 
 id int(11) DEFAULT NULL,
 category varchar(128) DEFAULT NULL,
  PRIMARY KEY (lft), 
  UNIQUE KEY id (id), 
  UNIQUE KEY rgt (rgt), 
  KEY idx_lft (lft), 
  KEY idx_rgt (rgt) ) ENGINE=InnoDB DEFAULT CHARSET=utf8

Thanks


回答1:


You have lot's of indexes, some of them are redundant. Let's begin by clearing up some of them. Too many indexes slows down inserts and updates.

PRIMARY KEY (lft),
KEY idx_lft (lft), 

Since you already have a primary key defined on lft, there is no need what so ever for another index on lft. Similarly with a unique index on rgt there is not need for the second index listed below.

UNIQUE KEY rgt (rgt), 
KEY idx_rgt (rgt)

Now let's look at your query.

SELECT A.id
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt)

This is very unlikely to be a query that will be encountered in the wild. With 500 rows, this query may produce even 5000 rows? Do you really need the entire key created in one go? The reason that this query is slow is because mysql can only optimize range comparisions for constants. It's more likely that your actually query will look something like this:

SELECT B.*
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt) 
WHERE a.id = N;

Where you create the node for a particular id. This will use indexes and will be really fast. What's the point of optimizing for a query that you will not use much if at all?




回答2:


The following SO question is critical to the solution, as there is very little information on the combination of adjacency lists and indices:

MySQL & nested set: slow JOIN (not using index)

It appears that adding a basic comparison condition triggers the use of an index, like so:

SELECT A.id
FROM mytbl      AS A
LEFT JOIN mytbl AS B ON (A.lft BETWEEN B.lft AND B.rgt)
-- THE FOLLOWING DUMMY CONDITIONS TRIGGER INDEX
WHERE A.lft > 0
AND B.lft > 0
AND B.rgt > 0

And no more table scans.

EDIT: Comparison of EXPLAIN function between fixed and unfixed version of the query:



来源:https://stackoverflow.com/questions/38494989/mysql-self-join-full-table-scan-cannot-scan-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!