Why does the cardinality of an index in MySQL remain unchanged when I add a new index?

前端 未结 2 1341
别那么骄傲
别那么骄傲 2020-12-31 13:23

I have added a FULLTEXT index to one of my MySQL database tables as follows:

ALTER TABLE members ADD FULLTEXT(about,fname,lname,job_title);

相关标签:
2条回答
  • 2020-12-31 13:32

    I cannot answer definitely why MySQL does not compute the cardinality, but I can guess. The MySQL manual states:

    Cardinality: An estimate of the number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables. The higher the cardinality, the greater the chance that MySQL uses the index when doing joins.

    FULLTEXT indices are only used in MATCH ... AGAINST (...) queries, which forces the index to be used. The MATCH ... AGAINST syntax does not work if there is not a FULLTEXT index on those fields.

    My guess is that the cardinality isn't computed because it really isn't necessary.

    Note that searches against the index work even though the cardinality isn't set.

    For the record, an ANALYZE TABLE foobar statement seems to set the cardinality correctly.

    0 讨论(0)
  • 2020-12-31 13:44

    If you only have 1 row in the table, the cardinality for the index should be 1, of course. It's just counting the number of unique values.

    If you think of an index as a lookup-table based on buckets (like a hash), then the cardinality is the number of buckets.

    Here's how it works: When you build an index over a set of columns (a,b,c,d), then the database goes over all the rows in the table, looking at the ordered quadruplets of those 4 columns, for each row. Let's say your table looks like this:

    a  b  c  d  e   
    -- -- -- -- --  
    1  1  1  1  200 
    1  1  1  1  300
    1  2  1  1  200
    1  3  1  1  200
    

    So what the database looks at is just the 4 columns (a,b,c,d):

    a  b  c  d  
    -- -- -- --
    1  1  1  1 
    1  2  1  1 
    1  3  1  1 
    

    See that there are only 3 unique rows left? Those will become our buckets, but we'll get back to that. In reality, there's also a record id, or row identifier for each row in the table. So our original table looks like this:

    (row id) a  b  c  d  e   
    -------- -- -- -- -- --  
    00000001 1  1  1  1  200 
    00000002 1  1  1  1  300
    00000003 1  2  1  1  200
    00000004 1  3  1  1  200
    

    So when we look at only the 4 columns of (a,b,c,d), we're really looking also at the row id:

    (row id) a  b  c  d 
    -------- -- -- -- --
    00000001 1  1  1  1
    00000002 1  1  1  1
    00000003 1  2  1  1
    00000004 1  3  1  1
    

    But we want to do lookup by (a,b,c,d) and not by row id, so we produce something like this:

    (a,b,c,d) (row id)
    --------- --------
    1,1,1,1   00000001
    1,1,1,1   00000002
    1,2,1,1   00000003
    1,3,1,1   00000004
    

    And finally, we group all the row ids of rows that have identicle (a,b,c,d) values together:

    (a,b,c,d) (row id)
    --------- ---------------------
    1,1,1,1   00000001 and 00000002
    1,2,1,1   00000003
    1,3,1,1   00000004
    

    See that? The values of (a,b,c,d), which are (1,1,1,1) (1,2,1,1) and (1,3,1,1) have become keys for our lookup table into the rows of the original table.

    Actually, none of this really happens, but it should give you a good idea on how a "naive" (i.e. straight-forward) implementation of an index might be done.

    But the bottom line is this: cardinality just measures how many unique rows there are in an index. And in our example that was the number of keys in our lookup table, which was 3.

    Hope that helps!

    0 讨论(0)
提交回复
热议问题