MySQL - Fastest way to check if data in InnoDB table has changed

前端 未结 2 1715
不知归路
不知归路 2021-02-10 11:11

My application is very database intensive. Currently, I\'m running MySQL 5.5.19 and using MyISAM, but I\'m in the process of migrating to InnoDB. The only problem left is checks

2条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-02-10 11:43

    I think I've found the solution. For some time I was looking at Percona Server to replace my MySQL servers, and now i think there is a good reason for this.

    Percona server introduces many new INFORMATION_SCHEMA tables like INNODB_TABLE_STATS, which isn't available in standard MySQL server. When you do:

    SELECT rows, modified FROM information_schema.innodb_table_stats WHERE table_schema='db' AND table_name='table'
    

    You get actual row count and a counter. The Official documentation says the following about this field:

    If the value of modified column exceeds “rows / 16” or 2000000000, the statistics recalculation is done when innodb_stats_auto_update == 1. We can estimate the oldness of the statistics by this value.

    So this counter wraps every once in a while, but you can make a checksum of the number of rows and the counter, and then with every modification of the table you get a unique checksum. E.g.:

    SELECT MD5(CONCAT(rows,'_',modified)) AS checksum FROM information_schema.innodb_table_stats WHERE table_schema='db' AND table_name='table';
    

    I was going do upgrade my servers to Percona server anyway so this bounding is not an issue for me. Managing hundreds of triggers and adding fields to tables is a major pain for this application, because it's very late in development.

    This is the PHP function I've come up with to make sure that tables can be checksummed whatever engine and server is used:

    function checksum_table($input_tables){
        if(!$input_tables) return false; // Sanity check
        $tables = (is_array($input_tables)) ? $input_tables : array($input_tables); // Make $tables always an array
        $where = "";
        $checksum = "";
        $found_tables = array();
        $tables_indexed = array();
        foreach($tables as $table_name){
            $tables_indexed[$table_name] = true; // Indexed array for faster searching
            if(strstr($table_name,".")){ // If we are passing db.table_name
                $table_name_split = explode(".",$table_name);
                $where .= "(table_schema='".$table_name_split[0]."' AND table_name='".$table_name_split[1]."') OR ";
            }else{
                $where .= "(table_schema=DATABASE() AND table_name='".$table_name."') OR ";
            }
        }
        if($where != ""){ // Sanity check
            $where = substr($where,0,-4); // Remove the last "OR"
            $get_chksum = mysql_query("SELECT table_schema, table_name, rows, modified FROM information_schema.innodb_table_stats WHERE ".$where);
            while($row = mysql_fetch_assoc($get_chksum)){
                if($tables_indexed[$row[table_name]]){ // Not entirely foolproof, but saves some queries like "SELECT DATABASE()" to find out the current database
                    $found_tables[$row[table_name]] = true;
                }elseif($tables_indexed[$row[table_schema].".".$row[table_name]]){
                    $found_tables[$row[table_schema].".".$row[table_name]] = true;
                }
                $checksum .= "_".$row[rows]."_".$row[modified]."_";
            }
        }
    
        foreach($tables as $table_name){
            if(!$found_tables[$table_name]){ // Table is not found in information_schema.innodb_table_stats (Probably not InnoDB table or not using Percona Server)
                $get_chksum = mysql_query("CHECKSUM TABLE ".$table_name); // Checksuming the old-fashioned way
                $chksum = mysql_fetch_assoc($get_chksum);
                $checksum .= "_".$chksum[Checksum]."_";
            }
        }
    
        $checksum = sprintf("%s",crc32($checksum)); // Using crc32 because it's faster than md5(). Must be returned as string to prevent PHPs signed integer problems.
    
        return $checksum;
    }
    

    You can use it like this:

    // checksum a signle table in the current db
    $checksum = checksum_table("test_table");
    
    // checksum a signle table in db other than the current
    $checksum = checksum_table("other_db.test_table");
    
    // checksum multiple tables at once. It's faster when using Percona server, because all tables are checksummed via one select.
    $checksum = checksum_table(array("test_table, "other_db.test_table")); 
    

    I hope this saves some trouble to other people having the same problem.

提交回复
热议问题