Natural Sort in MySQL

后端 未结 21 1064
南旧
南旧 2020-11-22 02:25

Is there an elegant way to have performant, natural sorting in a MySQL database?

For example if I have this data set:

  • Final Fantasy
  • Final Fant
相关标签:
21条回答
  • 2020-11-22 03:10

    Same function as posted by @plalx, but rewritten to MySQL:

    DROP FUNCTION IF EXISTS `udf_FirstNumberPos`;
    DELIMITER ;;
    CREATE FUNCTION `udf_FirstNumberPos` (`instring` varchar(4000)) 
    RETURNS int
    LANGUAGE SQL
    DETERMINISTIC
    NO SQL
    SQL SECURITY INVOKER
    BEGIN
        DECLARE position int;
        DECLARE tmp_position int;
        SET position = 5000;
        SET tmp_position = LOCATE('0', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF; 
        SET tmp_position = LOCATE('1', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('2', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('3', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('4', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('5', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('6', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('7', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('8', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
        SET tmp_position = LOCATE('9', instring); IF (tmp_position > 0 AND tmp_position < position) THEN SET position = tmp_position; END IF;
    
        IF (position = 5000) THEN RETURN 0; END IF;
        RETURN position;
    END
    ;;
    
    DROP FUNCTION IF EXISTS `udf_NaturalSortFormat`;
    DELIMITER ;;
    CREATE FUNCTION `udf_NaturalSortFormat` (`instring` varchar(4000), `numberLength` int, `sameOrderChars` char(50)) 
    RETURNS varchar(4000)
    LANGUAGE SQL
    DETERMINISTIC
    NO SQL
    SQL SECURITY INVOKER
    BEGIN
        DECLARE sortString varchar(4000);
        DECLARE numStartIndex int;
        DECLARE numEndIndex int;
        DECLARE padLength int;
        DECLARE totalPadLength int;
        DECLARE i int;
        DECLARE sameOrderCharsLen int;
    
        SET totalPadLength = 0;
        SET instring = TRIM(instring);
        SET sortString = instring;
        SET numStartIndex = udf_FirstNumberPos(instring);
        SET numEndIndex = 0;
        SET i = 1;
        SET sameOrderCharsLen = CHAR_LENGTH(sameOrderChars);
    
        WHILE (i <= sameOrderCharsLen) DO
            SET sortString = REPLACE(sortString, SUBSTRING(sameOrderChars, i, 1), ' ');
            SET i = i + 1;
        END WHILE;
    
        WHILE (numStartIndex <> 0) DO
            SET numStartIndex = numStartIndex + numEndIndex;
            SET numEndIndex = numStartIndex;
    
            WHILE (udf_FirstNumberPos(SUBSTRING(instring, numEndIndex, 1)) = 1) DO
                SET numEndIndex = numEndIndex + 1;
            END WHILE;
    
            SET numEndIndex = numEndIndex - 1;
    
            SET padLength = numberLength - (numEndIndex + 1 - numStartIndex);
    
            IF padLength < 0 THEN
                SET padLength = 0;
            END IF;
    
            SET sortString = INSERT(sortString, numStartIndex + totalPadLength, 0, REPEAT('0', padLength));
    
            SET totalPadLength = totalPadLength + padLength;
            SET numStartIndex = udf_FirstNumberPos(RIGHT(instring, CHAR_LENGTH(instring) - numEndIndex));
        END WHILE;
    
        RETURN sortString;
    END
    ;;
    

    Usage:

    SELECT name FROM products ORDER BY udf_NaturalSortFormat(name, 10, ".")
    
    0 讨论(0)
  • 2020-11-22 03:11

    If you do not want to reinvent the wheel or have a headache with lot of code that does not work, just use Drupal Natural Sort ... Just run the SQL that comes zipped (MySQL or Postgre), and that's it. When making a query, simply order using:

    ... ORDER BY natsort_canon(column_name, 'natural')
    
    0 讨论(0)
  • 2020-11-22 03:14

    MySQL doesn't allow this sort of "natural sorting", so it looks like the best way to get what you're after is to split your data set up as you've described above (separate id field, etc), or failing that, perform a sort based on a non-title element, indexed element in your db (date, inserted id in the db, etc).

    Having the db do the sorting for you is almost always going to be quicker than reading large data sets into your programming language of choice and sorting it there, so if you've any control at all over the db schema here, then look at adding easily-sorted fields as described above, it'll save you a lot of hassle and maintenance in the long run.

    Requests to add a "natural sort" come up from time to time on the MySQL bugs and discussion forums, and many solutions revolve around stripping out specific parts of your data and casting them for the ORDER BY part of the query, e.g.

    SELECT * FROM table ORDER BY CAST(mid(name, 6, LENGTH(c) -5) AS unsigned) 
    

    This sort of solution could just about be made to work on your Final Fantasy example above, but isn't particularly flexible and unlikely to extend cleanly to a dataset including, say, "Warhammer 40,000" and "James Bond 007" I'm afraid.

    0 讨论(0)
提交回复
热议问题