regex differentiating between ISBN-10 and ISBN-13

前端 未结 5 446
后悔当初
后悔当初 2020-12-11 00:52

I have an If-else statement which checks a string to see whether there is an ISBN-10 or ISBN-13 (book ID).

The problem I am facing is with the ISBN-10 check which o

相关标签:
5条回答
  • 2020-12-11 01:07
    ISBN10_REGEX = /^(?:\d[\ |-]?){9}[\d|X]$/i
    ISBN13_REGEX = /^(?:\d[\ |-]?){13}$/i
    
    0 讨论(0)
  • 2020-12-11 01:22

    You really only need one regex for this. Then do a more efficient strlen() check to see which one was matched. The following will match ISBN-10 and ISBN-13 values within a string with or without hyphens, and optionally preceded by the string ISBN:, ISBN:(space) or ISBN(space).

    Finding ISBNs :

    function findIsbn($str)
    {
        $regex = '/\b(?:ISBN(?:: ?| ))?((?:97[89])?\d{9}[\dx])\b/i';
    
        if (preg_match($regex, str_replace('-', '', $str), $matches)) {
            return (10 === strlen($matches[1]))
                ? 1   // ISBN-10
                : 2;  // ISBN-13
        }
        return false; // No valid ISBN found
    }
    
    var_dump(findIsbn('ISBN:0-306-40615-2'));     // return 1
    var_dump(findIsbn('0-306-40615-2'));          // return 1
    var_dump(findIsbn('ISBN:0306406152'));        // return 1
    var_dump(findIsbn('0306406152'));             // return 1
    var_dump(findIsbn('ISBN:979-1-090-63607-1')); // return 2
    var_dump(findIsbn('979-1-090-63607-1'));      // return 2
    var_dump(findIsbn('ISBN:9791090636071'));     // return 2
    var_dump(findIsbn('9791090636071'));          // return 2
    var_dump(findIsbn('ISBN:97811'));             // return false
    

    This will search a provided string to see if it contains a possible ISBN-10 value (returns 1) or an ISBN-13 value (returns 2). If it does not it will return false.

    See DEMO of above.


    Validating ISBNs :

    For strict validation the Wikipedia article for ISBN has some PHP validation functions for ISBN-10 and ISBN-13. Below are those examples copied, tidied up and modified to be used against a slightly modified version of the above function.

    Change the return block to this:

        return (10 === strlen($matches[1]))
            ? isValidIsbn10($matches[1])  // ISBN-10
            : isValidIsbn13($matches[1]); // ISBN-13
    

    Validate ISBN-10:

    function isValidIsbn10($isbn)
    {
        $check = 0;
    
        for ($i = 0; $i < 10; $i++) {
            if ('x' === strtolower($isbn[$i])) {
                $check += 10 * (10 - $i);
            } elseif (is_numeric($isbn[$i])) {
                $check += (int)$isbn[$i] * (10 - $i);
            } else {
                return false;
            }
        }
    
        return (0 === ($check % 11)) ? 1 : false;
    }
    

    Validate ISBN-13:

    function isValidIsbn13($isbn)
    {
        $check = 0;
    
        for ($i = 0; $i < 13; $i += 2) {
            $check += (int)$isbn[$i];
        }
    
        for ($i = 1; $i < 12; $i += 2) {
            $check += 3 * $isbn[$i];
        }
    
        return (0 === ($check % 10)) ? 2 : false;
    }
    

    See DEMO of above.

    0 讨论(0)
  • 2020-12-11 01:23

    Use ^ and $ to match beginning and end of string. By using the string delimiters, the order in which you test the 10 or the 13-digit codes will not matter.

    10 digits

    /^ISBN:(\d{9}(?:\d|X))$/
    

    13 digits

    /^ISBN:(\d{12}(?:\d|X))$/
    

    Note: According to http://en.wikipedia.org/wiki/International_Standard_Book_Number, it appears as though ISBNs can have a - in them as well. But based on the $str you're using, it looks like you've removed the hyphens before checking for 10 or 13 digits.

    Additional note: Because the last digit of the ISBN is used as a sort of checksum for the prior digits, regular expressions alone cannot validate that the ISBN is a valid one. It can only check for 10 or 13-digit formats.


    $isbns = array(
      'ISBN:1234567890',       // 10-digit
      'ISBN:123456789X',       // 10-digit ending in X
      'ISBN:1234567890123',    // 13-digit
      'ISBN:123456789012X',    // 13-digit ending in X
      'ISBN:1234'              // invalid
    );
    
    function get_isbn($str) {
       if (preg_match('/^ISBN:(\d{9}(?:\d|X))$/', $str, $matches)) {
          echo "found 10-digit ISBN\n";
          return $matches[1];
       }
       elseif (preg_match('/^ISBN:(\d{12}(?:\d|X))$/', $str, $matches)) {
          echo "found 13-digit ISBN\n";
          return $matches[1];
       }
       else {
          echo "invalid ISBN\n";
          return null;
       }
    }
    
    foreach ($isbns as $str) {
       $isbn = get_isbn($str);
       echo $isbn."\n\n";
    }
    

    Output

    found 10-digit ISBN
    1234567890
    
    found 10-digit ISBN
    123456789X
    
    found 13-digit ISBN
    1234567890123
    
    found 13-digit ISBN
    123456789012X
    
    invalid ISBN
    
    0 讨论(0)
  • 2020-12-11 01:24

    Put the ISBN-13 check before the ISBN-10 check? This is assuming that you want to match them as a part of any string, that is (your example has an extra "ISBN:" at the start so matching anywhere in a string seems to be a requirement of some sort)

    0 讨论(0)
  • 2020-12-11 01:26

    Switch the order of the if else block, also strip all whitespace, colons, and hyphens from your ISBN:

    //Replace all the fluff that some companies add to ISBNs
    $str = preg_replace('/(\s+|:|-)/', '', $str);
    
    if(preg_match("/^ISBN\d{12}(?:\d|X)$/", $str, $matches)){
       echo "ISBN-13 FOUND\n";
       //isbn returned will be 9780113411436
       return 1;
    }
    
    else if(preg_match("/^ISBN\d{9}(?:\d|X)$/", $str, $matches)){
       echo "ISBN-10 FOUND\n";  
       //isbn returned will be 9780113411
       return 0;
    }
    
    0 讨论(0)
提交回复
热议问题