Regular expression to remove comments from SQL statement

后端 未结 8 1710
时光说笑
时光说笑 2020-12-15 12:51

I\'m trying to come up with a regular expression to remove comments from an SQL statement.

This regex almost works:

(/\\*([^*]|[\\r\\n]|(\\*+([^*/]|         


        
相关标签:
8条回答
  • 2020-12-15 12:57

    Originally, I used @Adrien Gibrat's solution. However, I came across a situation where it wasn't parsing quoted strings, properly, if I had anything with a preceding '--' inside of them. I ended up writing this, instead:

    '[^']*(?!\\)'(*SKIP)(*F)       # Make sure we're not matching inside of quotes
    |(?m-s:\s*(?:\-{2}|\#)[^\n]*$) # Single line comment
    |(?:
      \/\*.*?\*\/                  # Multi-line comment
      (?(?=(?m-s:\h+$))         # Get trailing whitespace if any exists and only if it's the rest of the line
        \h+
      )
    )
    
    # Modifiers used: 'xs' ('g' can be used as well, but is enabled by default in PHP)
    

    Please note that this should be used when PCRE is available. So, in my case, I'm using a variation of this in my PHP library.

    Example

    0 讨论(0)
  • 2020-12-15 12:58

    This code works for me:

    function strip_sqlcomment ($string = '') {
        $RXSQLComments = '@(--[^\r\n]*)|(\#[^\r\n]*)|(/\*[\w\W]*?(?=\*/)\*/)@ms';
        return (($string == '') ?  '' : preg_replace( $RXSQLComments, '', $string ));
    }
    

    with a little regex tweak it could be used to strip comments in any language

    0 讨论(0)
  • 2020-12-15 12:58

    remove /**/ and -- comments

    function unComment($sql){
    
            $re = '/(--[^\n]*)/i';
            $sql = preg_replace( $re, '', $sql );
    
            $sqlComments = '@(([\'"]).*?[^\\\]\2)|((?:\#|--).*?$|/\*(?:[^/*]|/(?!\*)|\*(?!/)|(?R))*\*\/)\s*|(?<=;)\s+@ms';
            $uncommentedSQL = trim( preg_replace( $sqlComments, '$1', $sql ) );
            preg_match_all( $sqlComments, $sql, $comments );
            $sql = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', trim($uncommentedSQL));
    
    
            return $sql;
        }
    
    0 讨论(0)
  • 2020-12-15 13:06

    As you said that the rest of your regex is fine, I focused on the last part. All you need to do is verify that the -- is at the beginning and then make sure it removes all dashes if there are more than 2. The end regex is below

    (^[--]+)
    

    The above is just if you want to remove the comment dashes and not the whole line. You can run the below if you do want everything after it to the end of the line, also

    (^--.*)
    
    0 讨论(0)
  • 2020-12-15 13:06

    For all PHP folks: please use this library - https://github.com/jdorn/sql-formatter. I have been dealing with stripping comments from SQL for couple years now and the only valid solution would be a tokenizer/state machine, which I lazily resisted to write. Couple days ago I found out this lib and ran 120k queries through it and found only one bug (https://github.com/jdorn/sql-formatter/issues/93), which is fixed immediately in our fork https://github.com/keboola/sql-formatter.

    The usage is simple

    $query <<<EOF
    /* 
      my comments 
    */
    SELECT 1;
    EOF;
    
    $bareQuery = \SqlFormatter::removeComments($query);
    // prints "SELECT 1;"
    print $bareQuery;
    
    0 讨论(0)
  • 2020-12-15 13:16

    In PHP, i'm using this code to uncomment SQL:

    $sqlComments = '@(([\'"]).*?[^\\\]\2)|((?:\#|--).*?$|/\*(?:[^/*]|/(?!\*)|\*(?!/)|(?R))*\*\/)\s*|(?<=;)\s+@ms';
    /* Commented version
    $sqlComments = '@
        (([\'"]).*?[^\\\]\2) # $1 : Skip single & double quoted expressions
        |(                   # $3 : Match comments
            (?:\#|--).*?$    # - Single line comments
            |                # - Multi line (nested) comments
             /\*             #   . comment open marker
                (?: [^/*]    #   . non comment-marker characters
                    |/(?!\*) #   . ! not a comment open
                    |\*(?!/) #   . ! not a comment close
                    |(?R)    #   . recursive case
                )*           #   . repeat eventually
            \*\/             #   . comment close marker
        )\s*                 # Trim after comments
        |(?<=;)\s+           # Trim after semi-colon
        @msx';
    */
    $uncommentedSQL = trim( preg_replace( $sqlComments, '$1', $sql ) );
    preg_match_all( $sqlComments, $sql, $comments );
    $extractedComments = array_filter( $comments[ 3 ] );
    var_dump( $uncommentedSQL, $extractedComments );
    
    0 讨论(0)
提交回复
热议问题