Matching math expression with regular expression?

前端 未结 8 2081
旧时难觅i
旧时难觅i 2020-11-27 18:36

For example, these are valid math expressions:

a * b + c
-a * (b / 1.50)
(apple + (-0.5)) * (boy - 1)

And these are invalid math expression

相关标签:
8条回答
  • 2020-11-27 19:00

    You can't use regex to do things like balance parenthesis.

    0 讨论(0)
  • 2020-11-27 19:01

    Ok here's my version of parenthesis finding in ActionScript3, using this approach give a lot of traction to analyse the part before the parenthesis, inside the parenthesis and after the parenthis, if some parenthesis remains at the end you can raise a warning or refuse to send to a final eval function.

    package {
    import flash.display.Sprite;
    import mx.utils.StringUtil;
    public class Stackoverflow_As3RegexpExample extends Sprite
    {
        private var tokenChain:String = "2+(3-4*(4/6))-9(82+-21)"
        //Constructor
        public function Stackoverflow_As3RegexpExample() {
            // remove the "\" that just escape the following "\" if you want to test outside of flash compiler.
            var getGroup:RegExp = new RegExp("((?:[^\\(\\)]+)?)   (?:\\()       (  (?:[^\\(\\)]+)? )    (?:\\))        ((?:[^\\(\\)]+)?)", "ix")   //removed g flag
            while (true) {
                tokenChain = replace(tokenChain,getGroup)
                if (tokenChain.search(getGroup) == -1) break; 
            }
            trace("cummulativeEvaluable="+cummulativeEvaluable)
        }
        private var cummulativeEvaluable:Array = new Array()
        protected function analyseGrammar(matchedSubstring:String, capturedMatch1:String, capturedMatch2:String,  capturedMatch3:String, index:int, str:String):String {
            trace("\nanalyseGrammar str:\t\t\t\t'"+str+"'")
            trace("analyseGrammar matchedSubstring:'"+matchedSubstring+"'")
            trace("analyseGrammar capturedMatchs:\t'"+capturedMatch1+"'  '("+capturedMatch2+")'   '"+capturedMatch3+"'")
            trace("analyseGrammar index:\t\t\t'"+index+"'") 
            var blank:String = buildBlank(matchedSubstring.length)
            cummulativeEvaluable.push(StringUtil.trim(matchedSubstring))
            // I could do soo much rigth here!
            return str.substr(0,index)+blank+str.substr(index+matchedSubstring.length,str.length-1)
        }
        private function replace(str:String,regExp:RegExp):String {
            var result:Object = regExp.exec(str)
            if (result)
                return analyseGrammar.apply(null,objectToArray(result)) 
            return str
        }
        private function objectToArray(value:Object):Array {
            var array:Array = new Array()
            var i:int = 0
            while (true) {
                if (value.hasOwnProperty(i.toString())) {
                    array.push(value[i])
                } else {
                    break;
                }
                i++
            }
            array.push(value.index)
            array.push(value.input)
            return array
        }
        protected function buildBlank(length:uint):String {
            var blank:String = ""
            while (blank.length != length)
                blank = blank+" "
            return blank
        }
    }
    

    }

    It should trace this:

    analyseGrammar str:             '2+(3-4*(4/6))-9(82+-21)'
    analyseGrammar matchedSubstring:'3-4*(4/6)'
    analyseGrammar capturedMatchs:  '3-4*'  '(4/6)'   ''
    analyseGrammar index:           '3'
    
    analyseGrammar str:             '2+(         )-9(82+-21)'
    analyseGrammar matchedSubstring:'2+(         )-9'
    analyseGrammar capturedMatchs:  '2+'  '(         )'   '-9'
    analyseGrammar index:           '0'
    
    analyseGrammar str:             '               (82+-21)'
    analyseGrammar matchedSubstring:'               (82+-21)'
    analyseGrammar capturedMatchs:  '               '  '(82+-21)'   ''
    analyseGrammar index:           '0'
    cummulativeEvaluable=3-4*(4/6),2+(         )-9,(82+-21)
    
    0 讨论(0)
  • 2020-11-27 19:03

    This is tricky with one single regular expression, but quite easy using mixed regexp/procedural approach. The idea is to construct a regexp for the simple expression (without parenthesis) and then repeatedly replace ( simple-expression ) with some atomic string (e.g. identifier). If the final reduced expression matches the same `simple' pattern, the original expression is considered valid.

    Illustration (in php).

    function check_syntax($str) {
    
        // define the grammar
        $number = "\d+(\.\d+)?";
        $ident  = "[a-z]\w*";
        $atom   = "[+-]?($number|$ident)";
        $op     = "[+*/-]";
        $sexpr  = "$atom($op$atom)*"; // simple expression
    
        // step1. remove whitespace
        $str = preg_replace('~\s+~', '', $str);
    
        // step2. repeatedly replace parenthetic expressions with 'x'
        $par = "~\($sexpr\)~";
        while(preg_match($par, $str))
            $str = preg_replace($par, 'x', $str);
    
        // step3. no more parens, the string must be simple expression
        return preg_match("~^$sexpr$~", $str);
    }
    
    
    $tests = array(
        "a * b + c",
        "-a * (b / 1.50)",
        "(apple + (-0.5)) * (boy - 1)",
        "--a *+ b @ 1.5.0",
        "-a * b + 1)",
        "a) * (b + c) / (d",
    );
    
    foreach($tests as $t)
        echo $t, "=", check_syntax($t) ? "ok" : "nope", "\n";
    

    The above only validates the syntax, but the same technique can be also used to construct a real parser.

    0 讨论(0)
  • 2020-11-27 19:10

    Matching parens with a regex is quite possible.

    Here is a Perl script that will parse arbitrary deep matching parens. While it will throw out the non-matching parens outside, I did not design it specifically to validate parens. It will parse arbitrarily deep parens so long as they are balanced. This will get you started however.

    The key is recursion both in the regex and the use of it. Play with it, and I am sure that you can get this to also flag non matching prens. I think if you capture what this regex throws away and count parens (ie test for odd parens in the non-match text), you have invalid, unbalanced parens.

    #!/usr/bin/perl
    $re = qr  /
         (                      # start capture buffer 1
            \(                  #   match an opening paren
            (                   # capture buffer 2
            (?:                 #   match one of:
                (?>             #     don't backtrack over the inside of this group
                    [^()]+    #       one or more 
                )               #     end non backtracking group
            |                   #     ... or ...
                (?1)            #     recurse to opening 1 and try it again
            )*                  #   0 or more times.
            )                   # end of buffer 2
            \)                  #   match a closing paren
         )                      # end capture buffer one
        /x;
    
    
    sub strip {
        my ($str) = @_;
        while ($str=~/$re/g) {
            $match=$1; $striped=$2;
            print "$match\n";
            strip($striped) if $striped=~/\(/;
            return $striped;
        }
    }
    
    while(<DATA>) {
        print "start pattern: $_";
        while (/$re/g) { 
            strip($1) ;
        }
    }   
    
    __DATA__
    "(apple + (-0.5)) * (boy - 1)"
    "((((one)two)three)four)x(one(two(three(four))))"
    "a) * (b + c) / (d"
    "-a * (b / 1.50)"
    

    Output:

    start pattern: "(apple + (-0.5)) * (boy - 1)"
    (apple + (-0.5))
    (-0.5)
    (boy - 1)
    start pattern: "((((one)two)three)four)x(one(two(three(four))))"
    ((((one)two)three)four)
    (((one)two)three)
    ((one)two)
    (one)
    (one(two(three(four))))
    (two(three(four)))
    (three(four))
    (four)
    start pattern: "a) * (b + c) / (d"
    (b + c)
    start pattern: "-a * (b / 1.50)"
    (b / 1.50)
    
    0 讨论(0)
  • 2020-11-27 19:10

    Regular expressions can only be used to recognize regular languages. The language of mathematical expressions is not regular; you'll need to implement an actual parser (e.g. LR) in order to do this.

    0 讨论(0)
  • 2020-11-27 19:10

    I believe you will be better off implementing a real parser to accomplish what you're after.

    A parser for simple mathematical expressions is "Parsing 101", and there are several examples to be found online.

    Some examples include:

    • ANTLR: Expression Evaluator Sample (ANTLR grammars can target several languages)
    • pyparsing: http://pyparsing.wikispaces.com/file/view/fourFn.py (pyparsing is a Python library)
    • Lex & Yacc: http://epaperpress.com/lexandyacc/ (contains a PDF tutorial and sample code for a calculator)

    Note that the grammar you will need for validating expressions is simpler than the examples above, since the examples also implement evaluation of the expression.

    0 讨论(0)
提交回复
热议问题