What happens when I set the same variable to the same regex value in multiple statements?

前端 未结 3 1259
青春惊慌失措
青春惊慌失措 2020-12-20 05:29

Let\'s say I do this:

re = /cat/;
re = /cat/;

From reading Zakas\' book about Javascript, it seems that when executing the second line, no

相关标签:
3条回答
  • 2020-12-20 06:21

    Im not familiar with the book but this is how it works so far as I understand it.

    The var statement creates new variables which have no type and attachs them to the local scope.

    var re;
    var i;
    

    or

    var re,i
    

    The null statement produces a null type object that exists apart.

    null
    

    Assigning variables in a var statement just points it to that object but it does not become that object; they are separate things that share a relationship.

    var re=null,i;  
    

    Using a regex statement creates a new regex object which we may or may not assign to a variable.

    /cat/g  
    

    or

    re=/cat/g
    

    When i reproduce your example it only returns true once in firefox52, it never returns false, but if i assign the return value of the test to another variable, and log it, I get true ten times.

    var re=null,i; 
    
    for (i=0;i<10;i++){
        re=/cat/g;
        var x=re.test('catastrophe');
        console.log(x)}
    //returns true ten times
    

    I think that Zacas is explaining an eccentricity found in some browsers due to their implementation of javascript. Using a regex or any statement should create a new object every time but there are many things called javascript, and a lot of them will reuse objects as often as possible and occasionally lead to strange behaviour that is eventually fixed.

    I hope that helps

    0 讨论(0)
  • 2020-12-20 06:25

    Either the author is mistaken or Javascript has changed significantly since it was written, because that's not how it works now. See How often does JavaScript recompile regex literals in functions? for a number of answers that go into detail about this.

    I suspect the author may have confused regexp compilation with RegExp objects. When the compiler sees a regexp literal, it can compile it once. Then it generates code that runs each time through the loop to create a new object that uses that compiled regexp to perform the matching. But each RegExp object has its own state.

    Notice that he says he's describing EcmaScript 3. That's a very old edition of EcmaScript, originally published in 1999. EcmaScript 5 is from 2009 (ES4 was abandoned during development), and that's what most browsers have implemented for several years, with ES6 adoption being phased in during the past couple of years. Maybe ES3 behaved the way he describes, but more recent editions don't.

    0 讨论(0)
  • 2020-12-20 06:31

    In modern JavaScript (ES5+), evaluating a RegExp literal is specified to return a new instance each time a regular expression literal is evaluated. In ES3, a JavaScript literal creates a distinct RegExp object for each literal (including literals with the same content) at parse time and each “physical” literal always evaluates to the same instance.

    So, in both ES5 and ES3, the following code will assign distinct RegExp instances to re:

    re = /cat/;
    re = /cat/;
    

    However, if these lines are executed multiple times, ES3 will assign the same RegExp object on each line. In ES3, there will be exactly two instances of RegExp. The latter instance will always be assigned to re after executing those two lines. If you copied re to another variable in the meantime, you will see that re === savedCopy.

    In ES5, each execution will produce new instances. So each time those lines run, a new RegExp object will be created for the first line and then another new RegExp object will be created and saved to the re variable for the second line. If you copied re to another variable in the meantime, you will see that re !== savedCopy.

    Specs

    ECMAScript 3rd Edition (ECMA-262) ­­­§ 7.8.5 (p. 20) states the following (emphasis added on pertinent text):

    7.8.5 Regular Expression Literals

    A regular expression literal is an input element that is converted to a RegExp object (section 15.10) when it is scanned. The object is created before evaluation of the containing program or function begins. Evaluation of the literal produces a reference to that object; it does not create a new object. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp (section 15.10.4) or calling the RegExp constructor as a function (section 15.10.3).

    ECMAScript 5.1 (ECMA-262) § 7.8.5 states the following (emphasis added on pertinent text):

    7.8.5 Regular Expression Literals

    A regular expression literal is an input element that is converted to a RegExp object (see 15.10) each time the literal is evaluated. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp (see 15.10.4) or calling the RegExp constructor as a function (15.10.3).

    This means that the behavior is specified differently between ES3 and ES5.1. Consider this code:

    function getRegExp() {
        return /a/;
    }
    console.log(getRegExp() === getRegExp());
    

    In ES3, that particular /a/ will always refer to the same RegExp instance and the log will output true because the RegExp is instantiated once “when it is scanned”. In ES5.1, every evaluation of /a/ will result in a new RegExp instance, meaning that creation of a new RegExp happens each time the code refers to it because the spec says that it is “converted to a RegExp object (see 15.10) each time the literal is evaluated”.

    Now consider this expression: /a/ !== /a/. In both ES3 and ES5, this expression will always evaluate to true because each distinct literal gets a distinct RegExp object. In ES5 this happens because each evaluation of a literal always results in a new object instance. In ES3.1 this happens because the spec says “Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical.”.

    This change in behavior is documented as an incompatibility with ECMAScript 3rd Edition in ECMAScript 5.1 (ECMA-262) Annex E:

    Regular expression literals now return a unique object each time the literal is evaluated. This change is detectable by any programs that test the object identity of such literal values or that are sensitive to the shared side effects.

    Old code may have been written to rely on the ES3 behavior. This would allow a function to be called multiple times to incrementally walk through matches in a string when the expression was compiled with the g flag. This is similar to how, in C, the non-reentrant strtok() method works. If you want the same effect with ES5, you must manually store the RegExp instance in a variable and ensure that the variable has a long enough lifetime since ES5 effectively gives you behavior like the reentrant strtok_r() method instead.

    Optimization Bugs

    Supposedly there are bugs in JavaScript implementations which result in RegExp object caching resulting in observable side effects which should be impossible. The observed behavior does not necessarily adhere to either the ES3 or ES5 specification. An example for Mozilla is given at the end of this post with the spoiler text and explanation that the bug is not observable when debugging but is observable when the JavaScript is running in non-debug optimized mode. The blog author wrote a comment saying the bug was still reproducible in stable Firefox as of 2017-03-08.

    0 讨论(0)
提交回复
热议问题