--- 
author: 
  email: ash@firemirror.com
  keyid: bfc7465ebdca5337
  name: Ash Berlin
categories: []

comments: []

date: 2007-04-21T15:29:45Z
guid: 5D71B4D0-F01A-11DB-ACBE-0620203E369B
modified: 2007-04-21T15:29:45Z
raw: "-----BEGIN PGP SIGNED MESSAGE-----\nHash: SHA1\n\nSo in my (perhaps insane) quest to create an implementation of Andy Wardley's L<Template Toolkit|http://www.template-toolkit.org> in JavaScript, I discovered something rather irratating about Spidermonkey's implementation of regular expressions. (Spidermonkey is the name of the Mozilla JavaScript engine.)\n\nPut simply, if you use the C</foobar/g> construct to createa RegExp object, that object will be a singleton. Not with me? Well then consider the following example\n\n lang:JavaScript\n var reB, reA;\n \n function testa(str) {\n   // Make sure we are cally ourselves twice at most.\n   if (this.inA == 2)\n     return;\n   else if (this.inA)\n     this.inA++;\n   else\n     this.inA = 1;\n\n   var re = /(abc)/g;\n \n   if (reA) {\n     print(reA == re);\n   } else {\n     reA = re;\n   }\n   print('last index of re is ' + re.lastIndex);\n   if (re.lastIndex)\n     throw new Error('re has already been executed!');\n \n   var m = re.exec(str);\n \n   print(m);\n \n   testa(m[1]);\n }\n\nSo what the above block does is create a RegExp using the aforementioned C<//g> construct, matches and cpatures a string, then calls the same function again with that match.\n\nThe output from that? Not quite what you'd expect....\n\n lang:0\n js> testa('abcabc');\n last index of re is 0\n abc,abc\n true\n last index of re is 3\n re-test.js:21: re has already been executed!\n\nWhat's this telling us? RegExp instances have a property of lastIndex which says where the last global match finished.  If you examine the output, you can see that the second time the function is called, C<re.lastIndex> is 3. How bizzare!\n \n lang:JavaScript\n function testb(str) {\n   // Make sure we only call ourselves twice at most. \n   if (this.inB == 2)\n     return;\n   else if (this.inB)\n     this.inB++;\n   else\n     this.inB = 1;\n   var re = new RegExp('(abc)', 'g');\n   \n   if (reB) {\n     print(reB === re);\n   } else {\n     reB = re;\n   }\n \n   print('last index of re is ' + re.lastIndex);\n   if (re.lastIndex)\n     throw new Error('re has already been executed!');\n \n   var m = re.exec(str);\n \n   print(m);\n\n   testb(m[1]);\n }\n\nAnd the output from C<testb>?\n\n lang:0\n last index of re is 0\n abc,abc\n false\n last index of re is 0\n abc,abc\n\nSo that works. Let this be a lesson to you. If you need to a regexp with the 'g' (global) flag on a regular expression in JavaScript, make sure you create it with the C<new RegExp(pattern, 'g')>, not C</patter/g>, or else it will cause you problems.\n-----BEGIN PGP SIGNATURE-----\nVersion: GnuPG v1.4.5 (Darwin)\n\niD8DBQFGKiy0v8dGXr3KUzcRAqiuAJ0dk/V+B7vHBSn0KTCYsDkSiHiz7QCfVJZu\nfW7vGBwscV0Y5GU6HTJkOeU=\n=NhZg\n-----END PGP SIGNATURE-----\n"
signed: 1
summary: " So in my (perhaps insane) quest to create an …"
tags: 
  - 
    javascript: 0
text: "    So in my (perhaps insane) quest to create an implementation of\n    Andy Wardley's Template Toolkit in JavaScript, I discovered some-\n    thing rather irratating about Spidermonkey's implementation of\n    regular expressions. (Spidermonkey is the name of the Mozilla\n    JavaScript engine.)\n\n    Put simply, if you use the /foobar/g construct to createa RegExp ob-\n    ject, that object will be a singleton. Not with me? Well then con-\n    sider the following example\n\n     lang:JavaScript var reB, reA;\n\n     function testa(str) { // Make sure we are cally ourselves twice at\n     most. if (this.inA == 2) return; else if (this.inA) this.inA++;\n     else this.inA = 1;\n\n       var re = /(abc)/g;\n\n       if (reA) { print(reA == re); } else { reA = re; } print('last in-\n       dex of re is ' + re.lastIndex); if (re.lastIndex) throw new Er-\n       ror('re has already been executed!');\n\n       var m = re.exec(str);\n\n       print(m);\n\n       testa(m[1]); }\n\n    So what the above block does is create a RegExp using the aforemen-\n    tioned //g construct, matches and cpatures a string, then calls the\n    same function again with that match.\n\n    The output from that? Not quite what you'd expect....\n\n     lang:0\n     js> testa('abcabc');\n     last index of re is 0 abc,abc true last index of re is 3\n     re-test.js:21: re has already been executed!\n\n    What's this telling us? RegExp instances have a property of lastIn-\n    dex which says where the last global match finished. If you examine\n    the output, you can see that the second time the function is called,\n    re.lastIndex is 3. How bizzare!\n\n     lang:JavaScript function testb(str) { // Make sure we only call\n     ourselves twice at most. if (this.inB == 2) return; else if\n     (this.inB) this.inB++; else this.inB = 1; var re = new RegEx-\n     p('(abc)', 'g');\n\n       if (reB) { print(reB === re); } else { reB = re; }\n\n       print('last index of re is ' + re.lastIndex); if (re.lastIndex)\n       throw new Error('re has already been executed!');\n\n       var m = re.exec(str);\n\n       print(m);\n\n       testb(m[1]); }\n\n    And the output from testb?\n\n     lang:0 last index of re is 0 abc,abc false last index of re\n     is 0 abc,abc\n\n    So that works. Let this be a lesson to you. If you need to a regexp\n    with the 'g' (global) flag on a regular expression in JavaScript,\n    make sure you create it with the new RegExp(pattern, 'g'), not /pat-\n    ter/g, or else it will cause you problems.\n"
title: Spidermonkey RegExps are Singletons
type: pod
uri: http://perlitist.com/articles/spidermonkey-regexps-are-singletons
xhtml: "<div class=\"pod\">\n<p>So in my (perhaps insane) quest to create an implementation of Andy Wardley's <a href=\"http://www.template-toolkit.org\">Template Toolkit</a> in JavaScript, I discovered something rather irratating about Spidermonkey's implementation of regular expressions. (Spidermonkey is the name of the Mozilla JavaScript engine.)</p>\n<p>Put simply, if you use the <code>/foobar/g</code> construct to createa RegExp object, that object will be a singleton. Not with me? Well then consider the following example</p>\n<pre><span class=\"Keyword\">var</span><span class=\"Normal\"> reB, reA;</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Keyword\">function</span><span class=\"Normal\"> testa(str) {</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Comment\">// Make sure we are cally ourselves twice at most.</span><span class=\"Comment\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (this.inA == </span><span class=\"Float\">2</span><span class=\"Normal\">)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    </span><span class=\"Keyword\">return</span><span class=\"Normal\">;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">else</span><span class=\"Normal\"> </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (this.inA)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    this.inA++;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">else</span><span class=\"Normal\">\n</span><span class=\"Normal\">    this.inA = </span><span class=\"Float\">1</span><span class=\"Normal\">;</span>\n\n<span class=\"Normal\">  </span><span class=\"Keyword\">var</span><span class=\"Normal\"> re = /(abc)/g;</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (reA) {</span><span class=\"Normal\">\n</span><span class=\"Normal\">    </span><span class=\"DataType\">print</span><span class=\"Normal\">(reA == re);</span><span class=\"Normal\">\n</span><span class=\"Normal\">  } </span><span class=\"Keyword\">else</span><span class=\"Normal\"> {</span><span class=\"Normal\">\n</span><span class=\"Normal\">    reA = re;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  }</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"DataType\">print</span><span class=\"Normal\">(</span><span class=\"String\">&apos;</span><span class=\"Char\">last index of re is &apos;</span><span class=\"Normal\"> + re.</span><span class=\"DataType\">lastIndex</span><span class=\"Normal\">);</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (re.</span><span class=\"DataType\">lastIndex</span><span class=\"Normal\">)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    throw </span><span class=\"Keyword\">new</span><span class=\"Normal\"> Error(</span><span class=\"String\">&apos;</span><span class=\"Char\">re has already been executed!&apos;</span><span class=\"Normal\">);</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"Keyword\">var</span><span class=\"Normal\"> m = re.</span><span class=\"DataType\">exec</span><span class=\"Normal\">(str);</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"DataType\">print</span><span class=\"Normal\">(m);</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  testa(m[</span><span class=\"Float\">1</span><span class=\"Normal\">]);</span><span class=\"Normal\">\n</span><span class=\"Normal\">}</span>\n</pre>\n<p>So what the above block does is create a RegExp using the aforementioned <code>//g</code> construct, matches and cpatures a string, then calls the same function again with that match.</p>\n<p>The output from that? Not quite what you'd expect....</p>\n<pre>js&gt; testa('abcabc');\nlast index of re is 0\nabc,abc\ntrue\nlast index of re is 3\nre-test.js:21: re has already been executed!\n</pre>\n<p>What's this telling us? RegExp instances have a property of lastIndex which says where the last global match finished.  If you examine the output, you can see that the second time the function is called, <code>re.lastIndex</code> is 3. How bizzare!</p>\n<pre><span class=\"Keyword\">function</span><span class=\"Normal\"> testb(str) {</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Comment\">// Make sure we only call ourselves twice at most. </span><span class=\"Comment\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (this.inB == </span><span class=\"Float\">2</span><span class=\"Normal\">)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    </span><span class=\"Keyword\">return</span><span class=\"Normal\">;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">else</span><span class=\"Normal\"> </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (this.inB)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    this.inB++;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">else</span><span class=\"Normal\">\n</span><span class=\"Normal\">    this.inB = </span><span class=\"Float\">1</span><span class=\"Normal\">;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">var</span><span class=\"Normal\"> re = </span><span class=\"Keyword\">new</span><span class=\"Normal\"> </span><span class=\"Reserved\">RegExp</span><span class=\"Normal\">(</span><span class=\"String\">&apos;</span><span class=\"Char\">(abc)&apos;</span><span class=\"Normal\">, </span><span class=\"String\">&apos;</span><span class=\"Char\">g&apos;</span><span class=\"Normal\">);</span><span class=\"Normal\">\n</span><span class=\"Normal\">   </span>\n\n<span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (reB) {</span><span class=\"Normal\">\n</span><span class=\"Normal\">    </span><span class=\"DataType\">print</span><span class=\"Normal\">(reB === re);</span><span class=\"Normal\">\n</span><span class=\"Normal\">  } </span><span class=\"Keyword\">else</span><span class=\"Normal\"> {</span><span class=\"Normal\">\n</span><span class=\"Normal\">    reB = re;</span><span class=\"Normal\">\n</span><span class=\"Normal\">  }</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"DataType\">print</span><span class=\"Normal\">(</span><span class=\"String\">&apos;</span><span class=\"Char\">last index of re is &apos;</span><span class=\"Normal\"> + re.</span><span class=\"DataType\">lastIndex</span><span class=\"Normal\">);</span><span class=\"Normal\">\n</span><span class=\"Normal\">  </span><span class=\"Keyword\">if</span><span class=\"Normal\"> (re.</span><span class=\"DataType\">lastIndex</span><span class=\"Normal\">)</span><span class=\"Normal\">\n</span><span class=\"Normal\">    throw </span><span class=\"Keyword\">new</span><span class=\"Normal\"> Error(</span><span class=\"String\">&apos;</span><span class=\"Char\">re has already been executed!&apos;</span><span class=\"Normal\">);</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"Keyword\">var</span><span class=\"Normal\"> m = re.</span><span class=\"DataType\">exec</span><span class=\"Normal\">(str);</span><span class=\"Normal\">\n</span><span class=\"Normal\"> </span>\n\n<span class=\"Normal\">  </span><span class=\"DataType\">print</span><span class=\"Normal\">(m);</span>\n\n<span class=\"Normal\">  testb(m[</span><span class=\"Float\">1</span><span class=\"Normal\">]);</span><span class=\"Normal\">\n</span><span class=\"Normal\">}</span>\n</pre>\n<p>And the output from <code>testb</code>?</p>\n<pre>last index of re is 0\nabc,abc\nfalse\nlast index of re is 0\nabc,abc\n</pre>\n<p>So that works. Let this be a lesson to you. If you need to a regexp with the 'g' (global) flag on a regular expression in JavaScript, make sure you create it with the <code>new RegExp(pattern, 'g')</code>, not <code>/patter/g</code>, or else it will cause you problems.</p>\n\n\n</div>"
