Spidermonkey RegExps are Singletons
So in my (perhaps insane) quest to create an implementation of Andy Wardley's Template Toolkit in JavaScript, I discovered something rather irratating about Spidermonkey's implementation of regular expressions. (Spidermonkey is the name of the Mozilla JavaScript engine.)
Put simply, if you use the /foobar/g construct to createa RegExp object, that object will be a singleton. Not with me? Well then consider the following example
var reB, reA; function testa(str) { // Make sure we are cally ourselves twice at most. if (this.inA == 2) return; else if (this.inA) this.inA++; else this.inA = 1; var re = /(abc)/g; if (reA) { print(reA == re); } else { reA = re; } print('last index of re is ' + re.lastIndex); if (re.lastIndex) throw new Error('re has already been executed!'); var m = re.exec(str); print(m); testa(m[1]); }
So what the above block does is create a RegExp using the aforementioned //g construct, matches and cpatures a string, then calls the same function again with that match.
The output from that? Not quite what you'd expect....
js> testa('abcabc');
last index of re is 0
abc,abc
true
last index of re is 3
re-test.js:21: re has already been executed!
What's this telling us? RegExp instances have a property of lastIndex which says where the last global match finished. If you examine the output, you can see that the second time the function is called, re.lastIndex is 3. How bizzare!
function testb(str) { // Make sure we only call ourselves twice at most. if (this.inB == 2) return; else if (this.inB) this.inB++; else this.inB = 1; var re = new RegExp('(abc)', 'g'); if (reB) { print(reB === re); } else { reB = re; } print('last index of re is ' + re.lastIndex); if (re.lastIndex) throw new Error('re has already been executed!'); var m = re.exec(str); print(m); testb(m[1]); }
And the output from testb?
last index of re is 0 abc,abc false last index of re is 0 abc,abc
So that works. Let this be a lesson to you. If you need to a regexp with the 'g' (global) flag on a regular expression in JavaScript, make sure you create it with the new RegExp(pattern, 'g'), not /patter/g, or else it will cause you problems.
Comments on Spidermonkey RegExps are Singletons | no comments | Post a comment