[MacRuby-devel] StringScanner Performance

Sylvester Keil sylvester.keil at gmail.com
Thu Apr 19 06:19:05 PDT 2012


Dear all,

while debugging performance issues for a gem (bibtex-ruby) I noticed that MacRuby's StringScanner implementation creates new Regular Expression objects every time #scan is called; as I am dealing with a lexical analyzer based on StringScanner the #scan method is quite crucial and the current implementation performs so slowly that it is basically unusable on MacRuby.

This is the problematic method in MacRuby:

https://github.com/MacRuby/MacRuby/blob/master/lib/strscan.rb#L638

Both MRI and Rubinius work around this by using a feature of Oniguruma patterns to match the pattern at the beginning of a string only. Here are the corresponding sections:

https://github.com/ruby/ruby/blob/trunk/ext/strscan/strscan.c#L437
https://github.com/rubinius/rubinius/blob/master/lib/strscan.rb#L264

Do regular expressions in MacRuby expose similar functionality to either Ruby or C extensions? I'd be happy to help resolving this issue, but have no experience with MacRuby so any pointers on how to best approach this are much appreciated.

Thanks,

Sylvester






More information about the MacRuby-devel mailing list