[MacRuby-devel] Regular expression related performance

Yasu Imao yimao.ml at gmail.com
Wed Dec 1 09:46:07 PST 2010


Hello,

I'm rewriting an app for text analysis in MacRuby, which I originally wrote in RubyCocoa.  But I encountered a serious performance issue in MacRuby, which is related to processing text using regular expressions.  

I'm wondering if this will be taken care of in the near future (or already done in 0.8?).

Below are my simple tests.  The first two are essentially the same with a slightly different approach.  Both are simply counting frequency of each word.  I want to use the first approach not to count word frequencies, but in other processes.  The third one is to test the speed of String#gsub with regular expression.  I felt String#gsub was slow in my app, so I just wanted to test how slow it is compared to RubyCocoa.

 
Test 1 - scan-block

freq = Hash.new(0)
text.scan(/\w+/) do |word|
  freq[word] += 1
end


Test 2 - scan array.each

freq = Hash.new(0)
text.scan(/\w+/).each do |word|
  freq[word] += 1
end


Test 3 - gsub upcase

text.gsub!(/\w+/){|x| x.upcase}  


The results are in seconds.  The original text is in English with 8154 words.  Each process was repeated 10 times to calculate processing times.  Each test were done 3 times.

Ruby 1.8.7	 Test1 - scan-block:			  0.542,    0.502,    0.518
Ruby 1.8.7	 Test2 - scan array.each:	 	  0.399,    0.392,    0.399
Ruby 1.8.7	 Test3 - gsub upcase:		  0.384,    0.349,    0.390

MacRuby 0.7.1 Test1 - scan-block:      		27.612,  27.707,  27.453
MacRuby 0.7.1 Test2 - scan array.each: 	  3.556,    3.616,    3.554
MacRuby 0.7.1 Test3 - gsub upcase:    		27.613,  26.826,  27.327


Thanks,
Yasu


More information about the MacRuby-devel mailing list