Macruby 10x slower than built-in Ruby for my script
Hi all, I'm using ruby for some simple scripts for data processing, and I'm interested in Macruby since it's a mac friendly framework and very snappy too. But in this very simple (I think) script, Macruby is 10x slower than built-in Ruby in OSX 10.5.7 It's for reading atom positions, calculating distances, make a summary and find out nearest atoms. You can get the codes here: http://gist.github.com/138760 ------------------------------------------------------------------------------------------------------------------ require 'scanf' arr=IO.readlines(ARGV[0]) pos_data = arr[4..-1] pos = pos_data.collect{|a| a.scanf('%d %d %f %f %f')[2..4]} dist_matrix = pos.collect { |i| pos.collect{|j| (((i[0]-j[0])**2+(i[1]- j[1])**2+(i[2]-j[2])**2)**0.5*0.529*100).round/100.0}} dist_uniq = dist_matrix.inject([]) { |sum,i| sum|i } - [0.0] dist_min = dist_uniq.min dist_dup = dist_matrix.inject([]) { |sum,i| sum+i } dist_sum = dist_dup.inject(Hash.new(0)) { |h,i| h[i] += 1; h} dist_sum_sort = dist_sum.sort.find_all {|i| i[0]<3.0} dist_sum_sort.each { |i| puts "%.2f %d" %i } dist_matrix.each_with_index{|item,i| a=item.index(dist_min) if a then puts "%.2f: %d and %d"%[dist_min, a, i] end } ------------------------------------------------------------------------------------------------------------------ $time macruby dist.rb 309.xv real 0m17.084s user 0m26.893s sys 0m1.330s $time ruby dist.rb 309.xv real 0m1.052s user 0m0.981s sys 0m0.069s $time macruby-exp dist.rb 309.xv real 0m9.934s user 0m17.090s sys 0m0.643s ----------------------------------------------------------------------------------------------------------------------- the 309.xv file is too lengthy for email so you can download from the link. I also tested on several Macs and It's always the same. Any ideas? I admit the codes is not well optimized, but I don't think macruby should be so slow. Even 309**2 is not a too big number for current machines. Thanks, Sincerely Cheng
Hi all, On Wed, Jul 1, 2009 at 9:26 PM, zhida cheng<vorbei@gmail.com> wrote:
But in this very simple (I think) script, Macruby is 10x slower than built-in Ruby in OSX 10.5.7 It's for reading atom positions, calculating distances, make a summary and find out nearest atoms. You can get the codes here: http://gist.github.com/138760
I'd like to confirm this report with two Shark profiling results: http://omploader.org/vMXducQ/macruby.mshark (Running the ruby program dist.rb given by Zhida with macruby, trunk version) http://omploader.org/vMXdudQ/ruby.mshark (Running the ruby program dist.rb given by Zhida with ruby, the version shipping with Mac OS X Leopard)
From the first result, we can see most of the time is spent on libauto.dylib, the garbage collector library, especially spinlocks in that library. I suppose that may be a direction for optimization?
- Jiang
Hi, On Jul 1, 2009, at 8:23 AM, Jjgod Jiang wrote:
Hi all,
On Wed, Jul 1, 2009 at 9:26 PM, zhida cheng<vorbei@gmail.com> wrote:
But in this very simple (I think) script, Macruby is 10x slower than built-in Ruby in OSX 10.5.7 It's for reading atom positions, calculating distances, make a summary and find out nearest atoms. You can get the codes here: http://gist.github.com/138760
I'd like to confirm this report with two Shark profiling results:
http://omploader.org/vMXducQ/macruby.mshark (Running the ruby program dist.rb given by Zhida with macruby, trunk version)
http://omploader.org/vMXdudQ/ruby.mshark (Running the ruby program dist.rb given by Zhida with ruby, the version shipping with Mac OS X Leopard)
From the first result, we can see most of the time is spent on libauto.dylib, the garbage collector library, especially spinlocks in that library. I suppose that may be a direction for optimization?
I didn't look at the shark profiling results but I am pretty sure the problem here is because we need to box the fixnums in order to insert them into collections (NSArray), because NSArray expects true Objective-C objects and fixnums are immediate types. This means a call to the object allocator and a call to the GC to set the write barrier for every object insertion. We have a plan to fix this in experimental, it's in our TODO list. The idea is to make a specialized subclass of NSArray / NSDictionary only for pure Ruby use, where the elements can be immediate. We should then have much better performance results (close to 1.9 maybe). Laurent
participants (3)
-
Jjgod Jiang
-
Laurent Sansonetti
-
zhida cheng