[MacRuby] #424: performance regression from ruby 1.8 and 1.9
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- I have some code that previously ran incorrectly on MacRuby 0.4, and until recently crashed MacRuby trunk. Now that it works I would like to report that the performance of it is not as fast as 1.8 or 1.9. I know that MacRuby working is a primary consideration and speed only comes after that, but as people start looking at speed more and more I wanted to get this on the radar. This also supersedes #348 which was a smaller also based on this code, but was a reduced file since the main code would not run. Performance of Ruby 1.8, Ruby 1.9, and MacRuby compared using a faster and a slower version of the same basic algorithm: {{{ 35 jordan@thetourist ~/priority_queue_bug > for LOOP in ruby ruby1.9 macruby; do for LOOP2 in ./search.pqueue ./search_slow.pqueue; do eval time "${LOOP}" "${LOOP2}" -m -c a-star input_files/input13_7.txt
/dev/null; done; done ruby ./search.pqueue -m -c a-star input_files/input13_7.txt 4.51s user 0.10s system 98% cpu 4.657 total ruby ./search_slow.pqueue -m -c a-star input_files/input13_7.txt 9.84s user 0.15s system 98% cpu 10.087 total ruby1.9 ./search.pqueue -m -c a-star input_files/input13_7.txt 2.74s user 0.08s system 98% cpu 2.851 total ruby1.9 ./search_slow.pqueue -m -c a-star input_files/input13_7.txt 5.80s user 0.11s system 99% cpu 5.960 total macruby ./search.pqueue -m -c a-star input_files/input13_7.txt 14.30s user 0.69s system 171% cpu 8.722 total macruby ./search_slow.pqueue -m -c a-star input_files/input13_7.txt 27.30s user 0.88s system 150% cpu 18.745 total }}}
performance of various versions of the script specific to MacRuby: {{{ 36 jordan@thetourist ~/priority_queue_bug > for LOOP in *.cocoa; do eval time macruby "${LOOP}" -c -m a-star input_files/input13_7.txt >/dev/null; done macruby search.cocoa -c -m a-star input_files/input13_7.txt 14.38s user 0.79s system 161% cpu 9.392 total macruby search_group_add.cocoa -c -m a-star input_files/input13_7.txt 13.61s user 0.71s system 171% cpu 8.345 total macruby search_group_add_slow.cocoa -c -m a-star input_files/input13_7.txt 26.97s user 0.92s system 146% cpu 19.027 total macruby search_slow.cocoa -c -m a-star input_files/input13_7.txt 27.34s user 0.94s system 150% cpu 18.730 total }}} MacRuby information: {{{ 37 jordan@thetourist ~/priority_queue_bug > macruby -e 'p MACRUBY_VERSION' "0.5" 38 jordan@thetourist ~/priority_queue_bug > macruby -e 'p MACRUBY_REVISION' "git commit 19b2d7811662bf636cbcc2fe1ae99afef3243225" }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Another example (I think this file is the worst case for my sample program): {{{ 12 jordan@thetourist ...tificial Intelligence/Assignment No. 1/priority_queue_bug > for LOOP in ruby ruby1.9 macruby; do eval time "${LOOP}" ./search.pqueue -m -c a-star input_files/input13_1.txt
/dev/null; done ruby ./search.pqueue -m -c a-star input_files/input13_1.txt 12.22s user 0.19s system 99% cpu 12.513 total ruby1.9 ./search.pqueue -m -c a-star input_files/input13_1.txt 7.10s user 0.13s system 99% cpu 7.251 total macruby ./search.pqueue -m -c a-star input_files/input13_1.txt 34.86s user 1.52s system 180% cpu 20.195 total }}}
-- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:1> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by mattaimonetti@…): Jordan, could you please find out what exactly is slower in your script? If you can extract it out it will be easier to work on the isolated piece of code that's slower. Thanks, - Matt -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:2> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Well, unless I am losing my mind it must be something to do with iterating over things in MacRuby. I say this because even in the example below the amount of nodes being pushed onto the queue should be dominated by all of the other work going on, almost eliminating performance differences of the queue itself. {{{ 57 jordan@thetourist ~/priority_queue_bug > for LOOP in search.*; do eval time macruby "${LOOP}" -c -m a-star input_files/input13_1.txt >/dev/null; done macruby search.cocoa -c -m a-star input_files/input13_1.txt 33.69s user 1.58s system 182% cpu 19.363 total macruby search.pqueue -c -m a-star input_files/input13_1.txt 35.01s user 1.49s system 179% cpu 20.278 total }}} However, if I use two new scripts that are strictly adding a large amount of nodes the the queue the difference is greater, in this case the .pqueue file would still be taking the hit of iterating, etc. strictly on the MacRuby side (where MacRuby appears to be slower than 1.8 or 1.9), the .cocoa example ends up doing most of the heaving lifting in this case through CFBinaryHeap. Here are ruby, ruby1.9, and macruby using the .pqueue file: {{{ 62 jordan@thetourist ~/priority_queue_bug > for LOOP in ruby ruby1.9 macruby; do eval time "${LOOP}" ./speed_test.pqueue; done ruby ./speed_test.pqueue 82.38s user 3.10s system 96% cpu 1:28.39 total ruby1.9 ./speed_test.pqueue 55.25s user 3.02s system 99% cpu 58.845 total macruby ./speed_test.pqueue 69.08s user 11.54s system 47% cpu 2:49.53 total }}} Here is macruby using the .pqueue file and the .cocoa file: {{{ 63 jordan@thetourist ~/priority_queue_bug > for LOOP in speed_test.*; do eval time macruby "${LOOP}"; done macruby speed_test.cocoa 6.73s user 0.34s system 117% cpu 6.005 total macruby speed_test.pqueue 65.32s user 10.76s system 50% cpu 2:31.36 total }}} Here is the .pqueue file: {{{ #!/usr/bin/env ruby1.9 # put the directory this file is in into the search path so that we # can get ruby_priority_queue below $:.unshift(File.dirname($0)) require 'lib/revision' require 'lib/pqueue' if (Revision.new(RUBY_VERSION) < Revision.new("1.9")) class String def chars return self.split("") end end end class TestNode include Comparable attr_reader :priority def initialize(newPriority) $counter = $counter + 1 @priority = newPriority end def <=>(rhs) @priority <=> rhs.priority end def getChildren (@priority * 5).upto((@priority * 5) + 2).map { |childPriority| TestNode.new(childPriority) } end end $counter = 0 $searchQueue = PQueue.new $searchQueue.push(TestNode.new(0)) while ((currentNode = $searchQueue.pop) && ($counter < 250000)) do currentNode.getChildren.each { |child| $searchQueue.push(child) } end exit 0 }}} Here is the .cocoa file: {{{ #!/usr/bin/env macruby # put the directory this file is in into the search path so that we # can get ruby_priority_queue below $:.unshift(File.dirname($0)) require 'lib/revision' framework 'lib/JBBPriorityQueue.framework' if (Revision.new(RUBY_VERSION) < Revision.new("1.9")) class String def chars return self.split("") end end end class TestNode include Comparable attr_reader :priority def initialize(newPriority) $counter = $counter + 1 @priority = newPriority end def <=>(rhs) @priority <=> rhs.priority end def compareToNode(rhs) @priority <=> rhs.priority end def getChildren (@priority * 5).upto((@priority * 5) + 2).map { |childPriority| TestNode.new(childPriority) } end end $counter = 0 $searchQueue = JBBPriorityQueue.alloc.init $searchQueue.push(TestNode.new(0)) while ((currentNode = $searchQueue.pop) && ($counter < 250000)) do currentNode.getChildren.each { |child| $searchQueue.push(child) } end exit 0 }}} The library files are in the attached zip file already, these files were run from that directory. I will keep looking for other places in my main script that might have large differences between ruby, ruby1.9, and macruby. Please let me know if there is anything else I can specifically look for. -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:3> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by mattaimonetti@…):
Please let me know if there is anything else I can specifically look for.
Can you break it down to a smaller example really showing where the problem is without requiring your script? Thanks, - Matt -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:4> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): I can look into it more later, the ones posted above actually only require two smallish ruby files out of the lib directory from my archive, I could just paste them into the top of the other file and upload it as a new attachment if that would help in any way. In the mean time I used DTrace to get some timings for functions/methods from ruby and MacRuby ruby: {{{ 13 jordan@thetourist ~/priority_queue_bug > sudo dtrace -s ruby_method_analysis.d -c "ruby ./search.pqueue -m -c a-star input_files/input13_4.txt" Password: Target pid: 352 Step 0: bbwxbwbwbwwbw (c=0) Step 1: move 6 bbwbbwxwbwwbw (c=3) Step 2: move 5 bbwbbxwwbwwbw (c=1) Step 3: move 8 bbwbbbwwxwwbw (c=3) Step 4: move 2 bbxbbbwwwwwbw (c=6) Step 5: move 11 bbbbbbwwwwwxw (c=9) Step 6: move 6 bbbbbbxwwwwww (c=5) CLASS METHOD COUNT AVG(us) SUM(us) ----------------------------------------------------------------------------------------------------------------------- String split 3298 36 121861 IO write 16 19 307 String % 6 16 98 Exception initialize 1 16 16 JBBNodePriority initialize 3308 16 53831 Exception set_backtrace 1 15 15 Range each 3299 15 51517 String initialize_copy 42886 14 619271 Exception backtrace 1 14 14 Fixnum to_s 15 14 213 JBBNode priority 38826 14 544505 JBBNode to_s 7 13 97 Hash []= 381 13 5291 Hash include? 8629 13 117828 String index 3298 13 44610 String length 3300 13 43452 String []= 79176 13 1034814 Array push 381 13 4971 Array to_a 3299 12 42753 Array []= 18440 12 237676 Fixnum div 7236 12 93047 Fixnum <=> 5068 12 64921 String == 53172 12 679790 Array [] 87152 12 1109695 String [] 79176 12 1007480 Fixnum == 42887 12 544730 Fixnum abs 20086 12 255014 Fixnum / 9894 12 125471 Hash [] 16788 12 211988 Fixnum > 31024 12 390794 Fixnum * 21067 12 264933 Fixnum < 28374 12 355145 Fixnum - 24145 12 302109 Fixnum + 81377 12 1017731 }}} MacRuby: {{{ 14 jordan@thetourist ~/priority_queue_bug > sudo dtrace -s macruby_method_analysis.d -c "macruby ./search.pqueue -m -c a-star input_files/input13_4.txt" Target pid: 360 Step 0: bbwxbwbwbwwbw (c=0) Step 1: move 6 bbwbbwxwbwwbw (c=3) Step 2: move 5 bbwbbxwwbwwbw (c=1) Step 3: move 8 bbwbbbwwxwwbw (c=3) Step 4: move 2 bbxbbbwwwwwbw (c=6) Step 5: move 11 bbbbbbwwwwwxw (c=9) Step 6: move 6 bbbbbbxwwwwww (c=5) CLASS METHOD COUNT AVG(us) SUM(us) ----------------------------------------------------------------------------------------------------------------------- PQueue initialize 1 9068 9068 Class protected: 1 6113 6113 Revision initialize: 2 1456 2912 Class private: 11 1398 15381 OptionParser::List each_option 6 861 5167 Class attr_reader: 11 685 7541 Class gm: 1 663 663 Class include: 5 646 3231 NSMutableArray extend: 1 613 613 NSMutableArray <=>: 1 571 571 NSMutableString %: 13 550 7151 Class attr_accessor: 4 545 2180 OptionParser::OptionMap invert 14 524 7346 Enumerator to_a 3634 517 1880180 JBBNode <=>: 22168 429 9528017 Class extend_object: 3 422 1267 NSMutableArray each 3654 364 1333418 FalseClass ===: 20 345 6901 NSMutableString scan: 1 344 344 NilClass ===: 20 343 6878 TrueClass ===: 20 342 6852 Range to_a 3633 323 1176485 NSMutableDictionary update: 7 319 2239 Class attr: 1 314 314 Class attr_writer: 5 297 1488 Class new 7 289 2024 OptionParser::OptionMap []: 2 283 566 OptionParser::List __send__: 86 281 24238 NSMutableArray each_with_index 1 275 275 Class new: 3639 270 985681 NSMutableArray delete_if 7 254 1782 Proc call: 4 253 1015 Class append_features: 5 247 1236 ByteString swap!: 43596 241 10536678 Set initialize 1 199 199 OptionParser::CompletingHash []=: 7 182 1279 OptionParser::List initialize 3 148 445 Class readlines: 1 139 139 OptionParser::Switch::NoArgument initialize: 6 135 810 OptionParser::Switch::RequiredArgument initialize: 1 132 132 Class alloc 3636 124 452700 NSMutableString *: 4 121 487 Method to_proc 1 119 119 NSArray <<: 2 119 238 NSMutableString +: 2 117 234 JBBNodePriority initialize: 3632 116 423058 OptionParser::OptionMap default: 2 115 231 Class extended: 2 114 229 Proc initialize 4 114 458 OptionParser::List atype 81 114 9275 Range each 3634 114 415928 NilClass ==: 20 113 2277 Class included: 5 113 565 FalseClass ==: 20 112 2259 Class >=: 2 112 224 TrueClass ==: 20 112 2240 NilClass singleton_method_added: 1 110 110 Class method_added: 232 110 25593 Class ==: 4 110 440 NSObject singleton_method_added: 1 109 109 Proc to_proc 119 108 12946 Class inherited: 23 107 2462 OptionParser::Switch::OptionalArgument ==: 1 106 106 OptionParser::Switch::RequiredArgument ==: 12 106 1274 JBBNode to_s 7 105 738 OptionParser::Switch::NoArgument ==: 33 104 3458 Regexp ==: 6 102 617 OptionParser::List short 6 100 605 NSMutableString chars 2 94 189 Class exists?: 1 92 92 ByteString chars 7266 90 659583 ByteString strip 1 89 89 Class dirname: 1 87 87 ByteString downcase 1 81 81 NSMutableString =~: 1 70 70 JBBNode puts: 7 63 445 NSMutableArray reject 3633 59 217170 NSMutableArray method: 1 58 58 ByteString count: 3 56 169 NSMutableArray join: 1 55 55 Class lambda 5 55 275 NSMutableArray instance_eval 1 53 53 JBBSearch start_node 1 53 53 Class file?: 1 48 48 JBBNode puts 2 47 95 NSMutableArray length 2 46 93 Fixnum to_s 1 41 41 NSMutableString split: 3 40 121 PQueue lambda 1 40 40 OptionParser version=: 1 38 38 NSMutableArray at: 6 37 225 OptionParser::CompletingHash respond_to?: 2 36 72 NSMutableArray unshift: 1 35 35 Class basename: 1 34 34 NSObject include?: 1 33 33 NSMutableString downcase 8 33 267 PQueue loop 1 33 33 OptionParser banner=: 1 32 32 OptionParser::OptionMap []=: 17 32 546 NSMutableDictionary fetch: 81 31 2548 JBBNode save_path: 1 30 30 ByteString []: 87192 30 2689254 ByteString =~: 1 30 30 NSMutableArray []=: 1 29 29 Revision contents 1 29 29 NSMutableDictionary keys 20 29 587 Class incompatible_argument_styles: 1 29 29 ByteString dup 47229 29 1378244 OptionParser::OptionMap fetch: 5 29 145 NSMutableArray []: 1 28 28 NSMutableArray push: 426 28 12089 OptionParser::Switch::NoArgument block 2 28 56 Class const_set: 9 28 254 Set add: 413 27 11538 OptionParser::Switch::NoArgument conv 4 27 110 OptionParser::Switch::RequiredArgument match_nonswitch?: 2 27 55 OptionParser::Switch::NoArgument match_nonswitch?: 12 27 330 JBBNode child_for_goal 7 26 187 OptionParser::List long 3 26 79 Regexp respond_to?: 11 26 289 OptionParser top 5 26 130 OptionParser base 3 25 77 OptionParser::List list 6 25 154 NSMutableArray size 1 25 25 NSMutableDictionary freeze 1 25 25 JBBNode state 413 25 10589 OptionParser notwice: 2 25 51 NSMutableString freeze 12 25 300 OptionParser::List block_given? 9 25 225 NSMutableArray freeze 5 24 124 ByteString index: 3633 24 90215 Class pattern 6 24 148 Class proc 3 24 74 OptionParser block_given? 30 24 733 JBBNode node_cost 7 24 170 NSMutableArray empty? 8 24 194 JBBNode parent 832 24 20204 NSMutableArray shift 5 24 121 Class top 13 24 312 JBBNodePriority second_level 5258 23 126144 NSMutableString to_i 12 23 286 JBBNode expanded_children 412 23 9833 NSMutableDictionary include?: 9376 23 221179 JBBNodePriority first_level 38373 22 871790 JBBNode blank 3638 22 82600 NilClass nil? 3 22 67 ByteString []=: 87192 22 1945721 JBBNode level 4103 22 91287 JBBNode cost 7264 22 161085 Fixnum <=>: 5260 21 115443 JBBNode heuristic 3632 21 76630 JBBNode priority 40704 20 850370 Fixnum abs 22199 20 460471 Fixnum div: 7921 19 153603 }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:5> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Another interesting data point: The stripped down test that just generates and inserts X number of nodes and pops some along the way (to generate new nodes) always runs incredibly fast when using the CFBinaryHeap version of the script: {{{ 54 jordan@thetourist ~/priority_queue_bug > for LOOP in *_{125,187,218,250}.cocoa; do eval time macruby "${LOOP}"; done macruby speed_test_125.cocoa 3.45s user 0.17s system 145% cpu 2.495 total macruby speed_test_187.cocoa 4.97s user 0.22s system 150% cpu 3.457 total macruby speed_test_218.cocoa 5.92s user 0.26s system 152% cpu 4.060 total macruby speed_test_250.cocoa 6.85s user 0.29s system 153% cpu 4.657 total }}} When running the pure ruby version it is slower, and periodically can even be slower than ruby and ruby1.9, which are both fairly consistent, even with large data sets (250,000 items): {{{ ruby speed_test_250.pqueue 81.52s user 2.80s system 99% cpu 1:24.36 total ruby speed_test_250.pqueue 81.48s user 2.82s system 99% cpu 1:24.40 total ruby speed_test_250.pqueue 81.77s user 3.03s system 99% cpu 1:24.86 total ruby speed_test_250.pqueue 81.51s user 2.89s system 99% cpu 1:24.44 total }}} while the MacRuby version tends to fluctuate, especially with large data sets, the interesting part is that when it is slower with large data sets it is using a decent amount of memory (at least according to iStat Menus), and not as much CPU, possibly an issue with garbage collection? {{{ macruby speed_test_250.pqueue 76.62s user 9.61s system 160% cpu 53.618 total macruby speed_test_250.pqueue 78.58s user 9.93s system 159% cpu 55.512 total macruby speed_test_250.pqueue 82.74s user 13.04s system 98% cpu 1:36.89 total macruby speed_test_250.pqueue 78.36s user 11.76s system 58% cpu 2:33.71 total }}} The _125 variant generates 125000 nodes, _187 generates 187500 nodes, _218 generates 218750 nodes, 250 generates 250000 nodes. MacRuby tends to perform more slowly with large data sets with several applications open, but not doing anything (they are taking up memory, but not CPU). In the same scenario the timings for ruby and ruby1.9 are still consistent and ruby1.9 still performs quickly. -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:6> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Just tried a new version that generates 300000 nodes. ruby and ruby1.9 end up slowing down significantly, memory usage is way up, cpu is down, macruby doesn't even finish running: {{{ 60 jordan@thetourist ~/priority_queue_bug > for LOOP in ruby ruby1.9 macruby; do eval time "${LOOP}" ./speed_test_300.pqueue; done ruby ./speed_test_300.pqueue 124.68s user 5.30s system 56% cpu 3:50.52 total ruby1.9 ./speed_test_300.pqueue 80.61s user 7.13s system 47% cpu 3:05.52 total macruby(1524,0x7fff70940be0) malloc: *** auto malloc[1524]: agc error: Can not allocate new region 2009-11-10 22:30:32.742 macruby[1524:903] -[Bignum message]: unrecognized selector sent to instance 0x200088220 2009-11-10 22:30:34.888 macruby[1524:903] *** Terminating app due to uncaught exception 'NoMethodError', reason: 'undefined method `message' for #<Bignum:0x200088220>' *** Call stack at first throw: ( 0 CoreFoundation 0x00007fff80849444 __exceptionPreprocess + 180 1 libobjc.A.dylib 0x00007fff881a60f3 objc_exception_throw + 45 2 libmacruby.dylib 0x000000010017aef5 rb_vm_raise + 437 3 libmacruby.dylib 0x0000000100040f79 rb_exc_raise + 9 4 libmacruby.dylib 0x000000010017a350 rb_vm_method_missing + 560 5 libmacruby.dylib 0x000000010015fbcc rb_vm_call_with_cache2 + 5740 6 libmacruby.dylib 0x0000000100161a34 _ZL14method_missingmP13objc_selectorP11rb_vm_blockiPKm29rb_vm_method_missing_reason_t + 324 7 libmacruby.dylib 0x00000001001623cd rb_vm_call + 2125 8 libmacruby.dylib 0x00000001001796c3 rb_vm_print_current_exception + 99 9 macruby 0x0000000100000dec main + 172 10 macruby 0x0000000100000d34 start + 52 ) terminate called after throwing an instance of 'NSException' zsh: abort macruby ./speed_test_300.pqueue macruby ./speed_test_300.pqueue 69.00s user 15.48s system 27% cpu 5:03.12 total }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:7> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by lsansonetti@…): There is definitely something wrong going on but I have higher priority bugs to fix at this point and can't help you now. If you want to help you may want to use shark and identify the slow code paths. Try to reproduce the performance problems with a small snippet and let us know of what you find. -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:8> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Here is a shark profile, I was running 10.6.2 and MacRuby is git revision "git commit 554fe2e5930a8c5cb8d0f798cfb1d3e1cec8b378", it was 2 MB, so I am just linking to it. [http://files.me.com/jordan.breeding/zr9gjj Shark Profile] -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:9> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Well, I don't have any new deep insights, I need to run shark or dtrace on the new version later, but I am updating the archive attached to this bug, and upon creating a new completely ruby based version of a priority queue based on my Objective-C version I was able to improve the run times of everything except for MacRuby. The old pqueue.rb: {{{ /usr/bin/ruby ./search.pqueue.old.rb -c -m a-star input_files/input13_1.txt 19.01s user 0.21s system 99% cpu 19.338 total /opt/homebrew/bin/ruby ./search.pqueue.old.rb -c -m a-star 13.60s user 0.15s system 99% cpu 13.779 total macruby ./search.pqueue.old.rb -c -m a-star input_files/input13_1.txt 33.11s user 1.53s system 179% cpu 19.251 total }}} The new pqueue.rb: {{{ /usr/bin/ruby ./search.pqueue.rb -c -m a-star input_files/input13_1.txt 9.92s user 0.17s system 99% cpu 10.151 total /opt/homebrew/bin/ruby ./search.pqueue.rb -c -m a-star 6.38s user 0.12s system 99% cpu 6.552 total macruby ./search.pqueue.rb -c -m a-star input_files/input13_1.txt 32.25s user 1.11s system 142% cpu 23.348 total }}} For comparison, the Objective-C based version: {{{ macruby ./search.cocoa.rb -c -m a-star input_files/input13_1.txt > /dev/null 28.56s user 1.45s system 179% cpu 16.679 total }}} I don't think it is JIT overhead since a small sample still runs quickly in macruby, I was going to try AOT but the file from macrubyc gives me errors. Small sample: {{{ macruby ./search.cocoa.rb -c -m a-star input_files/input3.txt > /dev/null 1.00s user 0.08s system 107% cpu 1.002 total macruby ./search.cocoa.rb -c -m a-star input_files/input7.txt > /dev/null 1.03s user 0.08s system 107% cpu 1.027 total }}} error with compile: {{{ 286 jordan@thetourist ~/priority_queue_bug > macrubyc -o search.cocoa search.cocoa.rb 287 jordan@thetourist ~/priority_queue_bug > ./search.cocoa core:in `dirname:': can't convert false into String (TypeError) }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:10> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Alright, after taking another look I don't know that this is the only place where it is slower, but String is definitely causing some of the problem. I add a method to String to swap characters inside the String, this seems to be consistently slower in MacRuby. {{{ #!/usr/bin/eval ruby class String def swap(first, second) self.dup.swap!(first, second) end def swap!(first, second) temp_store = self[first] self[first] = self[second] self[second] = temp_store return self end end # first test puts puts("Test 001 -- String character swap, 2500000 times") $staticString = "asdfghjkl;123456789" $firstTime = 0.0 2500000.times { randOne = rand($staticString.length) randTwo = rand($staticString.length) innerTime = Time.new newString = $staticString.swap(randOne, randTwo) $firstTime += Time.new - innerTime } puts("Test 001: %.4f seconds" % [$firstTime]) puts }}} {{{ 63 jordan@thetourist ~/Tests/performance > ruby ./small_performance_test.rb Test 001 -- String character swap, 2500000 times Test 001: 7.8250 seconds 64 jordan@thetourist ~/Tests/performance > /opt/homebrew/bin/ruby ./small_performance_test.rb Test 001 -- String character swap, 2500000 times Test 001: 7.3780 seconds 65 jordan@thetourist ~/Tests/performance > macruby ./small_performance_test.rb Test 001 -- String character swap, 2500000 times Test 001: 23.7841 seconds }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:11> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): So, I wrote a self contained group of tests to test the performance of two variants of swapping characters in strings, and test the performance of push/pop and pushObjects/pop in combination with inserting into a Set. The attachment is [attachment:performance_test.rb performance_test.rb] Results: Ruby 1.8 {{{ Test 001 -- String character swap, 2500000 times Test 001: 13.5609 seconds Overall: 19.4610 seconds Test 002 -- String character swap (alternate), 2500000 times Test 002: 8.5599 seconds Overall: 14.6769 seconds Test 003 and 004 -- Insert 500000 nodes into a priority queue using pop/push (003), plus Set calls (004) Test 003: 71.3116 seconds Test 004: 1.1641 seconds Overall: 73.5490 seconds Test 005 and 006 -- Insert 500000 nodes into a priority queue using pop/pushObjects (005), plus Set calls (006) Test 005: 77.3168 seconds Test 006: 1.0839 seconds Overall: 74.4669 seconds }}} Ruby 1.9 {{{ Test 001 -- String character swap, 2500000 times Test 001: 9.7238 seconds Overall: 14.9907 seconds Test 002 -- String character swap (alternate), 2500000 times Test 002: 8.4750 seconds Overall: 13.9169 seconds Test 003 and 004 -- Insert 500000 nodes into a priority queue using pop/push (003), plus Set calls (004) Test 003: 23.5861 seconds Test 004: 1.2312 seconds Overall: 27.4714 seconds Test 005 and 006 -- Insert 500000 nodes into a priority queue using pop/pushObjects (005), plus Set calls (006) Test 005: 22.4412 seconds Test 006: 1.1627 seconds Overall: 23.1751 seconds }}} MacRuby {{{ Test 001 -- String character swap, 2500000 times Test 001: 27.9602 seconds Overall: 36.8786 seconds Test 002 -- String character swap (alternate), 2500000 times Test 002: 24.1841 seconds Overall: 32.5536 seconds Test 003 and 004 -- Insert 500000 nodes into a priority queue using pop/push (003), plus Set calls (004) Test 003: 22.8907 seconds Test 004: 2.4607 seconds Overall: 28.7985 seconds Test 005 and 006 -- Insert 500000 nodes into a priority queue using pop/pushObjects (005), plus Set calls (006) Test 005: 28.8811 seconds Test 006: 2.5541 seconds Overall: 27.3512 seconds }}} -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:12> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by vincent.isambart@…): MacRuby's strings will be rewritten at least partially to fix lots of problems with the current implementation so it's no use benchmarking them yet. -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:13> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by jordan.breeding@…): Okay, I will keep it around just in case so I can test again after the changes. String appears to be the biggest loss of performance for me right now, so even though push/pushObjects/pop seem like they could be slightly faster (compared to 1.9) they aren't as far behind to begin with (I am assuming it is really Array which is slightly slower). I will test again after general performance work has been done and String is reworked. Do you know if the String reworking is targeted for 0.5, or later? -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:14> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by lsansonetti@…): The new String is for later (0.6). I didn't shark your swap benchmark but it creates many temporary objects (one String#dup, 2 String#[]) and our object allocator is currently slow, which could explain why it's slower than 1.9. This should definitely be faster once our Strings are no longer pure CF objects, and can benefit of the thread local GC optimization. -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:15> MacRuby <http://macruby.org/>
#424: performance regression from ruby 1.8 and 1.9 -------------------------------------+-------------------------------------- Reporter: jordan.breeding@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: -------------------------------------+-------------------------------------- Comment(by martinlagardette@…): With the new strings: ruby 1.9 {{{ Test 001 -- String character swap, 2500000 times Test 001: 4.7923 seconds Overall: 9.3636 seconds Test 002 -- String character swap (alternate), 2500000 times Test 002: 4.1403 seconds Overall: 8.8592 seconds Test 003 and 004 -- Insert 500000 nodes into a priority queue using pop/push (003), plus Set calls (004) Test 003: 10.0445 seconds Test 004: 0.4814 seconds Overall: 13.3296 seconds Test 005 and 006 -- Insert 500000 nodes into a priority queue using pop/pushObjects (005), plus Set calls (006) Test 005: 10.9869 seconds Test 006: 0.5723 seconds Overall: 11.6866 seconds }}} MacRuby {{{ Test 001 -- String character swap, 2500000 times Test 001: 13.6413 seconds Overall: 17.3031 seconds Test 002 -- String character swap (alternate), 2500000 times Test 002: 11.3426 seconds Overall: 15.7529 seconds Test 003 and 004 -- Insert 500000 nodes into a priority queue using pop/push (003), plus Set calls (004) Test 003: 12.6970 seconds Test 004: 1.1502 seconds Overall: 15.9674 seconds Test 005 and 006 -- Insert 500000 nodes into a priority queue using pop/pushObjects (005), plus Set calls (006) Test 005: 16.4765 seconds Test 006: 1.2515 seconds Overall: 16.1109 seconds }}} Maybe still not as fast as ruby19, but still much faster than before :-) -- Ticket URL: <http://www.macruby.org/trac/ticket/424#comment:16> MacRuby <http://macruby.org/>
participants (1)
-
MacRuby