#1048: Performance of Hash with an Array as a key ----------------------------------+----------------------------------------- Reporter: yasuimao@… | Owner: lsansonetti@… Type: defect | Status: new Priority: blocker | Milestone: Component: MacRuby | Keywords: ----------------------------------+----------------------------------------- I found another performance issue related to text processing using Hash. This little script is an attempt to count n-grams (n-words sequences) in text. The same script on Ruby 1.8.7 runs much faster and not affected by the number of array elements. '''Script''' {{{ n = 1 hash = Hash.new(0) words = File.open("test.txt").read.scan(/\w+/) (words.length - n).times do |i| hash[words[i..n+i]] += 1 end }}} I used a text file with about 8000 English words. I ran the test 3 times for each of 1 to 4 grams (1 to 4 array elements) to check that the results were consistent. Only the processing times of the block part are shown. [[BR]] '''Results''': MacRuby - hash with array as key (in sec.) {{{ word (n=0) 3.95 4.00 3.96 2-gram (n=1) 12.35 13.02 13.16 3-gram (n=2) 17.97 17.90 17.92 4-gram (n=3) 21.26 21.22 20.78 }}} '''Results''': Ruby 1.8.7 - hash with array as key (in sec.) {{{ word (n=0) 0.049 0.048 0.047 2-gram (n=1) 0.048 0.049 0.054 3-gram (n=2) 0.047 0.047 0.048 4-gram (n=3) 0.049 0.047 0.048 }}} [[BR]] To compare this with performance with String as a key, I joined the array and run the script. {{{ hash[words[i..n+i].join(" ")] += 1 }}} For the word count, I used this script. {{{ words.length.times do |i| hash[words[i]] += 1 end }}} '''Results''': MacRuby - hash with string as key (array joined) (in sec.) {{{ word (string) 0.030 0.029 0.027 2-gram 0.17 0.17 0.16 3-gram 0.18 0.18 0.19 4-gram 0.24 0.21 0.22 }}} '''Results''': Ruby 1.8.7 - hash with string as key (array joined) (in sec.) {{{ word (string) 0.0092 0.0091 0.0094 2-gram 0.045 0.041 0.039 3-gram 0.041 0.043 0.041 4-gram 0.048 0.048 0.049 }}} [[BR]] The second script ran much faster, but still MacRuby is approximately 2 to 3 times slower than Ruby 1.8.7. -- Ticket URL: <http://www.macruby.org/trac/ticket/1048> MacRuby <http://macruby.org/>