[MacRuby-devel] Ruby sort algorithm

Morgan Schweers cyberfox at gmail.com
Mon Jan 31 12:45:14 PST 2011


Greetings,
Sorry; I neglected that aspect of your code.

If, instead of '@children = ...' you were to go:

def sort_children2
  new_children = @children.sort_by.with_index { |*args| args}
  set_modified if @children != new_children
  @children = new_children
end

It adds an extra 'n' in doing the comparison, but no more than the map at
the end of your function.

That said, I will acknowledge that I don't know how the quicksort used by
Ruby chooses its pivot.  If it uses the last value as it's pivot in each
partition, it will behave pessimally in your case.  If it uses the middle
value, it will perform well.

--  Morgan

On Mon, Jan 31, 2011 at 10:23 AM, Robert Rice <rice.audio at pobox.com> wrote:

> Hi Morgan:
>
> Thanks again.
>
> I understand the concept now of adding a secondary sort key but I was
> unfamiliar with the .with_index enumerator method.
>
> My app is usually sorting a small array of XML elements that will already
> be sorted but I need to know if the sequence has changed to update the file.
> Could .sort_by return a true or false indicating if the sequence was
> changed? Otherwise I need a first pass to check the sequence.
>
> Thanks,
> Bob Rice
>
> On Jan 31, 2011, at 4:57 AM, Morgan Schweers wrote:
>
> Greetings,
> The long example didn't work quite right.  You can see it here:
> https://gist.github.com/803842
>
> --  Morgan
>
> On Mon, Jan 31, 2011 at 1:55 AM, Morgan Schweers <cyberfox at gmail.com>wrote:
>
>> Greetings,
>> Ouch.  That's...probably painfully expensive for large @children arrays.
>>  Trying to understand, it looks like you're sorting on child.sequence,
>> and keeping each child with the same sequence in the same order as they
>> are initially in the @children array?
>>
>> You could try something like this:
>>
>>
>>
>>   def sort_children2
>>
>>     @children = @children.sort_by.with_index { |child, index|
>> [child.sequence, index]}
>>
>>   end
>>
>>
>>
>> One key to understanding this is that arrays elements ([child.sequence,
>> index] here) compare on each element when #<=> is used, so you're
>> essentially adding the elements index in the array as a secondary sort key.
>>  The call of #sort_by and then #with_index is chaining enumerators.
>>  (Hopefully I've gotten that basically right; it's not the simplest part of
>> Ruby, but it's fascinating...)  This all allows us to use an unstable (but
>> fast!) sort, while essentially adding additional sort keys that keep it
>> stable.
>>
>> A shorter version reads:
>>
>>    def sort_children2
>>     @children = @children.sort_by.with_index { |*args| args}
>>   end
>>
>>
>> Clever, perhaps, but a little obscure.  This works because |*args| stuffs
>> all the arguments into an array, which coincidentally is exactly where we
>> want them.
>>
>> Here's a fun little piece of code to demonstrate what I'm doing:
>>
>> # This example is flawed, but hopefully useful for demonstration purposes.
>>
>> def test_stable_sorting
>>
>>   ary = (1..100).to_a.shuffle + (1..100).to_a.shuffle
>>
>>
>>   # This associates an ordering with the randomized numbers
>>
>>   idx = 0
>>
>>   paired = ary.collect {|value| [value, idx += 1]}
>>
>>   puts "Now the numbers are paired; the first is the random number
>> 1-100,"
>>
>>   puts "the second is its sequence within the 200 entries."
>>
>>   puts paired.inspect
>>
>>   puts
>>
>>   puts "#sort is unstable; you'll see many entries with equal first
>> values"
>>
>>   puts "where the first of them has a higher second (sequence) number,
>> meaning"
>>
>>   puts "it's out of order now."
>>
>>   puts paired.sort {|x,y| x.first <=> y.first }.inspect
>>
>>   puts
>>
>>   puts "Now we sort exclusively on the value, while preserving ordering;"
>>
>>   puts "All entries with identical first values should have second
>> values"
>>
>>   puts "that are also in numerical order."
>>
>>   puts paired.sort_by.with_index {|x, i| [x.first, i]}.inspect
>>
>> end
>>
>> --  Morgan
>>
>> On Sun, Jan 30, 2011 at 5:22 PM, Robert Rice <rice.audio at pobox.com>wrote:
>>
>>> Hi Morgan:
>>>
>>> Thanks for the info although I have to admit that I don't understand how
>>> your solutions work.
>>>
>>> I also needed my sort to return a modified flag to update the file if
>>> changed so I wrote my own bubble sort.
>>> I haven't test this yet:
>>>
>>>  def sort_children # Don't trust Ruby sort to maintain sequence, also
>>> need set_modified
>>>  return if @children.empty?
>>>
>>>  arr = @children.map{ | child | [ child.sequence, child ]
>>>  modified = false
>>>
>>> while true # bubble sort
>>>  change = false
>>>  new_arr = []
>>>
>>>  arr.each_with_index do | new_child, index |
>>> if index.zero?
>>>  prior_child = new_child
>>>  new_arr[ 0 ] = new_child
>>>
>>>  elsif new_child.first < prior_child.first # OOO
>>>  change = true
>>>  new_arr.insert( index - 1, new_child )
>>>
>>>  else
>>>  new_arr[ index ] = new_child
>>> prior_child = new_child
>>>  end
>>>  end
>>> break unless change
>>>
>>>  modified = true
>>>  arr = new_arr
>>>  end
>>> return unless modified
>>>
>>>  @children = arr.map{ | child | child.last }
>>>  set_modified()
>>>  end
>>>
>>> Thanks,
>>> Bob Rice
>>>
>>> On Jan 30, 2011, at 7:19 PM, Morgan Schweers wrote:
>>>
>>> Greetings,
>>> Ruby's sort algorithm is quicksort, last I checked, and quicksort is not
>>> stable (which is the property you're looking for in a sort).  There are a
>>> bunch of ways around this, including writing your own, but one cute, quick,
>>> but possibly performance-impairing, approach I've seen (Matz's suggestion)
>>> is:
>>>
>>>   n = 0
>>>   ary.sort_by {|x| [x, n += 1]}
>>> Apparently it's also possible in 1.9.x (and thus MacRuby) to do:
>>>
>>>   ary.sort_by.with_index {|x, i| [x, i]}
>>>
>>> It's not much faster, though.  In the end, I'd probably suggest writing
>>> your own, if the performance of this is too poor.  (One person claimed this
>>> was on the order of 50 times slower; I haven't benchmarked it myself.)
>>>  Mergesort is stable, for example.
>>>
>>> This is a common problem; most systems don't need a stable sort, so they
>>> use Quicksort as a 'best general purpose' algorithm.
>>>
>>> --  Morgan
>>>
>>> On Sun, Jan 30, 2011 at 12:03 PM, Robert Rice <rice.audio at pobox.com>wrote:
>>>
>>>> Hi:
>>>>
>>>> Does the Ruby Array sort algorithm maintain the relative position for
>>>> children returning the same value for the comparison? I had an instance
>>>> where two children having the compare value were interchanged.
>>>>
>>>> Thanks,
>>>> Bob Rice
>>>>
>>>> _______________________________________________
>>>> MacRuby-devel mailing list
>>>> MacRuby-devel at lists.macosforge.org
>>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>>>>
>>>
>>> _______________________________________________
>>> MacRuby-devel mailing list
>>> MacRuby-devel at lists.macosforge.org
>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> MacRuby-devel mailing list
>>> MacRuby-devel at lists.macosforge.org
>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>>>
>>>
>>
> _______________________________________________
> MacRuby-devel mailing list
> MacRuby-devel at lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>
>
>
> _______________________________________________
> MacRuby-devel mailing list
> MacRuby-devel at lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macosforge.org/pipermail/macruby-devel/attachments/20110131/15c7067c/attachment-0001.html>


More information about the MacRuby-devel mailing list