[MacRuby] #1028: Strings generated by a directory listing of files with Unicode chars in name cannot be compared

MacRuby ruby-noreply at macosforge.org
Mon Dec 6 17:34:58 PST 2010


#1028: Strings generated by a directory listing of files with Unicode chars in
name cannot be compared
----------------------------+-----------------------------------------------
 Reporter:  jazzbox@…       |       Owner:  lsansonetti@…        
     Type:  defect          |      Status:  new                  
 Priority:  blocker         |   Milestone:  MacRuby 0.8          
Component:  MacRuby         |    Keywords:                       
----------------------------+-----------------------------------------------

Old description:

> in an empty directory:
>
> {{{
> touch rübe.txt; touch tomate.txt
> }}}
>
> files with Unicode-characters give false result:
>
> {{{
> macruby -e 'p Dir["r*.txt"].first == "rübe.txt"'
> false
> }}}
>
> without Unicode chars everything is fine:
>
> {{{
> macruby -e 'p Dir["t*.txt"].first == "tomate.txt"'
> true
> }}}
>
> All strings have the same encoding: #<Encoding:UTF-8>

New description:

 in an empty directory:

 {{{
 touch rübe.txt; touch tomate.txt
 }}}

 files with Unicode-characters give false result:

 {{{
 macruby -e 'p Dir["r*.txt"].first == "rübe.txt"'
 false
 }}}

 without Unicode chars everything is fine:

 {{{
 macruby -e 'p Dir["t*.txt"].first == "tomate.txt"'
 true
 }}}

 All strings have the same encoding: #<Encoding:UTF-8>

--

Comment(by vincent.isambart@…):

 I'll have to check on my Mac when I get back home, but I suspect a problem
 of normalization (http://www.unicode.org/reports/tr15/). If I remember
 correctly, the Mac OS X file system is know to use form D (Canonical
 Decomposition) whereas most editors use form C (Canonical Decomposition,
 followed by Canonical Composition).

 Something like:
 {{{
 macruby -e 'p Dir["r*.txt"].first.precomposedStringWithCanonicalMapping ==
 "rübe.txt".precomposedStringWithCanonicalMapping'
 }}}
 would probably work.

 Note that I'm pretty sure at some point Laurent did something to make some
 path strings automatically converted to form C but that seems to have
 disappeared when we rewrote String.

-- 
Ticket URL: <http://www.macruby.org/trac/ticket/1028#comment:2>
MacRuby <http://macruby.org/>



More information about the macruby-tickets mailing list