[MacRuby-devel] [MacRuby] #741: All character encoding of the string become UTF-8 when use force_encoding.
MacRuby
ruby-noreply at macosforge.org
Tue Jun 8 07:07:38 PDT 2010
#741: All character encoding of the string become UTF-8 when use force_encoding.
----------------------------------+-----------------------------------------
Reporter: watson1978@… | Owner: lsansonetti@…
Type: defect | Status: new
Priority: blocker | Milestone:
Component: MacRuby | Keywords:
----------------------------------+-----------------------------------------
Test code:
{{{
$ cat t.rb
def escape(string)
# original : 'CGI::escape'
p string.encoding
p string
string.gsub(/([^ a-zA-Z0-9_.-]+)/) do
puts "Bytesize: #{$1.bytesize}"
'%' + $1.unpack('H2' * $1.bytesize).join('%').upcase
end.tr(' ', '+')
end
value = "\xe3\x82\x86\xe3\x81\x8d\xe3\x81\xb2\xe3\x82\x8d"
value.force_encoding("utf-8")
p escape(value)
puts "----"
value = "\xa4\xe6\xa4\xad\xa4\xd2\xa4\xed"
value.force_encoding("EUC-JP")
p escape(value)
puts "----"
value = "\x82\xe4\x82\xab\x82\xd0\x82\xeb"
value.force_encoding("Shift_JIS")
p escape(value)
}}}
Result on Ruby 1.9.2 preview3:
{{{
$ ruby -v t.rb
ruby 1.9.2dev (2010-05-31 revision 28117) [x86_64-darwin10.3.0]
#<Encoding:UTF-8>
"ゆきひろ"
Bytesize: 12
"%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D"
----
#<Encoding:EUC-JP>
"\x{A4E6}\x{A4AD}\x{A4D2}\x{A4ED}"
Bytesize: 8
"%A4%E6%A4%AD%A4%D2%A4%ED"
----
#<Encoding:Shift_JIS>
"\x{82E4}\x{82AB}\x{82D0}\x{82EB}"
Bytesize: 8
"%82%E4%82%AB%82%D0%82%EB"
}}}
Result on Macruby SVN Trunk Head:
{{{
$ macruby -v t.rb
MacRuby 0.7 (ruby 1.9.2) [universal-darwin10.0, x86_64]
#<Encoding:UTF-8>
"ゆきひろ"
Bytesize: 12
"%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D"
----
#<Encoding:EUC-JP>
"ゆきひろ"
Bytesize: 12
"%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D"
----
#<Encoding:Shift_JIS>
"ゆきひろ"
Bytesize: 12
"%E3%82%86%E3%81%8D%E3%81%B2%E3%82%8D"
}}}
--
Ticket URL: <http://www.macruby.org/trac/ticket/741>
MacRuby <http://macruby.org/>
More information about the MacRuby-devel
mailing list