[MacRuby-devel] macruby produces strings with encodings that differ from MRI

Steve Clarke steve at sclarkes.me.uk
Sat Sep 17 13:48:14 PDT 2011


Code
========

ABC='ABC'
puts "ABC[0] encoding is #{ABC[0].encoding}"
puts "?\\xff encoding is #{?\xff.encoding}"


Output 
========


MRI output 

ABC[0] encoding is US-ASCII
?\xff encoding is ASCII-8BIT



macruby output

ABC[0] encoding is UTF-8
?\xff encoding is UTF-8


The encodings reported above did not seem to be effected by the encoding of the source file.  I tried both ASCII and UTF-8.

When the same code is executed in (mac)irb the results are the same for macirb as they are for macruby.
irb for MRI however produces UTF-8 strings in both cases! This seemed very odd but I'm fairly sure it's because I have an environment variable:
LANG=GB.UTF-8
When I changed to LANG=GB.US_ASCII irb for MRI rendered 'abc'[0] with US_ASCII encoding. macirb still used UTF-8.

(I discovered this when trying to get ruby-mysql to work with macruby.  It doesn't work as-is but seems to work with a few mods that use force_encoding to make MRI and macruby produce the same outputs).
I abandoned my earlier attempts to use postgres with macruby via the pg gem.  It failed regularly but in unpredictable ways associated, as far as I could tell, with memory management problems.




More information about the MacRuby-devel mailing list