[MacRuby-devel] Strings, Encodings and IO
Manfred Stienstra
manfred at gmail.com
Mon Apr 6 23:15:05 PDT 2009
On Apr 7, 2009, at 7:47 AM, Vincent Isambart wrote:
I have two small comments and a general statement about your essay;
> A few functions of 1.9 may also be disabled (like force_encoding). Of
> course it would be possible to add the full functionality of Ruby 1.9
> strings on ByteString but it wouldn't be worth it.
The force_encoding method will be absolutely _vital_ to working with
encodings in Ruby. Most library authors don't know anything about
character encoding and _will_ do the wrong things. And I'm not even
talking about libraries written for 1.8 which are totally unaware of
the String changes. For example, in a fictional HTTP library that
totally doesn't exist today:
response = HTTP.get('http://www.google.com')
response.body.encoding #=> #<Encoding:US-ASCII>
Even though the headers clearly say: "Content-Type: text/html;
charset=UTF-8". So we need force_encoding to fix these problems. Even
the library author probably needs force_encoding method because
somewhere deep down in the library there might be C / Obj-C code that
returns a byte string to Ruby without specifying the encoding.
> Ruby 1.9 also has default code and default external encodings
> different depending on the environment, but I think always both of
> them set to UTF-8 would be the best. (we may even completely ignore
> the encoding pragmas in the code not to complicate the parser).
Also, a no-go. ERB uses this pragma to signal what the encoding of the
template is, encoding will break when you ignore this.
Finally; I don't think it's a good idea to discuss this a great length
without actual code but in order to write a compatible implementation
most (if not all) of the String awkwardness will have to be implemented.
Manfred
More information about the MacRuby-devel
mailing list