A ruby string with ASCII-8BIT encoding is really an NSData and not an NSString
Hello, I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things. To get around this, I did the following: class String def to_data return NSData.data unless length > 0 bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length) bytes.each_with_index do |char, i| p[i] = char end NSData.dataWithBytes(p, length:bytes.length) end end I'm not sure what the solution to this is, but I thought I'd bring up the issue on the mailing list. Thanks, -carl
On May 28, 2010, at 10:03 AM, Carl Lerche wrote:
Hello,
Hi!
I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things.
All true statements. The problem, however, is with Ruby. Ruby's use of Strings to hold random binary data goes back a long way, and is deeply ingrained in a lot of Ruby libraries. It has even led to some ugly habits, like using regular expressions to parse binary data. Unfortunately, this also means that it would be very difficult to replace a Ruby binary string with an NSData without also re-implementing all the string methods that various Ruby libraries use. My recommendation (something I've been contemplating myself) would be to petition the ruby-core list for addition of an explicit binary data class in Ruby 2.0. In the mean time...
To get around this, I did the following:
class String def to_data return NSData.data unless length > 0
bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length)
bytes.each_with_index do |char, i| p[i] = char end
NSData.dataWithBytes(p, length:bytes.length) end end
I actually really like this! I think there's a case to be made for a set of MacRuby convenience functions for working between Ruby and Obj-C. For example, RubyCocoa's #to_plist and OSX::load_plist still need to be implemented. I'd propose a 'convenience.rb' library for all of these... Laurent? - Josh
Hi Carl, Why not using -[NSString dataUsingEncoding:]? It would be better perf wise than constructing a bytes array like you do. As Josh mentioned, the fact that the String class can hold arbitrary data (not only characters) is a design issue of Ruby that we unfortunately need to follow for compatibility purposes. Fortunately, all Ruby strings in MacRuby are NSStrings and the Cocoa APIs can be easily used to retrieve NSData objects. I thought about adding a convenience method on String to return NSDatas, but Strings already respond to dataUsingEncoding: , so I'm not sure if another method is needed. On the other side, NSData objects respond to #to_str which returns a Ruby string (with BINARY encoding). Laurent On May 28, 2010, at 10:03 AM, Carl Lerche wrote:
Hello,
I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things.
To get around this, I did the following:
class String def to_data return NSData.data unless length > 0
bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length)
bytes.each_with_index do |char, i| p[i] = char end
NSData.dataWithBytes(p, length:bytes.length) end end
I'm not sure what the solution to this is, but I thought I'd bring up the issue on the mailing list.
Thanks, -carl _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
Laurent, There is no NS* encoding that will correctly return the data from the ASCII-8BIT encoded string without mangling it (as far as I could figure out at least, I spent a good deal of time trying to). Basically, there was no way to get a data object using the NSString data* methods. My general understanding is that cocoa stores NSStrings internally as unicode. I have no idea how macruby maps ruby's ASCII-8BIT encoding -> unicode, but there is no way to get the data back in the original form (round trip). If there is something that I missed, please let me know. Thanks, -carl On Fri, May 28, 2010 at 1:33 PM, Laurent Sansonetti <lsansonetti@apple.com> wrote:
Hi Carl,
Why not using -[NSString dataUsingEncoding:]? It would be better perf wise than constructing a bytes array like you do.
As Josh mentioned, the fact that the String class can hold arbitrary data (not only characters) is a design issue of Ruby that we unfortunately need to follow for compatibility purposes. Fortunately, all Ruby strings in MacRuby are NSStrings and the Cocoa APIs can be easily used to retrieve NSData objects.
I thought about adding a convenience method on String to return NSDatas, but Strings already respond to dataUsingEncoding: , so I'm not sure if another method is needed. On the other side, NSData objects respond to #to_str which returns a Ruby string (with BINARY encoding).
Laurent
On May 28, 2010, at 10:03 AM, Carl Lerche wrote:
Hello,
I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things.
To get around this, I did the following:
class String def to_data return NSData.data unless length > 0
bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length)
bytes.each_with_index do |char, i| p[i] = char end
NSData.dataWithBytes(p, length:bytes.length) end end
I'm not sure what the solution to this is, but I thought I'd bring up the issue on the mailing list.
Thanks, -carl _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
I see... you're right. I guess adding a #to_data method makes sense then. Could you file a ticket on trac? I believe the method should only work for Ruby strings and raise an exception for NSStrings. Laurent On May 28, 2010, at 1:41 PM, Carl Lerche wrote:
Laurent,
There is no NS* encoding that will correctly return the data from the ASCII-8BIT encoded string without mangling it (as far as I could figure out at least, I spent a good deal of time trying to). Basically, there was no way to get a data object using the NSString data* methods. My general understanding is that cocoa stores NSStrings internally as unicode. I have no idea how macruby maps ruby's ASCII-8BIT encoding -> unicode, but there is no way to get the data back in the original form (round trip).
If there is something that I missed, please let me know.
Thanks, -carl
On Fri, May 28, 2010 at 1:33 PM, Laurent Sansonetti <lsansonetti@apple.com> wrote:
Hi Carl,
Why not using -[NSString dataUsingEncoding:]? It would be better perf wise than constructing a bytes array like you do.
As Josh mentioned, the fact that the String class can hold arbitrary data (not only characters) is a design issue of Ruby that we unfortunately need to follow for compatibility purposes. Fortunately, all Ruby strings in MacRuby are NSStrings and the Cocoa APIs can be easily used to retrieve NSData objects.
I thought about adding a convenience method on String to return NSDatas, but Strings already respond to dataUsingEncoding: , so I'm not sure if another method is needed. On the other side, NSData objects respond to #to_str which returns a Ruby string (with BINARY encoding).
Laurent
On May 28, 2010, at 10:03 AM, Carl Lerche wrote:
Hello,
I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things.
To get around this, I did the following:
class String def to_data return NSData.data unless length > 0
bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length)
bytes.each_with_index do |char, i| p[i] = char end
NSData.dataWithBytes(p, length:bytes.length) end end
I'm not sure what the solution to this is, but I thought I'd bring up the issue on the mailing list.
Thanks, -carl _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
Alright, Here is the ticket: https://www.macruby.org/trac/ticket/732 -carl On Fri, May 28, 2010 at 1:59 PM, Laurent Sansonetti <lsansonetti@apple.com> wrote:
I see... you're right. I guess adding a #to_data method makes sense then. Could you file a ticket on trac?
I believe the method should only work for Ruby strings and raise an exception for NSStrings.
Laurent
On May 28, 2010, at 1:41 PM, Carl Lerche wrote:
Laurent,
There is no NS* encoding that will correctly return the data from the ASCII-8BIT encoded string without mangling it (as far as I could figure out at least, I spent a good deal of time trying to). Basically, there was no way to get a data object using the NSString data* methods. My general understanding is that cocoa stores NSStrings internally as unicode. I have no idea how macruby maps ruby's ASCII-8BIT encoding -> unicode, but there is no way to get the data back in the original form (round trip).
If there is something that I missed, please let me know.
Thanks, -carl
On Fri, May 28, 2010 at 1:33 PM, Laurent Sansonetti <lsansonetti@apple.com> wrote:
Hi Carl,
Why not using -[NSString dataUsingEncoding:]? It would be better perf wise than constructing a bytes array like you do.
As Josh mentioned, the fact that the String class can hold arbitrary data (not only characters) is a design issue of Ruby that we unfortunately need to follow for compatibility purposes. Fortunately, all Ruby strings in MacRuby are NSStrings and the Cocoa APIs can be easily used to retrieve NSData objects.
I thought about adding a convenience method on String to return NSDatas, but Strings already respond to dataUsingEncoding: , so I'm not sure if another method is needed. On the other side, NSData objects respond to #to_str which returns a Ruby string (with BINARY encoding).
Laurent
On May 28, 2010, at 10:03 AM, Carl Lerche wrote:
Hello,
I've been playing around with macruby a bunch and I hit a point of confusion. I basically did "....".pack("H*") which returns a ruby string. This string is an ASCII-8BIT encoded ruby string which is an alias for a Binary string (aka, it has no encoding). IMO, this should really be backed by an NSData class as opposed to an NSString class because it doesn't make much sense. In fact, it's pretty much impossible to get any useful information out of an ASCII-8BIT encoded ruby string from the objective-c side of things.
To get around this, I did the following:
class String def to_data return NSData.data unless length > 0
bytes = self.bytes.to_a p = Pointer.new_with_type("char *", bytes.length)
bytes.each_with_index do |char, i| p[i] = char end
NSData.dataWithBytes(p, length:bytes.length) end end
I'm not sure what the solution to this is, but I thought I'd bring up the issue on the mailing list.
Thanks, -carl _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
participants (3)
-
Carl Lerche
-
Josh Ballanco
-
Laurent Sansonetti