My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string. return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding) puts "******* response encoding it #{str.encoding}" The result of the puts above is 'MACINTOSH'. I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get: RegexpError: too short multibyte code This occurs way in REXML: /Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in `check:' In any case, my questions are: 1) If anyone has run across this what did you do? 2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call? 3) Suggestions? Thanks, Steve
Hi Steve, On Dec 5, 2009, at 1:45 PM, s.ross wrote:
My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string.
return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding) puts "******* response encoding it #{str.encoding}"
The result of the puts above is 'MACINTOSH'.
I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get:
RegexpError: too short multibyte code
This occurs way in REXML:
/Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/ 1.9.0/rexml/text.rb:132:in `check:'
In any case, my questions are:
1) If anyone has run across this what did you do?
I don't believe REXML works. In any case, I would recommend to not use it. Since you're already using Cocoa, why not giving NSXMLDocument a try?
2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call?
#encoding returns the fastest encoding available for the receiver. You may specify UTF-8 during the string creation, but if Cocoa can pick a smaller encoding at runtime (like ASCII) it will. This is different from the Ruby 1.9 semantics and we have a plan to fix that in 0.6.
3) Suggestions?
See my comment in 1) :) Laurent
Laurent-- Thanks for the quick reply. See comments below: On Dec 5, 2009, at 4:22 PM, Laurent Sansonetti wrote:
Hi Steve,
On Dec 5, 2009, at 1:45 PM, s.ross wrote:
My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string.
return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding) puts "******* response encoding it #{str.encoding}"
The result of the puts above is 'MACINTOSH'.
I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get:
RegexpError: too short multibyte code
This occurs way in REXML:
/Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in `check:'
In any case, my questions are:
1) If anyone has run across this what did you do?
I don't believe REXML works. In any case, I would recommend to not use it. Since you're already using Cocoa, why not giving NSXMLDocument a try?
What I really want to use is Nokogiri. My main issue is that I'm having to reimplement XML-RPC because the Ruby Std. Lib version is broken over SSL. Even if it weren't it's never been thread safe and thus can't operate asynchronously. As a result, what I have is an XML document inside an XML-RPC response envelope. That means I have to parse the document once to get the contents of the envelope (which is HTML-escaped), then parse those contents to get an XML document I can work with. I've been using XPath for that, and that's why I haven't moved over the NSXMLDocument. Maybe I'm missing a bet here and should shift my strategy. I'll do some more reading...
2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call?
#encoding returns the fastest encoding available for the receiver. You may specify UTF-8 during the string creation, but if Cocoa can pick a smaller encoding at runtime (like ASCII) it will.
This is different from the Ruby 1.9 semantics and we have a plan to fix that in 0.6.
This is kind of surprising behavior. The 1.9 semantics are sufficiently different from 1.8x that code that works correctly on 1.8.7 breaks awkwardly on 1.9. Ok, but I fixed that in an MRI version and the gotcha above broke my MacRuby version. Now that I know this, I guess I can deal with it.
3) Suggestions?
See my comment in 1) :)
Laurent _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
Hi Steve, On Dec 5, 2009, at 5:10 PM, s.ross wrote:
What I really want to use is Nokogiri. My main issue is that I'm having to reimplement XML-RPC because the Ruby Std. Lib version is broken over SSL. Even if it weren't it's never been thread safe and thus can't operate asynchronously. As a result, what I have is an XML document inside an XML-RPC response envelope. That means I have to parse the document once to get the contents of the envelope (which is HTML-escaped), then parse those contents to get an XML document I can work with. I've been using XPath for that, and that's why I haven't moved over the NSXMLDocument.
Maybe I'm missing a bet here and should shift my strategy. I'll do some more reading...
Did you file a bug on our tracker regarding the stdlib breakage? Is it related to timeout.rb? It looks like there is a MacRuby problem that should be fixed :) Laurent
On Dec 6, 2009, at 6:41 PM, Laurent Sansonetti wrote:
Hi Steve,
On Dec 5, 2009, at 5:10 PM, s.ross wrote:
What I really want to use is Nokogiri. My main issue is that I'm having to reimplement XML-RPC because the Ruby Std. Lib version is broken over SSL. Even if it weren't it's never been thread safe and thus can't operate asynchronously. As a result, what I have is an XML document inside an XML-RPC response envelope. That means I have to parse the document once to get the contents of the envelope (which is HTML-escaped), then parse those contents to get an XML document I can work with. I've been using XPath for that, and that's why I haven't moved over the NSXMLDocument.
Maybe I'm missing a bet here and should shift my strategy. I'll do some more reading...
Did you file a bug on our tracker regarding the stdlib breakage? Is it related to timeout.rb? It looks like there is a MacRuby problem that should be fixed :)
I didn't file a bug because the problem I am working on requires Net::HTTP over SSL to retrieve the XML. That was simply not working, so I just went ahead and used NSURL. As I understand the issue, Net::HTTP does not protect the buffer in a multithreaded environment. Here's where I ran up against the problem (pseudo code) - HTTPS Get photo_id for nth 20 pictures - In groups of 5 spin off a thread to HTTPS Get details for the pictures The numbers 20 and 5 simply correspond to what the Web Service gives me. The idea is that because of network latency and the blocking nature of the HTTP traffic, there would likely be some parallelism achieved. It turns out this strategy is a huge win ... except that there is no concept of thread-local scope (some might contend that is an evil concept :), and thus the buffers used by Net::HTTP are at risk of corruption when used this way in a multithreaded environment. Long way of saying, if there's a bug, I am sure it's in MRI as well.
Laurent _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
Laurent-- Sorry to be pesty about this XML thing, but I have Objective-C like this: NSString* xml = @"<people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people>"; NSError* error = [NSError alloc]; NSXMLDocument* xmlDoc = [[NSXMLDocument alloc] initWithXMLString:xml options:NSXMLDocumentTidyXML error:&error]; NSLog(@"Created XML Document %@", xmlDoc); NSLog(@"Root element is %@", [xmlDoc rootElement]); NSArray* nodes = [xmlDoc nodesForXPath:@"//person" error:&error]; NSLog(@"nodes are %@", nodes); for(int i = 0; i < [nodes count]; i++) { NSXMLElement* node = [nodes objectAtIndex:i]; NSLog(@"Node %d is %@", i, node); NSArray *name = [[node nodesForXPath:@"//name" error:&error] objectAtIndex:0]; NSArray *address = [[node nodesForXPath:@"//address" error:&error] objectAtIndex:0]; NSLog(@"Name: %@", [name stringValue]); NSLog(@"Address: %@", [address stringValue]); } } Which behaves exactly as one might expect. It creates an XML document on which I can apply XPath queries. Using the nightly build from 5-December, the following MacRuby is different: error = Pointer.new_with_type("@") s = NSMutableString.new("<people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people>") xmlDoc = NSXMLDocument.alloc.initWithData(s, options:1 << 10, error:error) <== Fails NSLog("Created XML Document %@", xmlDoc) NSLog("Root element is %@", xmlDoc.rootElement) This produces the output: initializing <people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people> 2009-12-06 12:39:50.318 macruby[5229:903] Created XML Document 2009-12-06 12:39:50.321 macruby[5229:903] Root element is (null) Note the following: * initWithString is simply unrecognized as a method * NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift * The resultant XML document is null I'm not filing this as a bug because I think it might be a failure on my part to understand how this set of methods maps onto MacRuby. Thanks, Steve On Dec 5, 2009, at 4:22 PM, Laurent Sansonetti wrote:
Hi Steve,
On Dec 5, 2009, at 1:45 PM, s.ross wrote:
My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string.
return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding) puts "******* response encoding it #{str.encoding}"
The result of the puts above is 'MACINTOSH'.
I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get:
RegexpError: too short multibyte code
This occurs way in REXML:
/Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in `check:'
In any case, my questions are:
1) If anyone has run across this what did you do?
I don't believe REXML works. In any case, I would recommend to not use it. Since you're already using Cocoa, why not giving NSXMLDocument a try?
2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call?
#encoding returns the fastest encoding available for the receiver. You may specify UTF-8 during the string creation, but if Cocoa can pick a smaller encoding at runtime (like ASCII) it will.
This is different from the Ruby 1.9 semantics and we have a plan to fix that in 0.6.
3) Suggestions?
See my comment in 1) :)
Laurent _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
Hi Laurent: Can't comment on NSXMLDocument but I have been using REXML with MacRuby without any problems. Great job! Bob Rice On Dec 6, 2009, at 3:43 PM, s.ross wrote:
Laurent--
Sorry to be pesty about this XML thing, but I have Objective-C like this:
NSString* xml = @"<people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people>"; NSError* error = [NSError alloc]; NSXMLDocument* xmlDoc = [[NSXMLDocument alloc] initWithXMLString:xml options:NSXMLDocumentTidyXML error:&error]; NSLog(@"Created XML Document %@", xmlDoc); NSLog(@"Root element is %@", [xmlDoc rootElement]);
NSArray* nodes = [xmlDoc nodesForXPath:@"//person" error:&error]; NSLog(@"nodes are %@", nodes); for(int i = 0; i < [nodes count]; i++) { NSXMLElement* node = [nodes objectAtIndex:i]; NSLog(@"Node %d is %@", i, node); NSArray *name = [[node nodesForXPath:@"//name" error:&error] objectAtIndex:0]; NSArray *address = [[node nodesForXPath:@"//address" error:&error] objectAtIndex:0]; NSLog(@"Name: %@", [name stringValue]); NSLog(@"Address: %@", [address stringValue]); } }
Which behaves exactly as one might expect. It creates an XML document on which I can apply XPath queries. Using the nightly build from 5-December, the following MacRuby is different:
error = Pointer.new_with_type("@") s = NSMutableString.new("<people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people>") xmlDoc = NSXMLDocument.alloc.initWithData(s, options:1 << 10, error:error) <== Fails NSLog("Created XML Document %@", xmlDoc) NSLog("Root element is %@", xmlDoc.rootElement)
This produces the output:
initializing <people><person><name>steve</name><address>123 main</address></person><person><name>bob</name><address>345 First</address></person></people> 2009-12-06 12:39:50.318 macruby[5229:903] Created XML Document 2009-12-06 12:39:50.321 macruby[5229:903] Root element is (null)
Note the following:
* initWithString is simply unrecognized as a method * NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift * The resultant XML document is null
I'm not filing this as a bug because I think it might be a failure on my part to understand how this set of methods maps onto MacRuby.
Thanks,
Steve
On Dec 5, 2009, at 4:22 PM, Laurent Sansonetti wrote:
Hi Steve,
On Dec 5, 2009, at 1:45 PM, s.ross wrote:
My code receives XML data from a Web Service API call that is in UTF8 encoding. This winds up in a string.
return_data = NSURLConnection.sendSynchronousRequest(@request, returningResponse: response, error: error) str = NSString.alloc.initWithData(return_data, encoding: NSUTF8StringEncoding) puts "******* response encoding it #{str.encoding}"
The result of the puts above is 'MACINTOSH'.
I suspect the encoding of the string is not UTF-8, because when I try to parse the XML using REXML, I get:
RegexpError: too short multibyte code
This occurs way in REXML:
/Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in `check:'
In any case, my questions are:
1) If anyone has run across this what did you do?
I don't believe REXML works. In any case, I would recommend to not use it. Since you're already using Cocoa, why not giving NSXMLDocument a try?
2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the initWithData method call?
#encoding returns the fastest encoding available for the receiver. You may specify UTF-8 during the string creation, but if Cocoa can pick a smaller encoding at runtime (like ASCII) it will.
This is different from the Ruby 1.9 semantics and we have a plan to fix that in 0.6.
3) Suggestions?
See my comment in 1) :)
Laurent _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
* initWithString is simply unrecognized as a method Please look at the documentation first. It's initWithXMLString not initWithString * NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift If you do framework 'Cocoa', NSXMLDocumentTidyXML is properly defined...
* The resultant XML document is null It works fine if you use initWithXMLString:options:error:
Yes, well I did read the documentation but thanks for the suggestion. I just mistyped in my email. I'm also specifying the Cocoa framework. None of that seems to explain why initWithData:error returns null. I'll give it another whirl. Hunted and pecked from my iPhone On Dec 6, 2009, at 2:27 PM, Vincent Isambart <vincent.isambart@gmail.com> wrote:
* initWithString is simply unrecognized as a method Please look at the documentation first. It's initWithXMLString not initWithString * NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift If you do framework 'Cocoa', NSXMLDocumentTidyXML is properly defined...
* The resultant XML document is null It works fine if you use initWithXMLString:options:error:
MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
On Sun, Dec 6, 2009 at 3:02 PM, S. Ross <cwdinfo@gmail.com> wrote:
Yes, well I did read the documentation but thanks for the suggestion. I just mistyped in my email. I'm also specifying the Cocoa framework.
None of that seems to explain why initWithData:error returns null.
In the original code you showed, you were passing an NSMutableString object to initWithData:options:error:. If you want to use initWithData:options:error: you need to convert your string to a data object: xml_data = s.dataUsingEncoding(NSUTF8StringEncoding) Or, if the source data is coming from some other source, you might be able to create an NSData object directly.
I'll give it another whirl.
Hunted and pecked from my iPhone
On Dec 6, 2009, at 2:27 PM, Vincent Isambart <vincent.isambart@gmail.com> wrote:
* initWithString is simply unrecognized as a method
Please look at the documentation first. It's initWithXMLString not initWithString
* NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift
If you do framework 'Cocoa', NSXMLDocumentTidyXML is properly defined...
* The resultant XML document is null
It works fine if you use initWithXMLString:options:error: _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
This was a combination of what Vincent said [read the documentation... sorry about my snippy reply earlier] and what is apparently my misreading of the "How Does MacRuby Work?" Wiki page. Brian, your pointer led me down the path toward understanding the underlying mapping of types that caused the confusion in my code. I'll explain so if anyone else runs into this, they might find it useful. In http://www.macruby.org/trac/wiki/HowDoesMacRubyWork, (How Does MacRuby Work?) the following statement is made: ------ Because NSString was not designed to handle bytestrings, MacRuby will automatically (and silently) create an NSData object when necessary, attach it to the string object, and proxy the methods to its content. This will typically be used when you read binary data from a file or a network socket. ------ I understand the difference between an NSString and an NSMutableString, but my reading of the above is that you can use an NSString (or, in my reading, NSMutableString) anywhere an NSData is required and kinda vice-versa. Evidently that's not the case. So explicitly converting to NSData using the proper encoding is the cure that makes initWithData:data options:options error:error work. The caveat in the above quote about where the silent creation of an NSData object occurs was not sufficient to set off alarm bells when I read it. So now that I knew that I could create an NSXMLDocument from an NSData, I wanted to dig a bit further and find out why I couldn't make one from an NSString. I learned that there are at least three different initWithXMLString methods in Cocoa. Depending on the class, the selector is formed differently. So I used the Xcode help to search for initWithXMLString and found (id)initWithXMLString:(NSString *)string. Made sense, so I didn't look further. Again, Vincent, my apologies for not having done that. This method belongs to NSXMLDTDNode. So the working code, using the information all of you have so patiently given me is: # shebang so I can just Cmd+R inside Textmate #!/path/to/macruby framework "Cocoa" xmlDoc = NSXMLDocument.alloc.initWithXMLString(s, options:NSXMLDocumentTidyXML, error:error) NSLog("Created XML Document %@", xmlDoc) NSLog("Root element is %@", xmlDoc.rootElement) NSLog("Error is %@", error.class) nodes = xmlDoc.nodesForXPath("//person", error:error) nodes.each do |node| NSLog("Node is %@", node) name = node.nodesForXPath("./name", error:error)[0] || 'n/a' address = node.nodesForXPath("./address", error:error)[0] || 'n/a' NSLog("Name: #{name.stringValue}") NSLog("Address: #{address.stringValue}") end This does a whole bunch of logging, but illustrates what I haven't seen clearly in all my searching through Google and reading the example code. That, as Laurent says, Cocoa has a wonderful XML document class that will do what I was using REXML to do. Thanks... Steve On Dec 6, 2009, at 6:18 PM, Brian Chapados wrote:
On Sun, Dec 6, 2009 at 3:02 PM, S. Ross <cwdinfo@gmail.com> wrote:
Yes, well I did read the documentation but thanks for the suggestion. I just mistyped in my email. I'm also specifying the Cocoa framework.
None of that seems to explain why initWithData:error returns null.
In the original code you showed, you were passing an NSMutableString object to initWithData:options:error:. If you want to use initWithData:options:error: you need to convert your string to a data object:
xml_data = s.dataUsingEncoding(NSUTF8StringEncoding)
Or, if the source data is coming from some other source, you might be able to create an NSData object directly.
I'll give it another whirl.
Hunted and pecked from my iPhone
On Dec 6, 2009, at 2:27 PM, Vincent Isambart <vincent.isambart@gmail.com> wrote:
* initWithString is simply unrecognized as a method
Please look at the documentation first. It's initWithXMLString not initWithString
* NSXMLDocumentTidyXML constant is not defined so I just transcribed the equivalent bitshift
If you do framework 'Cocoa', NSXMLDocumentTidyXML is properly defined...
* The resultant XML document is null
It works fine if you use initWithXMLString:options:error: _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
On Dec 6, 2009, at 7:17 PM, s.ross wrote:
So the working code, using the information all of you have so patiently given me is:
You might want to create a stand-alone example from that and submit it as sample code, seeing how much work was involved in creating it. :) - Jordan
I tidied up the code and put a bunch of comments in it. You can see it on: http://github.com/sxross/MacRuby-Examples/tree/master/nsxml_example/ --steve On Dec 6, 2009, at 7:20 PM, Jordan K. Hubbard wrote:
On Dec 6, 2009, at 7:17 PM, s.ross wrote:
So the working code, using the information all of you have so patiently given me is:
You might want to create a stand-alone example from that and submit it as sample code, seeing how much work was involved in creating it. :)
- Jordan
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
participants (7)
-
Brian Chapados
-
Jordan K. Hubbard
-
Laurent Sansonetti
-
Robert Rice
-
S. Ross
-
s.ross
-
Vincent Isambart