[MacRuby-devel] Patch for multi-byte characters

Benjamin Stiglitz ben at tanjero.com
Wed Mar 12 12:03:41 PDT 2008


> I made a testbed patch to deal with multi-byte characters.

In imp_rb_string_characterAtIndex, you can just return - 
characterAtIndex: from the initialized string instead of jumping  
through the data step. The value returned should be in the native  
encoding.

@@ -2444,26 +2446,41 @@
static UniChar
imp_rb_string_characterAtIndex(void *rcv, SEL sel, NSUInteger idx)
{
-    if (idx >= RARRAY_LEN(rcv))
+    VALUE rstr;
+    NSString* ocstr;
+    NSData* data;
+    int length = NUM2INT(rb_str_length((VALUE)rcv));
+    UniChar c;
+
+    if (idx >= length)
	[NSException raise:@"NSRangeException"
-	    format:@"index (%d) beyond bounds (%d)", idx, RARRAY_LEN(rcv)];
-    /* FIXME this is not quite true for multibyte strings */
-    return (UniChar)RSTRING_PTR(rcv)[idx];
+	    format:@"index (%d) beyond bounds (%d)", idx, length];
+
+    rstr = rb_str_substr((VALUE)rcv, idx, 1);
+    ocstr = [NSString stringWithCString:RSTRING_PTR(rstr)  
encoding:NSUTF8StringEncoding];
+    return [ocstr characterAtIndex:0];
}


-Ben


More information about the MacRuby-devel mailing list