15 May
2011
15 May
'11
9:50 a.m.
Hi, I just wrote a simple script for text processing and encountered a problem with String#sub/gsub. Original text: UTF-8 encoded ASCII character only text Replacing text: UTF-8 encoded text with ASCII and non-ASCII characters (including Japanese characters) The resulting text: all the non-ASCII characters were garbage. When I split the original text at the strings to be replaced and inserted the replacing text at these places, the resulting string object was fine; all the characters were kept as they should be in UTF-8 encoding. I checked the tickets, but couldn't find something like this. Is this a known issue? Best, Yasu