Problem with latest macruby stable and unicode regexp
Hey all, This is my first foray into MacRuby dev so go easy :) I'm trying to make a fairly simple web editor with Haml and Sass parsing behind the scenes, and including Sass into my project throws the following error on build: regexp `[\u{80}-\u{D7FF}\u{E000}-\u{FFFD}\u{10000}-\u{10FFFF}]' compilation error: U_REGEX_BAD_ESCAPE_SEQUENCE (RegexpError) The offending line is here: https://github.com/nex3/sass/blob/master/lib/sass/scss/rx.rb#L55 Any ideas what I can do apart from forking Sass and opting for the 1.8 compatible regexp? Thanks, Glenn
On 2011-03-27, at 13:05 , Glenn Gillen wrote:
regexp `[\u{80}-\u{D7FF}\u{E000}-\u{FFFD}\u{10000}-\u{10FFFF}]' compilation error: U_REGEX_BAD_ESCAPE_SEQUENCE (RegexpError)
Any ideas what I can do apart from forking Sass and opting for the 1.8 compatible regexp?
MacRuby ditched Oniguruma in favour of ICU regexps. They're mostly compatible but not 100%. Here's the ICU equivalent for that pattern: /[\u0080-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]/ It uses the \uhhhh and \Uhhhhhhhh notation. It seems the \u{h…} notation is not supported. It also seems that Oniguruma does not support the ICU style. If you want to patch sass in a compatible way I'd look into just pre-expanding the escapes into actual characters before building the regexp. This seems to work fine in both 1.9 and MacRuby: s = "\u{80}-\u{D7FF}\u{E000}-\u{FFFD}\u{10000}-\u{10FFFF}" r = /[#{s}]/ Notice I used double quotes for the string, so what you get is a string with actual unicode chars in it, not escape sequences. The r regexp built using that string seems to work correctly in both rubies. At least it evaluated without error.
participants (2)
-
Caio Chassot
-
Glenn Gillen