Where can I find the documentation on the modifiers for gsub
? \\a \\b \\c \\1 \\2 \\3 %a %b %c $1 $2 %3 etc.?
Specifically, I\'m looking at this code...
If you use block in sub/gsub you can access to the groups like that :
>> rx = /(ab(cd)ef)/
>> s = "-abcdef-abcdef"
>> s.gsub(rx) { $2 }
=> "cdgh-cdghi"
gsub is also a string substitution function within the LUA language.
Within the LUA regex language %u represents the Upper Case character class. i.e. It will match all upper case letters. Similarly %l will match lower case.
LUA Regex Class Patterns
First off, %u
is nothing special in ruby regex:
mixonic@pandora ~ $ irb
irb(main):001:0> '%u'.gsub(/%u/,'heyhey')
=> "heyhey"
The definitive documentation for Ruby 1.8 regex is in the Ruby Doc Bundle:
Strings delimited by slashes are regular expressions. The characters right after latter slash denotes the option to the regular expression. Option i means that regular expression is case insensitive. Option i means that regular expression does expression substitution only once at the first time it evaluated. Option x means extended regular expression, which means whitespaces and commens are allowd in the expression. Option p denotes POSIX mode, in which newlines are treated as normal character (matches with dots).
The %r/STRING/ is the another form of the regular expression.
^ beginning of a line or string $ end of a line or string . any character except newline \w word character[0-9A-Za-z_] \W non-word character \s whitespace character[ \t\n\r\f] \S non-whitespace character \d digit, same as[0-9] \D non-digit \A beginning of a string \Z end of a string, or before newline at the end \z end of a string \b word boundary(outside[]only) \B non-word boundary \b backspace(0x08)(inside[]only) [ ] any single character of set * 0 or more previous regular expression *? 0 or more previous regular expression(non greedy) + 1 or more previous regular expression +? 1 or more previous regular expression(non greedy) {m,n} at least m but most n previous regular expression {m,n}? at least m but most n previous regular expression(non greedy) ? 0 or 1 previous regular expression | alternation ( ) grouping regular expressions (?# ) comment (?: ) grouping without backreferences (?= ) zero-width positive look-ahead assertion (?! ) zero-width negative look-ahead assertion (?ix-ix) turns on (or off) `i' and `x' options within regular expression.
These modifiers are localized inside an enclosing group (if any). (?ix-ix: ) turns on (or off)
i' and
x' options within this non-capturing group.Backslash notation and expression substitution available in regular expressions.
Good luck!
Zenspider's Quickref contains a section explaining which escape sequences can be used in regexen and one listing the pseudo variables that get set by a regexp match. In the second argument to gsub you simply write the name of the variable with a backslash instead of a $ and it will be replaced with the value of that variable after applying the regexp. If you use a double quoted string, you need to use two backslashes.
When using the block-form of gsub you can simply use the variables directly. If you return a string containing e.g. \1 from the block, that will not be replaced with $1. That only happens when using the two-argument form.
For Ruby 1.9's Oniguruma there is a good documentation of the regular expression here.