recursive-regex | 易学教程

Java regex library with recursion support

阅读更多关于 Java regex library with recursion support

问题 I am looking for a Java regexp lib with support for recursion, like: "<a+(?0)>" JDK does not support it, ORO does neither. Anyone knows about such? Thanks, Ondra Edit: See http://www.php.net/manual/en/regexp.reference.recursive.php And I need it for this expression: (?:mUi)^/--++ *+(.*)(?: *(?<= |^)\\.((?:\$[^)\\n]+\$|\\[[^\\]\\n]+\\]|\\{[^}\\n]+\\}|<>|>|=|<){1,4}?))?$((?:\\n.*+)*)(?:\\n(?0)|\\n\\\\--.*$|\\z) 回答1: The Stevesoft Pat library has some recursive-matching capability (documented

How to fix a BBcode regular expression

阅读更多关于 How to fix a BBcode regular expression

问题 I have a regular expression that grabs BBcode tags. It works great except for a minor glitch. Here is the current expression: \[([^=\[\]]+)[=\x22']*([^ \[\]]*)['\x22]*\](.+)\[/\1\] Here is some text it successfully matches against and the groups it builds: [url=http://www.google.com]Go to google![/url] 1: url 2: http://www.google.com 3: Go to google! [img]http://www.somesite.com/someimage.jpg[/img] 1: img 2: NULL 3: http://www.somesite.com/someimage.jpg [quote][quote]first nested quote[/quote

Ruby regex for matching simpliest Ruby's regexes

阅读更多关于 Ruby regex for matching simpliest Ruby's regexes

问题 I want to match regexes (at least the basic ones, not all their possible kinds... for now...) in a text of Ruby script. It's something like a... \/\^? oh my god... \$?\/[eimnosux]* Maybe I need recursive regex here. 回答1: As I commented above, you'll need to parse Ruby to differentiate division slashes and regex delimiters. But for the simplest, SIMPLEST case without worrying about this, how about: regex_match = %r{/(?:[^/\\]|\\.)+/[mgixo]*} That is "A forward slash, followed by one or more

Ruby regex for matching simpliest Ruby's regexes

阅读更多关于 Ruby regex for matching simpliest Ruby's regexes

I want to match regexes (at least the basic ones, not all their possible kinds... for now...) in a text of Ruby script. It's something like a... \/\^? oh my god... \$?\/[eimnosux]* Maybe I need recursive regex here. As I commented above, you'll need to parse Ruby to differentiate division slashes and regex delimiters. But for the simplest, SIMPLEST case without worrying about this, how about: regex_match = %r{/(?:[^/\\]|\\.)+/[mgixo]*} That is "A forward slash, followed by one or more things that either aren't a forward slash or a backslash, or are a backslash followed by something else,

Regex to parse functions with arbitrary depth

阅读更多关于 Regex to parse functions with arbitrary depth

问题 I'm parsing a simple language (Excel formulas) for the functions contained within. A function name must start with any letter, followed by any number of letters/numbers, and ending with an open paren (no spaces in between). For example MyFunc( . The function can contain any arguments, including other functions and must end with a close paren ) . Of course, math within parens is allowed =MyFunc((1+1)) and (1+1) shouldn't be detected as a function because it fails the function rule I've just

Regex for nested XML attributes

阅读更多关于 Regex for nested XML attributes

Lets say I have following string: "<aa v={<dd>sop</dd>} z={ <bb y={ <cc x={st}>ABC</cc> }></bb> }></aa>" How can I write general purpose regex (tag names change, attribute names change) to match content inside {} , either <dd>sop</dd> or <bb y={ <cc x={st}>ABC</cc> }></bb> . Regex I wrote "(\s*\w*=\s*\{)\s*(<.*>)\s*(\})" matches "<dd>sop</dd>} z={ <bb y={ <cc x={st}>ABC</cc> }></bb>" which is not correct. In generic regex there's no way to handle nesting in a good way. Hence all the wining when a question like this comes up - never use regex to parse XML/HTML. In some simple cases it might be

Why will this recursive regex only match when a character repeats 2^n - 1 times?

阅读更多关于 Why will this recursive regex only match when a character repeats 2^n - 1 times?

After reading polygenelubricants 's series of articles on advanced regular expressions techniques (particularly How does this Java regex detect palindromes? ), I decided to attempt to create my own PCRE regex to parse a palindrome, using recursion (in PHP). What I came up with was: ^(([a-z])(?1)\2|[a-z]?)$ My understanding of this expression is that it should either match zero or one characters (every string of less than 2 characters is implicitly a palindrome, as well as to account for palindromes with odd lengths in the recursion), or two of the same character separated by a recursion of the

Remove all empty HTML tags?

阅读更多关于 Remove all empty HTML tags?

I am imagining a function which I figure would use Regex, and it would be recursive for instances like to remove all empty HTML tags within a string. This would have to account for whitespace to if possible. There would be no crazy instances where < character was being used in an attribute value. I am pretty terrible at regex but I imagine this is possible. How can you do it? Here is the method I have so far: Public Shared Function stripEmptyHtmlTags(ByVal html As String) As String Dim newHtml As String = Regex.Replace(html, "/(<.+?>\s*</.+?>)/Usi", "") If html <>

Remove all empty HTML tags?

阅读更多关于 Remove all empty HTML tags?

问题 I am imagining a function which I figure would use Regex, and it would be recursive for instances like to remove all empty HTML tags within a string. This would have to account for whitespace to if possible. There would be no crazy instances where < character was being used in an attribute value. I am pretty terrible at regex but I imagine this is possible. How can you do it? Here is the method I have so far: Public Shared Function stripEmptyHtmlTags(ByVal html As

Can I use Perl regular expressions to match balanced text?

阅读更多关于 Can I use Perl regular expressions to match balanced text?

I would like to match text enclosed in brackets etc in Perl. How can I do that? This is a question from the official perlfaq . We're importing the perlfaq to Stack Overflow . This is the official FAQ answer minus any subsequent edits. Your first try should probably be the Text::Balanced module, which is in the Perl standard library since Perl 5.8. It has a variety of functions to deal with tricky text. The Regexp::Common module can also help by providing canned patterns you can use. As of Perl 5.10, you can match balanced text with regular expressions using recursive patterns. Before Perl 5.10