lookbehind

Variable-Width Lookbehind Issue in Python

隐身守侯 提交于 2019-11-30 18:55:47
I got the following scenarios: 1) car on the right shoulder 2) car on the left shoulder 3) car on the shoulder I want to match "shoulder" when left|right is not present. So only 3) return "shoulder" re.compile(r'(?<!right|right\s*)shoulder') sre_constants.error: look-behind requires fixed-width pattern It seems like I can't use \s* and "|" How can I solve this. Thanks in advance! zx81 regex module: variable-width lookbehind In addition to the answer by HamZa , for any regex of any complexity in Python, I recommend using the outstanding regex module by Matthew Barnett . It supports infinite

RegEx: Look-behind to avoid odd number of consecutive backslashes

戏子无情 提交于 2019-11-30 07:27:29
问题 I have user input where some tags are allowed inside square brackets. I've already wrote the regex pattern to find and validate what's inside the brackets. In user input field opening-bracket could ([) be escaped with backslash, also backslash could be escaped with another backslash (\). I need look-behind sub-pattern to avoid odd number of consecutive backslashes before opening-bracket. At the moment I must deal with something like this: (?<!\\)(?:\\\\)*\[(?<inside brackets>.*?)] It works

Issue with a Look-behind Regular expression (Ruby)

老子叫甜甜 提交于 2019-11-29 13:27:34
I wrote this regex to match all href and src links in an HTML page; (I know I should be using a parser; this just experimenting): /((href|src)\=\").*?\"/ # Without look-behind It works fine, but when I try to modify the first portion of the expression as a look-behind pattern: /(?<=(href|src)\=\").*?\"/ # With look-behind It throws an error stating 'invalid look-behind pattern'. Any ideas, whats going wrong with the look-behind? sawa Lookbehind has restrictions : (?<=subexp) look-behind (?<!subexp) negative look-behind Subexp of look-behind must be fixed character length. But different

How to non-greedy multiple lookbehind matches

孤人 提交于 2019-11-29 05:22:00
Source: <prefix><content1><suffix1><prefix><content2><suffix2> Engine: PCRE RegEx1: (?<=<prefix>)(.*)(?=<suffix1>) RegEx2: (?<=<prefix>)(.*)(?=<suffix2>) Result1: <content1> Result2: <content1><suffix1><prefix><content2> The desired result for RegEx2 is just <content2> but it is obviously greedy. How do I make RegEx2 non-greedy and use only the last matching lookbehind? [I hope I have translated this correctly from the NoteTab syntax. I don't do much RegEx coding. The <prefix>, <content> & <suffix> terms are just meant to represent arbitrary strings. Only the "<" in the "?<=" lookbehind

RegEx: Look-behind to avoid odd number of consecutive backslashes

空扰寡人 提交于 2019-11-29 03:58:20
I have user input where some tags are allowed inside square brackets. I've already wrote the regex pattern to find and validate what's inside the brackets. In user input field opening-bracket could ([) be escaped with backslash, also backslash could be escaped with another backslash (\). I need look-behind sub-pattern to avoid odd number of consecutive backslashes before opening-bracket. At the moment I must deal with something like this: (?<!\\)(?:\\\\)*\[(?<inside brackets>.*?)] It works fine, but problem is that this code still matches possible pairs of consecutive backslashes in front of

Regular Expression to match only one angle bracket

你。 提交于 2019-11-28 10:59:41
问题 I'm looking for a regular expression that matches the '>' in a > b > b> ... but not two or more angled brackets, i.e. it should not match a>>b >> b>> ... I was sure to do that with lookaheads or lookbehinds, but for some reason neither \>(?!\>) nor (?<!\>)\> work..? Thanks! 回答1: Perl syntax: /(?<!>)>(?!>)/ Without using lookahead or lookbehind: /(?:^|[^>])>(?:[^>]|$)/ 回答2: perreal's first regex is correct. However, the second regex given in that answer subtly fails in one condition. Since it

Javascript/RegExp: Lookbehind Assertion is causing a “Invalid group” error

有些话、适合烂在心里 提交于 2019-11-28 09:23:29
I'm doing a simple Lookbehind Assertion to get a segment of the URL (example below) but instead of getting the match I get the following error: Uncaught SyntaxError: Invalid regular expression: /(?<=\#\!\/)([^\/]+)/: Invalid group Here is the script I'm running: var url = window.location.toString(); url == http://my.domain.com/index.php/#!/write-stuff/something-else // lookbehind to only match the segment after the hash-bang. var regex = /(?<=\#\!\/)([^\/]+)/i; console.log('test this url: ', url, 'we found this match: ', url.match( regex ) ); the result should be write-stuff . Can anyone shed

Java regex error - Look-behind group does not have an obvious maximum length

穿精又带淫゛_ 提交于 2019-11-27 22:11:59
I get this error: java.util.regex.PatternSyntaxException: Look-behind group does not have an obvious maximum length near index 22 ([a-z])(?!.*\1)(?<!\1.+)([a-z])(?!.*\2)(?<!\2.+)(.)(\3)(.)(\5) ^ I'm trying to match COFFEE , but not BOBBEE . I'm using java 1.6. Java doesn't support variable length in look behind. In this case, it seems you can easily ignore it (assuming your entire input is one word): ([a-z])(?!.*\1)([a-z])(?!.*\2)(.)(\3)(.)(\5) Both lookbehinds do not add anything: the first asserts at least two characters where you only had one, and the second checks the second character is

Does lookbehind work in sed?

陌路散爱 提交于 2019-11-27 22:03:18
I created a test using grep but it does not work in sed . grep -P '(?<=foo)bar' file.txt This works correctly by returning bar . sed 's/(?<=foo)bar/test/g' file.txt I was expecting footest as output, but it did not work. GNU sed does not have support for lookaround assertions. You could use a more powerful language such as Perl or possibly experiment with ssed which supports Perl-style regular expressions. perl -pe 's/(?<=foo)bar/test/g' file.txt Note that most of the time you can avoid a lookbehind (or a lookahead) using a capture group and a backreference in the replacement string: sed 's/\

Python : Fixed Length Regex Required?

淺唱寂寞╮ 提交于 2019-11-27 16:29:32
I have this regex that uses forward and backward look-aheads: import re re.compile("<!inc\((?=.*?\)!>)|(?<=<!inc\(.*?)\)!>") I'm trying to port it from C# to Python but keep getting the error look-behind requires fixed-width pattern Is it possible to rewrite this in Python without losing meaning? The idea is for it to match something like <!inc(C:\My Documents\file.jpg)!> Update I'm using the lookarounds to parse HTTP multipart text that I've modified body = r"""------abc Content-Disposition: form-data; name="upfile"; filename="file.txt" Content-Type: text/plain <!inc(C:\Temp\file.txt)!> -----