lookahead

Using lookahead with generators

99封情书 提交于 2019-12-17 15:33:24
问题 I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value) : for token in scan("a(b)"): print token would print ("literal", "a") ("l_paren", "(") ... The next task implies parsing the token stream and for that, I need be able to look one item ahead from the current one without moving the pointer ahead as well. The fact that iterators and generators do not provide the complete sequence of items at once but each item as

XML schema restriction pattern for not allowing specific string

一曲冷凌霜 提交于 2019-12-17 10:02:37
问题 I need to write an XSD schema with a restriction on a field, to ensure that the value of the field does not contain the substring FILENAME at any location. For example, all of the following must be invalid: FILENAME ORIGINFILENAME FILENAMETEST 123FILENAME456 None of these values should be valid. In a regular expression language that supports negative lookahead, I could do this by writing /^((?!FILENAME).)*$ but the XSD pattern language does not support negative lookahead. How can I implement

Does lookaround affect which languages can be matched by regular expressions?

最后都变了- 提交于 2019-12-17 08:05:30
问题 There are some features in modern regex engines which allow you to match languages that couldn't be matched without that feature. For example the following regex using back references matches the language of all strings that consist of a word that repeats itself: (.+)\1 . This language is not regular and can't be matched by a regex that does not use back references. Does lookaround also affect which languages can be matched by a regular expression? I.e. are there any languages that can be

Regex to match all permutations of {1,2,3,4} without repetition

感情迁移 提交于 2019-12-17 04:35:19
问题 I am implementing the following problem in ruby. Here's the pattern that I want : 1234, 1324, 1432, 1423, 2341 and so on i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive. to make you understand in a simple manner I take a two digit pattern and the solution should be : 12, 21 i.e. the digits should be either 1 or 2 and should be non-repetitive. To make sure that they are non-repetitive I want to use $1 for the condition for my second digit but

Regex to match all permutations of {1,2,3,4} without repetition

坚强是说给别人听的谎言 提交于 2019-12-17 04:35:11
问题 I am implementing the following problem in ruby. Here's the pattern that I want : 1234, 1324, 1432, 1423, 2341 and so on i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive. to make you understand in a simple manner I take a two digit pattern and the solution should be : 12, 21 i.e. the digits should be either 1 or 2 and should be non-repetitive. To make sure that they are non-repetitive I want to use $1 for the condition for my second digit but

Spirit Qi sequence parsing issues

社会主义新天地 提交于 2019-12-12 18:37:41
问题 I have some issues with parser writing with Spirit::Qi 2.4. I have a series of key-value pairs to parse in following format <key name>=<value> . Key name can be [a-zA-Z0-9] and is always followed by = sign with no white-space between key name and = sign. Key name is also always preceded by at least one space. Value can be almost any C expression (spaces are possible as well), with the exception of the expressions containing = char and code blocks { } . At the end of the sequence of the key

Regular expression how to prevent match if followed by a specific word. Something like first character inclusive lookahead?

耗尽温柔 提交于 2019-12-12 16:24:56
问题 I am using a regular expression to match where conditions in a SQL query. I want WHERE <ANY CONDITION> , but with the exception of WHERE ROWNUM <WHATEVER> . So I do not want ROWNUM to appear after the WHERE keyword. I did use Lookaheads to achieve that. My regex is WHERE (.*(?! ROWNUM )+) . The problem is, it still matches WHERE ROWNUM < 1000 . If I delete the space before ROWNUM in the regex, then any column with a name ending with ROWNUM won't match. If I delete the space after WHERE then

Regex lookaround construct in Java: advise on optimization needed

亡梦爱人 提交于 2019-12-12 13:37:34
问题 I am trying to search for filenames in a comma-separated list in: text.txt,temp_doc.doc,template.tmpl,empty.zip I use Java's regex implementation. Requirements for output are as follows: Display only filenames and not their respective extensions Exclude files that begin with "temp_" It should look like: text template empty So far I have managed to write more or less satisfactory regex to cope with the first task: [^\\.,]++(?=\\.[^,]*+,?+) I believe to make it comply with the second

Match two quotes not preceded by opening bracket

大憨熊 提交于 2019-12-12 04:37:57
问题 I need a regex matching all occurrences of two quotes ( '' ) not preceded by opening bracket ( ( ). I did a negative lookahead for the bracket followed by a quote. But why is this not working: /(?!\()''/g for example with this string (''test''test It should match the second occurrence but not the first one but it matches both. When I use exactly the same solution but with check for new line instead of bracket it works fine: /(?!^)''/g With this string: ''test''test It matches as expected only

How to explain the same structure expression `(?=\w{6,10})\d+ and (?=abc)ad`? [duplicate]

旧城冷巷雨未停 提交于 2019-12-12 04:37:52
问题 This question already has an answer here : Reference - What does this regex mean? (1 answer) Closed 2 years ago . debian@wifi:~$ echo "348dfgeccvdf" | grep -oP "\d+(?=\w{6,10})" 348 debian@wifi:~$ echo "348dfgeccvdf" | grep -oP "(?=\w{6,10})\d+" 348 For \d+(?=\w{6,10}) ,it is the standard positive look ahead expression. As Wiktor Stribiżew say in the post position and negative lookbehind The negative lookbehind syntax starts with (?<! and ends with the unescaped ) . Whether it appears at the