lookahead | 易学教程

Using lookahead with generators

阅读更多关于 Using lookahead with generators

问题 I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value) : for token in scan("a(b)"): print token would print ("literal", "a") ("l_paren", "(") ... The next task implies parsing the token stream and for that, I need be able to look one item ahead from the current one without moving the pointer ahead as well. The fact that iterators and generators do not provide the complete sequence of items at once but each item as

XML schema restriction pattern for not allowing specific string

阅读更多关于 XML schema restriction pattern for not allowing specific string

问题 I need to write an XSD schema with a restriction on a field, to ensure that the value of the field does not contain the substring FILENAME at any location. For example, all of the following must be invalid: FILENAME ORIGINFILENAME FILENAMETEST 123FILENAME456 None of these values should be valid. In a regular expression language that supports negative lookahead, I could do this by writing /^((?!FILENAME).)*$ but the XSD pattern language does not support negative lookahead. How can I implement

Does lookaround affect which languages can be matched by regular expressions?

阅读更多关于 Does lookaround affect which languages can be matched by regular expressions?

问题 There are some features in modern regex engines which allow you to match languages that couldn't be matched without that feature. For example the following regex using back references matches the language of all strings that consist of a word that repeats itself: (.+)\1 . This language is not regular and can't be matched by a regex that does not use back references. Does lookaround also affect which languages can be matched by a regular expression? I.e. are there any languages that can be

Regex to match all permutations of {1,2,3,4} without repetition

阅读更多关于 Regex to match all permutations of {1,2,3,4} without repetition

问题 I am implementing the following problem in ruby. Here's the pattern that I want : 1234, 1324, 1432, 1423, 2341 and so on i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive. to make you understand in a simple manner I take a two digit pattern and the solution should be : 12, 21 i.e. the digits should be either 1 or 2 and should be non-repetitive. To make sure that they are non-repetitive I want to use $1 for the condition for my second digit but

Regex to match all permutations of {1,2,3,4} without repetition

阅读更多关于 Regex to match all permutations of {1,2,3,4} without repetition

Spirit Qi sequence parsing issues

阅读更多关于 Spirit Qi sequence parsing issues

问题 I have some issues with parser writing with Spirit::Qi 2.4. I have a series of key-value pairs to parse in following format <key name>=<value> . Key name can be [a-zA-Z0-9] and is always followed by = sign with no white-space between key name and = sign. Key name is also always preceded by at least one space. Value can be almost any C expression (spaces are possible as well), with the exception of the expressions containing = char and code blocks { } . At the end of the sequence of the key

Regular expression how to prevent match if followed by a specific word. Something like first character inclusive lookahead?

阅读更多关于 Regular expression how to prevent match if followed by a specific word. Something like first character inclusive lookahead?

问题 I am using a regular expression to match where conditions in a SQL query. I want WHERE <ANY CONDITION> , but with the exception of WHERE ROWNUM <WHATEVER> . So I do not want ROWNUM to appear after the WHERE keyword. I did use Lookaheads to achieve that. My regex is WHERE (.*(?! ROWNUM )+) . The problem is, it still matches WHERE ROWNUM < 1000 . If I delete the space before ROWNUM in the regex, then any column with a name ending with ROWNUM won't match. If I delete the space after WHERE then

Regex lookaround construct in Java: advise on optimization needed

阅读更多关于 Regex lookaround construct in Java: advise on optimization needed

问题 I am trying to search for filenames in a comma-separated list in: text.txt,temp_doc.doc,template.tmpl,empty.zip I use Java's regex implementation. Requirements for output are as follows: Display only filenames and not their respective extensions Exclude files that begin with "temp_" It should look like: text template empty So far I have managed to write more or less satisfactory regex to cope with the first task: [^\\.,]++(?=\\.[^,]*+,?+) I believe to make it comply with the second

Match two quotes not preceded by opening bracket

阅读更多关于 Match two quotes not preceded by opening bracket

问题 I need a regex matching all occurrences of two quotes ( '' ) not preceded by opening bracket ( ( ). I did a negative lookahead for the bracket followed by a quote. But why is this not working: /(?!\()''/g for example with this string (''test''test It should match the second occurrence but not the first one but it matches both. When I use exactly the same solution but with check for new line instead of bracket it works fine: /(?!^)''/g With this string: ''test''test It matches as expected only

How to explain the same structure expression `(?=\w{6,10})\d+ and (?=abc)ad`? [duplicate]

阅读更多关于 How to explain the same structure expression `(?=\w{6,10})\d+ and (?=abc)ad`? [duplicate]

问题 This question already has an answer here : Reference - What does this regex mean? (1 answer) Closed 2 years ago . debian@wifi:~$ echo "348dfgeccvdf" | grep -oP "\d+(?=\w{6,10})" 348 debian@wifi:~$ echo "348dfgeccvdf" | grep -oP "(?=\w{6,10})\d+" 348 For \d+(?=\w{6,10}) ,it is the standard positive look ahead expression. As Wiktor Stribiżew say in the post position and negative lookbehind The negative lookbehind syntax starts with (?<! and ends with the unescaped ) . Whether it appears at the