lookahead

Using lookahead with generators

别来无恙 提交于 2019-11-27 18:35:39
I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value) : for token in scan("a(b)"): print token would print ("literal", "a") ("l_paren", "(") ... The next task implies parsing the token stream and for that, I need be able to look one item ahead from the current one without moving the pointer ahead as well. The fact that iterators and generators do not provide the complete sequence of items at once but each item as needed makes lookaheads a bit trickier compared to lists, since the next item is not known unless __next__()

How to match multiple words in regex

江枫思渺然 提交于 2019-11-27 17:42:20
问题 Just a simple regex I don't know how to write. The regex has to make sure a string matches all 3 words. I see how to make it match any of the 3: /advancedbrain|com_ixxocart|p\=completed/ but I need to make sure that all 3 words are present in the string. Here are the words advancebrain com_ixxocart p=completed 回答1: Use lookahead assertions: ^(?=.*advancebrain)(?=.*com_ixxochart)(?=.*p=completed) will match if all three terms are present. You might want to add \b work boundaries around your

XML schema restriction pattern for not allowing specific string

谁说我不能喝 提交于 2019-11-27 16:22:43
I need to write an XSD schema with a restriction on a field, to ensure that the value of the field does not contain the substring FILENAME at any location. For example, all of the following must be invalid: FILENAME ORIGINFILENAME FILENAMETEST 123FILENAME456 None of these values should be valid. In a regular expression language that supports negative lookahead, I could do this by writing /^((?!FILENAME).)*$ but the XSD pattern language does not support negative lookahead. How can I implement an XSD pattern restriction with the same effect as /^((?!FILENAME).)*$ ? I need to use pattern, because

Does lookaround affect which languages can be matched by regular expressions?

蓝咒 提交于 2019-11-27 06:12:22
There are some features in modern regex engines which allow you to match languages that couldn't be matched without that feature. For example the following regex using back references matches the language of all strings that consist of a word that repeats itself: (.+)\1 . This language is not regular and can't be matched by a regex that does not use back references. Does lookaround also affect which languages can be matched by a regular expression? I.e. are there any languages that can be matched using lookaround that couldn't be matched otherwise? If so, is this true for all flavors of

String negation using regular expressions

爱⌒轻易说出口 提交于 2019-11-27 01:28:48
问题 Is it possible to do string negation in regular expressions? I need to match all strings that do not contain the string ".." . I know you can use ^[^\.]*$ to match all strings that do not contain "." but I need to match more than one character. I know I could simply match a string containing ".." and then negate the return value of the match to achieve the same result but I just wondered if it was possible. 回答1: You can use negative lookaheads: ^(?!.*\.\.).*$ That causes the expression to not

Javascript won't split using regex

两盒软妹~` 提交于 2019-11-26 21:29:45
问题 Since I started writing this question, I think I figured out the answers to every question I had, but I thought I'd post anyway, as it might be useful to others and more clarification might be helpful. I was trying to use a regular expression with lookahead with the javascript function split. For some reason it was not splitting the string even though it finds a match when I call match. I originally thought the problem was from using lookahead in my regular expression. Here is a simplified

Regex to match all permutations of {1,2,3,4} without repetition

坚强是说给别人听的谎言 提交于 2019-11-26 19:06:48
I am implementing the following problem in ruby. Here's the pattern that I want : 1234, 1324, 1432, 1423, 2341 and so on i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive. to make you understand in a simple manner I take a two digit pattern and the solution should be : 12, 21 i.e. the digits should be either 1 or 2 and should be non-repetitive. To make sure that they are non-repetitive I want to use $1 for the condition for my second digit but its not working. Please help me out and thanks in advance. polygenelubricants You can use this ( see on

Regular expression negative lookahead

可紊 提交于 2019-11-26 14:28:20
In my home directory I have a folder drupal-6.14 that contains the Drupal platform. From this directory I use the following command: find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf drupal-6.14.tar.gz What this command does is gzips the folder drupal-6.14 , excluding all subfolders of drupal-6.14/sites/ except sites/all and sites/default , which it includes. My question is on the regular expression: grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' The expression works to exclude all the folders I want excluded, but I don't quite

How does the regular expression ‘(?<=#)[^#]+(?=#)’ work?

江枫思渺然 提交于 2019-11-26 13:22:13
I have the following regex in a C# program, and have difficulties understanding it: (?<=#)[^#]+(?=#) I'll break it down to what I think I understood: (?<=#) a group, matching a hash. what's `?<=`? [^#]+ one or more non-hashes (used to achieve non-greediness) (?=#) another group, matching a hash. what's the `?=`? So the problem I have is the ?<= and ?< part. From reading MSDN, ?<name> is used for naming groups, but in this case the angle bracket is never closed. I couldn't find ?= in the docs, and searching for it is really difficult, because search engines will mostly ignore those special

Regular expression negative lookahead

China☆狼群 提交于 2019-11-26 03:55:38
问题 In my home directory I have a folder drupal-6.14 that contains the Drupal platform. From this directory I use the following command: find drupal-6.14 -type f -iname \'*\' | grep -P \'drupal-6.14/(?!sites(?!/all|/default)).*\' | xargs tar -czf drupal-6.14.tar.gz What this command does is gzips the folder drupal-6.14 , excluding all subfolders of drupal-6.14/sites/ except sites/all and sites/default , which it includes. My question is on the regular expression: grep -P \'drupal-6.14/(?!sites(?!