regex-group

Unicode re.sub() doesn't work with \g<0> (group 0)

萝らか妹 提交于 2020-02-03 13:25:06
问题 Why doesn't the \g<0> work with unicode regex? When I tried to use \g<0> to insert a space before and after the group with normal string regex, it works: >>> punct = """,.:;!@#$%^&*(){}{}|\/?><"'""" >>> rx = re.compile('[%s]' % re.escape(punct)) >>> text = '''"anständig"''' >>> rx.sub(r" \g<0> ",text) ' " anst\xc3\xa4ndig " ' >>> print rx.sub(r" \g<0> ",text) " anständig " but with unicode regex, the space isn't added: >>> punct = u""",–−—’‘‚”“‟„!£"%$'&)(+*-€/.±°´·¸;:=<?>@§#¡•[˚]»_^`≤…\«¿¨{}|

RegEx for matching HTML tags with specific attributes [duplicate]

寵の児 提交于 2020-01-30 13:15:27
问题 This question already has answers here : RegEx match open tags except XHTML self-contained tags (34 answers) Closed 8 months ago . I have a string like <span title="use a <label>">Some Content</span> <span title="use a <div>">Some Other Content</span> I need a regex to get only the Some Content or Some Other Content ignoring the tags, even if the tags has another tags inside 回答1: Use a document parser and DOM methods to get the content, not regular expressions. Regex is decidedly the wrong

RegEx for matching HTML tags with specific attributes [duplicate]

允我心安 提交于 2020-01-30 13:14:06
问题 This question already has answers here : RegEx match open tags except XHTML self-contained tags (34 answers) Closed 8 months ago . I have a string like <span title="use a <label>">Some Content</span> <span title="use a <div>">Some Other Content</span> I need a regex to get only the Some Content or Some Other Content ignoring the tags, even if the tags has another tags inside 回答1: Use a document parser and DOM methods to get the content, not regular expressions. Regex is decidedly the wrong

Regexp Replace - Append String to Second Occurrence Using R's Sub

纵然是瞬间 提交于 2020-01-25 20:53:36
问题 I'm trying to append a string to the second occurance. The below code will replace the second occurrence with a static replacement string, but I need it to be flexible because the match can be, for example, either (cat|dog) . Below is what I'm using to replace with a static string fish . string <- "xxx cat xxx cat xxx cat" sub('^((.*?cat.*?){1})cat', "\\1\\fish", string, perl=TRUE) [1]'xxx cat xxx fish xxx cat' But what I'm trying to get is: string <- "xxx cat xxx cat xxx cat" sub('^((.*?(cat

Regexp Replace - Append String to Second Occurrence Using R's Sub

末鹿安然 提交于 2020-01-25 20:53:30
问题 I'm trying to append a string to the second occurance. The below code will replace the second occurrence with a static replacement string, but I need it to be flexible because the match can be, for example, either (cat|dog) . Below is what I'm using to replace with a static string fish . string <- "xxx cat xxx cat xxx cat" sub('^((.*?cat.*?){1})cat', "\\1\\fish", string, perl=TRUE) [1]'xxx cat xxx fish xxx cat' But what I'm trying to get is: string <- "xxx cat xxx cat xxx cat" sub('^((.*?(cat

Kotlin Regex named groups support

柔情痞子 提交于 2020-01-22 17:50:27
问题 Does Kotlin have support for named regex groups? Named regex group looks like this: (?<name>...) 回答1: According to this discussion, This will be supported in Kotlin 1.1. https://youtrack.jetbrains.com/issue/KT-12753 Kotlin 1.1 EAP is already available to try. """(\w+?)(?<num>\d+)""".toRegex().matchEntire("area51")!!.groups["num"]!!.value You'll have to use kotlin-stdlib-jre8 . 回答2: As of Kotlin 1.0 the Regex class doesn't provide a way to access matched named groups in MatchGroupCollection

Re-use of a regular expression capture group in Python

橙三吉。 提交于 2020-01-22 02:30:11
问题 The following python code does work but the regular expression 'search' is evaluated twice: # my_string varies, it gets the following strings = ['%10s', 'comma%11d', 'comma%-6.2f', '%-8s'] in a loop output_string = '|' re_compiled_pat_int = re.compile(r'comma%(\d+)d') re_compiled_pat_fp = re.compile(r'comma%-?(\d+)\.(\d+)f') re_compiled_pat_str = re.compile(r'%(-?\d+)s') if re_compiled_pat_int.search(my_string): output_string += f' %s{re_compiled_pat_int.search (my_string).group(1)}s |' #

Regex behaving weird when finding floating point strings [duplicate]

我与影子孤独终老i 提交于 2020-01-17 14:06:19
问题 This question already has answers here : re.findall behaves weird (3 answers) Closed yesterday . So doing this (in python 3.7.3): >>> from re import findall >>> s = '7.95 + 10 pieces' >>> findall(r'(\d*\.)?\d+', s) ['7.', ''] # Expected return: ['7.95', '10'] I'm not sure why it doesn't find all the floats inside? Is this possibly some python quirk about capturing groups? My logic behind the regex: (\d*\.)? matches either 1 or none of any number of digits, followed by a period. \d+ then

Regex match for new lines

。_饼干妹妹 提交于 2020-01-16 15:46:12
问题 I am trying to get the log statements in my code using java. I am using this regex: Log\.[dD].*\); Test data: Log.d(TAG, "Construct product info"); test test test test Log.d(TAG, "Loading Product Id: %d, Version: %s", productInfo.getProductId(), productInfo.getProductVersion()); test test test test Log.d(TAG, "Loading Product Id: " + ProductId + "Version:" + Version); test test test test for some reason, regex is only picking up 1st line which is Log.d(TAG, "Construct product info"); Correct

Negating a backreference in Regular Expressions

纵饮孤独 提交于 2020-01-14 06:47:12
问题 if a string has this predicted format: value = "hello and good morning" Where the " (quotations) might also be ' (single quote), and the closing char (' or ") will be the same as the opening one. I want to match the string between the quotation marks. \bvalue\s*=\s*(["'])([^\1]*)\1 (the two \s are to allow any spaces near the = sign) The first "captured group" (inside the first pair of brackets) - should match the opening quotation which should be either ' or " then - I'm supposed to allow