regex-group | 易学教程

java.util.regex.Matcher confused group

阅读更多关于 java.util.regex.Matcher confused group

问题 I'm having trouble getting the right group of a regex match. My code boils down to following: Pattern fileNamePattern = Pattern.compile("\\w+_\\w+_\\w+_(\\w+)_(\\d*_\\d*)\\.xml"); Matcher fileNameMatcher = fileNamePattern.matcher("test_test_test_test_20110101_0000.xml"); System.out.println(fileNameMatcher.groupCount()); if (fileNameMatcher.matches()) { for (int i = 0; i < fileNameMatcher.groupCount(); ++i) { System.out.println(fileNameMatcher.group(i)); } } I expect the output to be: 2 test

RegEx for adding underscore before capitalized letters

阅读更多关于 RegEx for adding underscore before capitalized letters

问题 How do I add underscore (_) before capitalized letters in a string, excepted the first one ? [1] "VarLengthMean" "VarWidthMean" I want it to become : [1] "Var_Length_Mean" "Var_Width_Mean" I considered using str_replace_all from stringr , but I can't figure out which regexp I should use. How do I solve this problem? 回答1: One option would be to capture the lower case letter and the following upper case letter, and then insert the _ while adding the backreference ( \\1 , \\2 ) of the captured

RegEx for HTML tag conversion

阅读更多关于 RegEx for HTML tag conversion

问题 For some reasons, I want to convert strings which contain <p style=“text-align:center; others-style:value;”>Content</p> to <center>Content</center> in PHP. The text-align values can be either left, right, or center. And when there are other stylings, I want to omit them. How can I do that in PHP? Edit: Maybe I was not clear enough in my original question. What I mean is that I want to convert contents with text-align:center to be wrapped by <center> , and contents with text-align:right to be

Regex match everything between two {}

阅读更多关于 Regex match everything between two {}

问题 I was looking at different answers here but unfortunately none of them was good for my case. So I hope you don't mind about it. So I need to match everything between two curly brackets {} except situation when match starts with @ and without these curly brackets e.g: "This is a super text { match_this }" "{ match_this }" "This is another example @{deal_with_it}" Here are my test strings, 1,2,3 are valid while the last one shouldn't be: 1 {eww} 2 r23r23{fetwe} 3 #{d2dded} 4 @{d2dded} I was

using regular expression substitution command to insert leading zeros in front of numbers less than 10 in a string of filenames

阅读更多关于 using regular expression substitution command to insert leading zeros in front of numbers less than 10 in a string of filenames

问题 I am having trouble figuring out how to make this work with substitution command, which is what I have been instructed to do. I am using this text as a variable: text = 'file1, file2, file10, file20' I want to search the text and substitute in a zero in front of any numbers less than 10. I thought I could do and if statement depending on whether or not re.match or findall would find only one digit after the text, but I can't seem to execute. Here is my starting code where I am trying to

Extract multiple values if present from cell in Google Spreadsheets

阅读更多关于 Extract multiple values if present from cell in Google Spreadsheets

Using Google re2 https://github.com/google/re2/blob/master/doc/syntax.txt From a couple of lines like I love Rock I love Rock and scissors I hate paper I like Rock, paper and scissors I'd love myself I want to extract "Rock", "paper"and "scissors" from each line. I want the regex to match all the above five lines and give me Rock, paper and scissors where it found something. I'm predominantly using this in Google sheets, but any Google re2 regex should help. I've tried.... ".*(([Rock]{0,4})).*" ".*(([Rock]{4})|([Rock]{0})).*" =REGEXEXTRACT(A2,".*(Rock{0,2}).*(paper{0,2}).*(scissors{0,2}).*")

Regex Non-Duplicate Bigrams

阅读更多关于 Regex Non-Duplicate Bigrams

问题 I want a PCRE regex to create bigram pairings similar to this question, but without duplicates words. Full Match: apple orange plum Group 1: apple orange Group 2: orange plum The closest I’ve gotten to it is this, but ‘orange’ isn’t captured in the second group. (\b.+\b)(\g<1>)\b 回答1: You're looking for this: /(?=(\b\w+\s+\w+))/g Here's a quick perl one-liner to demonstrate it: $ perl -e 'while ("apple orange plum" =~ /(?=(\b\w+\s+\w+))/g) { print "$1\n" }' apple orange orange plum This uses

Regex to match whatsapp chat log

阅读更多关于 Regex to match whatsapp chat log

I've been trying to create Regex for WhatsApp chat log. So far I've been able to achieve this Click Here for the test link By creating the following Regex: (?P<datetime>\d{2}\/\d{2}\/\d{4},\s\d(?:\d)?:\d{2} [pa].m.)\s-\s(?P<name>[^:]*):(?P<message>.*) The problem with this regex is, it is not able to match big messages which span multiple lines with line breaks. You can see the issue in the link provided above. Help would be appreciated. Thank you. There you go: ^ (?P<datetime>\d{2}/\d{2}/\d{4}[^-]+)\s+-\s+ (?P<name>[^:]+):\s+ (?P<message>[\s\S]+?) (?=^\d{2}|\Z) See your modified demo on

RegEx for extracting a value from Open3.popen3 stdout

阅读更多关于 RegEx for extracting a value from Open3.popen3 stdout

问题 How do I get the output of an external command and extract values from it? I have something like this: stdin, stdout, stderr, wait_thr = Open3.popen3("#{path}/foobar", configfile) if /exit 0/ =~ wait_thr.value.to_s runlog.puts("Foobar exited normally.\n") puts "Test completed." someoutputvalue = stdout.read("TX.*\s+(\d+)\s+") puts "Output value: " + someoutputvalue end I'm not using the right method on stdout since Ruby tells me it can't convert String into Integer. So for instance, if the

RegEx for adding underscore before capitalized letters

阅读更多关于 RegEx for adding underscore before capitalized letters

How do I add underscore (_) before capitalized letters in a string, excepted the first one ? [1] "VarLengthMean" "VarWidthMean" I want it to become : [1] "Var_Length_Mean" "Var_Width_Mean" I considered using str_replace_all from stringr , but I can't figure out which regexp I should use. How do I solve this problem? akrun One option would be to capture the lower case letter and the following upper case letter, and then insert the _ while adding the backreference ( \\1 , \\2 ) of the captured group sub("([a-z])([A-Z])", "\\1_\\2", v1) #[1] "Var_Length" "Var_Width" If there are more instances,