regex-greedy

Strange behavior in regexes

浪尽此生 提交于 2020-01-10 05:07:31
问题 There was a question about regex and trying to answer I found another strange things. String x = "X"; System.out.println(x.replaceAll("X*", "Y")); This prints YY. why?? String x = "X"; System.out.println(x.replaceAll("X*?", "Y")); And this prints YXY Why reluctant regex doesn't match 'X' character? There is "noting"X"nothing" but why first doesn't match three symbols and matches two and then one instead of three? and second regex matches only "nothing" s and not X ? 回答1: Let's consider them

perl non-greedy problem

浪子不回头ぞ 提交于 2020-01-03 18:33:05
问题 I am having a problem with a non-greedy regular expression. I've seen that there are questions regarding non-greedy regex, but they don't answer to my problem. Problem: I am trying to match the href of the "lol" anchor. Note: I know this can be done with perl HTML parsing modules, and my question is not about parsing HTML in perl. My question is about the regular expression itself and the HTML is just an example. Test case: I have 4 tests for .*? and [^"] . The 2 first produce the expected

Regular Expression nongreedy is greedy

£可爱£侵袭症+ 提交于 2020-01-02 02:31:10
问题 I have the following text tooooooooooooon According to this book I'm reading, when the ? follows after any quantifier, it becomes non greedy. My regex to*?n is still returning tooooooooooooon . It should return ton shouldn't it? Any idea why? 回答1: A regular expression can only match a fragment of text that actually exists. Because the substring 'ton' doesn't exist anywhere in your string, it can't be the result of a match. A match will only return a substring of the original string EDIT: To

How can I fix my regex to not match too much with a greedy quantifier? [duplicate]

假装没事ソ 提交于 2019-12-30 03:29:11
问题 This question already has answers here : My regex is matching too much. How do I make it stop? (5 answers) Closed 10 months ago . I have the following line: "14:48 say;0ed673079715c343281355c2a1fde843;2;laka;hello ;)" I parse this by using a simple regexp: if($line =~ /(\d+:\d+)\ssay;(.*);(.*);(.*);(.*)/) { my($ts, $hash, $pid, $handle, $quote) = ($1, $2, $3, $4, $5); } But the ; at the end messes things up and I don't know why. Shouldn't the greedy operator handle "everything"? 回答1: The

Regular Expression specific to a particular message pattern with mandatory elements

点点圈 提交于 2019-12-25 19:39:13
问题 As I'm pretty new to Regular Expression, I'm looking for a regular expression which will validate whether entire string is separated by | and there will be values with $ followed by an integer. Valid Values: ABC=$2|CDE=$1|Msg=$4|Ph.No=$3|TIME=$5 ABC=$2|CDE=$1|Msg123=$4|Ph.No=$3|TIME_23=$5 abc=$2|123=$1|cfg=$4|Ph.No=$3 Invalid Values: ABC=$2CDE=$1Msg=$4 ABC=2|CDE=1|Msg123=$4|Ph.No=$3|TIME_23=$5 abc$2|123$1|cfg$4|Ph.No=$3 Msg123=$ |Ph.No=$ |TIME_23=5 abcdefgh|1234|eghjik Msg123=$*|Ph.No=$()

Python re.search() and re.findall() [duplicate]

匆匆过客 提交于 2019-12-24 13:20:57
问题 This question already has answers here : Python re.search (2 answers) Closed 6 years ago . I am trying to solve this from problem from Hackerrank. It is a Machine Learning problem. Initially, I tried to read all the words from the Corpus file for building unigram frequencies. According to this ML problem word is defined as Word is a sequence of characters containing only letters from a to z (lowercase only) and can contain hyphens ( - ) and apostrophe ( ' ). Word should begin and end with

Parsing regex with alternatives and optionals

╄→尐↘猪︶ㄣ 提交于 2019-12-24 03:37:08
问题 I'm building a chatbot subset of RiveScript and trying to build the pattern matching parser with regular expression. Which three regexes match the following three examples? ex1: I am * years old valid match: - "I am 24 years old" invalid match: - "I am years old" ex2: what color is [my|your|his|her] (bright red|blue|green|lemon chiffon) * valid matches: - "what color is lemon chiffon car" - "what color is my some random text till the end of string" ex3: [*] told me to say * valid matches: -

How to replace “^@” with “\r” [duplicate]

♀尐吖头ヾ 提交于 2019-12-23 17:20:10
问题 This question already has answers here : Beginner scripting: ^@ appearing at the end of text (2 answers) Closed 4 years ago . I have a script contains certain functions/replaces,etc.. But in my output, contains some junks like ^@ . How to replace this ^@ with a newline? with the script This command executes in vim commandline %s/<CTRL-2>//g , but not by script 回答1: ^@ is ASCII 0. You could use: sed 's/\x00/\n/g' 回答2: type command in vim If you want to fix the result, you could do this in vim:

PHP preg_replace non-greedy trouble

倾然丶 夕夏残阳落幕 提交于 2019-12-23 09:40:36
问题 I've been using the following site to test a PHP regex so I don't have to constantly upload: http://www.spaweditor.com/scripts/regex/index.php I'm using the following regex: /(.*?)\.{3}/ on the following string (replacing with nothing): Non-important data...important data...more important data and preg_replace is returning: more important data yet I expect it to return: important data...more important data I thought the ? is the non-greedy modifier. What's going on here? 回答1: Your non-greedy

regular expression greedy on left side only (.net)

回眸只為那壹抹淺笑 提交于 2019-12-23 04:34:07
问题 I am trying to capture matches between two strings. For example, I am looking for all text that appears between Q and XYZ, using the "soonest" match (not continuing to expand outwards). This string: circus Q hello there Q SOMETEXT XYZ today is the day XYZ okay XYZ Should return: Q SOMETEXT XYZ But instead, it returns: Q hello there Q SOMETEXT XYZ Here is the expression I'm using: Q.*?XYZ It's going too far back to the left. It's working fine on the ride side when I use the question mark after