regex-greedy

Non greedy regex

我是研究僧i 提交于 2019-12-22 18:44:32
问题 I need to get the value inside some tags in a comment php file like this php code /* this is a comment !- <titulo>titulo3</titulo> <funcion> <descripcion>esta es la descripcion de la funcion 6</descripcion> </funcion> <funcion> <descripcion>esta es la descripcion de la funcion 7</descripcion> </funcion> <otros> <descripcion>comentario de otros 2a hoja</descripcion> </otros> -! */ some php code so as you can see the file has newlines and repetions of tags like <funcion></funcion> and i need to

Regex to pick out artist name and song title, issue with lazy matching

江枫思渺然 提交于 2019-12-22 18:26:36
问题 I'm trying to build a flexible regular expression to pick out the artist name and song title of a media file. I'd like it to be flexible and support all of the following: 01 Example Artist - Example Song.mp3 01 Example Song.mp3 (In this example, there's no artist so that group should be null) Example Artist - Example Song.mp3 Example Song.mp3 (Again, no artist) I've come up with the following (in .NET syntax, particularly for named capture groups): \d{0,2}\s*(?<artist>[^-]*)?[\s-]*(?<songname

Regex to match whatsapp chat log

99封情书 提交于 2019-12-22 17:58:01
问题 I've been trying to create Regex for WhatsApp chat log. So far I've been able to achieve this Click Here for the test link By creating the following Regex: (?P<datetime>\d{2}\/\d{2}\/\d{4},\s\d(?:\d)?:\d{2} [pa].m.)\s-\s(?P<name>[^:]*):(?P<message>.*) The problem with this regex is, it is not able to match big messages which span multiple lines with line breaks. You can see the issue in the link provided above. Help would be appreciated. Thank you. 回答1: There you go: ^ (?P<datetime>\d{2}/\d{2

Python non-greedy regex to clean xml

情到浓时终转凉″ 提交于 2019-12-22 13:56:10
问题 I have an 'xml file' file that has some unwanted characters in it <data> <tag>blar </tag><tagTwo> bo </tagTwo> some extra characters not enclosed that I want to remove <anothertag>bbb</anothertag> </data> I thought the following non-greedy substitution would remove the characters that were not properly encased in <sometag></sometag> re.sub("</([a-zA-Z]+)>.*?<","</\\1><",text) ^ ^ ^ ^ text is the xml txt. remember tag, | | put tag back without and reopen next tag read everything until the next

python regex match more than once per index of search string

时间秒杀一切 提交于 2019-12-22 13:48:39
问题 I'm looking for a way to make the finditer function of the python re module or the newer regex module to match all possible variations of a particular pattern, overlapping or otherwise. I am aware of using lookaheads to get matches without consuming the search string, but I still only get one regex per index, where I could get more than one. The regex I am using is something like this: (?=A{2}[BA]{1,6}A{2}) so in the string: AABAABBAA it should be able to match: AABAA AABAABBAA AABBAA but

Matching text between delimiters: greedy or lazy regular expression?

会有一股神秘感。 提交于 2019-12-20 09:37:26
问题 For the common problem of matching text between delimiters (e.g. < and > ), there's two common patterns: using the greedy * or + quantifier in the form START [^END]* END , e.g. <[^>]*> , or using the lazy *? or +? quantifier in the form START .*? END , e.g. <.*?> . Is there a particular reason to favour one over the other? 回答1: Some advantages: [^>]* : More expressive. Captures newlines regardless of /s flag. Considered quicker, because the engine doesn't have to backtracks to find a

Converting a Regex Expression that works in Chrome to work in Firefox [duplicate]

做~自己de王妃 提交于 2019-12-20 07:56:04
问题 This question already has answers here : Javascript: negative lookbehind equivalent? (13 answers) Closed 6 months ago . I have this Regex Expression that works in chrome but doesn't not work in Firefox. SyntaxError: invalid regexp group It has something to do with lookbehinds and Firefox does not support these. I need this to work in Firefox can some one help me convert this so it works in Firefox and filters out the tags as well? return new RegExp(`(?!<|>|/|&amp|_)(?<!</?[^>]*|&[^;]*)(${term

Priority in regex manipulating

南笙酒味 提交于 2019-12-20 06:18:33
问题 I write some java code to split string into array of string. First, I split that string using regex pattern "\\,\\,|\\," and then I split using pattern "\\,|\\,\\," . Why there are difference between output of the first and output of the second? public class Test2 { public static void main(String[] args){ String regex1 = "\\,\\,|\\,"; String regex2 = "\\,|\\,\\,"; String a = "20140608,FT141590Z0LL,0608103611018634TCKJ3301000000018667,3000054789,IDR1742630000001,80507,1000,6012,TCKJ3301,6.00E

Java Regex - Ilegal Repetition character

爷,独闯天下 提交于 2019-12-20 05:56:17
问题 My regex is (?:--|#|\/\*|{) When i compile this using Pattern.complie() in java, I am getting * Illegal Repetitive Character * I tested this regex (a|\/\*|b) When i compiled this, It shows no error. Why does this occur ? 回答1: It is because of {. It is used to specify how many times something should it be repeated. For instance x{2,4} will match x repeated 2 ( xx ), 3 ( xxx ) or 4 ( xxxx ) times. If you want regex to match { literal it needs to be escaped: (?:--|#|\/\*|\{) 来源: https:/

Need regexp to find substring between two tokens

断了今生、忘了曾经 提交于 2019-12-20 01:36:12
问题 I suspect this has already been answered somewhere, but I can't find it, so... I need to extract a string from between two tokens in a larger string, in which the second token will probably appear again meaning... (pseudo code...) myString = "A=abc;B=def_3%^123+-;C=123;" ; myB = getInnerString(myString, "B=", ";" ) ; method getInnerString(inStr, startToken, endToken){ return inStr.replace( EXPRESSION, "$1"); } so, when I run this using expression " .+B=(.+);.+ " I get "def_3%^123+-;C=123;"