findall

how to extract string inside single quotes using python script

坚强是说给别人听的谎言 提交于 2019-11-28 10:38:27
问题 Have a set of string as follows text:u'MUC-EC-099_SC-Memory-01_TC-25' text:u'MUC-EC-099_SC-Memory-01_TC-26' text:u'MUC-EC-099_SC-Memory-01_TC-27' These data i have extracted from a Xls file and converted to string , now i have to Extract data which is inside single quotes and put them in a list. expecting output like [MUC-EC-099_SC-Memory-01_TC-25, MUC-EC-099_SC-Memory-01_TC-26,MUC-EC-099_SC-Memory-01_TC-27] Thanks in advance. 回答1: Use re.findall: >>> import re >>> strs = """text:u'MUC-EC-099

python - regex search and findall

大城市里の小女人 提交于 2019-11-27 15:03:42
I need to find all matches in a string for a given regex. I've been using findall() to do that until I came across a case where it wasn't doing what I expected. For example: regex = re.compile('(\d+,?)+') s = 'There are 9,000,000 bicycles in Beijing.' print re.search(regex, s).group(0) > 9,000,000 print re.findall(regex, s) > ['000'] In this case search() returns what I need (the longest match) but findall() behaves differently, although the docs imply it should be the same: findall() matches all occurrences of a pattern, not just the first one as search() does. Why is the behaviour different?

Python - re.findall returns unwanted result

二次信任 提交于 2019-11-27 02:21:50
re.findall("(100|[0-9][0-9]|[0-9])%", "89%") This returns only result [89] and I need to return the whole 89%. Any ideas how to do it please? The trivial solution: >>> re.findall("(100%|[0-9][0-9]%|[0-9]%)","89%") ['89%'] More beautiful solution: >>> re.findall("(100%|[0-9]{1,2}%)","89%") ['89%'] The prettiest solution: >>> re.findall("(?:100|[0-9]{1,2})%","89%") ['89%'] >>> re.findall("(?:100|[0-9][0-9]|[0-9])%", "89%") ['89%'] When there are capture groups findall returns only the captured parts. Use ?: to prevent the parentheses from being a capture group. Use an outer group, with the inner

Finding all references to a method with Roslyn

被刻印的时光 ゝ 提交于 2019-11-26 18:45:27
I'm looking to scan a group of .cs files to see which ones call the Value property of a Nullable<T> (finding all references). For example, this would match: class Program { static void Main() { int? nullable = 123; int value = nullable.Value; } } I found out about Roslyn and looked at some of the samples, but many of them are outdated and the API is huge. How would I go about doing this? I'm stuck after parsing the syntax tree. This is what I have so far: public static void Analyze(string sourceCode) { var tree = CSharpSyntaxTree.ParseText(sourceCode); tree./* ??? What goes here? */ } You're

python - regex search and findall

旧城冷巷雨未停 提交于 2019-11-26 16:59:31
问题 I need to find all matches in a string for a given regex. I've been using findall() to do that until I came across a case where it wasn't doing what I expected. For example: regex = re.compile('(\d+,?)+') s = 'There are 9,000,000 bicycles in Beijing.' print re.search(regex, s).group(0) > 9,000,000 print re.findall(regex, s) > ['000'] In this case search() returns what I need (the longest match) but findall() behaves differently, although the docs imply it should be the same: findall() matches

Beautiful Soup findAll doesn't find them all

时光怂恿深爱的人放手 提交于 2019-11-26 16:10:22
I'm trying to parse a website and get some info with BeautifulSoup.findAll but it doesn't find them all.. I'm using python3 the code is this #!/usr/bin/python3 from bs4 import BeautifulSoup from urllib.request import urlopen page = urlopen ("http://mangafox.me/directory/") # print (page.read ()) soup = BeautifulSoup (page.read ()) manga_img = soup.findAll ('a', {'class' : 'manga_img'}, limit=None) for manga in manga_img: print (manga['href']) it only prints the half of them... Different HTML parsers deal differently with broken HTML. That page serves broken HTML, and the lxml parser is not

Python - re.findall returns unwanted result

萝らか妹 提交于 2019-11-26 12:33:37
问题 re.findall(\"(100|[0-9][0-9]|[0-9])%\", \"89%\") This returns only result [89] and I need to return the whole 89%. Any ideas how to do it please? 回答1: The trivial solution: >>> re.findall("(100%|[0-9][0-9]%|[0-9]%)","89%") ['89%'] More beautiful solution: >>> re.findall("(100%|[0-9]{1,2}%)","89%") ['89%'] The prettiest solution: >>> re.findall("(?:100|[0-9]{1,2})%","89%") ['89%'] 回答2: >>> re.findall("(?:100|[0-9][0-9]|[0-9])%", "89%") ['89%'] When there are capture groups findall returns only

Finding all references to a method with Roslyn

依然范特西╮ 提交于 2019-11-26 05:27:20
问题 I\'m looking to scan a group of .cs files to see which ones call the Value property of a Nullable<T> (finding all references). For example, this would match: class Program { static void Main() { int? nullable = 123; int value = nullable.Value; } } I found out about Roslyn and looked at some of the samples, but many of them are outdated and the API is huge. How would I go about doing this? I\'m stuck after parsing the syntax tree. This is what I have so far: public static void Analyze(string

Beautiful Soup findAll doesn&#39;t find them all

廉价感情. 提交于 2019-11-26 04:43:17
问题 I\'m trying to parse a website and get some info with BeautifulSoup.findAll but it doesn\'t find them all.. I\'m using python3 the code is this #!/usr/bin/python3 from bs4 import BeautifulSoup from urllib.request import urlopen page = urlopen (\"http://mangafox.me/directory/\") # print (page.read ()) soup = BeautifulSoup (page.read ()) manga_img = soup.findAll (\'a\', {\'class\' : \'manga_img\'}, limit=None) for manga in manga_img: print (manga[\'href\']) it only prints the half of them...

Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method “find”, “findall”

本小妞迷上赌 提交于 2019-11-25 21:46:50
I want to use the method of "findall" to locate some elements of the source xml file in the ElementTree module. However, the source xml file (test.xml) has namespace. I truncate part of xml file as sample: <?xml version="1.0" encoding="iso-8859-1"?> <XML_HEADER xmlns="http://www.test.com"> <TYPE>Updates</TYPE> <DATE>9/26/2012 10:30:34 AM</DATE> <COPYRIGHT_NOTICE>All Rights Reserved.</COPYRIGHT_NOTICE> <LICENSE>newlicense.htm</LICENSE> <DEAL_LEVEL> <PAID_OFF>N</PAID_OFF> </DEAL_LEVEL> </XML_HEADER> The sample python code is below: from xml.etree import ElementTree as ET tree = ET.parse(r"test